The MinIO AIStor Observability feature was designed specifically for the challenges of managing large-scale data infrastructure. With object-level granularity and awareness of the entire hardware stack, it delivers mission-critical information to those who need to keep the world running smoothly. From metrics, logs, traces and health checks, MinIO institutionalizes its experience managing exabytes of data infrastructure into a simple, yet powerful solution that monitors infrastructure at a granular level.
The audit log capability captures all the system calls and system activity along with all the user activity - delivering full visibility into who did what and when.
The error logs identify tough to diagnose problems like drives which cannot connect and drives that have random read problems. These issues are fairly rare making them particularly challenging for operations teams to find.
API metrics provide an overview of how the data is being accessed, with sensitivity down to the millisecond.
MinIO depends on the network and the drives to deliver its industry leading performance. System metrics allow full visibility into how they interact and where the issues in your infrastructure lurk.
While MinIO’s capabilities on healing are well known, from bit rot to drive failure - the metrics on where the healing process is, or what was done were difficult to generate. With healing metrics, the operations team has all the information at their fingertips.
MinIO supports full metrics on data lifecycle management data. Are objects making it where they need to go, when they are supposed to and without unnecessary overhead? ILM metrics provide the insight.
MinIO’s rich replication capabilities require equally rich observability. Identify any bottlenecks or delays with the replication metrics and stay on top of the resilience game.
When you have millions if not billions of objects, a scanner comes in very handy. But who watches the scanner? Now with scanner metrics, it is easy to see the performance of scan jobs and identify if anything is running incompletely or not completing in a timely manner.
Audit logs show the login, setting changes, accessing the cluster and various activities that you can sort through to check for activity.
The full list of supported metrics is available and will be included in the documentation.
Yes. In the audit logs and API Metrics Observability page, you can see the API calls/events in a graph and then you can dig deeper with the logs.
At the moment alerts can be configured via webhooks available in the Events section.
Almost close to none. This is a very lightweight component that asynchronously sends metrics data.