Documentation

Object Lifecycle Management

Use MinIO Object Lifecycle Management to create rules for time or date based automatic transition or expiry of objects. For object transition, MinIO automatically moves the object to a configured remote storage tier. For object expiry, MinIO automatically deletes the object.

MinIO derives it’s behavior and syntax from S3 lifecycle for compatibility in migrating workloads and lifecycle rules from S3 to MinIO. For example, you can export S3 lifecycle management rules and import them into MinIO or vice-versa. MinIO uses JSON to describe lifecycle management rules and may require conversion to or from XML as part of importing S3 lifecycle rules.

Object Transition (“Tiering”)

MinIO supports creating object transition lifecycle management rules, where MinIO can automatically move an object to a remote storage “tier”. MinIO supports any of the following remote tier targets:

MinIO object transition supports use cases like moving aged data from MinIO clusters in private or public cloud infrastructure to low-cost private or public cloud storage solutions. MinIO manages retrieving tiered objects on-the-fly without any additional application-side logic.

Use the mc ilm tier add command to create a remote target for tiering data to that target. You can then use the mc ilm rule add --transition-days command to transition objects to that tier after a specified number of calendar days.

New in version RELEASE.2022-11-10T18-20-21Z.

You can verify the tiering status of an object using mc ls against the bucket or bucket prefix. The output includes the storage tier of each object:

$ mc ls play/mybucket
[2022-11-08 11:30:24 PST]    52MB  STANDARD log-data.csv
[2022-11-09 12:20:18 PST]    120MB WARM event-2022-11-09.mp4
  • STANDARD marks objects stored on the MinIO deployment.

  • WARM marks objects stored on the remote tier with matching name.

Important

MinIO Object Transition supports cost-saving strategies around moving older or aged data to cost-optimized remote storage tiers, such as cloud storage or high-density HDD storage.

MinIO Object Transition does not provide backup and recovery functionality. You cannot use the remote tier as a recovery source in the event of data loss in MinIO.

Use either site replication or bucket replication to support backup/recovery or BC/DR requirements.

Exclusive Access to Remote Data

MinIO requires exclusive access to the transitioned data on the remote storage tier. Object metadata on the “hot” MinIO source is strongly linked to the object data on the “warm/cold” remote tier. MinIO cannot retrieve object data without access to the remote, nor can the remote be used to restore lost metadata on the source.

All access to the transitioned objects must occur through MinIO via S3 API operations only. Manually modifying a transitioned object - whether the metadata on the “hot” MinIO tier or the object data on the remote “warm/cold” tier - may result in loss of that object data.

MinIO ignores any objects in the remote bucket or bucket prefix not explicitly managed by the MinIO deployment. Automatic transition and transparent object retrieval depend on the following assumptions:

  • No external mutation, migration, or deletion of objects on the remote storage.

  • No lifecycle management rules (e.g. transition or expiration) on the remote storage bucket.

MinIO stores all transitioned objects in the remote storage bucket or resource under a unique per-deployment prefix value. This value is not intended to support identifying the source deployment from the backend. MinIO supports an additional optional human-readable prefix when configuring the remote target, which may facilitate operations related to diagnostics, maintenance, or disaster recovery.

MinIO recommends specifying this optional prefix for remote storage tiers which contain other data, including transitioned objects from other MinIO deployments. This tutorial includes the necessary syntax for setting this prefix.

Availability of Remote Data

MinIO tiering behavior depends on the remote storage returning objects immediately (milliseconds to seconds) upon request. MinIO therefore cannot support remote storage which requires rehydration, wait periods, or manual intervention.

MinIO creates metadata for each transitioned object that identifies its location on the remote storage. Applications cannot trivially identify and access a transitioned object independent of MinIO. Availability of the transitioned data therefore depends on the same core protections that erasure coding and distributed deployment topologies provide for all objects on the MinIO deployment. Using object transition does not provide any additional business continuity or disaster recovery benefits.

Workloads that require BC/DR protections should implement MinIO Server-Side replication. Replication ensures objects remains preserved on the remote replication site, such that you can resynchronize from the remote in the event of partial or total data loss. See Resynchronization (Disaster Recovery) for more complete documentation on using replication to recover after partial or total data loss.

Versioned Buckets

MinIO adopts S3 behavior for transition rules on versioned buckets. Specifically, MinIO by default applies the transition operation to the current object version.

To transition noncurrent object versions, specify the --noncurrent-transition-days and --noncurrent-transition-tier options when creating the transition rule.

Object Expiration

MinIO lifecycle management supports expiring objects on a bucket. Object “expiration” involves performing a DELETE operation on the object. For example, you can create a lifecycle management rule to expire any object older than 365 days.

Use mc ilm rule add --expire-days to expire objects after a specified number of calendar days.

For buckets with replication configured, MinIO does not replicate objects deleted by a lifecycle management expiration rule. See Replication of Delete Operations for more information.

Versioned Buckets

MinIO adopts S3 behavior for expiration rules on versioned buckets. MinIO has two specific default behaviors for versioned buckets:

  • MinIO applies the expiration option to only the current object version by creating a DeleteMarker as is normal with versioned delete.

    To expire noncurrent object versions, specify the --noncurrent-expire-days option when creating the expiration rule.

  • MinIO does not expire DeleteMarkers even if no other versions of that object exist.

    To expire delete markers when there are no remaining versions for that object, specify the --expire-delete-marker option when creating the expiration rule.

  • To expire all versions of an object after a specified period of days, use the --expire-all-object-versions flag with the --expire-days flag. This permits the permanent deletion of the object after the specified number of days pass.

Lifecycle Management Object Scanner

MinIO uses a built-in scanner to actively check objects against all configured lifecycle management rules. The scanner is a low-priority process that yields to high I/O workloads to prevent performance spikes triggered by rule timing. The scanner may therefore not detect an object as eligible for a configured transition or expiration lifecycle rule until after the lifecycle rule period has passed.

Scanner performance typically depends on the available node resources, the size of the cluster, and the complexity of bucket hierarchy (objects and prefixes). For example, a cluster that starts with 100TB of data that then grows to 200TB of data may require more time to scan the entire namespace of buckets and objects given the same hardware and workload. As the cluster or workload increases, scanner performance decreases as it yields more frequently to ensure priority of normal S3 operations.

You can adjust how MinIO balances the scanner performance with read/write operations using either the MINIO_SCANNER_SPEED environment variable or the scanner speed configuration setting.

Consider regularly checking cluster metrics, capacity, and resource usage to ensure the cluster hardware is scaling alongside cluster and workload growth: