Data Lifecycle Management and Tiering

As data continues to grow, the ability to co-optimize for access, security and economics becomes a hard requirement, not a nice-to-have. This is the role of lifecycle data management. MinIO offers a unique suite of features to protect data within and across clouds - both public and private. MinIO's enterprise data lifecycle management tools, including versioning, object locking and the various derivative components, satisfy many use cases.

Object Expiration

Data doesn't have to live forever: MinIO lifecycle management tools let you define how long data stays on disk before being removed. The length of time is user-defined as a specific date or a number of days after which MinIO begins removing objects.

Lifecycle management rules are per-bucket, and can be built using any combination of object and tag filters. Specify no filter to set the expiry rule for the entire bucket, or specify multiple rules to craft more complex expiry behavior.

MinIO object expiry rules also work with versioned buckets, with some versioning-specific flavor on the side. For example, you can specify an expiry rule on only the non-current versions of objects to maximize the benefits of object versioning without incurring long-term storage costs. Similarly, you can create a lifecycle management rule for deleting objects whose only remaining version is a delete marker.

Bucket expiration rules are fully compliant with MinIO WORM locking and legal holds - objects held under lock remain on disk until the lock expires or is explicitly lifted. Once an object is no longer subject to locking, MinIO begins applying the expiry rules as normal.

MinIO object expiration lifecycle management rules are feature-and-syntax-compatible with AWS Lifecycle Management. MinIO also supports importing an existing rule in JSON format, making it simple to migrate existing AWS expiry rules.

Policy-Based Object Tiering

MinIO can programmatically configure object storage tiering so that objects transition from one state or class to another based on any number of variables - although the most commonly used are time and frequency of access. This functionality is best understood in the context of tiering. Tiering allows the user to optimize storage cost or functionality to address changing data access patterns. Tiered data storage is generally used in the following scenarios:

Across Storage Mediums

Tiering across storage mediums is the best known and most straightforward tiering use case. Here, MinIO abstracts the underlying media and co-optimizes for performance and cost. For example, data may be stored on NVMe or SSD for performance or nearline workloads but tiered to HDD media after a certain period of time or for workloads that value scale over performance. Over time, that data can be further migrated to long-term storage if appropriate.

Data Tiering Across Private Cloud Storage Types

Across Cloud Types

A rapidly emerging use case involves using the public cloud’s inexpensive storage and compute resources as a just another tier for the private cloud. In this use case, nearline, performance-oriented workloads are executed using the appropriate private cloud media. The amount of data is irrelevant, but the value and performance expectations are not. As data volumes increase and performance expectations decrease, enterprises can use the public cloud’s cold storage options to optimize the cost associated with retaining that data and the ability to access it.

This is achieved by running MinIO on the private cloud as well as on the public cloud. Using replication, MinIO can move the data onto inexpensive public cloud options and use MinIO in the public cloud to protect and access it if necessary. In this scenario, the public cloud becomes dumb storage to MinIO in the same way that JBOD’s become dumb storage to MinIO. This approach avoids replacing and adding to obsolete tape infrastructure.

Data Tiering Across Hybrid Cloud Storage Types

Within a Public Cloud

MinIO commonly acts as the primary application storage layer within a public cloud. In this, as in the other use cases, MinIO is the only storage that an application accesses. Applications (and developers) don’t need to know anything other than a storage endpoint. MinIO determines which data belongs where based on administrative parameters. For example, MinIO may determine that block data should be moved to the object layer and which object layer meets the performance and economic goals of the enterprise.

MinIO combines different storage tiering layers and determines the appropriate media to deliver better economics without compromising performance. Applications simply address objects through MinIO, and MinIO transparently applies policy to move objects between tiers and to retain the metadata of that object in the block layer.

Data Tiering Across Public Cloud Storage Types

Learn more about Data Lifecycle Management and Tiering

You are using Internet Explorer version 11 or lower. Due to security issues and lack of support for web standards, it is highly recommended that you upgrade to a modern browser.