The volume group snapshot feature, introduced as alpha in Kubernetes 1.27, is now beta in version 1.32. This feature enables crash-consistent snapshots of multiple volumes using the Group Snapshot extension API. Kubernetes organizes Persistent VolumeClaims (PVCs) into groups through label selectors for snapshots. The primary purpose of this feature is to facilitate workload recovery by restoring a collection of snapshots from a crash-consistent recovery point to a new volume.
This feature is supported only for CSI volume drivers.
About volume group snapshots
Certain storage systems allow you to create crash-consistent snapshots of multiple volumes at the same time. These “group snapshots” ensure that all volumes are captured at the same point in time. Group snapshots allow you to append snapshot data to new volumes or restore existing volumes to a previous state.
Why implement volume group snapshots?
The ability to take consistent group snapshots is beneficial for applications that span multiple volumes, ensuring that all components are captured at the same point in time. Although it is possible to manually stop the application before taking individual snapshots, this process can be time-consuming or impractical in certain scenarios. Therefore, users may prefer to quiesce their applications and perform regular backups while relying on consistent group snapshots for more frequent backups.
Related article: Data management in Kubernetes with Portworx
Kubernetes API for volume group snapshots
Kubernetes uses three API resources to manage volume group snapshots:
- Volume group snapshot: User-defined object that requests snapshots of multiple PVCs. This includes metadata such as creation timestamp and readiness status.
- Volume group snapshot content: Automatically created by the snapshot controller for dynamically provisioned snapshots and stores details such as the snapshot ID. Each instance is uniquely mapped to a corresponding VolumeGroupSnapshot.
- Volume group snapshot class: Defined by the administrator and specifies how group snapshots are created, including driver information and deletion policy.
These APIs are implemented as CustomResourceDefinitions (CRDs) and must be installed on your Kubernetes cluster for CSI driver compatibility.
Components that support volume group snapshots
Implementing a volume group snapshot that is part of an external snapshot repository involves updating multiple components.
- New CRD for VolumeGroupSnapshot and related APIs.
- Enhanced snapshot controller logic.
- CSI invocation logic is integrated into the snapshot sidecar controller.
The volume snapshot controller and CRD operate at the cluster level, and a snapshot sidecar is deployed with each CSI driver. Kubernetes recommends that distributors include the Snapshot Controller and CRD as default add-ons in their cluster management processes.
Beta stage improvements
- CSI specification updates: Support for VolumeGroupSnapshot is now generally available (GA) in CSI specification v1.11.0.
- Delete validation webhook: external-snapshotter was deprecated in v8.0.0 and its rules were primarily transferred to CRD and required Kubernetes v1.25 or later. Some validation rules remain outside of the CRD, such as preventing multiple default snapshot classes for the same driver, but still cause errors during provisioning.
- Feature gate overview: of
--enable-volumegroup-snapshot
Flags have been replaced with feature gates (--feature-gates=CSIVolumeGroupSnapshot=true
). This feature is disabled by default. - Update RBAC rules: Moved responsibility for dynamic snapshot creation from the CSI snapshotter to the common snapshot controller and updated the necessary RBAC rules.
Using Kubernetes volume group snapshots
Creating a new group snapshot
To create a group snapshot:
- define
VolumeGroupSnapshotClass
Specify the CSI driver and provisioning rules. - create
VolumeGroupSnapshot
dynamically provision a snapshot or reference an existing snapshot.VolumeGroupSnapshotContent
.
For dynamic provisioning, use selectors to group PVCs by label. The creation process takes individual volume snapshots and VolumeGroupSnapshotContent
Use a reference to the underlying storage.
Importing an existing group snapshot
To import an existing group snapshot, manually create the following:
VolumeSnapshotContent
Objects for individual snapshots.- a
VolumeGroupSnapshotContent
With references to individual snapshot handles. - a
VolumeGroupSnapshot
with reference toVolumeGroupSnapshotContent
.
Restoring from a group snapshot
Restoration involves creating new PVCs from individual snapshots within the group. Repeat this process for each snapshot to fully restore the application state.
Related article: How to manage and secure virtualized and containerized environments using scale computing and Rubrik
Volume group snapshot support in CSI driver
To implement support, a CSI driver must:
- Introducing a new Group Controller service.
- Implement RPC to create, delete, and retrieve group snapshots.
- Add.
CREATE_DELETE_GET_VOLUME_GROUP_SNAPSHOT
ability.
The Kubernetes project recommends bundling the snapshot controller and CRD into the cluster management process, independent of the CSI driver. The external snapshot sidecar monitors API server changes and triggers CSI operations for group snapshots.
Limitations and future plans
Current limitations include:
- Reverting existing PVCs to their previous state is not supported (creating new volumes only).
- Application consistency is limited to what the storage system provides (such as crash consistency).
In future releases, we aim to gather feedback, increase adoption, and advance features to general availability (GA).