Setup Disaster Recovery Volumes
Help and potential gotchas associated with specific cloud providers.
A disaster recovery volume is a volume that stores data in a backup cluster in case the whole main cluster goes down. Disaster recovery volumes are used to increase the resiliency of Longhorn volumes.
A disaster recovery volume doesn’t support creating/deleting/reverting snapshot, creating backup, creating
PV/PVC. Users cannot update
Backup Target in Settings if any disaster recovery volumes exist.
When users try to activate a disaster recovery volume, Longhorn will check the last backup of the original volume. If it hasn’t been restored, the restoration will be started, and the activate action will fail. Users need to wait for the restoration to complete before retrying.
For disaster recovery volume,
Last Backup indicates the most recent backup of its original backup volume. If the icon
representing disaster volume is gray, it means the volume is restoring
Last Backup and users cannot activate this
volume right now; if the icon is blue, it means the volume has restored the
Typically incremental restoration is triggered by the periodic backup store update. Users can set backup store update
Setting - General - Backupstore Poll Interval. Notice that this interval can potentially impact
Recovery Time Objective(RTO). If it is too long, there may be a large amount of data for the disaster recovery volume to
restore, which will take a long time. As for Recovery Point Objective(RPO), it is determined by recurring backup
scheduling of the backup volume. You can check here to see how to set recurring backup in Longhorn.
If recurring backup scheduling for normal volume A is creating backup every hour, then RPO is 1 hour.
Assuming the volume creates backup every hour, and incrementally restoring data of one backup takes 5 minutes.
Backupstore Poll Interval is 30 minutes, then there will be at most one backup worth of data since last restoration.
The time for restoring one backup is 5 minute, so RTO is 5 minutes.
Backupstore Poll Interval is 12 hours, then there will be at most 12 backups worth of data since last restoration.
The time for restoring the backups is 5 * 12 = 60 minutes, so RTO is 60 minutes.