camera lens

YugabyteDB Snapshots

Share this post on:

A distributed database is designed to withstand outages to a good degree. However, you should also maintain backups in case of “oops” scenarios like a dropped table.

The yb-admin tool can be used to manage snapshots. Here’s a brief walkthrough.

Some caveats about using snapshots… They are stored on the same server, so this method doesn’t protect against file system corruption. Also, this doesn’t snapshot the schema, just data.

If you don’t already have a test environment, check out a quick test setup here https://github.com/dataindataout/xtest_ansible.

The basic snapshot method

yb-admin --master_addresses 127.0.0.1:7100,127.0.0.2:7100,127.0.0.3:7100 create_database_snapshot ysql.yugabyte
Started snapshot creation: d8da62c0-9dde-460d-944b-eb0394aa15a2

yb-admin --master_addresses 127.0.0.1:7100,127.0.0.2:7100,127.0.0.3:7100 list_snapshots
Snapshot UUID State Creation Time
d8da62c0-9dde-460d-944b-eb0394aa15a2 COMPLETE 2023-07-17 21:50:27.499351
No snapshot restorations

Scheduling backups

The following statement will create a snapshot every minute, with a retention of 5 minutes.

yb-admin --master_addresses 127.0.0.1:7100,127.0.0.2:7100,127.0.0.3:7100 create_snapshot_schedule 1 5 ysql.yugabyte
{
"schedule_id": "6cbce548-0c4d-4258-a633-77fe5a68f774"
}

yb-admin --master_addresses 127.0.0.1:7100,127.0.0.2:7100,127.0.0.3:7100 list_snapshot_schedules
{
"schedules": [
{
"id": "6cbce548-0c4d-4258-a633-77fe5a68f774",
"options": {
"filter": "ysql.yugabyte",
"interval": "1 min",
"retention": "5 min"
},
"snapshots": [
{
"id": "d5f8dc24-0b13-4164-ac87-496bdf5a9a67",
"snapshot_time": "2023-07-17 21:59:52.807140"
},
{
"id": "cf9f1379-0b57-4ab6-8588-a8654f71fa2d",
"snapshot_time": "2023-07-17 22:00:57.825314",
"previous_snapshot_time": "2023-07-17 21:59:52.807140"
}
]
}
]
}

The schedules are created per database/keyspace. You can choose to view the schedule and snapshot timestamps for a single database/keyspace by specifying the schedule id.

yb-admin --master_addresses 127.0.0.1:7100,127.0.0.2:7100,127.0.0.3:7100 list_snapshot_schedules 6cbce548-0c4d-4258-a633-77fe5a68f774
{
"schedules": [
{
"id": "6cbce548-0c4d-4258-a633-77fe5a68f774",
"options": {
"filter": "ysql.yugabyte",
"interval": "1 min",
"retention": "5 min"
},
"snapshots": [
{
"id": "99800d8a-fa6c-4f9d-90cb-c36ab87fdbcf",
"snapshot_time": "2023-07-17 22:03:07.857994",
"previous_snapshot_time": "2023-07-17 22:02:02.841897"
},
{
"id": "19d941f8-4544-44b1-8138-b8704fd657ce",
"snapshot_time": "2023-07-17 22:04:12.877590",
"previous_snapshot_time": "2023-07-17 22:03:07.857994"
},
{
"id": "b98062c6-ccb0-4d10-8a25-4a545874bbab",
"snapshot_time": "2023-07-17 22:05:17.894554",
"previous_snapshot_time": "2023-07-17 22:04:12.877590"
},
{
"id": "8fb4fd87-68b7-4bd2-86f2-36da26b7175a",
"snapshot_time": "2023-07-17 22:06:22.913563",
"previous_snapshot_time": "2023-07-17 22:05:17.894554"
},
{
"id": "c7ca4452-01b3-4167-a1d4-be58dd5f0fbb",
"snapshot_time": "2023-07-17 22:07:27.934475",
"previous_snapshot_time": "2023-07-17 22:06:22.913563"
}
]
}
]
}

You can change the snapshot interval or the retention time with the following:

yb-admin --master_addresses 127.0.0.1:7100,127.0.0.2:7100,127.0.0.3:7100 edit_snapshot_schedule 6cbce548-0c4d-4258-a633-77fe5a68f774 retention 10

yb-admin --master_addresses 127.0.0.1:7100,127.0.0.2:7100,127.0.0.3:7100 edit_snapshot_schedule 6cbce548-0c4d-4258-a633-77fe5a68f774 interval 2

Finally, you can delete a snapshot schedule:

yb-admin --master_addresses 127.0.0.1:7100,127.0.0.2:7100,127.0.0.3:7100 delete_snapshot_schedule 6cbce548-0c4d-4258-a633-77fe5a68f774
{
"schedule_id": "6cbce548-0c4d-4258-a633-77fe5a68f774"
}

Restoring from snapshot

Let’s create a new schedule so we can test a restore.

Here’s the table we’ll work with:

snaptest=# select * from snaptable ;
id | name
----+---------
3 | Emily
2 | Martha
1 | Valerie

Set a schedule to snapshot the snaptest database.

yb-admin --master_addresses 127.0.0.1:7100,127.0.0.2:7100,127.0.0.3:7100 create_snapshot_schedule 1 10 ysql.snaptest
{
"schedule_id": "b0a32871-aea7-4a92-ac9c-e8b26cdd86e9"
}

Run the following four commands to observe the output. Note: In the restore_snapshot_schedule command, use the snapshot schedule ID, not the snapshot ID.

ysqlsh -d snaptest -c 'delete from snaptable'
DELETE 3

ysqlsh -d snaptest -c 'select * from snaptable'
id | name
----+------
(0 rows)

yb-admin --master_addresses 127.0.0.1:7100,127.0.0.2:7100,127.0.0.3:7100 restore_snapshot_schedule b0a32871-aea7-4a92-ac9c-e8b26cdd86e9 minus 10m
{
"snapshot_id": "9f129241-095a-44b2-a783-568d9e3dbc27",
"restoration_id": "425c6842-507c-4780-9570-667a4d9fe78d"
}

ysqlsh -d snaptest -c 'select * from snaptable'
id | name
----+---------
3 | Emily
2 | Martha
1 | Valerie
(3 rows)

Note that once you do a restore to a previous time, you cannot re-restore to a later time. For example, once I’ve restored to 10 minutes in the past as shown above, I cannot change my mind and immediately restore the same table to 5 minutes in the past.

yb-admin --master_addresses 127.0.0.1:7100,127.0.0.2:7100,127.0.0.3:7100 restore_snapshot_schedule b0a32871-aea7-4a92-ac9c-e8b26cdd86e9 minus 5m
Error running restore_snapshot_schedule: Not implemented (yb/master/master_snapshot_coordinator.cc:1838): Cannot perform a forward restore. Existing restoration 425c6842-507c-4780-9570-667a4d9fe78d was restored to { physical: 1689684066083926 } and completed at { physical: 1689684666898989 }, while the requested restoration for { physical: 1689684379540988 } is in between.

In a production system, of course, data is usually being written constantly. When you restore to a previous point, you lose data. Here’s an example:

ysqlsh -d snaptest -c 'select * from snaptable'
id | name
----+---------
3 | Emily
2 | Martha
1 | Valerie
(3 rows)

ysqlsh -d snaptest -c 'delete from snaptable'
DELETE 3

ysqlsh -d snaptest -c 'select * from snaptable'
id | name
----+------
(0 rows)

ysqlsh -d snaptest -c "insert into snaptable values (4,'Rachel')"
INSERT 0 1

ysqlsh -d snaptest -c 'select * from snaptable'
id | name
----+--------
4 | Rachel
(1 row)

yb-admin --master_addresses 127.0.0.1:7100,127.0.0.2:7100,127.0.0.3:7100 restore_snapshot_schedule b0a32871-aea7-4a92-ac9c-e8b26cdd86e9 minus 5m
{
"snapshot_id": "f78b07d0-7c33-48db-8769-6dfb58aa73c3",
"restoration_id": "4fe67cfa-c2df-42c8-ad4e-069cf237d8ab"
}

ysqlsh -d snaptest -c 'select * from snaptable'
id | name
----+---------
3 | Emily
2 | Martha
1 | Valerie
(3 rows)

Schema changes

Here’s a demonstration to underline the fact that snapshots don’t restore schema changes.

ysqlsh -d snaptest -c 'select * from snaptable'
id | name
----+---------
3 | Emily
2 | Martha
1 | Valerie
(3 rows)

ysqlsh -d snaptest -c 'delete from snaptable'
DELETE 3

ysqlsh -d snaptest -c 'select * from snaptable'
id | name
----+------
(0 rows)

ysqlsh -d snaptest -c 'alter table snaptable add column city varchar'
ALTER TABLE

ysqlsh -d snaptest -c "insert into snaptable values (4,'Rachel','Durham')"
INSERT 0 1

ysqlsh -d snaptest -c 'select * from snaptable'
id | name | city
----+--------+--------
4 | Rachel | Durham
(1 row)

yb-admin --master_addresses 127.0.0.1:7100,127.0.0.2:7100,127.0.0.3:7100 restore_snapshot_schedule b0a32871-aea7-4a92-ac9c-e8b26cdd86e9 minus 5m
{
"snapshot_id": "18fee06f-4c13-4e99-8804-eb5d68019cad",
"restoration_id": "4fc88b25-77a1-447a-a5da-6ddf69834679"
}

ysqlsh -d snaptest -c 'select * from snaptable'
id | name
----+---------
3 | Emily
2 | Martha
1 | Valerie
(3 rows)

Recommendations

Maintain snapshots via a schedule to protect yourself from those “oops” moments in production.

Keep the snapshot schedule interval low enough not to use a lot of space, but high enough for you to reference the runbook for restoring from snapshot.

Note that you will lose data written between the point you’ve chose to restore to and the point you’ve issued a restore command, and weigh this against what you’re trying to recover. Minimize data loss by restoring to as recent of a time in the past as you can.

Maintain independent schema backups, and be sure a snapshot is taken right after any schema change.

What’s next?

There are other types of backups available in YugabyteDB that can help you meet your RPO and RTO needs: point-in-time recovery, offsite backups, etc. More on those at a later time.

For a demo of offsite backups, see the test scenario here: https://github.com/dataindataout/xtest_ansible/blob/main/scenarios/snapshot/README.md.

Share this post on:

Author: Valerie Parham-Thompson

View all posts by Valerie Parham-Thompson >