Backups in SQLite

learn
Backups in SQLite

There are many ways to backup an SQLite database, watch this video to see all the options, along with the preferred options.

Transcript

"Backing up a SQLite database is largely about copying the database to a safe, secure, backup medium.

So, how do you go about backing up or copying a SQLite database?

Well, there's actually quite a few options for copying a SQLite database, for example, you could just copy the file with the cp command, or you can use the clone command to export the entire database, or you can use the dump command to export the entire database to a SQL script which you could then use to recreate the database.

Or you could use the "vacuum into" command, to create a vacuumed copy, or you could use the backup command, or you could use a third-party backup tool such as Litestream.

So, which of these options should you use?

Well, it depends.

If your SQLite database is offline during the backup window, then any of these options could work.

Ok, but what do I mean by offline?

Well, basically, by offline I mean that the database has no connections to it, so it's effectively closed.

But, if your database is online, meaning it's open and it has one or more active connections, then you can use several of these backup options, but there's really only two backup options that you should strongly consider, but I'll talk more about this in a moment here.

Ok, let's take a closer look at each of these options.

Now, one of the nice things about SQLite is that the data is stored in a single file, well mostly, and often, you can get away with simply copying the database file to some other location, but this can be problematic in certain situations.

For example, if you try and make a backup copy of the database file while its in the middle of a transaction, then the backup copy might contain some old data and some new data, at which point it's basically corrupted.

The next option we'll consider is the clone command.

So let's try using the clone command to see how it works.

So I'll key in sqlite3, the sqlite command line tool, then I'll specify the database we want to clone, imdb.db, then I'll pass the string, dot clone and I'll provide a name for the new database clone.

Now, before I run this, I'll prefix this entire command with the time command, so that we can measure how long this takes to complete, and now I'll go ahead and run this command.

So, cloning the database, as we're doing right now, can work, but there's two problems with this approach.

First of all it's somewhat slow, as you'll see in a moment here, but probably the bigger issue is that cloning put's a lock on the database.

So for example, while this cloning operation is occurring, I'll open up the database in another terminal, and I'll try to perform an update, and as you can see, the update failed because the database is locked.

Now we can still perform read operations, for example I can query the titles table, and I see the results, but any write operations will fail, while the cloning is occurring, so cloning isn't ideal.

Ok, so how long does cloning take? It looks like it took about 3 minutes to clone this database, which is about eleven point three gigabytes in size.

So, should you use the dot clone command to generate a backup?

Probably not, because there are better options, which you'll see in a moment here.

The next option we'll look at, is using the dot dump command, for backup purposes.

So what does the dump command do?

Well, it exports the entire database, both the schema and the data, to a text file, as SQL, which you could turn around and use to regenerate a copy of the sqlite database.

Let's go ahead and use the dump command, so I'll enter the command sqlite3, followed by our database name, then I'll key in dot dump, and I'll redirect the output to an imdb dot sql file. And one last thing, I'll prefix this line with the time command, so we can measure how long it takes to complete.

So is using the dump command a good option for generating a backup?

Well, it's not great for a few reasons. First of all, dumping is a pretty slow process, as you'll see in a moment here, and the dump command also locks the database, preventing write operations, just like the clone command did.

And the last reason I don't like the idea of using dump to generate a backup, is because the restoration process takes a bit of time as well. So why does the restore take time? Well, because you've got to run the sql script to regenerate all the tables and populate the tables with data and then add indexes and so on.

Now, the dump command did finish a bit faster than the clone command, as it took about one minute and forty seconds, which is nearly twice as fast as the clone command, but it's still relatively slow when compared to the next couple of options we'll look at.

Another option for generating a backup is to use the "Vacuum into" command.

So, what's vacuuming? Well, it's a process of removing unused database pages, and it sort of defragments a database as well. Basically, vacuuming is a maintenance task, that you should periodically perform on your database.

Ok, so how would you go about using the "Vacuum Into" command for backup purposes?

Well, I'll use the sqlite3 command on our database, then I'll pass in the string vacuum into, followed by the file name we want to use. And of course, I'll use the time command to measure the execution duration.

Ok, so vacuuming suffers from the same locking issues as does cloning and dumping, but it does finish a bit faster, in about 55 seconds, and there's a good chance your backup file could be significantly smaller than the file it was backing up, because it's vacuuming it as well, which is a small added benefit.

So, should you use Vacuum into to generate a backup file? Well, it is a better option than cloning and dumping, but it still requires locking the database, which could be problematic for you, and I think there are better options, which we'll look at right now.

The next option we'll consider is the dot backup command.

So, to run this command I'll use the time command, followed by sqlite3, then our database name, followed by the string, dot backup, and a file name for our backup.

Now, the backup command is the fastest of all the dot commands, as you can see it finished in about 10 seconds, and the backup process is a bit more forgiving in terms of how it locks the database.

Ok, so how does locking work with the backup command? Well, it obtains locks on the databases, but it does periodically release the lock and yields to other connections, which can allow you to perform write operations while the backup is proceeding, as long as you've set an appropriate query timeout value.

Another nice thing about the backup option is that there are also API's for most SQLite libraries, that will perform the backup for you.

Now, there is a quirk, related to write operations that occur during a backup, that you should be aware of.

If the database connection that's performing the backup operations, does some write operations, while the backup is occurring, then the data in the write operations are included in the backup, as you probably would have guessed.

But here's the quirk, if a different database connection performs write operations, then the entire database backup is restarted, and as you might imagine, successive write operations can postpone or even prevent the database backup from completing.

So, if at all possible, you should use the same database connection for the backup as you use for the write operations and depending on the language and library you use, this could be a bit challenging, but the next option we'll look at can work around these types of challenges.

Ok, the last backup option we'll look at, is to perform backups using an open-source tool called Litestream.

You can find installation instructions for Litestream, on their website.

Now, Litestream is different than what we've looked at so far, in that it provides near real-time steaming backups of sqlite to S3 compatible storage services.

To use Litestream, you'll setup an S3 bucket, then you'll create some credentials that you'll export as environmental variables, then you'll run Litestream replicate, followed by the database file name, then you'll pass along the s3 url for your bucket.

In my example here, I'm using minIO, an S3 compatible open source server, but keep in mind, any s3 compatible service should work.

Now, as I mentioned a moment ago, the nice thing about Litestream is that it continuously streams SQLite changes to the s3 compatible service, which means you can quickly recover to the point of failure, and you don't have to worry about the backup process locking your database.

Now, to restore your database using Litesteam, you'll run Litestream, restore, and we'll restore the file as imdb-ls.db, and we'll restore it from our s3 bucket.

So, Litestream is nice because it gives you streaming backups and you can just keep on using your sqlite database, without worrying about locks or any other database backup-related concerns.

Now, the restores from Litestream can take a bit of time, because it's got to download all the backups from s3, but I think the benefits of Litestream mostly outweigh the downside of slower restores.

So in summary, you've got many options for backing up SQLite, but for most scenarios, the backup command and Litestream are typically the best options.

Hey, here at Mycelial, we're big fans of SQLite, and we're building tools around SQLite that will allow you to synchronize a SQLite database to multiple computers without conflict.

If you are interested in SQLite and what we're doing with it, please subscribe to our channel, and consider joining our newsletter to learn more."

MORE ABOUT MYCELIAL