Forums

Is it safe to backup database while an always-on task with db hits is running?

Hi there,

Just a quick question for today! I have an always-on task that does some data scraping and saving to db every ~17 seconds. It saves a good number of new Django objects to the db (~100 - 500, in about 8 save/bulk_create/update functions) in about 3 seconds or rarely 4-10 seconds if an API rate limit is hit. These objects are all related, and it is important that the full set is saved each time.

I'd like to schedule daily database backups, or even just manual ones without having to keep stopping and starting the task each time. Do you think the task will interfere with the db backup process or that the db backup will contain an incomplete set of entries?

If so, do you have any advice on how to check that the always-on task is on its 17 second break, or to pause the task automatically when I create a backup?

Thanks for your help!

George "Kirkmania" Kirkman

It depends on the details of how you're storing the data in the database; if you put all of the objects into the DB in the same transaction then you will normally be fine, but the exact behaviour will depend on which database you're using, and which transaction settings you've put on the database connection.

By default, Django runs in "autocommit" mode, where each operation is committed (that is, the transaction completes) after it's done, so it would not have the properties that you want. But you can override that behaviour.

There's quite a lot of computer science theory involved in how this all works, so it's more than I can really explain in a forum post -- but the Django docs page on transactions might be a useful starting point

Alrighty, thanks for the info! I'll look into it. Orrrr I'll never get round to it and keep doing manual backups like a noob ;)

You can establish some communication between your always-on task and backup script. It could be something as simple as a pair of files that are touched by them. That would make it possible for the always-on task to pause.