Managing Database Changes in Django

Introduction


Managing database changes in a team environment working on a django project can be complicated. I would imagine that there is no one size fits all solution and it would depend on team size and configuration, production database size, etc.

What I will outline here is a solution I developed for a small team that I work with on a django based web application. So far it has worked really well for allowing us to streamline database changes so we can stay in sync, deploy easily, and add small tweaks to the database (indexing, data manipulation, etc.) that is semi-automated.

The idea is based partly on Rails database migrations (though not has sleek and well put together) and a database versioning system that was used on a team that I have worked on previously using MS SQL in a corporate environment and a fairly decent team size. In one sense it is not nearly as well put together as either of these two solutions, but at the same time, it is pretty much hands off and has been working well for us for a number of months now.

Why not just use syncdb?

As nice as python manage.py syncdb is when the project is getting started and is still in early phases, it becomes less and less useful, mostly in handling schema changes. Since syncdb only creates a model if it doesn’t already exist in the database, we found ourselves outputting sql with python manage.py sql APP_NAME and then making adhoc ALTER TABLE scripts from the CREATE statements and passing them around to each other and then trying to remember the order to apply them in production.

Furthermore, syncdb left us without a way to alter data, add indexes, remove indexes, etc. Granted that wasn’t it’s intent and I am certainly not trying to beat up syncdb — it does what it does well.

The Solution

In order to address this, it seemed the most natural thing was to have some structured and ordered way to write SQL scripts that were applied in a uniform manner so that it was repeatable. This way we could run on our development databases, on test databases, and in deployment on production all the same way.

First step to achieve this was to version the database. This is accomplished with a simple table that will keep track of what it has executed along with some other metadata primarily for reference purposes.

CREATE TABLE `versions` (
    `version` VARCHAR(200) NOT NULL, 
    `date_created` DATETIME NOT NULL,
    `sql_executed` LONGTEXT NULL,
    `svn_version` int null
);

In order to populate this table and keep versions consistent, the naming scheme of the sql scripts should follow something that lends itself to easy sorting:

YYYYMMDD-##.sql

where pound signs are zero-padded integers of the changes for the day. In a team environment, it is probable that two people might be working on a database change at the same time and therefore would be use 01. Whoever committed first would be able to keep the 01, the other developer would get a conflict message and need to rename his script to 02.

These scripts live in a db/ folder in our project. The name of this folder is not important, but it is important to keep all the sql scripts in a folder not mixed in with other files.

To tie it all together there is s simple python script that gets the list of applied changes from the database, gets a list of files in the current directory in order, filters out the ones that have been applied and then processes the remaining scripts. In processing the scripts, it opens each script and splits script down into individual statements (splitting on the semicolon). After execution of an entire script file is complete, the versions table is updated to reflect the applied version.

Running python upgrade.py by itself will simply print to standard output all the statements it plans to execute, so that they can be reviewed. Running python upgrade.py —execute will actually execute the scripts.

If an error was found in executing a statement, processing stops immediately.

You can find upgrade.py in Django Snippet 849.

Improvements

There are lots of improvements to this script that I’d like to add if I got around to it. The ones I’d like to see most is support for “rollbacks”. Comments and suggestions are most welcome!