Database and Continuous Delivery
Nobody likes getting called at 2 AM. But it happens. More often than not, the reason for that page will have something to do with the database. A SQL job failed, a stored procedure didn’t get deployed, a missing index on a table. That problem doesn’t seem to happen with source code (C#, JavaScript, etc.), so why does it happen with databases?
The answer lies in the workflow, or lack thereof. A typical process will have a person or persons keep track of the changes being made in the database on a development server. When it comes time to promote those changes to a test environment or pre-production environment they will either manually generate a script, or use a tool like Red Gate’s SQL Compare, to generate the script. But development doesn’t happen in a bubble, other changes for other features are being made against the development database. Using the tool they have to make sure they only include changes that should be promoted. Even worse is when a different script is generated for each environment. With such a manual process, it is no wonder the phone goes off at 2 AM.
A more recent trend is continuous delivery, which in a nutshell, is delivering smaller changes faster to production. So now that manual process has to be done faster. Doesn’t that sound like gallons of fun? But continuous delivery is a good thing. Being able to deliver changes faster allows for faster feedback, and helps the software being developed meets the needs of the users.
Taking a step back, it is important to ask, why does this problem with databases and not source code? Most likely, the source code is packaged, either via an MSDeploy package for a website, or an MSI file for a windows service. That same package, or artifact, is deployed multiple times prior to getting to production using some sort of automated process. There is little doubt that code will work in production. That same basic philosophy should be applied to the database. Some sort of package or artifact should be generated. That same artifact should be then deployed through the various environments in an automated fashion prior to getting to production.
To do all of this requires major mind shifts, in both process and tooling. In other words, a lot of cheese needs to be moved.
High Level View of the Process
It is first important to see the new process at a 50,000-foot view. The database development process should closely match the source code process.
Story Comes in with Database Schema Change
Database work is done on a local instance of SQL Server
Schema changes are checked into source control
The continuous integration build monitoring source control kicks off and generates the database artifacts (it could be a delta script between source control and production or a database project)
An automated process, be it a build server or release management server, promotes the artifacts through the environments.
Get the Database Under Source Control
The key to this process is the database schema must be placed into source control. It is the only way to generate a single artifact. There are a number of tools out there that will read the schema of a database and convert it to script files to put into source control. A good tool will help with 95% of the scenarios and allow you to write custom migration scripts to make complex changes to the schema.
Schema means all the tables, views, stored procedures, functions, roles, users, schemas, triggers, and anything else that makes up the database. Basically everything but the data a user can manipulate should be under source control.
Artifacts
The continuous integration build monitoring source control creates some sort of artifact. That artifact is promoted through the various environments. This is very similar to MSI packages or MSDeploy packages with transform files. The MSDeploy package or MSI is generated once and deployed to the web server or application server.
The artifact being created can be any number of things. For example, the Red Gate tools create a nuget package representing the schema and migration scripts stored in source control. Or, it can be a delta script between what is in source control and what is in production. It could be a combination of the two.
Local Database Development
Most shops have a shared development database where all development work is done. This process turns that on its head. Instead, all database development work should be moved onto a local instance of the database.
The shared database model is limiting for the following reasons:
Any breaking schema change will affect other developers.
Changes not ready for production are intermingled with changes ready for production.
There is no concept of branching, while source code can be branched
The first time a database deployment is tested is when it moves to the environment above the Dev environment, which is typically called QA or testing. This is in contrast to a code deployment, which is first tested in the Dev environment.
It is the equivalent of storing the source code on single file share and having developers make changes against the same file. Unless you work for a really bad company, that approach was abandoned a long, long time ago.
Branching
Remember continuous delivery is about deploying smaller changes faster. This means the master or main branch of your source code should be ready to production at any point. This process helps with that. This allows developers to experiment with database changes, and because they are done locally those changes will not affect anyone else on the team. The delta between the database development server and production database server is what going to be pushed up to production.
Having a local database also encourages the use of feature branches. If two developers are working on a schema change that will break the source code they can create a feature branch to share. Schema changes are made in that feature branch and each developer applies them to their local database. They then fix the code in the feature branch. Once both the code and database are ready then they are merged into master and deployed to the environments at the same time.
Automated Deployments
The application deploying the source code artifacts to environments should be the same application deploying the database artifacts to the environments. Some shops have the build server; such as TeamCity or TFS Build handle the deployments. While other shops use a specific release management tool, such as Octopus Deploy or Microsoft’s Release Management. It doesn’t matter as long as the tool deploying the source code artifacts is the same tool deploying the database artifacts. It should be automated for the source code artifacts and then have a DBA deploy the database artifacts.
Obstacles
The hardest part to this process is the mind-shift for the development team and any of the support staff doing the deployments. A lot of cheese is being moved. It will take a while to build the necessary trust in the process. There might be the need for multiple manual verifications prior to deploying to production until that trust is built. That is okay, as long as those manual verifications are eventually removed.
Developing locally will fail unless it is easy to get the schema changes and any necessary data onto the local database. Figure out how those two problems will be solved before trying to implement it. Some tools, such as ones provided by Red Gate solve that problem very easily. But they also come at a cost. It is entirely possible a custom script will be needed to pull the necessary data from a testing environment. Just remember, the higher the friction, the slower the adoption.
Steps to Getting Started
Pick a tool to convert a database schema into files to be placed into source control.
Figure out how to get database schema and data down to the local databases (ideally the tool chosen in the previous step should be able to handle that).
Get the database schema under source control.
Update the continuous integration build to generate the database artifact to push to the various environments.
Update the deployment process to deploy the database artifacts to the development environment.
Work through any road blocks the deployment to the development database.
Repeat the previous two steps for each of the pre-production environments (test, pre-prod, QA, etc.)
Update the production deployment process to include the database artifact.