Avoiding Disaster Post Release

Mistakes are made. That is a fact and it is something that you may deal with and do your best to mitigate against. If you hold responsibilities for your source control respositories and/or the developers who commit to those repositories (especially anyone with the power to release to production), then you will need to keep track of what is happening to those repos.

Recently, I observed two overnight releases for different systems. Both seemed to go well and post-release smoke tests suggested that everything was working as expected.

The following day began and very quickly we had received reports that one of the systems wasn’t responding as it should and required a hotfix to be made and deployed ASAP.

What should you do? Well, following the standard procedures and GitFlow, we went to spin off a hotfix branch from master, fix the issues and push it through the release process… But as it turns out master did not contain the latest release; the release itself had not been merged up into master. I hear your cries, I see the pitch forks, but do not dismay. All is not lost!

Luckily the release was pushed through using the correct procedure (mostly, anyway, as the release was just a tag in develop and not a separate release branch), except for the fact that once the release has gone to production it hadn’t been merged to the master branch. Luckily for us, this meant we could pull the release from the commit tagged with the latest version number and we now had a good copy of the code that went live.

From here we could fix the issue, release it to production and then merge the hotfix into master. Crisis averted, clients and account managers were happy because their system worked again, and management were relatively satisfied.

The problems faced here were inconsequential, but actually posed a real risk. Another developer could have blindly spun off a release branch from master and proceed to fix the issue. However, if they had fixed the issue at hand and it had been released, it would have been the previous versions codebase which would have inconsistencies with the database schema and would have then had many more problems in the production environment.

In this case, that would not have happened as the issue was caused by new code and therefore they would never have been able to recreate the issue, let alone find code (or write any) to solve it. But that poses another kind of risk in the sense that the developer may have then made assumptions about the production environment being a cause, sinking time into investigations which would turn out to be pointless endeavours, leaving clients unable to use the system for their daily tasks.

It was also uncertain as to whether any commits in develop after the tagging of the version were actually bugs that were released to production. When one thing – no matter how trivial – is not right in your source control system, you lose confidence in other aspects of it, too. We could only make a judgement call on where to branch our hotfix from, especially while the person who performed the release is not in London but on the west coast of the US, so they weren’t available to advise or assist. What was a relatively simple fix and should have taken less than an hour, took almost 4 hours plus some extra time sorting out source control afterwards.

I will be implementing tighter procedures and a system of checking in the near future to reduce the likelihood of this reoccurring. After all, we are only human and mistakes happen, you forget to commit something or miss out a merge.

What Should Happen?

  1. Merge your approved pull requests into develop
  2. Create a release branch from the latest commit in develop (where the code is to go live, nothing after that)
  3. Perform a final round of testing and UAT on the release branch against a staging site
  4. Once signed off, tag the latest commit in the release branch with a version tag
  5. Release to production
  6. Merge the release branch into master
  7. (Optional, but recommended) merge master into develop to ensure the tag, etc. is in development

Check out the Source Control and Release Strategy article for more details on release management.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s