Stress-Less Deployment
It’s noon at Lab Zero. Soft corporate punk plays on the hi-fi. Some people are eating lunch, talking politics. I look up from my salad. In #deploy somebody has written ‘Everything is deployed. Smoke tests are running. I’ll send an update when they’re complete.’ And I’m agitated. They added hard-boiled egg again.
Do you think of deployment as an off-hours activity with high stakes and frequent fire drills? Do you imagine a lone developer saying a prayer, releasing a thousand-line shell script from the mouse buffer, and trying to remember whether to press return? You won’t see that at Lab Zero. Today, deployment is all grown up. It’s part of our everyday business.
Why Are We So Calm During Deployment?
Stress-Less Deployment is made possible with the steady, disciplined application of a few principles, practices and tools.
We’re Running Smoke Tests
Anything we push can be rolled back immediately. If all the smoke tests pass, then we go forward. If a smoke test fails, we tell the business stakeholder what’s failing (and why, if we know that immediately) and find out what they want to do. We can either roll back or let it go with the failure. We FAIL SAFE.
Our Staging Environment is Identical to Production
Because we deploy our code in Docker containers, we know there is no difference between the OS and framework software on Dev, Test, Staging and Production. This makes it much more likely that smoke tests will pass without incident, and it also makes it much simpler to troubleshoot any issues that come up.
We’ve Done Automated Testing All Along the Way
Testing tiers give us increasing levels of confidence:
Unit Tests
In unit tests all infrastructure is removed, and we’re testing code in isolation from everything else. We use RSpec for Ruby, Jest for Node and Javascript, ExUnit for Elixir. These tests are typically pretty low-level. An example is input validation. Our unit tests make sure that we only accept certain kinds of input.
Functional Tests
When we add in some fakes, some stubs, a mock database, and some libraries, we have a functional test that is more complex than a unit test. A functional test is more likely to fail than a unit test, because of the dependencies and behaviors that are added in. Which is why we pass the unit tests first, and then move on to functional tests.
Integration Tests
An integration test is an end-to-end test that includes everything in the environment. We build integration tests backed by Selenium/Puppeteer, using language-specific frameworks like Cucumber and WebdriverIO and Cypress (node). The tests are script-like, and are often human-readable. Integration tests are at the top of the stack, and are the most like human testing. Therefore, when integration tests pass, we get the greatest degree of confidence. An example of a common integration test is logging in: such an automated test simulates landing on the login page, typing a username and password, and successfully logging in.
We Use Power Tools
We leverage power tools to get the most out of the time we spend writing tests, and to increase our confidence as quickly as possible.
Jenkins server manages all the tests, as well as the deployment, in what we call the pipeline. Our Jenkins server is typically configured to start running a suite of tests as soon as code is merged.
Sauce Labs Selenium Grid lets us simulate a wide variety of mobile and desktop browsers and configurations for our website integration tests. It keeps us from making the login page so that the submit button is invisible on Internet Explorer.
Feature Flipper lets us release new features to a small subset of the users, while things remain the same for everybody else. This gives us confidence in the feature being what the users need, not just in the fact that the code is working. Even though it isn’t a defense against bad code, it’s a defense against a rollback that would impact a lot of people.
We Use Protocols that Build Confidence
The deployment process is different for each organization. Often, there is a stakeholder that wants to see valuable features get into production as early as possible. And there is another stakeholder who wants to keep buggy features out of production as long as possible. Those people are both reachable to make a decision during a deployment.
We know in advance how and when we will communicate results of smoke tests to those stakeholders, and how we will make decisions about rolling back or going forward.
Doesn’t Anything Ever Go Wrong?
Of course. Here’s a note from the field manual:
“Operating system software upgrade could not be rolled back.”
We once had a failed node in production that could not be rolled back to a stable version of the code, because the software upgrade broke the new code *and* also broke the previous stable release. The most practical approach was to patch the code to work with the update. Our tools really helped us here: we were able to build and stabilize a patch within 45 minutes of deployment, and had it in production within an hour.
None of This Is Hard
None of this is hard. The tools are actually pretty simple. They’re not expensive, either--it’s mostly open source, and the services are really cheap compared to the value they add. The smallest startup and the largest enterprise can both take advantage of the same value. It’s just having the sense and the discipline to outline what you’re going to do, and why, and then do it each time.
Which is why, on deploy day, you won’t find us sweating and swearing. At Lab Zero, any day can be deploy day.
Continue the conversation.
Lab Zero is a San Francisco-based product team helping startups and Fortune 100 companies build flexible, modern, and secure solutions.