Over the last couple of years my team has gone almost 100% CI/CD (the only manual step is requiring a human to push the “go to prod” button) and it’s been great. We’ve found this has tightened up our engineering discipline and allowed us to deliver features for our customers faster and more confidently with far fewer bugs. We no longer think of a deploy as a “scary” thing. It’s all blue-green with no downtime, one-click rollbacks, and full monitoring. And if we needed to hold off on a feature to coincide with a PR campaign, we would simply put it behind a feature flag, allowing the engineers to deploy whenever they want and stakeholders to view and use the feature whenever they wanted without all customers being exposed to it.
Recently a new “Dev Ops” manager was hired and one of the first things he did was implement a “deploy window” policy; 1p-5p Monday through Thursday. Anything outside of that window required upper management approval (spoiler, they are never approved). This really threw a wrench into our process since our team’s definition of done for a feature is “deployed to prod and verified”. This leaves us with a few options:
- Continuously deliver to dev/stage and wait for the “window” to deploy to prod. Downside: we queue up risk by having lots of changes go out with a deploy instead of one small change. And god forbid an urgent bug fix has to be done.
- Introduce a branching strategy. Downside: Unnecessary process overhead and chances for errors (https://martinfowler.com/bliki/FeatureBranch.html) plus we still have the issue of larger deploys.
- Approach management about a 4-day work week. Downside: Only solves for Fridays.
Is this an acceptable policy for a software engineering company that consider themselves Agile? It feels very Waterfall to me. And if it is acceptable, how do you recommend working with it in a way that doesn’t violate CI/CD (specifically the CI part as we would no longer be continuously integrating to production)?
Note for clarification: While the “deploy to prod” button could technically be hit after many changes are queued up (and this is sometimes what happens) most devs prefer not to do this. The reason being that if a bug is introduced it is easier to figure out where the change was that caused it in one small change instead of a large string of changes.