Thursday, October 3, 2019

DevOps in detail, Trunk-based Development

In this post I'm taking a short diversion to discuss one aspect not directly related to the implementation of a DevOps solution, but very relevant to the success of the DevOps strategy supported by such solution.

Gitflow vs. Trunk-based Development

As a DevOps leader in your organization, you have visibility on most SW projects being developed around you. Take a look at the development teams behind those projects and think of the processes they follow to move the code base forward.

If you work for a large corporation, most likely the teams are using Gitflow. It's been a de-facto standard for some years now and has displaced the traditional feature branch-based development processes used elsewhere during the pre-cloud era.

The problem with feature branch-based development was it just couldn't keep pace with the rate of change required in the cloud era. The standardization of platforms (Linux, Android, iOS) and advances in SW packaging (snap, VMs, containers, APKs) enable fast deployment of new SW versions to users, but a feature branch-based development process cannot exploit that ability. When you're releasing twice a week, the number of open feature branches and associated merges the dev teams need to handle becomes unmanageable. A process allowing faster progress of the master branch's head was necessary.

Hence Gitflow was invented. In Gitflow, development and release works are pipelined so work on one of them does not prevent progress on the other. The frequency of releases is limited only by the time required to go through the release process. New features can be developed while existing ones are polished to be released. The master branch is always tidy and shiny and points to the latest release of the SW. Everybody is happy and life smiles at you.

This figure illustrates a typical Gitflow process; you can browse here for a short summary.


Cool, Gitflow helps increasing the release cadence. However, it still does not allow you to achieve Continuous Delivery, neither does it tackle the main issue that arises when developers start using feature branches: merge hell.

Merge hell and 'distance' between developers

Traditionally, feature branches have been used to craft new features into an existing SW baseline without compromising the quality and stability of the main branch. Bug fixes are also introduced through short-lived feature branches, in the absence of a better mechanism. Regardless a feature-based or Gitflow-based development process, the main issue with feature branches is that when a team or developer starts working on a feature branch, it's like an army dispatching a platoon to take an enemy outpost. If the outpost is near and is easy to take, the platoon will soon complete their goal and rejoin the main army, which immediately reaps the bounty captured in the operation. However if the outpost is on a distant land and/or is a tough target, the platoon will stay quite some time away from their comrades. Many things might happen during that time: other platoons could be dispatched to take conflicting targets, bounties might have lost relevance once they're finally captured, valuable soldiers lost along the way might delay the achievement, and perhaps the worst of it all: when the platoon gets back with the bounty the army might be many miles away from where they left it.

No matter if the target remains strategically sound, the bounty is still valuable and they take no losses in the op: if the army is not there when they get back, the whole op might be a blunder. Imagine that happening to a dozen platoons dispatched every week or two weeks. There's possibly no way an army can gather all those dispersed platoons. And that's indeed the main issue with feature branches: when a feature is done and the branch is to be merged to the main code base, that code base may have moved substantially, forcing the dev team to carry out a big effort to adapt their changes, performed on a code base hundreds of lines away from the current one. And that happens to each and every feature team. That phenomenon is known as merge hell, and regardless how good the team, how valuable the feature or how complex the code, there's no way the team can get away without it.



How do we prevent merge hell? If we were in the army the answer would be "right, the outpost is a thousand miles away and is well guarded so there's no easy way to this, period". Fortunately we're not in the army. The stem of the merge hell problem is in what can be referred to as 'distance' between developers. The longer different developers work on the same code base, the more divergence between their versions of the code base. Let's call that divergence a 'distance'. The longer the 'distance', the harder it will be to walk that distance back to a common point. If we could minimize that distance and keep it to a reasonable size for the number of teams working on the common code base, we would end merge hell once and for ever. We need to pick close and weak outposts so our platoons can leave early in the morning and be back with any captured bounties before dusk comes, move the army between dusk and dawn, and start it all over the next day.

Trunk-based development

Trunk-based development (TBD) is a systematic approach to forsake merge hell and achieve Continuous Deployment. To decrease developer distance, all developers sync on a single code base, 'the trunk'. Updates to that code base are submitted in small chunks, ideally sized at one-day-worth of work, or even smaller. Everybody is aware and participates of those updates on a daily basis. That way, all developers share a single, common view of the code base, like a shared mind (sort of).



TBD can be achieved by following a few simple rules:

1) no branches: at every point in time, all developers see the same code base (the trunk)
2) single source-of-truth: the trunk contains everything (this implies what's not in the trunk does not exist)
3) short-lived changes: any update to the trunk should be crafted and submitted in one day, exceptionally two (if e.g. someone goes sick before being able to submit)
4) continuous integration: each and every update to the trunk is integrated ASAP and proper feedback is provided to the update author(s)
5) broken master goes first: if feedback indicates the master branch is broken, fixing it is the single highest priority in every developer's task list
6) code review goes second: outstanding code reviews are the second highest priority in every developer's task list

Following those rules, distance between developers is minimized. All developers are aware of what updates are integrated in the trunk every day. They proactively keep their copies of the trunk updated, eagerly checking outstanding updates to review and browsing comments to reviewed updates. Eventually, once your deployment process is streamlined, you can reach the nirvana of Continuous Deployment, having each and every update deployed to production promptly and safely. At that point, your job is done. There's little else you can do from the DevOps perspective to improve the business, so enjoy a well-deserved rest while you keep the DevOps engine humming.

Getting there and resistance to change

Okay, so you're convinced TBD is where you want to go in your DevOps strategy. Now all that's left is convincing everybody else that's the way to go. And that's the toughest part (and the reason I wrote this post in the first place). If you check the TBD list above, there're a number of well-established behaviors the developers need to change, and there're some new they need to acquire.

Starting with senior developers, people feel quite comfortable with the Gitflow process. It allows a team to keep feature branches open indefinitely, even several of them in parallel, until features are done. They don't need to check the trunk every day, neither are they obliged to check on their colleague's work at all. They can blindly move forward with features, then blame merge issues when feature integration starts causing trouble ('it works in my branch').

Moving people out of their comfort zone is not easy feat. You'll need cooperation from Managers and Product Owners in order to shift teams to TBD. The following advice may come handy when you start walking that shaky path:

- convince senior management TBD is the way to go. For this you can use business arguments. It is well documented that Continuous Deployment brings a number of benefits to a SW business, and CD cannot be achieved without TBD. Leverage articles from the Internet, e.g. this.

- tell Product Owners how CD will improve their products and squeeze more outcome from their budgets. Explain that CD is hard to achieve with Gitflow. Bring them on your side to help shift teams towards TBD.

- with the support from senior management, you can work with Managers to define goals that steer teams towards TBD. Craft concrete goals, e.g. average number of commits per day or average time a Pull Request/Merge Request remains open. Managers are good at people so ask for their help to coach developers in their path to TBD.

Conclusion

TBD is the new standard development process. It is a gateway to Continuous Deployment and brings many benefits to a SW development organization. But successfully adopting it requires discipline and motivation, and that won't change overnight. As a DevOps leader, you'll need patience, perseverance, and cooperation from other areas of the organization in order to successfully transition from Gitflow to this new process.

Go middlewares for object-oriented programmers Go language (Golang, http://golang.org ) is a very simple procedural programming language of ...