Last year, I read The DevOps Handbook which is seriously recommended reading for anyone that wants to be at the center of a high-performing, fast-paced, agile, technological company, and wants to understand what, and why, things happen the way they do. The DevOps Handbook is a brilliantly written deconstruction of the methods used by agile organizations in the manufacturing industries, such as Toyota, and how to apply those to the software industry - it is a deconstruction of the DevOps movements. Written as a manual, it guides readers through all the steps that are necessary, for both organizations and individual contributors, to follow in order to fully embrace DevOps. More focused on case studies and actionable practices, The DevOps Handbook summarizes the existing knowledge into actionable guidelines, without tech jargon, that can be implemented in organizations in a sequential path. It clarifies concepts of DevOps by encapsulating all of this knowledge into a framework named Three Ways of Devops. Finally, it exposes case studies extensively, from renowned companies, with strong engineering cultures, to prove that it can be done, even though it can seem extraordinarily difficult to do so.
The Three Ways of DevOps represents a philosophy that defines what DevOps is in three sets of principles that interact and connect in a way that reinforces each of them. Following these sets of principles doesn’t guarantee immediate success in DevOps implementations but understanding them can unlock some incredible powers within organizations. Furthermore, understanding how and when to apply them can make a difference between an organization’s success and a failure. Every software engineer should be made aware of these principles, in order to be able to champion them and contribute to a strong engineering culture within their organization. A strong engineering culture is not only made of technical prowess but also about understanding the dynamics of how work flows within the organizations, and how each individual contributor can make a difference. Let’s take a look at the The Three Ways of DevOps.
First Way
We increase flow by making work visible, by reducing batch sizes and intervals of work, and by building quality in, preventing defects from being passed to downstream work centers. By speeding up the flow through the technology value stream, we reduce the lead time required to fulfill internal and external customer requests, further increasing the quality of our work while making us more agile and able to out-experiment the competition. (pp. 15)
The Principles of Flow are the guiding principles that define the First Way. Their focus is on enabling fast flow of work from the conception stage all the way to a completion stage. This means that we ought to focus on ensuring that work can flow, as quickly as possible, between idealization, implementation, testing, quality assurance and deployment. In a traditional manufacturing process, this would be the process of taking work from an initial stage of raw materials, all the way to a complete product at the end of the production line. Enabling this fast flow increases the competitive advantages because software becomes easier to produce, easier to modify and easier to maintain, meaning that organizations can, more quickly, adapt to constant changes in their surroundings.
One way we can increase the flow speed is by making work visible and limiting work-in-progress. By ensuring that all work is visible, in queues, to all stakeholders, an organization is capable of speedy organization and prioritization of work. Making work visible also eases the focus of teams because everyone will know, at all times, what the priorities are. By limiting work-in-progress, we avoid interruptions throughout the day, increasing productivity, setting clear expectations and priorities, avoiding multitasking incentives.
Another way of increasing flow speed is by working in small batches. Traditional approaches, such as waterfall, rely on large batches of work having to be complete before starting sequential large batches work. It has been clear by now that large batches of work reduce flow and decrease quality, with small batches of work being preferable.
Finally, reducing hand-offs between teams and reducing waste of work-in-progress also contribute to increasing flow speed. Each time work moves from one team to another requires a lot of communication and creates potential moments where work will be stopped, waiting for something to be resolved. This also reduces knowledge and context that cannot inevitably be transferred between teams. Removing any obstacles to work-in-progress, such as partially done work, unnecessary processes, manual and non-standard work, context switching and extra features, increases the quality of software and flow speed by allowing individual contributors to focus on specific tasks one at a time.
Second Way
We make our system of work safer by creating fast, frequent, high quality information flow throughout our value stream and our organization, which includes feedback and feedforward loops. This allows us to detect and remediate problems while they are smaller, cheaper and easier to fix; avert problems while they are smaller, cheaper, and easier to fix; avert problems before they cause catastrophe; and create organizational learning that we integrate into future work. (pp. 27)
The Principles of Feedback are the guiding principles behind the Second Way. At their core, they enable a “fast and constant” flow of of feedback from production all the way back to engineers. These principles focus on getting the necessary ambiance and tools in place for engineers to monitor their work and quickly react to any adverse situations. They focus on amplifying feedback from the operational side of software to the development side of software.
This can be achieved by working in an environment in which we can work in our systems “without fear, confident that any errors will be detected quickly, long before they cause catastrophic outcomes (…)”. Giving individual contributors the confidence to work without fear is a tremendously important aspect of an engineering culture.
It is also important to be able to see problems as they occur and to ensure that teams swarm to resolve those issues, generating more knowledge about the systems in question throughout the resolution of those issues. We need to ensure we have telemetry to accompany our production environment and are able to effectively monitor any incidents long before they affect customers. We can also use telemetry to validate we are achieving our desired goals through appropriate business metrics. Swarming “is a collective behavior exhibited by entities, particularly animals, of similar size which aggregate together, perhaps milling about the same spot or perhaps moving en masse or migrating in some direction.” (Wikipedia) In the technological sense, this means that when an issue, even the smallest of issues, occurs, all elements must united to resolve the problems as quickly as possible.
Finally, we should create a culture that ensures quality is pushed as close to the source as possible. By this we can understand that quality should be enforced by our peers, rather than top-down bureaucratic processes. We can do this by ensuring that each individual feels empowered to push for quality and can take its time in order to achieve the desired quality.
Third Way
(…), our goal is to create a high-trust culture, reinforcing that we are all lifelong learners who must take risks in our daily work. By applying a scientific approach to both process improvement and product development, we learn from our successes and failures, identifying which ideas don’t work and reinforcing those that do. Moreover, any local learnings are rapidly turned into global improvements, so that new techniques and practices can be used by the entire organization.
The Principles of Continual Learning are the guiding principles of the Third Way . These principles ensure that an organization values learning and creates and engineering culture that actively and proactively encourages continuous learning as an objective, enabling “a generative, high-trust culture that supports a dynamic, disciplined, and scientific approach to experimentation and risk-taking (…)”.
We achieve such an engineering culture by enabling a “generative organization”, characterized by actively seeking and sharing information. Responsibilities are shared throughout and failure results in reflection.
By institutionalizing the practice of improving daily work, we can also create an organization that has more time to improve, rather than constantly put out fires. Daily work should be a punishment and any attempt at improving it should be well received and we should aim at reserving time to pay down technical debt, fix defects, refactor and improve, both in source code as well as processes.
Finally, we should seek out to share local discoveries to make them global knowledge for the entire organization. We should actively support sharing of information, such as postmortems, and sharing of code libraries as well as configurations, within the organization, so that everyone benefits from the work and research of each individual. We want to convert individual expertise into artifacts that can be shared with everyone.
Closing Remarks
Essentially, these three pillars (The Principles of Flow, The Principles of Feedback and The Principles of Continual Learning) allow for work to flow faster, feedback to be faster and more constant, enabling teams to iterate faster and reduce lead time. Overall, applying theses principles should improve the quality and speed of producing software but, as with most things, they should be considered with a grain of salt. Not all practices will work the same in all contexts and the introduction of some of these principles should be carefully monitored to ensure a successful implementation, as well as its results.