What DevOps means to me...

Over the last year or so a bunch of presumptuous European sysadmins and developers, joined by some of their American brethren and even a couple of us antipodeans (there are others too!) have been talking about a concept called DevOps. DevOps is the merger of the realms of development and operations (and if truth be told elements of product management, QA, and *winces* even sales should be thrown into the mix too).

The Broken

So … why should we merge or bring together the two realms? Well there are lots of reasons but first and foremost because what we’re doing now is broken. Really, really broken. In many shops the relationship between development (or engineering) and operations is dysfunctional to the point of occasional toxicity. Here’s an example I think everyone will be at least partially familiar with: the minefield that is project to production software deployment. Curse along as I explain. Development builds an application, the new hotness which promises customers all the whizz-bang features and will make the company millions. It is built using cutting edge technology and a brand new platform and it has got to be delivered right now. Development cuts code like crazy and gets the product ready for market ahead of schedule. They throw their masterpiece over the fence to Operations to implement and dash off to the pub for the wrap party. Operations catches the deployment and is filled with horror.

The Operations team summarizes their horror and says one or more of:

  • The wonder application won’t run on our infrastructure because {it’s too old, it doesn’t have capacity, we don’t support that version}
  • The architecture of the application doesn’t match our { storage, network, deployment, security } model
  • We weren’t consulted about the { reporting, security, monitoring, backup, provisioning } and it can’t be “productionised”.

But Operations persevere and install the new hotness - cursing and bitching throughout. Sadly, after forcing the application onto infrastructure and bending and twisting the architecture to get it running, the performance of the new application can be summed up as “epic fail”.

The Operations sighs and starts logging problems and passing issues back to the Development team. Their responses generally come from the following pool:

  • It’s not our fault, our code is perfect, it’s just been poorly implemented
  • Operations are stupid and don’t understand the new hotness! Why can’t they implement the cutting edge technology? Why are they so backward?
  • It runs fine on my machine…

The interactions between teams quickly becomes a toxic blame storm. The customers (and by extension the shareholders, investors and management) then become the losers. The loop gets closed with the company losing bucket loads of money and everyone losing their jobs. EPIC and FAIL.

What’s different about DevOps?

DevOps is all about trying to avoid that epic failure and working smarter and more efficiently at the same time. It is a framework of ideas and principles designed to foster cooperation, learning and coordination between development and operational groups. In a DevOps environment, developers and sysadmins build relationships, processes, and tools that allow them to better interact and ultimately better service the customer. DevOps is also more than just software deployment - it’s a whole new way of thinking about cooperation and coordination between the people who make the software and the people who run it. Areas like automation, monitoring, capacity planning & performance, backup & recovery, security, networking and provisioning can all benefit from using a DevOps model to enhance the nature and quality of interactions between development and operations teams. Everyone in the DevOps community has a slightly different take on “What is DevOps?” We all bring different experiences and focuses to the problem space. I personally see DevOps as having four quadrants:

Simplicity

KISS is King and in that vein this section is simple too. Design simple, repeatable, and reusable solutions. Simplicity saves documentation, training, and support time. Simplicity increases the speed of communication, avoids confusion, and helps reduces the risk of development and operational errors. Simplicity gets you to the pub faster.

Relationships

Engage early, engage often. Development teams need to embed operations people into their project and development life cycles. Invite operational people to your scrum or development meetings. Share ideas and information about product plans and new technologies. Gather operational requirements when gathering functional ones. As a project progresses test deployment, backup, monitoring, security and configuration management as well as application functionality. The more issues you fix during the project the less issues you expose your customers to when the application is live. Educate operations people about the applications architecture and the code base. The more information operations people can feed you about a problem with the code the less trouble-shooting you need to perform and the faster the problem can be fixed.

Operations people need to bring development people into the problem and change management space. Invite developers into your team meetings. Share your roadmaps and upgrade plans. Understand where future development is heading to better ensure infrastructure deployments match product requirements. Developers also bring skills, knowledge and tools that can help make your environment easier to manage, more efficient and cleaner. Learn to code or if you’re a hack-n-slash systems programmer like me then learn to code better. Concepts like building tools with APIs rather than closed interfaces, distributed version control, test driven development, and methodologies like Agile Development, Kanban and Scrum can revolutionise operational practises in the same way they’ve changed the way code is cut. Don’t be afraid of ideas and approaches from outside your domain - we can all learn things, even if it’s “let’s never do it that way again…!”, from how others do things and ultimately? Guess what? Yep, we’re all on the SAME team. Remember that interactions between people rank, in decreasing order of effectiveness (in IMHO but backed by some research):

  1. Face to face
  2. Video conference
  3. Phone
  4. IM & IRC
  5. Email

Process & Automation

Automation

Automate, automate, automate. Build or make use of simple and extensible tools (make sure they have APIs and machine readable input and output - see James White’s Infrastructure Manifesto). Use tools like Puppet (or others) to manage your configuration. Remember to extend your automation umbrella cross-domain and end-to-end in your environment - manage development, testing, staging and production environments with the same tools and processes. Not only does this have economies of scale benefits in support and management but it means you can test deployment and management alongside functionality as your application and new codes rolls toward production. Finally, when building process and automation always keep the KISS principle in mind. Complexity breeds opportunities for error:

Build simple processes and tools that are easy to implement, manage and maintain.

Process

Don’t underestimate the power of process and automation. But stop treating processes as stand-alone in your teams. Link process together across domains: software deployment, monitoring, capacity planning and other “operational” processes have their start in the development world. Software deployment is the logical conclusion of the software development life cycle and should be viewed as such rather than a separate operational process. Another example is metrics and monitoring, it is hard to measure anything without understanding the baselines and assumptions made in the development domain. Joint processes also mean more opportunity for development and operations interaction, understanding and joint accountability. Finally, joint process development means single repositories for documentation and other opportunities for economies of scale.

Assume failure is the norm. Many shops do process engineering: ranging from hand-written lists to ISO9001. Those processes generally have one key flaw: they focus on the outcome and its inevitable success. A simple process might provision a host:

  • Step 1: Install
  • Step 2: Cable machine
  • Step 3: Install OS, etc, etc.

Assuming all goes to process then at the end of Step x you will have a fully provisioned host. But what happens if it doesn’t go right? If your process breaks or you receive some anomalous output how does your process deal with it? Instead think about process as a journey and map out the potential pitfalls and obstacles. Treat your processes like applications and build error handling into them. You can’t predict every application or operational pitfall or issue but you can ensure that if you hit one your process isn’t derailed.

Continuous Improvement

Don’t stop innovating and learning. Technology moves fast. So do customer requirements. Build continuous improvement and integration into your tools and processes. Here is a good place operations people can learn from (good) developers about practises like test-driven development. A good example here is to build tests for your software deployment process and infrastructure. They are often an application in their own right and should be developed and maintained correctly. Your monitoring could also be extended with behavioural testing to deliver better business value. Look at using development domain tools, like Jenkins-CI for example, to explore and measure the operational domain.

Learn from mistakes and from outages. Seek root cause aggressively AND cross-domain. If you have an outage and a post-incident review then bring development and operational teams together to review the incident. Sometimes some simple code refactoring can save making infrastructure changes. Work together to fix root cause, treat it with the same process you develop to conduct project to production software deployment, rather than relegating them to incident review reports or batting issues between teams.

Me

Finally, for me DevOps is about people and nature of the environment you want to work in. The best thing about the movement for me is that it is trying to foster behaviours and environments where people work together towards joint goals rather than at cross-purposes or at odds. That’s a world I’d much rather use my skills in.