I’m a fairly basic person. The concept of “cloud” never really made sense to me: it’s just a bunch of servers connected to a network, and it’s all spun up and down automagically. Sure, that abstracts away a lot of plumbing under the covers, but it’s basically the idea behind “cloud”. I won’t rail about how the idea of “cloud” has been perverted by people who like to use buzzwords, that’ll be in a different blog. Today’s topic is DevOps, and why I hate it.
It roughly follows the same lineage as cloud: a great idea pioneered by people who can adopt a DevOps model has been productized and perverted by the IT industry in the pursuit of sales dollars. That’s pretty much why I hate it, but there are some real reasons underneath my distaste for DevOps that I’ll get to.
The whole concept behind devops is not really anything to do with developers and operations working together in a unicorn-and-rainbows-harmony. It’s all about computing resource cost models changing over time. When you had to deal with the expense of computing resources more closely, it made sense to do waterfall delivery. But as computing resources got both more ubiquitous and less expensive, that model really doesn’t make sense. And thus, we have devops. But what we really have is ubiquitous, cheap, reliable computing resources, and the need to consume them in a different way. The real change is not in allowing developers to deploy code to production (pause for gasps from those wacky funster auditors out there who are seeing nothing but dollar signs for their futures), but in forcing a discussion at the C-level about “Continuous Everything”. That’s the But there’s a different element to the devops discussion: and that’s the need for IT to improve the velocity and scope of what they can deliver quickly.
Why does “Continuous Everything” matter?
The real revolution in application delivery isn’t DevOps, it’s Continuous-Everything (CE). I use that shorthand because I hate typing things out over and over again. What I mean by “CE” is the entire cycle:
- Continuous Requirements
- Continuous Development
- Continuous Testing
- Continuous Deployment
- Continuous Operations
This, as you can see, a very different idea that “developers or sysadmins deploy code into production”. But there is a question: is there a quantitative/qualitative difference between "devops" and "continuous everything"? There is a difference, and it really is how you want to think about the goals of delivery versus the "how" of delivery.
Going back in history
For many years you would have the differentiation of the programmer, the systems engineer, the systems adminstrator and the system operator. Roughly speaking, one wrote the code, one figured how much was needed to make the application useable, one was the builder and maintainer of the system, and the last (and most important) were the people who ran the things that the others built. This makes sense when you have scarce resources: you need to apply a lot of focus to make sure you’re not spending money you shouldn’t on expensive computing resources, so you can handle a lot of overhead in delivery.
The thing is, that Moore’s Law held true, well, everywhere: CPU, memory, storage, network. The scarce resources are no more: there’s so much compute everywhere now that companies are even looking to run other people’s workloads on their spare CPU cycles, or have built entire businesses around providing on demand resources. Even now, Intel is starting to deploy more widely Rack Scale Architecture precisely to deliver those cheap resources at scale. But that’s only the resource side of things changing. What else happened that caused “DevOps” to occur?
What happened was the rise of the Single App. Flickr’s “1000 deploys a day” is a great model… when you have 1 App to deploy. When you have many apps to deploy in concert, it takes a lot more effort to scale to that level. Now we’re starting to get to the friction that is stopping DevOps adoption across broader enterprises.
The Legacy Enterprise Challenge
Yeah, we know, large enterprises are different, especially when they’re mired in their legacy environments. It is really difficult for large enterprises that invested in things like SOA and framework development and waterfall releases schedules to get more “with it” with DevOps. Why? Because they really don’t have an application problem, or a deployment problem, or necessarily a resource problem: they have an architecture problem.
Large enterprises have been standardizing lots of things over the years, but the thing that has been hardest to standardize with any level of success has been applications. More precisely, the application technology stacks that are used to deliver business functions. Part of this is hubris: everyone likes to be an inventive, creative soul that is able to work inside their chosen framework to deliver the bestest thing ever - never mind that it’s in FORTRAN and only three people in the world really grok it well enough to understand (let alone debug, or heaven forbid, update) the code. Partially it’s friction - it’s really expensive to modernize applications - the code, the testing, the User Acceptance Testing, the disruption to your business, realizing that you really can’t modernize that app because the way that the User Interface was developed relies on a particular mechanism of delivery that can’t necessarily be automated - all of these things plus a myriad more. Partially, it’s the need to improve velocity that perversely means that you can’t take downtime to upgrade/modernize. Partially it’s the lack of enforced standardization on critical elements of your enterprise: namely Code Management, Configuration Management, and Environment Management. Partially it’s the resistance to automating everything, either out of fear of automating yourself out of a job, or that it will expose exactly how much you have been relying on luck to deploy, well, anything with the code/config/environment you have, or worse, how little it takes Partially it’s the technical debt that’s been building up inside the organization as the need to reduce expenses based on reduced margins on whatever it is you’re selling leads your CFO to reach into the toolbox-of-one-tool and reduce headcount, and who needs enterprise architects anyway?
Funny though, that all seems to add up to a recipe for being successful. And I guess it is, but you know what? Noone really talks about all these elements with regards to DevOps. It’s all way more… opaque than that. Talking about getting developers and operations closer together is nice, but if it’s not backed by most of what’s detailed above, well, you’re going to buy a lot of lunches for very little return.
Recipe for success
A director I worked for has a saying that “base hits win baseball games, but it’s hard to justify base hits when your management wants home runs all the time”. I’m all about base hits, and I want to detail some of the base hits that you can actually achieve to get further along the DevOps path.
- Source Code Control
Mandate that a single, unified source code repository is used by at least all of your critical applications. This sounds like a “duH’ mandate, but you would be highly surprised if your corporate risk people weren’t already saying that this is a risk to your corporate survivability in the event of a disaster, and you just haven’t been hearing them. Putting DR to one side for a second, the concept of source code control is fundamental to anything in the Continuous realm - Continuous Build, Test, Deploy and the forgotten one: Operate. If you don’t have a source of truth for your code, then anything you do in your code base is likely to cause regressions and reintroduction of bugs that you swore were fixed 3 releases ago, but they magically reappeared. Bugs don’t magically reappear, by the way, out of the entire spectrum of Things That Go Wrong In IT, bugs in software are 99% human induced. This is where having source code control works in your favor: you at least know the human that put the bug in the code, and you have the ability to diff and refactor to remove it and then check into the master tree the fixed version that will then go into the testbeds and break/fix testing environment (if you have or know what that is) so that you can quickly deploy the fix into production.
2) Narrow your standards
Standards is usually interpreted to mean an excuse to throw a temper tantrum because you didn’t get that toy you wanted and you want it NAOW. To avert this, I am a big proponent of proividing a catalog of standards that people can select from, and making it very difficult to get outside those standards. Changing the back and forth that happens from “I must use or my timeline is blown” into a “But here’s why I don’t waaaaaant to use any of the staaaaaandards you have provided me” makes the job of “management” easier: I’m sorry, Mr Developer, you get to justify to your CxO why you are a speshul snowflake to not use the standards OR you get to justify to your CxO why the standards need to expand to use your technology stack because it’s Just Better Than What We Do Now. Either way, it’s no longer at the delivery step that this kind of argument occurs, this should be happening way up at the front of the sprint. On the Business of IT side, narrowing standards means that you can put dollar costs to the application stacks you deploy, and you can put dollar costs to the exceptions that are requested: and that means easier decision making.
3) Everything must be deployable
Doug and Dinsdale Pirahna would be proud of this: “nice application release you have there, be a shame if you caught fire and noone else knew how to deploy it”. Forcing deployability - code/config/environment and handsfree deployment modes - insulates your organization from the Single Points of Failure/Success you have bred. It also means, with enough discipline, that you can deploy into any environment - system/integration/UAT/break fix/training/production. Once you cross the Rubicon of splitting apart the code from the environment, a whole raft of efficiencies open up, not to mention that this allows test automation to actually, you know, work.
Incidentally, doing this also forces your service catalog to be more robust: if an app can’t be deployed without the load balancer being able to be updated at the same time, then whoever owns the load balancer is now the critical path element owner, not the app team sitting at the sidelines bleating that they can’t deploy their app. This is the principle of ownership writ large: equipping each team to own their capability and to own exposing consumption of that capability means that they have to be on the hook to deliver. No more finger pointing that the form wasn’t filled out correctly/the form doesn’t handle our app/but the form isn’t filled out correctly/the form doesn’t cater for our port range/what do you mean you don’t know the TCP ports your app uses/Forget it, we’ll just escalate.
4) Automate Everything
The difference the new model of delivery versus the old way is the standardization of technology and the degree of automation that is applied. Automation is one of those things that people don’t really want to talk about because it’s very personal: automation is seen as a job remover. As I’ve said a LOT before: automation is about getting smart people to work on smart things, not smart people working on dumb things. But it’s also about the Business of IT: there’s no reason to use cheap labor when there should be no labor, it’s just how you make it from point A (expensive labor/smart people/low automation) to point B (automated delivery/smart people/high automation) without breaking your business
At the end of the day, the thing that is going to prove or slay the devops model is whether it can be adopted by existing IT organizations and Not Break Your Business In The Meantime.
No comments:
Post a Comment