SUMMER 1994, Evanston, IL
In one of those Dickensian “It was the best of times, it was the worst of times yada, yada” moments in my life, I found myself sharing a three bedroom, one bathroom crappy apartment with three other guys located right next to Chicago’s L train in Evanston, IL. I slept on a shitty, uncomfortable futon and all of my worldly possessions were crammed into plastic cubes purchased at Sam’s Club. As there were four of us and only three bedrooms, I shared one of the rooms with “Jon”.
Sadly, most of us could barely manage the reality of paying bills. Thankfully, Jon performed this task for us, but to this day I can remember the itemized receipts he would give us. They would look like:
Electric Bill ($100 / 4) = $25
Gas Bill ($50 / 4) = $12.50
Stamps (2 x $.29 / 4) = $0.15
He would literally add on the fifteen cents. I get that we were all broke, but Jon had this little mantra that he liked to yell at us, “Hey! (pause) It all adds up!”
That summer, while my other roommates were gone, I got my will to live beaten out of me by Jon. If I were sadly sitting on the floor watching TV (we had no furniture until one of the other roommates moved in) and got up to use the bathroom, Jon would yell from the kitchen, “Turn the lights off!”
“But... I’m just going to use the bathroom and come right back.”
“Hey! (pause) It all adds up!”
If I went from the bedroom to the kitchen and right back to the bedroom without turning the bedroom light off. I would get yelled at. Any protest was immediately greeted with, “It all adds up!”
I must have heard this a thousand times in the course of the year and I made a solemn vow that I would never be this ridiculous. I would rather pay an extra nickel in my monthly electric bill than be nagged for leaving a light on in a room I was going to immediately return to. I never consider myself wasteful, but there is a point of going too far where conservation becomes obsession and the law of diminishing returns kicks in. Jon had definitely crossed that line. And yet… Over twenty years later, I was reminded of Jon’s mantra as my attention turned towards cloud computing.
WHAT IS CLOUD COMPUTING?
Cloud computing relies on leasing services using a provider’s hardware. Back in the olden days, if I wanted to build a website, I would go out and purchase a server and a machine to use as my database. I would setup my server in a rack, apply patches, install software, etc. Same with the database. I would have to get a purchase order, wait for the items to be delivered, physically set them up, and a bunch of other things before I could even get started. There would be upfront, non-trivial costs as well.
Today, I could go out to a host of “cloud providers” and have a server and/or database provisioned for me in minutes. Billing is typically done by compute hours, but for a small server, it can be literally pennies per hour. The cloud provider locks down the specifics of the operating system version and does a whole bunch of other services like automated backups, provides guaranteed uptimes, and does all the setup and teardown.
BUT ISN’T THAT JUST RENTING?
Well, that’s one way to look at it, but… One of the biggest issues with resources is making sure they are properly utilized. What if my web application allowed users to order food at a restaurant from their smartphones? If I wanted to pilot it and only offer it in the United States, there might be spikes in traffic around lunch and dinner time, but in the dead of night my server or, more likely, servers would be idle. Every time one of my resources sits idle, it’s wasting money. Hey, (pause) it all adds up!
The other issue that cloud computing helps to solve is allowing horizontal scaling. Previously, if you needed a lot of compute power, you got a big server. The more power you needed the bigger the server. More CPUs. More RAM. This was called vertical scaling. Of course, the bigger the computer, the more expensive it is. When that big, expensive server sits idle it is no longer pennies adding up - the cost can be substantial.
The new trend is instead of having a big machine do a lot of tasks, is to chunk the tasks out to a bunch of little machines and do things in parallel. This process is called horizontal scaling and lets the developer use fancy, SAT words like parallelism.
With horizontal scaling and my theoretical order entry system, we might automatically launch extra servers during peak traffic times and then take those servers away after the traffic dissipates. That way, we are not paying for resources that are not being fully utilized. Using cloud computing, the scaling up and scaling down process can be done using technologies like auto scaling groups or Lambdas (more later).
On the contrary, if I had all my own machines and managed them myself, I would be stuck trying to plan around my worst case scenario. While I might be able to handle the dinner rush just fine, I would have overkill for 3AM. In fact, this is how Amazon Web Services was born. The folks at Amazon designed their peak capacity around the Christmas rush only to find that they had a lot of idle resources in January. They created a whole business around selling their excess capacity and it has turned out to be a win for Amazon and their customers. Amazon created a new revenue stream and customers got the benefit of Amazon’s experience with scalability. Previously, it took a lot of experience and pain to know how to setup a server behind a load balancer. Now, it’s just a few clicks away.
A NEW WAY OF THINKING
So the term “cloud” is all the rage right now. To potential investors in software companies or service providers, there may very well be a good reason for this. By taking advantage of cloud computing, software companies can control their costs. They can start small, learn, and then scale out (if they design their system correctly). During the early days, there is very little upfront costs, so a company can do a good proof of concept that will still work at scale.
Sadly, I have seen, firsthand, several instances where someone who is aware of the hype but unaware of why the hype exists decide to make a “cloud offering” based on an existing product. While working at a poorly run Internet of Things company, I was tasked with creating our “cloud product”. I took the exact same bits that were being burned onto DVDs and installed on a server at a client site and copied them to a server in AWS. If the potential customer needed a big server, I provisioned a big server. If they needed a small server, then I provisioned a small server. Behind the scenes, we started calling this “cloud washing”. Taking on-premise bits and putting them on a remote server does not create a cloud product. If the resources do not scale in or out, then it is not really a cloud offering.
DOCKER, WHAT IS IT GOOD FOR? ABSOLUTELY NOTHING, SING IT AGAIN!
In a similar vein, I was recently tasked, along with a small team of two other people, with building a platform from the ground up for a fintech company.
Before I took on this Herculean task, I had a friend in the company that I shared a joke with. We would just look at each other and say, “Docker, docker, docker, docker, docker…” We would say it fast and in a monotone. I didn’t say it was a funny joke, but it did capture the insane way that everyone at the company fell in love with Docker.
So, Docker is a software product that allows someone to create a Virtual Machine with very specific, fine grained dependencies. For example, one could create a Virtual Machine that was based off of Ubuntu 16.0.4, with MySQL 5.7, and Python 2.7.12. Anyone who used this VM, would be guaranteed to have those packages at the specified versions. Instead of going through a whole process of installing software on the specified version, it is possible to create a Docker Image and store it in a Docker Repository. Then, doing a simple docker pull, the user of the VM has everything that they need. In all fairness, it makes development and testing way easier because all the dependencies are completely spelled out down to the smallest detail.
WHEN YOU SAY YOU’RE AGILE, BUT YOU REALLY DO WATERFALL
So the company was a big proponent of Agile methodology, in theory. While it has taken me awhile to come to grips with the new world order of agile software development, I have come to terms with it. Like Winston Churchill said about democracy, “It’s the worst form of government, but it’s better than everything else.” Sure, it can be cargo cultish, but producing small chunks of working software beats talking about building software any day of the week.
Back when everyone was doing waterfall, most companies would push their own methodologies. Much like the secret sauce at any fast food restaurant is just mayonnaise and ketchup, the waterfall methodology always came down to five phases:
At the company claiming to do Agile, but really doing waterfall, there were five phases too. They broke down as:
The company had an interesting philosophy, especially for a supposedly agile shop. In theory, one of the benefits of being agile is that agile teams feel fully invested in the technical decisions that are made. Every team member, theoretically, gets to put their input into the project. The platform, theoretically, is composed of lots of small decisions made democratically by the team. Theoretically, all team members have a sense of pride and ownership because the software they ship is made of their decisions and hard work.
Except… The company decided that people who actually write software probably don’t know anything about writing software. There was a crack team of architects assigned to second guess and micro manage the agile teams to ensure that they were slow, inefficient, and had random technology they had never used and didn’t make any sense thrown into each project. With this in mind, we had chewed up several months of Navel Gazing and Whishy Washiness with absolutely nothing to show for it.
The team and I had done numerous Proofs of Concept (POCs) and had plenty of design sessions, but… The crack team of architects always managed to shoot down anything proposed. Defeated, I allowed myself to be talked into a solution that as our architect said, “Had all of the problems already solved!”
Before I knew it, I had cloned the repository of another team and was told to go forth and prosper. I was sold that this “solution” was amazing and would do everything we needed. I spent a good amount of time figuring out what, exactly, it did as there wasn’t a single build step or instruction to be found in the repo. A few days later, I had the existing solution reverse engineered. It was a console application written in Java that would sit idle most of the time. Every five minutes, it would reach out to a queue system in Amazon to see if it had anything to do. If there was nothing to do, it would sit idle for another five minutes. If there was something, it would grab a file that was part of the queue’s message, run the Apache Pig process against the file, and output the transformed file to a dedicated S3 bucket. Then it would go back to sitting idle.
What was so amazing about this solution that had the crack team of architects lauding it? It ran in Docker of course! All the excitement was over a simple Java console application that spent most of its life doing absolutely nothing except racking up fees on AWS...
LIGHTBULB
I tried my hardest to understand what part of the solution to save and what to throw out. In reality, all it did was poll a queue and run Pig. Most of the heavy lifting was not even done in the Java application, but Java invoked a shell script which did the actually work. What was worse, the solution mostly sat idle. While it was sitting idle, charges accumulated. If a whole bunch of files came in at once which needed to be transformed, then this one little worker would finally have something to do but it was going to a decent amount of time to process all the files. In short, it didn’t seem very cloudy to me. In my humble opinion, a better solution would have been to create a resource when the resource was needed. To spin up as many resources simultaneously as necessary, and then more importantly turn everything off when they were no longer needed, because hey (pause) it all adds up.
In the middle of the Panic phase on a random Saturday, my wife took the kids out and I found myself with some alone time. Naturally, I decided to use these few unfilled hours on an experiment to make the solution we were coerced into suck just a little bit less…
In the spirit of cloudiness, scalability, and low cost; Amazon launched its Lambda service a few years ago. Using Lambdas are a bit different than the traditional use of servers to do work. Typically, a lambda is event driven, meaning that something, somewhere happened that kicked off the process. They are also stateless, or have no persistent memory or knowledge of what has happened before they were kicked off. They are also short lived, as of this writing, they can only exist for five minutes or less. While their short lived nature is a bit of a deterrent, they are also billed by fractions of 100 milliseconds. Additionally, they automatically scale themselves out. If ten events simultaneously happen, then ten lambdas are created to handle the ten events. When the lambdas turn themselves off, there is absolutely no charge for the time that they sit idle. This seemed like a much cloudier solution than having a single dedicated resource that would handle events one at a time and keep on billing when it sat idle 97% of the time.
With that in mind and a few hours, I had a prototype that performed the same functionality as the Docker solution, but scaled out and charged by the 100 milliseconds. The real work, again, was done using a bunch of bash shell scripts, and the Lambda itself had roughly one hundred lines of code. It opened up a lot of unknowns such as how would we manage and deploy the lambdas along with the fear of using a technology that was not proposed by the crack team of architects. But the promise of parallel processing and cost savings were eventually too great to pass on even though we were mid way through the Panic phase.
Sure enough, a scant two months later, the team and I delivered the platform completely based on Lambdas. We skipped the Failure and Excuse Making phases altogether, replacing them with Deploy and Iterate phases. As our non-technical managers worried themselves sick about our unproven and new way of doing things, after delivery a lot of analysis was done. It turns out the cost of running our Lambdas for a few months came out to roughly $6. My miserable summer of 1994 having Jon repeatedly yelling at me about costs adding up had eventually paid dividends a mere twenty-three years later as I got a chance to design a truly cloud based system that automatically scaled horizontally and was incredibly low cost. Hey (pause), it all adds up!