Living the dream: October 2017

Wednesday, October 25, 2017

The Paradox of Competency

It has been argued that the most difficult thing to do in sports is to hit a ball off of a major league pitcher. I will respectfully disagree. Once every four years, I stare mesmerized at the image of young women competing at the Summer Olympics in the vault competition. The girls sprint down a narrow strip, jump onto a launching pad, get sprung ten feet in the air, land on a narrow apparatus, then push themselves back into the air from a handstand, perform multiple spins and twists, and are expected to nail the landing. For the life of me, I cannot think of anything that could be more difficult.

And yet… Should one of these ladies hop or slightly waive an arm, you can audibly hear the commentator suck in their breath and say, “Oooooohhh, it was a good jump but there was that little hop at the end. The judges are definitely going to deduct for that.”

There are perhaps one in a million young women with the requisite talent and training to be able to accomplish this mind blowing feet. These girls spend literally years of their lives training and preparing for this moment. All those years, all those workouts, all those sessions in the gym come down to those few seconds in front of the judges. Their goal is to do the near impossible and to make the impossible look easy.

No commentator has ever said, “Wow! That was an amazing vault. She did three somersaults with a twist! Who cares if she hopped a little on the landing? It was still incredible!”

I think the judges are so adamant that the girls make it look easy because we all implicitly know that what they are doing is both difficult and dangerous. Making it look easy is part of the competition. On the opposite end of the spectrum, in software engineering, software engineers are often judged by people who know little about software engineering. In order to be a good software engineer, in their minds, software engineers need to make their jobs look hard.

If an engineer takes a look at a difficult problem, thinks about it for a while, quietly sits and cranks out some code, writes a few tests, and then silently checks it in - the equivalent of a young woman nailing the perfect vault, most managers will think that it is easy and be sadly unimpressed with the accomplishment.

On the other hand, if the same engineer had called countless design sessions, stayed some late nights, checked in code and saw it fail, reworked it several times, and then after a few weeks got it to work; the manager judging the engineer would probably give him high marks. This would be the equivalent of the gymnast flailing around in the air and landing on her butt, yet receiving a perfect ten.

I like to call this the “Paradox of Competency”. In most pursuits, accomplishing a task with grace and style is seen as a good thing. Sadly, in this godforsaken industry, most managers feel developers are interchangeable parts. To a bad manager, a software engineer is like the example of a monkey randomly hitting keys on a keyboard and hammering out a work of Shakespeare. Given enough time, randomly, that monkey will do it by accident. In fact, I have worked on plenty of systems that looked like they were developed by monkeys banging on keyboards.

THE NASH EQUILIBRIUM

Mathematician John Nash, the subject of the movie “A Beautiful Mind”, won the Nobel Prize in Economics for his work on the Nash Equilibrium. It is a framework for evaluating the decision making process of parties involved in a non-cooperative game. In layman’s terms, it is often used to describe the “prisoner’s dilemma”.

Both prisoners would be best off if they could confess without the other one confessing. Both prisoners, unfortunately, know the stakes and if they both confess they face ten years of hard time. Rationally, both of them keep quiet and accept the one year of jail. In this framework, each prisoner has to take in account the decision making process of the other participant and they settle on the most likely, least bad outcome. According to the Economist Magazine, this framework helps to describe why “the decisions that are good for the individual can be terrible for the group.” (Source)

While I am a student of applied game theory and a lover of all things rational, I once joked that I had to leave the consulting industry because I had ethical problems. Mainly, I have ethics. Sadly, I have started to realize that game theory and questionable ethics mix quite often in software companies, especially in the presence of a non-technical manager. Mainly, doing a competent job and creating simple, maintainable solutions is not rewarded. Effort and complexity are rewarded. It is, therefore, in the best interest of the individual to make their solution as complicated as possible. As the Nash Equilibrium predicts, what is good for the individual is bad for the organization.

I worked in a group dominated by the Comic Book Guy. Well over forty, never had a family or wife. As far as I know, never had a girlfriend. The only thing he did, to the best of my knowledge, outside of work was collect comic books and talk about collecting comic books. He was an absolute delight to work with. He, somehow, had managed to convince just about the entire company that he was the ONLY person who could possibly understand all the data going into the data warehouse. Therefore, however he wanted to design the data warehouse was up to him.

With two members of the crack team of architects dedicated to him, two consulting firms, countless dollars, endless resources, and two full years to create the data warehouse; it is not complete. With less than 50 gigabytes (not a big amount in terms of data warehousing), it is not uncommon for queries to run for over an hour. By any reasonable measure, the data warehouse in question is an absolute disaster. And yet… The Comic Book Guy is still calling the shots. Our shared manager went out of his way to call out how good a job he and his small crew were doing giving they were new to Redshift. They had been working with Redshift for two years now.

On the other hand, my small crew and I pulled together a platform that would hopefully lead to new revenue. We did it in two months using technology we had never used before. The CEO announced it, PR Newswire carried a press release for it, and not a word was said to the two guys who actually wrote it. We made a mistake. We made it look easy.

Like the gymnasts hurling themselves at the vault, writing software is not easy. Making it look easy, sticking to schedules, and actually delivering should be rewarded like the gymnast who sticks the landing. Hopefully, the non-technical managers of the world can do a better job evaluating teams and eliminate the Paradox of Competency.

Wednesday, October 18, 2017

WTF is DevOps?

In the late nineties, I can remember having a conversation with my future wife about the number of pillows on her bed.

“What are all these pillows for?”

I can understand the pillow for sleeping on. That I got. But there must have been at least ten pillows on the bed.

Patiently, she explained. “They are decorative pillows.”

“So you don’t sleep on them?”

Her patience was starting to wear thin. “No, they are decorative.”

The next week, we were sitting on the couch watching TV when a promo for “The Man Show” came on. Jimmy Kimmel was doing a monologue with just a clip for the promo. In the monologue he said, “Hey, ladies, what’s with all the pillows?”

I absolutely lost it as the timing on the promo and our recent conversation was just uncanny. To end my laughter at her expense, Julie hit me in the head with a pillow. It only made me laugh harder, but in the same vein…

WHAT’S WITH ALL THESE ENVIRONMENTS?

The second most read thing I have ever written was entitled “SharePoint is a Colossal Piece of Shit and Should Not Be Used by Anyone”. One of my chief complaints is that SharePoint does not have a good way to migrate code between environments. Most SharePoint implementations have a single environment, production.

Typically, when writing software, there are multiple environments. Code moves from a developer’s laptop to a development environment where the code is then mixed in with everyone else’s changes. Tests are performed. If everything passes, it is moved to a testing environment where business users can take a look at the new feature or verify a bug has been squashed. If everything goes well, there is a final move from testing to production. The goal is to verify along the way that the amount of change an end user sees is limited, that new code is thoroughly tested before it reaches an end user, and that nothing breaks during the deployment of the code itself.

When it’s just one software engineer writing code all by herself, having multiple environments would probably be overkill. Our heroine could introduce a new feature or squash a bug on her laptop, test it, and then push it out to a production server. In fact, that’s what I have typically done on a lot of my hobby work. I have a copy that runs on my laptop and I have a server in Amazon Web Services (AWS) or Digital Ocean. I code, test, and push.

However, when working with a team, it is a bit more complicated. What if I make a change in code and it works but then the guy sitting across from me makes a change and it works? Sadly, when our code is merged together, weird stuff starts happening. On a big software project, the code base becomes a living organism that is constantly changing. It is not uncommon to start working on a feature and have it take over a week before the resulting changes are attempted to merge back into the main code base. During that week, numerous changes will probably have been made by other team members.

Although each individual developer should be responsible for unit testing their code, how all the changes are going to work together is usually an unknown. Sadly, software is usually so fragile that every change can cause a ripple effect. Therefore, every time a change is introduced, it is a best practice to run a suite of regression tests to verify that the stuff that used to work actually still does work after the change has been introduced.

It is also at the point that the new code is moved from the developer’s laptop to the development environment that issues can arise as the hardware and software the developer is working on is now different than the integrated environment. A developer may work on a Mac or a PC, but have their code run on a Linux box. Not only are all the changes now tested together, but the deployment mechanism and differences between operating systems and hardware can be tested.

It is not uncommon for developers to test edge cases and create users like “Daffy Duck” in a development environment. The development environment exists to merge code together, test deployments, and make sure everything is working before getting the business users involved. The data quality and the ability for a business user to actually use this environment is usually fairly limited. Additionally, it is not uncommon to make multiple deployments to this environment before calling it done. Sure, every time a developer pushes code they really think they are done, but it never works that way.

In test, it is not uncommon to have more production like data. Data can be sanitized for Personally Identifiable Information (PII) and pulled back into the test environment so that business users can run tests and have the system behave more like they are used to in production. When a deployment goes from dev to test, the change set should have been thoroughly tested and there should be a high confidence that the code will work as expected. Of course, business users may change their minds, in which case development starts all over again on the engineer’s laptop, pushed to dev, and then to test before the business user sees it again.

Once the product owner or business users sign off, the code can move from test to prod. Usually this is done in a specific maintenance window and the user will never know about all the work that went into writing and testing a feature. If this all seems like overkill, plenty of bugs and unintended consequences are usually caught during the process and it allows multiple developers to work on the same code base simultaneously.

LAPTOP OPS

There are services such as Heroku or AWS Elastic Beanstalk which will allow a user to deploy an application from their laptop. If the user goes to the command line (where the real work is done) and types, “git push heroku master”, the contents of the local git repository are teleported to a Heroku dyno. A few minutes later, any changes that were made are now in effect in that environment.

This is not Dev Ops. It is laptop ops. It might work well for one lonely developer, beavering away on a code base by themselves, but… What if I make some changes to my application and the guy sitting a few desks away from me makes some changes? What happens if we type the magical “eb deploy” invocation at the same time? How would we know whose changes just stomped on top of the previous deploy? Short answer, we really don’t. While it may be a fun way to prototype, way too much can go wrong when working with a team.

FINALLY, DEVOPS DEFINED

So I attended DevOps days in Austin this year. The first thing every speaker did was try to define DevOps. So here’s my checklist for a definition my personal definition of DevOps:

All code is kept in a source control repository
Developers can clone the repository and get a local copy of the application working with minimal headache
All third party dependencies are clearly marked, kept out of source control, locked at a specific version, and can be quickly installed locally
Any change is tracked and fully auditable
Deployments of changes happen in a controlled and gated manner

In my dream DevOps world, no one would deploy code from their laptop. The more I thought about it, the more I liked the idea of merging a change into git launching a build. Popular git applications like the near ubiquitous GitHub have controls built into them that prevent merging directly into a branch, only allow certain GitHub users to approve a merge, allow for private repositories, and more safety/security controls. At the time; I was working with a Frankenstein’s monster of Stash, Sonatype, Jenkins, and Rundeck which was cobbled together by a crack team of architects that took at least half an hour to do a simple deployment. After the deployment was approved by God Himself. And Congress.

THE CODE PUSH PYRAMID

On April 14, 2017; I was sitting with a small team late at night. It was Good Friday and the rest of the company went home at 1:00. We were just getting started. Someone above my pay grade had drawn a line in the sand and declared some level of feature parity would be available in the Data Warehouse that day.

The day started optimistically. A deployment was approved and a scant half hour later, it was in production. But it didn’t do what it was expected to do. I pulled together a script that set the production environment back to the way it was and waited for the next deployment. That didn’t work either. I ran my script. And so it went. Until 11:00 that night. It occurred to me somewhere around noon that every change made in dev barely touched the ground in test before it was shot straight to production. I had an epiphany. We were testing in production!

I tried my hardest to convey to a team of people unphased by the fact that every change was going straight to production that this was a bad idea. I coined the phrase “The Code Push Pyramid (™)”. The idea is lots of changes get introduced in dev. Once tested together, they can move to test. Nowhere near as many changes that get pushed to dev will make it to test. Even fewer changes will make it from test to prod. That’s the DevOps way.

I grew increasingly frustrated with the highly inefficient processes governing every aspect of software development from the tools and frameworks we could use, to the deployments using Frankenstein’s monster, and the micromanagement we encountered along the way. What made it all the worse was that for all of our vaunted process, we TESTED IN PRODUCTION! I have personally dropped tables or restaged data too many times to count IN PRODUCTION. At some point, I offered that we could save a lot of money and time if we just got rid of all the lower environments and deployed straight to production as that was where we did our testing anyway.

It came to my attention that we tested in production because we had PII data. With all the attention given to the Equifax data breach, protecting PII data seems reasonable enough. I immediately proposed that we should figure out which fields contain PII and replace it with mock data in the lower environments. That way, we could still test and protect PII. This practice is fairly common in the industry. A crack team of architects is still working on this solution, using AWS Kinesis (trust me, this is a great tool but makes absolutely no sense for this scenario) as we speak.

#SYNERGY

Before we broke up the long running and inefficiently used Docker application into Lambdas, my co-worker, The Kid, kept on advocating that we should be using Lambdas. He was looking at it from a “this could solve all of our problems in a very cost effective and fast way” kind of perspective. I was looking at it from a “how in the world am I going to deploy this on Frankenstein’s monster” sort of way. I created a little cheer I used to say every time The Kid brought up the subject of Lambdas.

“You say Lambda, I say no!”

(pause)

“Lambda!”

(pause)

“No!”

“You say Lambda, I say no! Lambda! No!”

And repeat. Maybe you had to be there, but it was kind of funny. The entire time I was doing my “no lambda” chant, I was working like mad to find a way to deploy them. Something better than Frankenstein’s monster. From a laptop ops perspective, I loved using an open source (read: free) framework called serverless. My hope was to introduce this framework to the monster and make it suck just a little bit less. The crack team of architects said no.

So here I am, months later, and I came up with my own framework for deploying things to AWS in a manner that meets my definition of DevOps. I have a GitHub action hook pointing to a lambda. The lambda reads the branch that has been merged and the branch determines which environment it will be deployed to. If the branch does not match the config file, the service assumes it is a feature branch and no deployment is made. The lambda then clones the repository and runs a deploy script kept within the repo. I built a build board and deployed it with the platform which shows the current state of every repository and streams the output in real time to the browser so anyone who is interested can follow along at home. I started calling it “Synergy” and I guess the name stuck. I hope to have a demo up next week.

Hopefully, anyone reading this far has a better understanding of what DevOps is and why we need all of those environments. Now, twenty years later, can someone please tell me why the ladies need all those pillows?

Tuesday, October 10, 2017

Cloud: The Hype and the Promise

SUMMER 1994, Evanston, IL

In one of those Dickensian “It was the best of times, it was the worst of times yada, yada” moments in my life, I found myself sharing a three bedroom, one bathroom crappy apartment with three other guys located right next to Chicago’s L train in Evanston, IL. I slept on a shitty, uncomfortable futon and all of my worldly possessions were crammed into plastic cubes purchased at Sam’s Club. As there were four of us and only three bedrooms, I shared one of the rooms with “Jon”.

Sadly, most of us could barely manage the reality of paying bills. Thankfully, Jon performed this task for us, but to this day I can remember the itemized receipts he would give us. They would look like:

Electric Bill ($100 / 4) = $25
Gas Bill ($50 / 4) = $12.50
Stamps (2 x $.29 / 4) = $0.15

He would literally add on the fifteen cents. I get that we were all broke, but Jon had this little mantra that he liked to yell at us, “Hey! (pause) It all adds up!”

That summer, while my other roommates were gone, I got my will to live beaten out of me by Jon. If I were sadly sitting on the floor watching TV (we had no furniture until one of the other roommates moved in) and got up to use the bathroom, Jon would yell from the kitchen, “Turn the lights off!”

“But... I’m just going to use the bathroom and come right back.”

“Hey! (pause) It all adds up!”

If I went from the bedroom to the kitchen and right back to the bedroom without turning the bedroom light off. I would get yelled at. Any protest was immediately greeted with, “It all adds up!”

I must have heard this a thousand times in the course of the year and I made a solemn vow that I would never be this ridiculous. I would rather pay an extra nickel in my monthly electric bill than be nagged for leaving a light on in a room I was going to immediately return to. I never consider myself wasteful, but there is a point of going too far where conservation becomes obsession and the law of diminishing returns kicks in. Jon had definitely crossed that line. And yet… Over twenty years later, I was reminded of Jon’s mantra as my attention turned towards cloud computing.

WHAT IS CLOUD COMPUTING?

Cloud computing relies on leasing services using a provider’s hardware. Back in the olden days, if I wanted to build a website, I would go out and purchase a server and a machine to use as my database. I would setup my server in a rack, apply patches, install software, etc. Same with the database. I would have to get a purchase order, wait for the items to be delivered, physically set them up, and a bunch of other things before I could even get started. There would be upfront, non-trivial costs as well.

Today, I could go out to a host of “cloud providers” and have a server and/or database provisioned for me in minutes. Billing is typically done by compute hours, but for a small server, it can be literally pennies per hour. The cloud provider locks down the specifics of the operating system version and does a whole bunch of other services like automated backups, provides guaranteed uptimes, and does all the setup and teardown.

BUT ISN’T THAT JUST RENTING?

Well, that’s one way to look at it, but… One of the biggest issues with resources is making sure they are properly utilized. What if my web application allowed users to order food at a restaurant from their smartphones? If I wanted to pilot it and only offer it in the United States, there might be spikes in traffic around lunch and dinner time, but in the dead of night my server or, more likely, servers would be idle. Every time one of my resources sits idle, it’s wasting money. Hey, (pause) it all adds up!

The other issue that cloud computing helps to solve is allowing horizontal scaling. Previously, if you needed a lot of compute power, you got a big server. The more power you needed the bigger the server. More CPUs. More RAM. This was called vertical scaling. Of course, the bigger the computer, the more expensive it is. When that big, expensive server sits idle it is no longer pennies adding up - the cost can be substantial.

The new trend is instead of having a big machine do a lot of tasks, is to chunk the tasks out to a bunch of little machines and do things in parallel. This process is called horizontal scaling and lets the developer use fancy, SAT words like parallelism.

With horizontal scaling and my theoretical order entry system, we might automatically launch extra servers during peak traffic times and then take those servers away after the traffic dissipates. That way, we are not paying for resources that are not being fully utilized. Using cloud computing, the scaling up and scaling down process can be done using technologies like auto scaling groups or Lambdas (more later).

On the contrary, if I had all my own machines and managed them myself, I would be stuck trying to plan around my worst case scenario. While I might be able to handle the dinner rush just fine, I would have overkill for 3AM. In fact, this is how Amazon Web Services was born. The folks at Amazon designed their peak capacity around the Christmas rush only to find that they had a lot of idle resources in January. They created a whole business around selling their excess capacity and it has turned out to be a win for Amazon and their customers. Amazon created a new revenue stream and customers got the benefit of Amazon’s experience with scalability. Previously, it took a lot of experience and pain to know how to setup a server behind a load balancer. Now, it’s just a few clicks away.

A NEW WAY OF THINKING

So the term “cloud” is all the rage right now. To potential investors in software companies or service providers, there may very well be a good reason for this. By taking advantage of cloud computing, software companies can control their costs. They can start small, learn, and then scale out (if they design their system correctly). During the early days, there is very little upfront costs, so a company can do a good proof of concept that will still work at scale.

Sadly, I have seen, firsthand, several instances where someone who is aware of the hype but unaware of why the hype exists decide to make a “cloud offering” based on an existing product. While working at a poorly run Internet of Things company, I was tasked with creating our “cloud product”. I took the exact same bits that were being burned onto DVDs and installed on a server at a client site and copied them to a server in AWS. If the potential customer needed a big server, I provisioned a big server. If they needed a small server, then I provisioned a small server. Behind the scenes, we started calling this “cloud washing”. Taking on-premise bits and putting them on a remote server does not create a cloud product. If the resources do not scale in or out, then it is not really a cloud offering.

DOCKER, WHAT IS IT GOOD FOR? ABSOLUTELY NOTHING, SING IT AGAIN!

In a similar vein, I was recently tasked, along with a small team of two other people, with building a platform from the ground up for a fintech company.

Before I took on this Herculean task, I had a friend in the company that I shared a joke with. We would just look at each other and say, “Docker, docker, docker, docker, docker…” We would say it fast and in a monotone. I didn’t say it was a funny joke, but it did capture the insane way that everyone at the company fell in love with Docker.

So, Docker is a software product that allows someone to create a Virtual Machine with very specific, fine grained dependencies. For example, one could create a Virtual Machine that was based off of Ubuntu 16.0.4, with MySQL 5.7, and Python 2.7.12. Anyone who used this VM, would be guaranteed to have those packages at the specified versions. Instead of going through a whole process of installing software on the specified version, it is possible to create a Docker Image and store it in a Docker Repository. Then, doing a simple docker pull, the user of the VM has everything that they need. In all fairness, it makes development and testing way easier because all the dependencies are completely spelled out down to the smallest detail.

WHEN YOU SAY YOU’RE AGILE, BUT YOU REALLY DO WATERFALL

So the company was a big proponent of Agile methodology, in theory. While it has taken me awhile to come to grips with the new world order of agile software development, I have come to terms with it. Like Winston Churchill said about democracy, “It’s the worst form of government, but it’s better than everything else.” Sure, it can be cargo cultish, but producing small chunks of working software beats talking about building software any day of the week.

Back when everyone was doing waterfall, most companies would push their own methodologies. Much like the secret sauce at any fast food restaurant is just mayonnaise and ketchup, the waterfall methodology always came down to five phases:

Scoping - Defining what the project will actually do
Requirements Gathering - Writing down detailed functionality of each feature
Development
Testing
Deployment

At the company claiming to do Agile, but really doing waterfall, there were five phases too. They broke down as:

Navel Gazing
Wishy Washiness with Lots of Requirements That Quickly Started to Contradict Themselves
Panic!
Failure
Excuse Making and Blame

The company had an interesting philosophy, especially for a supposedly agile shop. In theory, one of the benefits of being agile is that agile teams feel fully invested in the technical decisions that are made. Every team member, theoretically, gets to put their input into the project. The platform, theoretically, is composed of lots of small decisions made democratically by the team. Theoretically, all team members have a sense of pride and ownership because the software they ship is made of their decisions and hard work.

Except… The company decided that people who actually write software probably don’t know anything about writing software. There was a crack team of architects assigned to second guess and micro manage the agile teams to ensure that they were slow, inefficient, and had random technology they had never used and didn’t make any sense thrown into each project. With this in mind, we had chewed up several months of Navel Gazing and Whishy Washiness with absolutely nothing to show for it.

The team and I had done numerous Proofs of Concept (POCs) and had plenty of design sessions, but… The crack team of architects always managed to shoot down anything proposed. Defeated, I allowed myself to be talked into a solution that as our architect said, “Had all of the problems already solved!”

Before I knew it, I had cloned the repository of another team and was told to go forth and prosper. I was sold that this “solution” was amazing and would do everything we needed. I spent a good amount of time figuring out what, exactly, it did as there wasn’t a single build step or instruction to be found in the repo. A few days later, I had the existing solution reverse engineered. It was a console application written in Java that would sit idle most of the time. Every five minutes, it would reach out to a queue system in Amazon to see if it had anything to do. If there was nothing to do, it would sit idle for another five minutes. If there was something, it would grab a file that was part of the queue’s message, run the Apache Pig process against the file, and output the transformed file to a dedicated S3 bucket. Then it would go back to sitting idle.

What was so amazing about this solution that had the crack team of architects lauding it? It ran in Docker of course! All the excitement was over a simple Java console application that spent most of its life doing absolutely nothing except racking up fees on AWS...

LIGHTBULB

I tried my hardest to understand what part of the solution to save and what to throw out. In reality, all it did was poll a queue and run Pig. Most of the heavy lifting was not even done in the Java application, but Java invoked a shell script which did the actually work. What was worse, the solution mostly sat idle. While it was sitting idle, charges accumulated. If a whole bunch of files came in at once which needed to be transformed, then this one little worker would finally have something to do but it was going to a decent amount of time to process all the files. In short, it didn’t seem very cloudy to me. In my humble opinion, a better solution would have been to create a resource when the resource was needed. To spin up as many resources simultaneously as necessary, and then more importantly turn everything off when they were no longer needed, because hey (pause) it all adds up.

In the middle of the Panic phase on a random Saturday, my wife took the kids out and I found myself with some alone time. Naturally, I decided to use these few unfilled hours on an experiment to make the solution we were coerced into suck just a little bit less…

In the spirit of cloudiness, scalability, and low cost; Amazon launched its Lambda service a few years ago. Using Lambdas are a bit different than the traditional use of servers to do work. Typically, a lambda is event driven, meaning that something, somewhere happened that kicked off the process. They are also stateless, or have no persistent memory or knowledge of what has happened before they were kicked off. They are also short lived, as of this writing, they can only exist for five minutes or less. While their short lived nature is a bit of a deterrent, they are also billed by fractions of 100 milliseconds. Additionally, they automatically scale themselves out. If ten events simultaneously happen, then ten lambdas are created to handle the ten events. When the lambdas turn themselves off, there is absolutely no charge for the time that they sit idle. This seemed like a much cloudier solution than having a single dedicated resource that would handle events one at a time and keep on billing when it sat idle 97% of the time.

With that in mind and a few hours, I had a prototype that performed the same functionality as the Docker solution, but scaled out and charged by the 100 milliseconds. The real work, again, was done using a bunch of bash shell scripts, and the Lambda itself had roughly one hundred lines of code. It opened up a lot of unknowns such as how would we manage and deploy the lambdas along with the fear of using a technology that was not proposed by the crack team of architects. But the promise of parallel processing and cost savings were eventually too great to pass on even though we were mid way through the Panic phase.

Sure enough, a scant two months later, the team and I delivered the platform completely based on Lambdas. We skipped the Failure and Excuse Making phases altogether, replacing them with Deploy and Iterate phases. As our non-technical managers worried themselves sick about our unproven and new way of doing things, after delivery a lot of analysis was done. It turns out the cost of running our Lambdas for a few months came out to roughly $6. My miserable summer of 1994 having Jon repeatedly yelling at me about costs adding up had eventually paid dividends a mere twenty-three years later as I got a chance to design a truly cloud based system that automatically scaled horizontally and was incredibly low cost. Hey (pause), it all adds up!