Spot instances for the win!

Cloud computing is supposed to be cheap, right?

No longer do we need to fork out £5-10k for some silicon and tin, and pay for space and the power, the cables and the install, etc, etc. Building in the cloud meant we could go and provision a host and leave it running for a few hours and remove it when we were done. No PO/finance hoops to jump through, no approvals needed, just provision the host and do your work.

So, in some ways this was true, there was little or no upfront cost and it’s easier to beg forgiveness than permission, right? But the fact is we’ve moved on from the times when AWS was a demo environment, or a test site, or something that the devs were just toying with. Now it’s common for AWS (or Azure, or GCE) to be your only compute environment and the fact is the bills are much bigger now. AWS has become our biggest platform cost and so we’re always looking for ways to reduce our cost commitments there.

At the same time that AWS and cloud have become mainstream for many of us, so too have microservices, and while their development and testing benefits of microservices are well recognised, the little-recognised truth is that they also cost more to run. Why? Because as much as I may be doing the same amount of ‘computing’ as I was in the monolith (though I suspect we were actually doing less there) each microservice now wants its own pool of memory. The PHP app that we ran happily on a single 2GB server with 1CPU has now been split out into 40 different components, each with its own baseline memory consumption of 100MB, so I’ve already doubled my cost base just by using a more ‘efficient’ architecture.

Of course, AWS offers many ways of reducing your compute costs with them. There are many flavours of machine available, each with memory and CPU offerings tuned to your requirements. You can get 50%+ savings on the cost of compute power by committing to paying for the system for 3 years (you want the flexible benefits of cloud computing, right?). Beware the no-upfront reservations though – you’ll lose most of the benefits of elastic computing, with very little cost-saving benefits.

You could of course use an alternative provider, Google bends over backward to prove they have a better, cheaper, IaaS, but the truth is we’re currently too in-bed and busy to move provider (we’ve only just finished migrating away from Rackspace, so we’re in no hurry to start again!)

So, how can we win this game? Spot Instances. OK, so they may get turned off at any moment, but the fact is for the majority of common machine types you will pay 20% of the on-demand price for a spot instance. Looking at the historical pricing of spot instances also gives you a pretty good idea how likely it is that a spot instance will be abruptly terminated. The fact is, if you bid at the on-demand price for a machine – i.e. what you were GOING to pay, but put it on a spot instance instead, you’ll end up paying ~20% of what you were going to and your machine will almost certainly still be there in 3 months time. As long as your bid price remains above spot price, your machine will stay on and you will pay the spot price, not your bid!

AWS Spot Price History

What if this isn’t certain enough for you? If you really want to take advantage of spot instances, build your system to accommodate failure and then hedge your bids across multiple compute pools of different instance types. You can also reserve a baseline of machines, which you calculate to be the bare minimum needed to run your apps, and then use spots to supplement that baseline pool in order to give your systems more burst capacity.

How about moving your build pipeline on to spot instances or that load test environment?

Sure, you can’t bet your house on them, but given the right risk approach to them you can certainly save a ton of money of your compute costs.

How to turn good ideas into great ones

To have a great idea, you first need to have several hundred crazy ideas. This is as true of organisations as it is of people. A culture where people feel free to speak up in meetings and throw crazy ideas about is a really useful thing to develop, as long as we can somehow sort the good from the bad, the great from the good, and the exceptional from the great.

Does your company have a culture of exploring, analysing and weighing new ideas? Typically, wild ideas are treated in one of two ways:

“Mhmm. Sounds good Jeff” (while thinking, that’s a terrible idea, hopefully this guy will stop talking about this soon).

“Yes, that sounds great” [doesn’t really understand]. The book The Mom Test describes how people will often heap superficial praise on the idea, without really understanding it well, because they like you as a person.

Instead of immediately forming an opinion on whether an idea is good or bad, a better approach is to try to put our initial gut-reactions on hold and spend 2 or 3 minutes exploring the concept in a more rigorous way.

A useful way of doing this is to use the ‘dialectic method’, otherwise known as Thesis, Antithesis, Synthesis.

In this paradigm you listen to what is being proposed and take the opposite stance playing ‘devil’s advocate’, to try and find arguments against whatever has been proposed. This criticism creates a tension which you and the proposer then try to resolve. This reconciliation between the two viewpoints usually ends up in something brand new called the ‘synthesis’.

For example:

Thesis: People can only get financial advice from a qualified financial planner, and this is expensive

Antithesis: People don’t need access to a qualified financial planner to get financial advice.

Synthesis: Can we create some software that gives the advice that a qualified financial planner would give, but at a lower price?

Thesis: We should get a ball pit for the office so that people can take naps during the working day

Antithesis: We can’t get a ball pit, the balls will get everywhere and will make the office look terrible

Synthesis: Let’s get a futon!

Thesis: Oil is becoming more and more scarce, we need to develop more efficient engines

Antithesis: Oil is becoming more scarce, but I wonder if developing more efficient engines is the only solution?

Synthesis: Build electric cars

Thesis: No one will buy electric cars, they look like a shoe and perform like a cow

Antithesis: People need to starting converting to electric cars

Synthesis: Build a sexy high-end electric supercar to show everyone the possibilities and use the proceeds to fund a mass-production car


 

How can we practice this as a company? First, someone needs to be the person in the team who starts flinging ideas about like Billy-o. Could this be you?

Then, here are some ideas about how we can use the dialectic method to explore and filter those ideas:

As a giver
If someone presents an idea, immediately take the contrary view. Do it with a cheeky smile, so everyone knows what the game is. Challenge the idea instead of the proposer. Treat it as an intellectual game.

As a receiver
If you have an idea, seek out people who are likely to think it’s a bad idea. Do you know any grumpusses? Go and find them, and pour your heart out. Keep your cool as they stomp your beautiful idea under their vile jackboots and write down all of their pearls of wisdom.  See if you can satisfy all of their beefs. If you can, you have a winner. If you can’t, then maybe we’ve all learned something today.

As a team
Run brainstorming sessions where you produce ideas on post-its. Then as a team rank them by the most controversial. Then explore the top two or three in depth in this paradigm.

 

TLDR; let’s take the lead in showing how we can challenge ideas with a smile, and use this as a way to create new even better ideas!

Making asynchronous code look synchronous in JavaScript

Why go asynchronous

Asynchronous programming is a great paradigm which offers a key benefit over its synchronous counterpart – non blocking I/O within a single threaded environment. This is achieved by allowing I/O operations such as network requests and reading files from disk to run outside of the normal flow of the program. By doing so this enables responsive user interfaces and highly performant code.

The challenges faced

To people coming from a synchronous language like PHP, the concept of asynchronous programming can seem both foreign and confusing at first, which is understandable. One moment you were programming one line at a time in a nice sequential fashion, the next thing you know you’re skipping entire chunks of code, only to jump back up to those chunks at some time later. Goto anyone? Ok, it’s not *that* bad.
Then, you have the small matter of callback hell, a name given to the mess you can find yourself in when you have asynchronous callbacks nested within asynchronous callbacks several times deep – before you know it all hell has broken loose.
Promises came along to do away with callback hell, but for all the good they did, they still did not address the issue of code not being readable in a nice sequential fashion.

Generators in ES6

With the advent of ES6, along came a seemingly unrelated paradigm – generators. Generators are a powerful construct, allowing a function to “yield” control along with an (optional) value back to the calling code, which can in turn resume the generator function, passing an (optional) value back in. This process can be repeated indefinitely.

Consider the following function, which is a generator function (note the special syntax), and look at how its called:

function *someGenerator() {
  console.log(5); // 5
  const someVal = yield 7.5;
  console.log(someVal); // 10
  const result = yield someVal * 2;
  console.log(result); // 30
}

const it = someGenerator();
const firstResult = it.next();
console.log(firstResult.value); // 7.5
const secondResult = it.next(10);
console.log(secondResult.value); // 20
result.next(30);

Can you see what’s going on? The first thing to note is that when a generator is called, an iterator is returned. An iterator is an object that knows how to access items from a collection, one item at a time, keeping track of where it is in the collection. From there, we call next on the iterator, passing control over to the generator, and running code up until the first yield statement. At this point, the yielded value is passed to the calling code, along with control. We then call next, passing in a value and with it we pass control back to the generator function. This value is assigned to the variable someVal within the generator. This process of passing values in and out of the generator continues, with console log’s providing a clearer picture of what’s going on.

One thing to note is the de-structuring of value from the result of each call to next on the iterator. This is because the iterator returns an object, containing two key value pairs, done, and value. done represents whether the iterator is complete. value contains the result of the yield statement.

Using generators with promises

This mechanism of passing control out of the generator, then at some time later resuming control should sound familiar – that’s because this is not so different from the way promises work. We call some code, then at some time later we resume control within a thenable block, with the promise result passed in.

It therefore only seems reasonable that we should be able to combine these two paradigms in some way, to provide a promise mechanism that reads synchronously, and we can!

Implementing a full library to do this is beyond the scope of this article, however the basic concepts are:

  • Write a library function that takes one argument (a generator function)
  • Within the provided generator function, each time a promise is encountered, it should be yielded (to the library function)
  • The library function manages the promise fulfillment, and depending on whether it was resolved or rejected passes control and the result back into the generator function using either next or throw
  • Yielded promises should be wrapped in a try catch
For a full working example, check out a bare bones library I wrote earlier in the year called awaiting-async, complete with unit tests providing example scenarios.

How this looks

Using a library such as this (there are plenty of them out there), we can take the following code from this:

const somePromise = Promise.resolve('some value');

somePromise
  .then(res => {
    console.log(res); // some value
  })
  .catch(err => {
    // (Error handling code would go in here)
  });
To this:
const aa = require('awaiting-async');

aa(function *() {
  const somePromise = Promise.resolve('some value');
  try {
    const result = yield somePromise;
    console.log(result); // some value
  } catch (err) {
    // (Error handling code would go in here)
  }
});

And with it, we’ve made asynchronous code look synchronous in JavaScript!

tl;dr

Generator functions can be used in ES6 to make asynchronous code look synchronous.

What went wrong? Reverse-engineering disaster

Last week, we nearly pushed a bad configuration into production, which would have broken some things and made some code changes live that were not ready. Nearly, but not quite: while we were relieved that we’d caught it in time, it was still demoralising to find out how close we had come to trouble, and a few brave souls had to work into the evening to roll back the change and make it right.

Rather than shouting and pointing fingers, the team came together, cracked open the Post-Its and Sharpies and set to engineering. The problem to be solved: what one thing could we change to make this problem less likely, or less damaging?

What happened?

The first step was for the team to build a cohesive view of what happened. We did that by using Post-Its on the wall to construct a timeline: everybody knew what they individually had done and had seen, and now we could put all of that together to describe the sequence of events in context. Importantly, we described the events that occurred not the people or feelings: “the tests passed in staging” not “QA told me there wouldn’t be a problem”.

Yes, the tests passed, but was that before or after code changes were accepted? Did the database migration start after the tests had passed? What happened between a problem being introduced, and being discovered?

Why was that bad?

Now that we know the timeline, we can start to look for correlation and insight. So the tests passed in staging, is that because the system was OK in staging, because the tests missed a case, because the wrong version of the system ran in testing, or because of a false negative in the test run? Is it expected that this code change would have been incorporated into that migration?

The timeline showed us how events met our expectations (“we waited for a green test run before starting the deployment”) or didn’t (“the tests passed despite this component being broken”, “these two components were at incompatible versions”). Where expectations were not met, we had a problem, and used the Five Whys to ask what the most…problemiest…problem was that led to the observed effect.

What do we need to solve?

We came out of this process with nine different things that contributed to our deployment issue. Nine problems are a lot to think about, so which is the most important or urgent to solve? Which one problem, if left unaddressed, is most likely to go wrong again or will do most damage if it does?

More sticky things were deployed as we dot-voted on the issues we’d raised. Each member of the team was given three stickers to distribute to the one-three issues that seemed highest priority to solve: if one’s a stand-out catastrophe, you can put all three dots on that issue.

This focused us a great deal. After the dots were counted, one problem (gaps in our understanding of what changes went into the deployment) stood out above the rest. A couple of other problems had received a few votes, but weren’t as (un)popular: the remaining six issues had zero or one dot each.

I got one less problem without ya

Having identified the one issue we wanted to address, the remaining question was how? What shall we do about it? The team opted to create a light-weight release checklist that could be used in deployment to help build the consistent view we need of what is about to be deployed. We found that we already have the information we need, so bringing it all into one place when we push a change will not slow us down much while increasing our confidence that the deployment will go smoothly.

A++++ omnishambles; would calamity again

The team agreed that going through this process was a useful activity. It uncovered some process problems, and helped us to choose the important one to solve next. More importantly, it led us to focus on what we as a team did to get to that point and what we could do to get out of it, not on what any one person “did wrong” and on finding someone to blame.

Everyone agreed that we should be doing more of these root cause analyses. Which I suppose, weirdly, means that everybody’s looking forward to the next big problem.

Taming the Beast

Security is a journey, we’ve all heard it said but how many of us believe it and who knows where they’re trying to go?  I think we do, and that’s ‘Our next audit’ – we want to breeze through each audit like passing street lamps on a motorway.

At Wealth Wizards we deal with personal data. We provide financial guidance to customers. To do this we need customers’ personal data. their valuable personal data. Not just address and email (which are actually considered freely-available, or business card data) but information on savings, investments, tax details, health conditions, etc. We don’t store credit card or bank details but we do have all the really personal stuff. What this means is that over time we will build up a large dataset of things that con men, attackers, villains and others want to get hold of.  We know this is valuable, as do our customers and we want our customers to trust us. We want to instill confidence that when a customer tells us something, it’s private and remains so. This means we know security is important, and without it, it doesn’t matter how much effort we put into building up our business, because if the data is stolen and exposed then it could be the downfall of our business.

One of the best ways for us to show prospective clients and customers that we’re serious about security is to show our credentials and accreditation. Evidence that we have a rigorous process that stands up to rigorous audits. ISO 27001 is designed to do just this. Which is why we are working towards achieving ISO 27001 this year. However, anyone who knows ISO 27001 will know it’s a beast and not for the faint of heart and so the trick is learning how we can use ISO to advantage instead of against us.  We can use ISO as a framework to build up the policies and process we use as a business.  Instead of trying to fight it, we’re going to make it help us.

We don’t just want to add security on to what we’ve done, we want to build security into what we do.  We deal with a lot of big companies. when we’re selling our products and things are getting close to signing contracts, those big companies (we’re talking tens of thousands of employees) start asking us about our processes, about our data security and more importantly, in their eyes, their data security. In other words, they start auditing us. No one likes an audit, but if you can show an auditor that you do care about things and that you do have processes then they tend to avoid asking the really difficult questions. Even then, when you really do care about things, it doesn’t matter if they do ask the difficult questions because you have an answer for them.

I’m currently going through the ‘Technical Measures’ questions with our team here and it feels endless; How can I prove what we did X, how can I prove why we did Y, how can I show what something looked like on this date compared to that date. Those are difficult questions to answer at the best of times but more so when you’re running in an elastic environment where a server instance may only exist for a day or two. What’s becoming apparent though is that ISO is asking questions that I actually want to be answered myself, regardless of what certification we go for.  I, as a sys admin, want to have a record of what happened, when and why. I also want to know that something happened because we made it happen. If I know this then I can start to answer questions about why something doesn’t work at 3 am.  So already I’m starting to find that while ISO is a beast, it can actually be tamed to be a friendly beast. On our path to ISO, we will build the framework that will define the tasks we need to do to build security into our platform, that will show the auditors what they want to see, as well as the meat behind it to prove it’s not just paperwork.

By doing a true risk assessment of our business and technical environment, we start to build an accurate picture of our weaknesses, both in terms of security as well as our processes. Once we have identified these then we can start to build suitable responses. It looks overwhelming to begin with but before long it starts to become clear that the automation that we’re building to allow hands-off delivery of our applications is also the solution we need to be able to record what was deployed when and why. The automation scripts are the perfect mechanism to build these audit trails rather than having to rely on someone to manually ensure these actions are identified!

How do we ensure there is a separation of concerns? That no one is putting back-door code into production? Why peer review of the code (both application and infrastructure) allows us to enforce this programmatically! Suddenly ISO has become my friend. Sure, it’s still a beast, but it’s not blocking our delivery but helping to define what processes we need and therefore it’s starting to write our automation algorithms. How cool is that!? Ok, perhaps cool is a little strong.

So while we’re still very much en-route, I’m confident we’re on the right path and that the next audit will be us proving we’re secure, not hiding the things we don’t want to be seen. Don’t be afraid of the beast called ISO, embrace it and use it to your advantage.

Be part of our engineering team

If you’re a software developer who wants to swap the urban jungle of London for the rolling hills of Warwickshire, look no further than Wealth Wizards. You’ll join a dynamic team of clever people, be absorbed in an energetic atmosphere, and benefit from an excellent work-life balance. Plus, you’ll be working in a team that’s breaking ground in the world of robo advice and artificial intelligence.

If Wealth Wizards sounds like your sort of company, take a look at our careers page for all the info.

Are you based in The Midlands already? Do you like beer? And DevOps? Come to out DevHops MeetUp.

Using Ansible with WordPress

WordPress is a great tool to use when creating websites as it provides flexibility when managing content.

As you may be aware, one of the operational downsides of managing websites run on WordPress is how frequent new releases are released as part of patching vulnerabilities. This comes with the overhead pain and cost of having to upgrade your WordPress instance every other week.

As we are living in a world that thrives of automation, here at Wealth Wizards we thought it would be a good idea to automate the upgrade process by using configuration management tools like Ansible combined with the power of AWS API’s.

As our WordPress sites are deployed in AWS, we decided to use the AWS API’s to provision instances, manage snapshots alongside configure and apply security groups. We then decided to use various Ansible modules to install packages, update configs, encrypt and decrypt files pushed and retrieved from AWS S3 as well as change permissions on files and directories as part of the upgrade process.

Switching from the traditional method of manually moving files using plugins and bash commands to an automated manor has allowed us to gain more control over our upgrades as well as reduce the time it takes from a day to a 2-hour process, with most of that being dedicated to AWS provisioning. Automating the process using Ansible has also given us the ability to upgrade multiple instances at once over the traditional method of doing one instance at a time.