Spot instances for the win!

Cloud computing is supposed to be cheap, right?

No longer do we need to fork out £5-10k for some silicon and tin, and pay for space and the power, the cables and the install, etc, etc. Building in the cloud meant we could go and provision a host and leave it running for a few hours and remove it when we were done. No PO/finance hoops to jump through, no approvals needed, just provision the host and do your work.

So, in some ways this was true, there was little or no upfront cost and it’s easier to beg forgiveness than permission, right? But the fact is we’ve moved on from the times when AWS was a demo environment, or a test site, or something that the devs were just toying with. Now it’s common for AWS (or Azure, or GCE) to be your only compute environment and the fact is the bills are much bigger now. AWS has become our biggest platform cost and so we’re always looking for ways to reduce our cost commitments there.

At the same time that AWS and cloud have become mainstream for many of us, so too have microservices, and while their development and testing benefits of microservices are well recognised, the little-recognised truth is that they also cost more to run. Why? Because as much as I may be doing the same amount of ‘computing’ as I was in the monolith (though I suspect we were actually doing less there) each microservice now wants its own pool of memory. The PHP app that we ran happily on a single 2GB server with 1CPU has now been split out into 40 different components, each with its own baseline memory consumption of 100MB, so I’ve already doubled my cost base just by using a more ‘efficient’ architecture.

Of course, AWS offers many ways of reducing your compute costs with them. There are many flavours of machine available, each with memory and CPU offerings tuned to your requirements. You can get 50%+ savings on the cost of compute power by committing to paying for the system for 3 years (you want the flexible benefits of cloud computing, right?). Beware the no-upfront reservations though – you’ll lose most of the benefits of elastic computing, with very little cost-saving benefits.

You could of course use an alternative provider, Google bends over backward to prove they have a better, cheaper, IaaS, but the truth is we’re currently too in-bed and busy to move provider (we’ve only just finished migrating away from Rackspace, so we’re in no hurry to start again!)

So, how can we win this game? Spot Instances. OK, so they may get turned off at any moment, but the fact is for the majority of common machine types you will pay 20% of the on-demand price for a spot instance. Looking at the historical pricing of spot instances also gives you a pretty good idea how likely it is that a spot instance will be abruptly terminated. The fact is, if you bid at the on-demand price for a machine – i.e. what you were GOING to pay, but put it on a spot instance instead, you’ll end up paying ~20% of what you were going to and your machine will almost certainly still be there in 3 months time. As long as your bid price remains above spot price, your machine will stay on and you will pay the spot price, not your bid!

AWS Spot Price History

What if this isn’t certain enough for you? If you really want to take advantage of spot instances, build your system to accommodate failure and then hedge your bids across multiple compute pools of different instance types. You can also reserve a baseline of machines, which you calculate to be the bare minimum needed to run your apps, and then use spots to supplement that baseline pool in order to give your systems more burst capacity.

How about moving your build pipeline on to spot instances or that load test environment?

Sure, you can’t bet your house on them, but given the right risk approach to them you can certainly save a ton of money of your compute costs.

Microservices make hotfixes easier

Microservices can ease the pain of deploying hotfixes to live due to the small and bounded context of each service.

Setting the scene

For the sake of this post, imagine that your system at work is written and deployed as a monolith. Now, picture the following situation: stakeholder – “I need this fix in before X, Y, and Z”. It’s not an uncommon one.
But let’s say that X, Y, and Z are all already in the mainline branch and deployed to your systems integration environment. This presents a challenge. There are various ways you could go about approaching this – some of them messier than others.

The nitty gritty

One approach would be to individually revert the X, Y, and Z commits in Git, implement the hotfix straight onto the mainline, and deploy the latest build from the there. Then, when ready, (and your hotfix has been deployed to production), you would need to go back and individually revert the reverts. A second deployment would be needed to bring your systems integration environment back to where it was, (now with the hotfix in there too), and life carries on. Maybe there are better ways to do this, but one way or another it’s not difficult to see how much of a headache this can potentially cause.

Microservices to the rescue!

But then you remember that you are actually using microservices and not a monolith after all. After checking, it turns out that X, Y and Z are all changes to microservices not affected by the hotfix. Great!
Simply fix the microservice in question, and deploy this change through your environments ahead of the microservices containing X, Y, and Z, and voila. To your stakeholders, it looks like a hotfix, but to you it just felt like every other release!

Conclusion

Of course, you could still end up in a situation where a change or two needs to be backed out of one or more of your microservice mainlines in order for a hotfix to go out, however I’m betting it will not only be less often, but I’m also betting that it will be less of a headache than with your old monolith.

 

Mars Attacks!!! Ack, Ack-Ack!

Last Tuesday we saw our first (recognised DDoS attack.  At 12:09 GMT we started to see an increase in XML-RPC GET requests against our marketing site, hosted on WordPress. We don’t serve XMLRPC so we knew this was non-valid traffic for a start.

By 12:11 GMT traffic volumes were well above what the system could handle and the ELBs started to return 503 responses. By 12:20 GMT the request rate was over 250 x higher than usual. At this point, we were trying to establish what was causing the demand. We don’t currently have the highest coverage of monitoring over our marketing sites so this took us a little while. Eventually, by 12:30, using the ELB logs, we had managed to establish we were seeing requests from all over the world, all making GET requests to /xmlrpc.php. We don’t typically see requests from China, Serbia, Thailand and Russia, among others so it was pretty obvious this was a straight forward DDoS attack.

Shortly after 12:30 GMT the request rate drops off just as quickly as it started and by 12:35 GMT it was over and the site recovered. Either the BotNet Attack got bored, they had achieved their purpose (investigation into the consequence of the attack continues with our security partner) or AWS Shield did its free, little-known job and suppressed the attack…

Whatever led to the attack, it passed as quickly as it arrived, and from initial assessment had little purpose. At least we’ve had our first taste of an attack and will be able to better tackle the next one. In the meantime, we continue to analyse logs to determine if there was any more to the attack than a simple DDoS, or if there was something more malicious intended.

Running Jenkins on Kubernetes

by Sion Williams

tl;dr

This guide will take you through the steps necessary to continuously deliver your software to end users by leveraging Amazon Web Services and Jenkins to orchestrate the software delivery pipeline. If you are not familiar with basic Kubernetes concepts, have a look at Kubernetes 101.

In order to accomplish this goal you will use the following Jenkins plugins:

  • Jenkins EC2 Plugin – start Jenkins build slaves in AWS when builds are requested, terminate those containers when builds complete, freeing resources up for the rest of the cluster
  • Bitbucket Oauth Plugin – allows you to add your bitbucket oauth credentials to jenkins

In order to deploy the application with Kubernetes you will use the following resources:

  • Deployments – replicates our application across our kubernetes nodes and allows us to do a controlled rolling update of our software across the fleet of application instances
  • Services – load balancing and service discovery for our internal services
  • Volumes – persistent storage for containers

Credit

This article is an AWS variant of the original Google Cloud Platform article found, here.

Prerequisites

  1. An Amazon Web Services Account
  2. A running Kubernetes cluster

Containers in Production

Containers are ideal for stateless applications and are meant to be ephemeral. This means no data or logs should be stored in the container otherwise they’ll be lost when the container terminates.

– Arun Gupta

The data for Jenkins is stored in the container filesystem. If the container terminates then the entire state of the application is lost. To ensure that we don’t lose our configuration each time a container restarts we need to add a Persistent Volume.

Adding a Persistent Volume

From the Jenkins documentation we know that the directory we want to persist is going to be the Jenkins home directory, which in the container is located at /var/jenkins_home (assuming you are using the official Jenkins container). This is the directory where all our plugins are installed, jobs and config information is kept.

At this point we’re faced with a chicken and egg situation; we want to mount a volume where Jenkins Home is located, but if we do that the volume will be empty. To overcome this hurdle we first need to add our volume to a sacrificial instance in AWS, install Jenkins, copy the contents of Jenkins Home to the volume, detach it, then finally add it to the container.

Gotchas

Make sure that the user and group permissions in the Jenkins Home are the same. Failure to do so will cause the container to fail certain Write processes. We will discuss more about the Security Context later in this article.

To recursively change permissions of group to equal owner, use:

$ sudo chmod -R g=u .

Now that we have our volume populated with the Jenkins data we can start writing the Kubernetes manifests.  The main things of note are the name, volumeId and storage.

jenkins-pv.yml

apiVersion: v1
kind: PersistentVolume
metadata:
 name: jenkins-data
spec:
 capacity:
 storage: 30Gi
 accessModes:
 - ReadWriteOnce
 awsElasticBlockStore:
 volumeID: aws://eu-west-1a/vol-XXXXXX
 fsType: ext4

With this manifest we have told Kubernetes where our volume is held. Now we need to tell Kubernetes that we want to make a claim on it. We do that with a Persistent Volume Claim.

jenkins-pvc.yml

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
 name: jenkins-data
spec:
 accessModes:
 - ReadWriteOnce
 resources:
 requests:
 storage: 30Gi

In the file above we are telling Kubernetes that we would like to claim the full 30GB. We will associate this claim with a container in the next section.

Create a Jenkins Deployment and Service

Here you’ll create a deployment running a Jenkins container with a persistent disk attached containing the Jenkins home directory.

jenkins-deployment.yml

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
 annotations:
 labels:
 app: jenkins
 name: jenkins
spec:
 replicas: 1
 selector:
 matchLabels:
 app: jenkins
 template:
 metadata:
 labels:
 app: jenkins
 spec:
 containers:
 - image: jenkins:2.19.2
 imagePullPolicy: IfNotPresent
 name: jenkins
 ports:
 - containerPort: 8080
 protocol: TCP
 name: web
 - containerPort: 50000
 protocol: TCP
 name: slaves
 resources:
 limits:
 cpu: 500m
 memory: 1000Mi
 requests:
 cpu: 500m
 memory: 1000Mi
 volumeMounts:
 - mountPath: /var/jenkins_home
 name: jenkinshome
 securityContext:
 fsGroup: 1000
 volumes:
 - name: jenkinshome
 persistentVolumeClaim:
 claimName: jenkins-data

There’s a lot of information in this file. As the post is already getting long, I’m only going to pull out the most important parts.

Volume Mounts

Earlier we created a persistent volume and volume claim. We made a claim on the PersistentVolume using the PersistentVolumeClaim, and now we need to attach the claim to our container. We do this using the claim name, which hopefully you can see ties each of the manifests together. In this case jenkins-data.

Security Context

This is where I had the most problems. I found that when I used the surrogate method of getting the files onto the volume I forgot to set the correct ownership and permissions. By setting the group permissions to the same as the user, when we deploy to Kubernetes we can use the fsGroup feature.  This feature lets the Jenkins user in the container have the correct permissions on the directories via the group level permissions. We set this to 1000 as per the documentation.

If all is well and good you should now be able start each of the resources:

kubectl create -f jenkins-pv.yml -f jenkins-pvc.yml -f jenkins-deployment.yml

As long as you dont have any issues at this stage you can now expose the instance using a load balancer. In this example we are provisioning an aws loadbalancer with our AWS provided cert.

jenkins-svc.yml

apiVersion: v1
kind: Service
metadata:
 labels:
 app: jenkins
 name: jenkins
 annotations:
 service.beta.kubernetes.io/aws-load-balancer-ssl-cert: "arn:aws:acm:eu-west-1:xxxxxxxxxxxx:certificate/bac080bc-8f03-4cc0-a8b5-xxxxxxxxxxxxx"
 service.beta.kubernetes.io/aws-load-balancer-backend-protocol: "http"
 service.beta.kubernetes.io/aws-load-balancer-ssl-ports: "443"
spec:
 ports:
 - name: securejenkinsport
 port: 443
 targetPort: 8080
 - name: slaves
 port: 50000
 protocol: TCP
 targetPort: 50000
 selector:
 app: jenkins
 type: LoadBalancer
 loadBalancerSourceRanges:
 - x.x.x.x/32

In the snippet above we also use the loadBalancerSourceRanges feature to whitelist our office. We aren’t making our CI publicly available, so this is a nice way of making it private.

I’m not going to get into the specifics of DNS etc here, but if that’s all configured you should now be able access your Jenkins. You can get the ingress url using the following:

kubectl get -o jsonpath="{.status.loadBalancer.ingress[0].hostname}" svc/jenkins

EC2 Plugin

I guess you’re wondering; “why after all that effort with Kubernetes are you creating AWS instances as slaves?” Well, our cluster has a finite pool of resource. We want elasticity with the Jenkins slaves, but equally, we don’t want a large pool sat idle waiting for work.

We are using the EC2 Plugin so that our builder nodes will be automatically launched as necessary when the Jenkins master requests them. Upon completion of their work they will automatically be turned down ,and we don’t get charged for anything that isn’t running. This does come with a time penalty for spinning up new VM’s, but we’re OK with that. We mitigate some of that cost by leaving them up for 10 mins after a build, so that any new builds can jump straight on the resource.

There’s a great article on how to configure this plugin, here.

Bitbucket OAuth

Our Active Directory is managed externally, so integrating Jenkins with AD was a little bit of a headache. Instead, we opted to integrate Jenkins with Bitbucket OAuth, which is useful because we know all of our engineers will have accounts. The documentation is very clear and accurate, so I would recommend following that guide.

learning to live with Kubernetes clusters.

In my previous post I wrote about how we’re provisioning kubernetes clusters using kube-up and some of the problems we’ve come across during the deployment process. I now want to cover some of the things we’ve found while running the clusters. We’re in an ‘early’ production phase at the moment. Our legacy apps are still running on our LAMP systems and our new micro services are not yet live. We only have a handful of micro services so far, so you can get an idea of what stage we’re at. We’re learning that some of the things we need to fix mean going back and rebuilding the clusters but we’re lucky that right now this isn’t breaking anything.

We’ve been trying to build our kubernetes cluster following a pretty conventional model; internet facing components (e.g. NAT gateways and ELBs) in a DMZ and the nodes sitting in a private subnet using NAT gateways to access the internet. Despite the fact that the kube-up scripts support the minion nodes being privately addressed, the ELBs also get created in the ‘private subnet’  thus preventing the ELBs from serving public traffic. There have been several comments online around this and the general consensus seems to suggest it’s not currently possible. We have since found though there are a number of annotations available for ELBs suggesting it may be possible by using appropriate tags on subnets (we’re yet to try this though. I’ll post an update if we have any success)

Getting ELBs to behave the way we want with SSL has also been a bit of a pain. Like many sites, we need the ELB to serve plain text on port 80 and TLS on port 443, with both listeners serving from a single backend port. As before, the docs aren’t clear on this. the Service documentation tells us about the https and cert annotations but doesn’t tell you the other bits that are necessary. Again, looking at the source code was a big help and we eventually got a config which worked for us.

kind: Service
apiVersion: v1
metadata:
  name: app-name
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-ssl-cert: "arn:aws:acm:us-west-1:xxxxxxxxxx:certificate/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxx”
    service.beta.kubernetes.io/aws-load-balancer-backend-protocol: "http"
    service.beta.kubernetes.io/aws-load-balancer-ssl-ports: "443"
spec:
  selector:
    app:  app-name
  ports:
  - protocol: TCP
    name: secure
    port: 443
    targetPort: 80
  - protocol: TCP
    name: insecure
    port: 80
    targetPort: 80
  type: LoadBalancer

 

Kubernetes comes with a bunch of add-ons ‘out of the box’. By default when running kube-up, you’ll get heapster/influxDB/grafana installed. You’ll also get FluentD/Elasticsearch/Kibana along with a dashboard and DNS system. While the DNS system is pretty much necessary (in fact remember to ensure you want more than one replica, in one cluster iteration, the DNS system stopped and would’t start again rendering the cluster mostly useless.) The other add-ons are perhaps less valuable. We’ve found heapster consumes a lot of resource and gives limited (by which I mean, not what I want) info. InfluxDB is also very powerful but will get deployed into non-persistent storage. Instead we’ve found it preferable to deploy prometheus into the cluster and deploy our own updated grafana container. Prom actually gives far better cluster metrics than heapster and there are lots of pre-built dashboards for grafana meaning we can get richer stats faster.

Likewise Fluentd -> Elasticsearch gives an in-built log collection system, but the provisioned ES is non-persistent and by having fluent ship the logs straight to ES you loose many of the benefits of dropping in logstash and running grok filters to make sense of the logs. It’s trivial  (on the assumption you don’t already have fluentD deployed, see below!) to drop in filebeat to replace fluentD and this makes it easy to add logstash before sending the indexed logs to ES. In this instance we decided to use AWS provided Elasticsearch to save ourselves the trouble of building the cluster. When deploying things like log collectors, make you they’re deployed as a daemonSet. This will make sure you have an instance of the pod running on each node, regardless of how many nodes you have running, which is exactly what you want for this type of monitoring agent.

Unfortunately, once you’ve deployed a k8s cluster (through kube-up) with these add-ons enabled, it’s actually pretty difficult to remove them. It’s easy enough to removing a running instance of a container, but if a minion node gets deleted and a new one provisioned, the add-ons will turn up again on the minions. This is because kube-up makes use of salt to manage the initial minion install and salt kicks in again for the new machines. To date I’ve failed to remove the definition from salt and have found the easiest option is just to rebuild the k8s cluster without the add ons. To do this, export the following variables when running kube-up:

export KUBE_ENABLE_CLUSTER_MONITORING=false
export KUBE_ENABLE_NODE_LOGGING=false
export KUBE_ENABLE_CLUSTER_LOGGING=false

Of course, this means re-provisioning the cluster, but I did say we’re fortunate enough to be able to do this (at the moment at least!)

Our next task is to sufficiently harden the nodes, ideally running the system on CoreOS.

Building Kubernetes Clusters

We’re in the early stages of deploying a platform built with micro services on Kubernetes. While there are a growing number of alternatives to k8s (all the cool kids are calling it k8s, or kube), Mesos, Nomad and Swarm being some of the bigger names, we came to the decision that k8s has the right balance of features, maturity and out of the box ease of use to make it the right choice for us. For the time being at least.

You may know k8s was gifted to the world by Google. In many ways it’s a watered down version of the Omega and Borg engines they use to run their own apps worldwide across millions of servers, so it’s got good provenance. The cynical among you (myself included) may suggest that Google gave us k8s as a way of enticing us to the Google Compute Engine. K8s is clearly designed with being run on GCE first and foremost with a number of the features only working on GCE. That’s not to say it can’t be run in other places, it can and does get used elsewhere from laptops and bare metal to other clouds. For example we run it in AWS and while functionality on AWS lags behind GCE, k8s still provides pretty much everything we need at this point.

As you can imagine, setting up k8s could be pretty complicated due to the number of elements involved in something which you can trust to run your applications in production with little more than a definition in a YAML file, fortunately the k8s team provide a pretty thorough build script to set it up for you called ‘kube-up’.

Kube-up is a script* capable of installing a fully functional cluster on anything from your laptop (Vagrant) through Rackspace, Azure, AWS and VMware and of course onto Google compute and container engine (GCE & GKE). Configuration and customisation for your requirements is done by modifying values in the scripts, or preferably exporting the appropriate settings into your env vars before running the script.

For a couple of reasons, which seemed good at the time, we’re running in AWS. While support for AWS is pretty good, the main feature missing currently that we’ve noticed is the lack of the ingress resource, which provides advanced L7 control such as rate limiting,  it’s actually pretty difficult to find good information on what actually is supported, both in the Kube-up script and once k8s is running and in use. The best option is to read through the script, see what environment variables are mentioned and then have a play with them.

Along with a kube-up script, there is also a kube-down script (supplied in the tar file downloaded by kube-up). This can be very handy if you’re building and rebuilding clusters to better understand what you need but be warned, it also means it’s perfectly feasible to delete a cluster you didn’t want deleted.

So far I’ve found a few guideline which I think should be stuck to when using kube-up, these, with a reason why, are:

Create a stand-alone config file (a list of export Env=Vars) and source that script before running kube-up instead of modding the downloaded config files. 

Having gone through the build process a couple of times now, I’ve come to the conclusion the best route is to define all the EnvVar overrides into a stand-alone file and source the file before running the main kube-up script. By default, kube-up will re-download the tar and replace the script directory, blowing away any overrides you may have configured. Downloading a new version of the tar file means you benefit from any fixes and improvements, keeping your config outside this means you don’t have to keep re-defining it. I should add too that I have had to hack the contents of various scripts to get the script to run without errors, so using the latest version doe help minimise this.

Don’t use the default Kubernetes cluster name, create a logical name (something that makes sense to use and stands the test of having 3-4, other clusters running alongside and still making sense what this one is) 

Kube-up/down both rely on the information held in ~/.kube. This directory is created when you run kube-up and lets the kubectl script know where to connect and what credentials to use to mange the system through the API. If you have multiple clusters and have the details for the ‘wrong’ cluster stored in this file, kube-down will merrily also delete the wrong cluster.

In addition to this, in AWS, kube-up/down both rely heavily on AWS name tags. These tags are used during the whole lifecycle of the cluster so are important at all times. When kube-up provisions the cluster it will tag items to know which resources it’ll manage. The same tags are used by the master to control the cluster. For example; to add the appropriate instance specific routes to the AWS route tables. If the tags are missing, or duplicated (which can happen if you are building and tearing down clusters frequently and miss something in the tear-down) you can end up with a cluster which is reported as fully functional, but applications running in the cluster will fail to run.

One problem I found was that having laid out a nice VPC config, including subnet and route tables with Terraform and then having provisioned the system, when I came to deploying the k8s cluster the k8s script failed to bind it’s route table to the subnet which I ha told it to use. It failed because I had already defined one myself in Terraform. kube-up did report this as an error, but continued on and provisioned what looked like a fully functioning cluster. It wasn’t until the following day that we identified that there were important per-node routes missing. kube-up had provisioned and tagged a route table. Because that table was tagged, that’s the table the kube master was updating when minions were getting provisioned. The problem being that route table was not associated to my subnet. Once I had tagged by terraformed subnet with the appropriate k8s tag, the master would then update the correct table with new routes for minions. I had to manually copy across the routes from the other table for the existing minions.

Understand your network topology before building the cluster and define IP ranges for the cluster that don’t collide with your existing network and allow for more clusters to be provisioned alongside in the future. 

If, for example you choose to deploy 2 separate clusters using the kube-up scripts they will both end up with the same IP addressing, they will also only be accessible over the internet. While this isn’t the end of the world, it’s not ideal and being able to access them using their private IP/name space is a huge improvement. Of course, if the kube-up provisioned IP range is the same as one of your internal networks, or you have 2 VPCs with the same IP ranges it becomes impossible to do this. Having a well thought-out Network and IP ranges also makes routing and security far simpler. If you know all your production services sit over there you can easily configure your firewalls to restrict access to that whole range.

Although you can pre-build the VPC, networks, gateways, route tables, etc. if you do, make sure they’re kube-up friendly, adding the right tags (which match the custom name you defined above.)

When building with dealt configs, kube-up will provision a new VPN into AWS. While this is great when you want to just get something up and running, it’s pretty likely you’ll actually want to build a cluster in a pre-existing VPC. You may also already have a way of building and managing these. We like to provision things with Terraform and so we found a way to configure kube-up to use an existing VPC (and to change it’s networking accordingly) there are still a number of caveats.

K8s makes heavy use of some networking tricks to provide an easy to use interface, however this means that to really understand k8s  (you’re running your production apps on this, right? so you want a good idea how it’s running, right?) you should also have a good understanding of it’s networks. In essence, Kubernetes makes use of 2 largely distinct networks. The first is to provide IPs to the master and nodes and allows you to reach the surface of the cluster (allowing you to manage it, and deploy apps onto it and for those to be served to the world). It uses the second network to manage where the apps are within the cluster and to allow the scheduler to do what it needs to without you having to worry about what node an apps is deployed to and what port it’s on. If either of these network ranges collides with one of your existing networks you can get sub-optimal behaviour, even if this means you have to hop through hoops just to reach your cluster.

Update the security groups as soon as the system is built to restrict access to the nodes. We’ve built ours in a VPC with a VPN connection to our other systems, so we can restrict access to private ranges only. 

Also note that by default, although kube-up will provision a private network for you in AWS, all the nodes end up getting public addresses and a security group which allows access to these nodes from anywhere over SSH and HTTP/S for the master. This strikes me as a little scary.

  • Kube-up is in fact far more than just a single script, it downloads a whole tar file of scripts, but let’s keep it simple.