In my previous post I wrote about how we’re provisioning kubernetes clusters using kube-up and some of the problems we’ve come across during the deployment process. I now want to cover some of the things we’ve found while running the clusters. We’re in an ‘early’ production phase at the moment. Our legacy apps are still running on our LAMP systems and our new micro services are not yet live. We only have a handful of micro services so far, so you can get an idea of what stage we’re at. We’re learning that some of the things we need to fix mean going back and rebuilding the clusters but we’re lucky that right now this isn’t breaking anything.
We’ve been trying to build our kubernetes cluster following a pretty conventional model; internet facing components (e.g. NAT gateways and ELBs) in a DMZ and the nodes sitting in a private subnet using NAT gateways to access the internet. Despite the fact that the kube-up scripts support the minion nodes being privately addressed, the ELBs also get created in the ‘private subnet’ thus preventing the ELBs from serving public traffic. There have been several comments online around this and the general consensus seems to suggest it’s not currently possible. We have since found though there are a number of annotations available for ELBs suggesting it may be possible by using appropriate tags on subnets (we’re yet to try this though. I’ll post an update if we have any success)
Getting ELBs to behave the way we want with SSL has also been a bit of a pain. Like many sites, we need the ELB to serve plain text on port 80 and TLS on port 443, with both listeners serving from a single backend port. As before, the docs aren’t clear on this. the Service documentation tells us about the https and cert annotations but doesn’t tell you the other bits that are necessary. Again, looking at the source code was a big help and we eventually got a config which worked for us.
kind: Service apiVersion: v1 metadata: name: app-name annotations: service.beta.kubernetes.io/aws-load-balancer-ssl-cert: "arn:aws:acm:us-west-1:xxxxxxxxxx:certificate/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxx” service.beta.kubernetes.io/aws-load-balancer-backend-protocol: "http" service.beta.kubernetes.io/aws-load-balancer-ssl-ports: "443" spec: selector: app: app-name ports: - protocol: TCP name: secure port: 443 targetPort: 80 - protocol: TCP name: insecure port: 80 targetPort: 80 type: LoadBalancer
Kubernetes comes with a bunch of add-ons ‘out of the box’. By default when running kube-up, you’ll get heapster/influxDB/grafana installed. You’ll also get FluentD/Elasticsearch/Kibana along with a dashboard and DNS system. While the DNS system is pretty much necessary (in fact remember to ensure you want more than one replica, in one cluster iteration, the DNS system stopped and would’t start again rendering the cluster mostly useless.) The other add-ons are perhaps less valuable. We’ve found heapster consumes a lot of resource and gives limited (by which I mean, not what I want) info. InfluxDB is also very powerful but will get deployed into non-persistent storage. Instead we’ve found it preferable to deploy prometheus into the cluster and deploy our own updated grafana container. Prom actually gives far better cluster metrics than heapster and there are lots of pre-built dashboards for grafana meaning we can get richer stats faster.
Likewise Fluentd -> Elasticsearch gives an in-built log collection system, but the provisioned ES is non-persistent and by having fluent ship the logs straight to ES you loose many of the benefits of dropping in logstash and running grok filters to make sense of the logs. It’s trivial (on the assumption you don’t already have fluentD deployed, see below!) to drop in filebeat to replace fluentD and this makes it easy to add logstash before sending the indexed logs to ES. In this instance we decided to use AWS provided Elasticsearch to save ourselves the trouble of building the cluster. When deploying things like log collectors, make you they’re deployed as a daemonSet. This will make sure you have an instance of the pod running on each node, regardless of how many nodes you have running, which is exactly what you want for this type of monitoring agent.
Unfortunately, once you’ve deployed a k8s cluster (through kube-up) with these add-ons enabled, it’s actually pretty difficult to remove them. It’s easy enough to removing a running instance of a container, but if a minion node gets deleted and a new one provisioned, the add-ons will turn up again on the minions. This is because kube-up makes use of salt to manage the initial minion install and salt kicks in again for the new machines. To date I’ve failed to remove the definition from salt and have found the easiest option is just to rebuild the k8s cluster without the add ons. To do this, export the following variables when running kube-up:
Of course, this means re-provisioning the cluster, but I did say we’re fortunate enough to be able to do this (at the moment at least!)
Our next task is to sufficiently harden the nodes, ideally running the system on CoreOS.