Log aggregation for Docker containers in Mesos / Marathon cluster

Renat Zubairov integration best practices

Docker containers

This article will describe several alternatives for gathering Docker containers logs in the distributed environment of Apache Marathon / Mesos cluster, like Syslog, Container linking, Docker REST API, embedded logging piping stdout/stderr and Mesos APIs for that. We’ll go through a problem statement, different alternatives and describe the challenges related to each of them.

tl;dr: Docker REST API combined with intelligent ‘docker inspect’ hooks do a great job.

Background

At elastic.io we are building an integration platform for developers, with the best possible environment to code, test and run integration jobs or flows. Integration flow is a sequence of integration components that are connected to each other. Each integration component is an individual process running in a Docker containers that communicates via persistent RabbitMQ queue with the next component. We provide tooling and monitoring on top of that, and showing logs of integration components is an important part of it.

A logging problem in Docker containers

We have a large number of Docker containers and we need to aggregate logs from it so that we could show it to the users. Docker containers are running inside an Apache Mesos and scheduled with Mesosphere Marathon on varying number of Mesos Slaves. Our goal is to support all programming languages (that are running inside Docker containers) so we can’t really impose any specific logging framework, therefore our options are limited to grabbing STDOUT and STDERR and pushing it to persistent storage e.g. S3. Which is by the way not too far away from the 12 Factor Apps logging concept that actually proves our point here.

Another requirements for the solution is that we have to encrypt the log output. Security is very important part of what we do, and log output may contain sensitive information so we treat logs just like user data – logs have to be encrypted with tenant-specific key.

Implementation alternatives

After some googling we identified following alternatives:

  • Alternative A: Mounted volume – Store container logs on the mounted volume and pick it up from there
  • Alternative B: Syslog – Aggregate logs within a container and push them somewhere over the network, e.g. via syslog
  • Alternative C: Docker API – Use a Docker REST or CLI API and attach to each container after start

Alternative A: Mounted volume

It’s a great solution if we package existing applications, just mount /var/logs to outside of the container and use other tools like logstash to collect them. So first advantage is simplicity.

However there are following disadvantages:

  • We don’t know what applications will be run inside our Docker containers, assuming that logging will be pushed to the filesystem
  • Enforcing that logging will be done to a file is also against 12 Factor Apps logging concept.
  • As we are working inside a Mesos/Marathon cluster we would have make sure log-collector agents will be active on all Mesos slaves
  • Disc capacity – partially solved via Mesos sandbox but when mounting to outside volume will become a problem again.

We decided not to proceed with this one, though.

Alternative B: Syslog

We could aggregate logs within a container and then just push them to syslog. Syslog is a great unix tool for logging with long history hence very stable and reliable. Solution would involve connecting exposing a syslog port in or outside of the container and push the data from the container into it. Best way to do that is by linking containers together. This solution have several advantages over the previous (filesystem mount) solution like:

  • No need to do filesystem mounts, network is simpler to use
  • Syslog will get the logging information pushed, it don’t have to poll the FS for that
  • More flexible deployment options

There are however also drawbacks:

  • Container needs to know about syslog and push the logs internally, so imposing some limitations on container again
  • Container linking concept compete with some of the Mesos/Marathon concepts – without proper network virtualisation layer it doesn’t seems to be right to use container linking
  • To minimise network load we would have to deploy a Docker container with syslog on all slaves

Alternative C: Docker API

Docker API solution involve following steps:

  • Monitor new Docker containers start on the Mesos slaves
  • As soon as a new container is started, attach to it and grab all STDOUT and STDERR from it
  • Encrypt it and push to the appropriate storage (e.g. S3)

Advantages of this solution is that there is no assumption will be made on the application inside the container, all logs that are sent to STDOUT and STDERR will be forwarded. Drawbacks however are following:

  • Network communication is required, so we have to work with distributed ‘agents’ on localhost to minimise it
  • Logging agent would need an access the Docker API so they represent a potential security issue
  • We would need a clever way to access Docker API and make it in secure and reliable way
  • We would need to monitor the uptime of the logging agents to mitigate their failures

More details about this solution below.

The solution – say hello to Boatswain

As you could guess last solution is the one that we implemented, and I have to say so far it works like a charm. Our distributed Docker logging agent called Boatswain and it does a great job to aggregate logging information from Docker containers running on our Mesos cluster. And guess what, it’s less than 100 code lines long.

docker containers boatswain

We are using docker-allcontainers that uses dockerode and never-ending-stream to access Docker API. Boatswain will be notified about starts and stops for all new and existing containers from the local Docker daemon:

var ac = allContainers({
    preheat: true, // emit starts event for all already running containers
    docker: null
})
.on('start', listeners.onContainerStart)
.on('stop', listeners.onContainerStop);

You might start wondering how we connect to Docker API, see later. When new container starts we do a quick docker inspect and then attach to it:

function onContainerStart(meta, container) {
    console.log('Container started: %s / %s', meta.image, meta.name);
    Q.ninvoke(container, 'inspect')
        .then(attach)
        .catch(error)
        .done();
        ...

Resulting stream will be encrypted and pushed to S3. That’s it.

This app is packaged as Docker container and then we use Apache Marathon to start and monitor it. Obviously we run into several other issues, like:

How to connect to Docker API?

As our logger process is deployed as a Marathon app we need a secure way to give it an access to Docker daemon running on the Mesos Slave. Pavel and George found a nice way to do that – they just mount a Docker socket inside the Docker container, here how it looks like in our Marathon app descriptor file:

{
    "container": {
        "type": "DOCKER",
        "volumes": [
            {
                "containerPath": "/var/run/docker.sock",
                "hostPath": "/var/run/docker.sock",
                "mode": "RW"
            }
        ]
    }
}

As container is running as root and Mesos daemon is also running as root (it has to start Docker containers somehow too) we have a nice socket-based solution that imposes no network load at all. IMHO it’s the best way to use Docker API from one of the Marathon apps.

Note: make sure your Marathon version already have support of the volumes in the configuration.

How to deploy logger collector app?

So how do we make sure each Mesos slave have exactly one instance of Boatswain up and running? Here Marathon constraints give us a good solution, here is our Marathon app descriptor:

{
    "id": "boatswain",
    "constraints": [
        [
            "hostname",
            "UNIQUE"
        ]
    ],

Now we just need to scale our app to exact number of slaves and Marathon will not only distribute boatswain to all slaves, it will also make sure it will be restarted in case of the shutdown.

There is however a little problem – when we increase a number of slaves we need to update the boatswain application descriptor. We could of course set a very large number of instances required in the first place and Marathon will only start on each slave, that will however lead to the ‘pending’ status of Boatswain deployment in Marathon UI which is also not nice. We still need to see what will be the best solution here.

How to know which app is running in which container?

As we make no assumptions about the code running inside containers we have to find a reliable way to identify the containers and associate them with particular integration component of particular tenant running on our system. This is quite a significant issue, what we have over Docker API is container ID which is essentially a randomly generated UUID. Neither Marathon nor Mesos give us a reliable way to transport some way of identification down to Docker container (e.g. name the container like Marathon-App-SlaveID-Random or something similar). A solution we found here was to inspect the container before. When launching an integration component on Marathon we give it couple of environment variables so that for example it connect to RabbitMQ and decrypt messages from there. With docker inspect we gained access to the environment variables so that we could reliably identify the app inside the Docker container and encrypt the log files with tenant-specific key.

Conclusion

We are quite happy with the resulting approach, it’s not only simple (<100 lines of code) but also clever solution that uses technologies at hand imposes no requirements on the applications running in the Docker containers on top of Mesosphere Marathon and Apache Mesos.


About the Author
Avatar für Renat Zubairov

Renat Zubairov

Facebook Twitter

Renat Zubairov is CEO and co-founder of elastic.io. He is an experienced hacker, product owner and agile evangelist. Renat is a speaker on international conferences, user groups and active open source community member. During his career Renat was working with world best companies like Nokia, Nokia-Siemens Networks, TCS and DHL. Last 5 years he has been working in product start-ups in Application Integration, Data Integration and Business Process Management areas. Leading development of Application Integration (ESB, SOA) product.


You might want to check out also these posts