A Mesos logging module for journald
A short introduction to Mesos
We're going to send logs from Mesos containers to systemd-journald by writing a container logging module in C++. You can skip this section if you already know what Mesos is.
Apache Mesos describes itself as a distributed systems kernel, which is appropriate in a lot of ways. For some it may sound a bit intimidating and complicated, and I think that's a bit unfortunate, because it can be explained very simply without losing too much.
Mesos offers resources in a cluster. Resources types include CPU cores, memory, disk space, network ports, GPUs, etc. Let's say I'm a developer and I want to run my application on the cluster. I get an offer from Mesos of 4 CPU cores, 8 GBs of memory, 40 GBs of disk space and a range of ports from 10000-20000.
I don't need all of it, and reply that accept 1 CPU core, 2 GBs of memory,
200 MBs of disk space and one port, port 10000, and I want to fetch
https://wjoel.com/foo-standalone.jar (a self-contained "fat JAR" with no
external dependencies) and run it with the command
java -jar foo-standalone.jar
. Mesos will create a container
using cgroups (if running on Linux) to enforce limits based on the resource
constraints I accepted. The container is also known as a sandbox, and we
get to play in it as long as we stay within the resource limits.
Developers typically don't want to bother with resource offers from Mesos. Programs that respond to resource offers from Mesos are called frameworks. One such framework is Mesosphere's Marathon, and its application specifications are essentially lists of resources and the command to run. Marathon can also ensure that applications are restarted if they die for any reason, do rolling updates, and many other useful things that developers like to have.
You may have noticed that I told Mesos to run my JAR file using Java, but didn't specify that I wanted Java to be downloaded. Hence, my application will only run if Mesos decides to run it somewhere where Java is already installed.
I could create a Docker image which includes foo-standalone.jar
, Java, and
any other dependencies I might need. Mesos can run Docker containers as well,
either on its own or using by Docker for isolation. Alternatively, I could
have included an additional URL in my reply, containing the location of an
archive with a full Java installation and used that instead, all from within
the container.
Container logging in Mesos
The output from my program will end up in the directory of the sandbox
Mesos created,
in the files stdout
and stderr
. That's fine in a lot of cases, since
the Mesos UI has an interface to view the contents of those files and even
updates the view when the files are changed.
Some people prefer to have all their logs go through systemd-journald, or journald for short, perhaps because they have already solved the problems of log forwarding and archiving for journald. We can get this today, instead of having to wait for a release that has it, because there is support for many types of modules in Mesos. There is a module type for container loggers, so let's make one for journald.
The default behavior of logging to stdout
and stderr
is implemented
by the sandbox logger, which can be found in src/slave/container_logger/sandbox.cpp.
It's alright if you don't know (or dislike) C++, because the important lines
are simple enough.
process::Future<ContainerLogger::SubprocessInfo> prepare(
const ExecutorInfo& executorInfo,
const std::string& sandboxDirectory)
{
ContainerLogger::SubprocessInfo info;
info.out = SubprocessInfo::IO::PATH(path::join(sandboxDirectory, "stdout"));
info.err = SubprocessInfo::IO::PATH(path::join(sandboxDirectory, "stderr"));
return info;
}
In other words, direct all standard output into stdout
in the sandbox
directory, and direct all standard errors into stderr
in the sandbox
directory.
A Mesos module for container logging to systemd-journald
At first I thought I'd have to intercept everything written to info.out
and info.err
and split it on newlines, and then send them to journald
using sd_journal_print. I was sufficiently disgusted by the
idea of going back to the ancient C world of reading bytes from file
descriptors to go looking for prior art. There is already a command for
sending lines of texts to journald called systemd-cat
, and it is
straightforward.
Using sd_journal_stream_fd doesn't quite work with the example above,
since it is using (file) paths, but it's possible to assign a
file descriptor to info.out
and info.err
.
// find out how to set this to the Mesos task identifier on github
std::string identifier;
journal_out = sd_journal_stream_fd(identifier.c_str(), LOG_INFO, 1);
journal_err = sd_journal_stream_fd(identifier.c_str(), LOG_ERR, 1);
info.out = SubprocessInfo::IO::FD(journal_out);
info.err = SubprocessInfo::IO::FD(journal_err);
The documentation on Mesos modules instructions is good, but some extra
steps are needed for the compilation to succeed. First compile Mesos but
use ../configure --enable-install-module-dependencies
. You might be
forgiven for missing that step, as I did, since it's not (yet?) included
in the documentation and it's a new flag. I also had to install libz-dev
on a clean Debian 8 installation, but once Mesos has been compiled and
installed we can compile the module.
g++ -I/usr/local/lib/mesos/3rdparty/include -c -fpic \
-o journald_container_logger.o journald_container_logger.cpp
gcc -lsystemd -shared -o mesos_journald_container_logger.so journald_container_logger.o
The GitHub repository for this code uses CMake, but the above is the essence of what it does and it's where I started before going down the CMake route.
A quick demo
First we start mesos-master
as usual. I use an internal IP address here.
Then we start mesos-agent
with two additional flags, perhaps using
sudo
. We'll use Marathon to run a simple shell script.
$ mesos-master --ip=10.0.1.3
$ mesos-agent --master=10.0.1.3:5050 \
--modules='{"libraries":[{"file":"/path/to/mesos_journald_container_logger.so", \
"modules":[{"name":"com_wjoel_JournaldLogger"}]}]}' \
--container_logger=com_wjoel_JournaldLogger
$ /path/to/marathon-1.1.1/bin/start --master 10.0.1.3:5050
Once everything is up and running we can create an application in Marathon
which echoes "hello world" every 5 seconds using a simple shell loop,
with a journalctl -f
running in the background.
In the terminal running journalctl -f
we can see the output
from our application (along with a failed ssh
login from Colombia).
What we win and what we lose
The output from our container is now forwarded to journald. That means the output is no longer written to files in the sandbox, so we can't view it in the Mesos web interface.
This isn't an issue if log forwarding has been set up on all
machines where a container might run, but wouldn't it be nice if we
could have both? And while our change to make this happen was most
delightfully simple, isn't it just a worse systemd-cat
?
Of course it would, and sure it is. We'll take care of all that and more next time. Until then, you can find the mesos-journald-container-logger on GitHub.