Mesos container log forwarding with Filebeat
The next level
In the previous post we managed to forward container logs to journald, but
we could have done the same thing and much more if we could have forwarded
the logs to systemd-cat
instead. We wouldn't have had to write any code
at all if that was possible.
So we're going to make that possible by backporting an upcoming "external" container logger, being developed for the next release of Mesos, as a module compiled for the currently stable release. That's version 1.0.1 at the time of writing.
Then we'll use this module to set up log forwarding from Mesos containers to Graylog using Filebeat 1.3 and include additional fields from the Mesos executor's environment.
Mesos and external container loggers
Apache Mesos has had support for a LogrotateContainerLogger since version
0.27, but if you look at what that container logger you'll find that all it
does is receive input from standard input and standard error, counts the
number of bytes received, and writes to the stdout
and stdout
files in
the sandbox directory.
It calls logrotate
once a configurable number of bytes has been received.
This logic is implemented by mesos-logrotate-logger
which is a
stand-alone program. Why do we need a separate module to feed logs to
this specific external program?
MESOS-6003 was created to add a logging module for logging to any external program. Thanks to the hard work of Will Rouesnel, shepherded by Joseph Wu, this is almost completed.
What if we can't wait for the next release of Mesos? Maybe we're stuck on 1.0.1 for a while for reasons beyond our control, but we'd love to use Filebeat for log forwarding. Unfortunately we can't, because the recommended setup doesn't work.
Even if it did we'd still be missing out on important context, like what the Marathon app ID and the Mesos task ID are. Maybe there's a specific version of the Marathon app which is having issues. It would be nice if we could find all log messages from that application and version with a simple search.
We'll make that possible by backporting the external container logging module to Mesos 1.0.1 and setting it up to use a small wrapper around Filebeat to create configuration which includes the task's environment variables and forwards the log messages to Graylog. It will also work if you are using Logstash.
Creating and using the backported module
By reusing much of what we did in the last post we can create a module for Mesos 1.0.1 using the code available in the review request for the external logger module. It doesn't compile using Mesos 1.0.1 as is, but the necessary changes are trivial.
The module has been compiled for Ubuntu 14.04 (Trusty Tahr) and Debian 8 and can be downloaded from the mesos-external-container-logger repository on GitHub, but using the provided scripts you can compile it for your system.
It is enabled by starting mesos-slave
with a modules config similar
to this one and using --container-logger=org_apache_mesos_ExternalContainerLogger
.
{
"libraries": [{
"file": "/usr/lib/libexternal_container_logger-0.1.0.so",
"modules": [{
"name": "org_apache_mesos_ExternalContainerLogger",
"parameters": [{"value": "/usr/bin/filebeat-container-logger.sh",
"key": "external_logger_cmd"}]
}]
}]
}
The ExternalContainerLogger
module creates an environment variable called
MESOS_LOG_STREAM
which will have the value "STDOUT" or "STDERR" depending
on where the log is coming
from. To emulate the default sandbox container logger you could use this.
#!/bin/sh
cat > $(echo "$MESOS_LOG_STREAM" | tr '[:upper:]' '[:lower:]')
We can also emulate the LogrotateContainerLogger
.
#!/bin/sh
/usr/libexec/mesos/mesos-logrotate-logger \
--log_filename=$(echo "$MESOS_LOG_STREAM" | tr '[:upper:]' '[:lower:]')
Filebeat as the external logger process
We can use Filebeat to forward our logs to a central log server like Graylog if we're not happy with local files. We'll create a small bootstrap script to create the Filebeat configuration file and start it.
We extract all environment variables for the task from
MESOS_EXECUTORINFO_JSON
, which is set by
the module, using jq to parse the JSON
and a bit of awk
to convert it to the proper format.
The script is simple enough to include in full here but you can also find it on GitHub as examples/filebeat-container-logger.sh
#!/bin/bash
config_path="$MESOS_LOG_SANDBOX_DIRECTORY/filebeat-$MESOS_LOG_STREAM.yml"
mesos_fields=$(echo "$MESOS_EXECUTORINFO_JSON" | \
jq -r ".command.environment.variables
|map(\"\(.name):\(.value|tostring)\")|.[]" | \
# Skip empty variables, use mesos_ prefix, convert to lowercase
awk -F: 'length($2) > 0 {
$1=tolower($1);
if (!match($1, "^mesos_.*")) {
$1="mesos_" $1;
}
printf("%s: \"%s\"\n ", $1, $2);
}')
cat <<EOF > $config_path
filebeat:
prospectors:
-
paths:
- "-"
input_type: stdin
close_eof: true
fields:
mesos_log_stream: $MESOS_LOG_STREAM
mesos_log_sandbox_directory: $MESOS_LOG_SANDBOX_DIRECTORY
$mesos_fields
output:
logstash:
hosts: ["graylog.aws.wjoel.com:5044"]
EOF
/usr/bin/filebeat -c $config_path
Change the hosts
variable to your Graylog
or Logstash server. More output options are available
if you need to send the logs to something else.
All environment variables in the ExecutorInfo
JSON description,
prefixed with mesos_
and in lowercase, are included as extra fields.
This makes it easy to find all log messages for any
executor, Marathon application, or Mesos agent.
This is what we get in Graylog when we run three instances of the "hello world" Marathon app from the previous post with all of the above in place.
Making the previous post irrelevant with systemd-cat
At the end of the previous post we noted that we had implemented a worse
systemd-cat
. We can use the external logger module to provide logging
to journald with systemd-cat
. Note that this example requires that jq
is installed on the Mesos agent, just like the example above.
#!/bin/bash
log_level="info"
if [ "$MESOS_LOG_STREAM" = "STDERR" ]; then
log_level="err"
fi
task_id=$(echo "$MESOS_EXECUTORINFO_JSON" | \
jq -r '.command.environment.variables[]|select(.name=="MESOS_TASK_ID").value')
if [ "$task_id" ]; then
task_identifier="-t $task_id"
fi
systemd-cat $task_identifier -p $log_level
Finally, we can combine Filebeat with mesos-logrotate-logger
by
changing the last line of filebeat-container-logger.sh
to the
following.
log_stream=$(echo $MESOS_LOG_STREAM | tr '[:upper:]' '[:lower:]')
tee >(/usr/libexec/mesos/mesos-logrotate-logger \
--log_filename="$MESOS_LOG_SANDBOX_DIRECTORY/$log_stream" \
--logrotate_options="rotate 5" \
--max_size="10MB") | \
/usr/bin/filebeat -c $config_path
This way the logs are available both in Graylog and the sandbox, so we can view logs in the Mesos web interface as usual. In addition, log rotation will ensure that don't out of disk space due to unbounded container logs.
Caveats and possible improvements
The cost for this setup is two Filebeat processes per container. That's acceptable for most setups since it was created to be lightweight, but in some cases it could be worth exploring other options.
For example, you could run just one Filebeat process
on each Mesos agent and create config for each container in an external
logger script which is then loaded by Filebeat. You would then
have to find a way to clean up configs after containers are
terminated, but this should be possible (although ugly) with a
cron
job.
Filebeat unfortunately does not terminate when standard input
is closed, even with close_eof
. The newly added
-once flag might help, but it's so new that you would currently
have to compile Filebeat from source to enable it.
This should
not be much of an issue if you have long-running services but
otherwise you should find a way to solve this. Again, perhaps
by using a cron
to kill off old Filebeat processes.
Summary
By backporting the upcoming external container logger module to Mesos 1.0.1 we can centralize container logs by using Filebeat and Graylog today.
The external container logger is flexible enough to replace all existing container logger modules and the one for journald which we created in the previous post. It can even be used to combine the functionality of several other container loggers.
By centralizing the container logs with Filebeat we can include environment variables provided by Mesos, Marathon and other frameworks to provide more context.
The future of logging is bright for Mesos, and we can have much of it today thanks to modules. You can get this module on GitHub in the mesos-external-container-logger repository and follow the development by Will Rouesnel in MESOS-6003.
It's not perfect yet, but I hope you find this useful.