If you like to stir up the way you deploy applications by using containers then you may also like to mix up the way you run an operating system on the server itself. We see a healthy split in our user base between those that just trim down their favorite OS to run containers and those that jump onto one of the new container-only operating systems, specifically CoreOS and RancherOS. My previous blog post describes how the RightScale RightLink agent can be used on traditional Linux distros to inventory containers and perform container-level monitoring as well as application-level monitoring of apps in containers. But what if you’re using CoreOS or RancherOS?
In CoreOS there are actually two options: running RightLink at the host level and running it in a container. It is not obvious which is better and, in fact, it really depends on what your goals are.
The RightScale RightLink (version 10) agent is a lightweight agent that connects servers to the RightScale platform and enables remote management of the servers as well as monitoring and other metric collection. Generally, monitoring uses the standard Linux collectd agent, but recent versions of RightLink also incorporate some basic monitoring directly and this now extends to containers.
Running RightLink at the host level is fairly straightforward thanks to the fact that it is a (mostly) statically linked executable. The next release of RightLink will include some changes and options that remove some roadblocks, such as locating config files in read-only filesystems. Running RightLink at the host level keeps a number of tasks simple. For example, adding a user account can be accomplished by running a RightScript with the typical useradd commands. Applications can be launched using either Docker commands to start containers or by installing systemd unit files that then in turn run the apps in containers and can restart them automatically on failure.
The downside of running RightLink at the host level is that RightScripts have a limited toolset available to them. For example, there is no Ruby or Python available, so scripts are pretty much restricted to bash and the small set of executables installed with the OS.
This leads to the question whether RightLink can be productively run in a container on CoreOS, which would allow the container to include any desired tools such as a scripting language or other familiar tools.
RightLink In a CoreOS Container
Running RightLink in a container is simple per se; the tricky part is to figure out how the container needs to be set up so all the desired functionality of RightLink continues to work. This comes down to the following aspects:
- persisting state across RightLink restarts
- enabling RightScripts to create systemd units
- enabling RightScripts to launch containers
- enabling managed login (i.e., RightLink to update the RightScale user’s authorized_keys file)
- enabling the built-in container monitoring
- providing the desired environment for RightScripts
None of these aspects are difficult to deal with if you follow these steps.
Running RightLink in a container is most easily accomplished by creating a systemd unit that uses Docker commands to download an image with RightLink and then runs it and restarts it in the event of a failure. Making this happen occurs in three steps.
1. Download the RightLink container image at boot and run an install script
The following cloud-config snippet placed in the AWS instance’s userdata will download the RightLink container image and run an install script contained in the image:
2. Create and launch the systemd unit
The install script, which runs in a temporary container, copies the systemd unit file from within the container to the host’s /etc/systems/system directory and tells systemd to start the real RightLink container as a service:
3. Watch the systemd unit run RightLink
At this point systemd takes over and starts the unit, the key lines of which are the following:
This unit file sets up a whole slew on bind-mounts (the
options to docker) in order to allow RightLink to persist state,
create systemd units, launch containers and more.
Persisting state across RightLink restarts and across OS reboots
allows options configured into RightLink not to be inadvertently lost
and prevents boot scripts from being erroneously re-run in the event
that RightLink crashes. RightLink stores a small amount of state in
/var/run/rightlink and a volume bind-mount can be used to persist it.
To enable RightLink to create systemd units and launch Docker
containers it needs to be able to write units to
and communicate with the systemd daemon via its socket and dbus,
and with the Docker daemon via its socket. All this is enabled using
The bind-mount of the Docker socket also enables the monitoring of
containers. For host-level monitoring RightLink needs read-only access
/sys/block filesystems, which is enabled using yet
another two bind-mounts.
Finally, some attention needs to be placed on the base image used for the RightLink container. RightLink itself only requires a shared library for DNS name resolution and TLS certificate authority certs, making almost any image usable. However, in order to run RightScripts, most users need a minimal set of tools, possibly a scripting language such as Python or Ruby. So while a busybox-based base image augmented with the TLS CA certs is certainly possible, most users are better served with a minimal Debian, Ubuntu, or CentOS base image. These base images also leave the option open to use the built-in package manager to install any additional tools that may be needed.
All this enables RightLink to run in a container under CoreOS and to manage system services and application containers, plus perform simple host level and container monitoring.
Collectd In a CoreOS Container
Running collectd on CoreOS turns out to be slightly more challenging than running RightLink itself, especially if one wants to use it for host-level monitoring as well as for application-level monitoring.
First of all, why not run collectd at the host level instead of trying to fit it into a container? The reason is that collectd’s internal architecture makes heavy use of shared libraries, many of which are not installed in CoreOS. The shared libraries are required for many highly desirable plugins, so making do without is not an attractive option. The upshot is that collectd pretty much has to run in a container that can provide all the required dependencies.
Unfortunately it is not possible to configure a standard binary collectd
package to run in a container because certain paths, in particular
/proc, are hard-coded in the source and while bind-mounting
is required under any scenario it is not practical to bind-mount the
/proc onto the container’s
/proc. Instead some other
directory needs to be used and a convention seems to be emerging to use
/host prefix for such purposes, i.e., to bind-mount
/host/proc as shown above in the systemd unit for RightLink itself.
To run in a container collectd must thus be built from
slightly modified sources in order to refer to
/host. The Dockerfile that does this is available at
It starts from a standard Debian image, installs the collectd dependencies
(such as libraries to connect to MySQL, Postgres, etc.), downloads the
collectd sources, makes the
/host/proc modification and compiles
The next issue that needs solving is the configuration of
collectd. The approach taken here is to add a standard configuration
/etc/collectd that is structured such that additional files can
be added to enable new plugins. While this sounds simple, there is an
additional wrinkle, which is that collectd must be restarted after any
config changes: There is no HUP signal or similar to make it re-read
the config. This means that either collectd runs under some monitoring
process inside the container and can be restarted that way or it runs
as pid 1 (i.e., top process) in the container and then the container
must be restarted. Since the container will already run as a systemd
unit it seemed simpler to avoid another level of monitoring inside the
container and run just collectd as pid 1. But this now means that the
configuration needs to be persisted across container restarts, which
requires a bind-mount, for example, from the host’s
the container. Because it’s also desirable to preserve the configuration
across host reboots, this bind-mount is most likely required regardless
of the approach.
All this means that in order to make a configuration change, the
files in the host’s
/etc/collectd directory need to be updated and
then systemd needs to be signaled to restart the collectd service,
which causes a container restart. A RightScript that can be used
to create the systemd unit file and start collectd can be found at
Monitoring Applications In Containers with Collectd
The final hurdle toward a really powerful monitoring solution is
to perform application-level monitoring of apps running in their own
containers. As an example, the MySQL plugin built into collectd expects to
connect to MySQL’s port 3306 in order to issue a series of
type queries and then parse the results. The trick with containers is to
figure out how to establish that connection. The standard method when
launching multiple containers is to use the container linking feature
of Docker (i.e., the
-l flag to docker run) but that doesn’t work
here because collectd would have to be relaunched with a different
flag every time a MySQL container (or any other monitored container)
is launched or relaunched.
A more static solution is to map the MySQL port to a persistent port, which insulates its clients (collectd among them) from changes due to container restarts. The default practice would be to launch MySQL to listen on localhost port 3306 but that doesn’t work with Docker because each container has its own localhost loopback interface and cannot get to the host’s loopback interface. In other words, there are many localhosts! Having MySQL listen on the host’s public interface may be acceptable in some cases but not all, so a different solution is needed.
The emerging localhost replacement is to use the host’s interface on
the Docker bridge. Recall that Docker by default creates a local bridge
network on which each container receives an interface with its own IP
address. That’s what all the 172.17.x.x IP addresses you see for your
containers refer to. The host itself also has an interface on that network
and it is in fact the default gateway for all the containers. This means
that all containers can reach the host’s interface and it turns out that
the docker run
-p option can be used to map a container’s port to that
interface. Thus if the MySQL container’s port 3306 is mapped to the
host’s 172.17.x.x interface’s port 3306 then all other containers,
including collectd, can easily reach it and MySQL container restarts
don’t require the clients to discover any new port.
This configuration is shown in the diagram below, with port 88 on the host’s interface forwarded to RightLink and port 3306 forwarded to MySQL.
The use of the host’s interface as a replacement for the inaccessible localhost solves a number of issues and enables fairly simple linking between containers through restarts. It is certainly not the only option, especially if container cluster management software is used. But for this post, it provides a simple to understand solution that you can customize or morph into a more dynamic solution if you prefer.
The final piece of the puzzle then is a RightScript that pulls down a MySQL container image, launches it, and configures collectd with the appropriate MySQL plugin to show all the MySQL metrics. A simple form of this script is:
This RightScript really should create a systemd unit to run and restart MySQL, but that, in itself, is the subject for another blog post.
Putting It All Together
All the RightLink, collectd and MySQL pieces can be seen working together in the RL10 MySQL CoreOS Container ServerTemplate. A slightly simpler RL10.2.docker1 CoreOS Container ServerTemplate without the MySQL pieces is also available. Using the first ServerTemplate, a CoreOS image can be launched in AWS, which at boot time runs RightLink in a container, then starts executing the RightScripts that run collectd in another container, run MySQL, and configure collectd to monitor MySQL. In the end, the RightScale Cloud Management dashboard shows the information about the containers running on the server (SHAs shortened here):
and the images they use:
and it shows the MySQL monitoring graphs:
This ServerTemplate is a sample starting point from which you can further customize; you would want to customize the configuration of MySQL in particular. Also, tools such as fleet or Docker compose may come into play in order to deploy application containers on servers. The setup described here tries to be as simple and generic as possible so you know what all the pieces of the puzzle are and can then adapt them to your own deployment methodology.
In the end you are left with the question whether to run RightLink at the host level or in a container as shown in this blog post. This question is not unique to RightLink but presents itself for most systems daemons. Which option you pick depends on what you expect RightScripts to do. If you mostly perform host-level operations, such as managing user accounts, mounting disk volumes, or tweaking network configuration then it is easier to run RightLink at the host level. If you mostly perform application-level operations, such as installing, configuring, launching applications, or if your RightScripts make use of scripting languages (Python, Ruby, Perl, …) then running RightLink in a container is the way to go. In addition, you may have policies or comfort levels that make you want to run as much as possible in a container. Fortunately we give you the choice!
Coming up next will be a post on running RightLink in RancherOS: Stay tuned!