Friday, February 26, 2010

Introducing Linux virtual containers with LXC

In the past we have looked at using OpenVZ for container virtualization on Linux. OpenVZ is great as it allows you to run compartmentalized “servers” within an operating system so you can separate systems, much like running virtual machines on a host system.

With OpenVZ, you can get the benefits of virtualization without the overhead.

The downside of OpenVZ is that it isn’t in the mainline kernel. This means you need to run a kernel provided by the OpenVZ project.

By itself this isn’t necessarily a problem, unless you are running an unsupported Linux distribution, and also if you don’t mind a bit of lag from upstream security fixes.

Like OpenVZ, Linux Resource Containers (LXC) provide the ability to run containers that contain processes run within them to isolate them from the host operating system.

The project is part of the upstream kernel, which means that any Linux distribution using kernel 2.6.29 or later will have the kernel-level bits available, without resorting to a third-party to provide it.

For instance, Fedora 12 comes with the appropriate kernel and the user-space tools to use LXC.

To start using LXC, you must install the LXC user-space tools and have an appropriate kernel with LXC support enabled.

On Fedora 12, the kernel is provided and the user-space tools can be installed via:
# yum install lxc

The next step is to make sure the kernel properly supports LXC:
$ lxc-checkconfig

It will provide a list of capabilities; if every capability is listed as “enabled,” LXC is ready to be used with the kernel.

You must first create and mount the LXC control group filesystem:
# mkdir /cgroup
# mount none -t cgroup /cgroup
# echo "none /cgroup cgroup defaults 0 0" >> /etc/fstab

Next, you need to configure bridge networking. This can be done as root with the brctl command, part of the bridge-utils package (install this package if it is not already installed):
# brctl addbr br0
# brctl setfd br0 0
# ifconfig br0 192.168.250.52 promisc up
# brctl addif br0 eth0
# ifconfig eth0 0.0.0.0 up
# route add -net default gw 192.168.250.1 br0

This creates the bridge interface, br0, and assigns it the existing host IP address (in this case, 192.168.250.52).

You will need to do this locally, as once you bring br0 up, the network will go down until the rest of the reconfiguration is complete.

The next commands then reset the IP address of eth0 to 0.0.0.0, but since it is bound to the bridge interface, it will respond to the previous IP address anyways.

Finally, a route is added for br0, which will be used by containers to connect to the network.

Once this is done, we must create a configuration file for a new container. This is a very basic example, so create the configuration file with the following contents:
lxc.utsname = test
lxc.network.type = veth
lxc.network.flags = up
lxc.network.link = br0
lxc.network.hwaddr = 4a:49:43:49:79:bd
lxc.network.ipv4 = 192.168.250.150
lxc.network.ipv6 = 2003:db8:1:0:214:1234:fe0b:3596

Save it as /etc/lxc/lxc-test.conf or something similar. The next command will start a confined shell process:
# /usr/bin/lxc-execute -n test -f /etc/lxc/lxc-test.conf /bin/bash
[root@test lxc]# ps ax
  PID TTY      STAT   TIME COMMAND
    1 pts/1    S      0:00 /usr/libexec/lxc-init -- /bin/bash
    2 pts/1    S      0:00 /bin/bash
   20 pts/1    R+     0:00 ps ax

At this point, the confined shell can ping a remote host and can also be pinged by a remote host. It shares the same host filesystem, so /etc in this container is the same as /etc of the host, but as can be seen by the ps output, the process is fully isolated from the host process table.

On the host, you can use LXC tools to view the state of the container:
# lxc-info -n test
'test' is RUNNING
# lxc-ps
CONTAINER    PID TTY          TIME CMD
           13095 pts/2    00:00:00 su
           13099 pts/2    00:00:00 bash
           13134 pts/2    00:00:00 lxc-ps
           13135 pts/2    00:00:00 ps

The above is an example of an LXC application container. This example had full separate networking support, however you can also isolate a single application that uses the existing host network (as a result not requiring a configuration file) using:
# lxc-execute -n test /bin/bash

You can also create LXC system containers that are more similar to OpenVZ containers. These mimic an entire operating system with its own file system and network address, fully separate from the host operating system.

The simplest way to create these containers is to use OpenVZ templates. Next week, we will create an LXC-based system container.

LXC is powerful, and finally Linux users have something similar to the jail feature that BSD has enjoyed for years.

While OpenVZ works great, having something immediately available from your Linux vendor makes maintenance of the system easier as all the bits are already available, and even though LXC is not as mature as OpenVZ, it is quite capable and under active development.

1 comment:

  1. Nice article sameh, and finally we have something like a BSD Jails and solaris zones, although i did read about jails and zones and even lxc before but i can't say that lxc have all the features of the others or not.

    as far as i can remember i did read an article about LXC on IBM developerWorks sometime ago, i will try to find it .... yes Firefox remebers it ;).
    http://www.ibm.com/developerworks/linux/library/l-lxc-containers/index.html?S_TACT=105AGY83&S_CMP=TWDW

    Thanks & Best Regards

    ReplyDelete