Monday, September 7, 2015

Concerning Containers' Connections: on Docker Networking by Federico Kereki

http://www.linuxjournal.com/content/concerning-containers-connections-docker-networking

Containers can be considered the third wave in service provision after physical boxes (the first wave) and virtual machines (the second wave). Instead of working with complete servers (hardware or virtual), you have virtual operating systems, which are far more lightweight. Instead of carrying around complete environments, you just move applications, with their configuration, from one server to another, where it will consume its resources, without any virtual layers. Shipping over projects from development to operations also is simplified—another boon. Of course, you'll face new and different challenges, as with any technology, but the possible risks and problems don't seem to be insurmountable, and the final rewards appear to be great.
Docker is an open-source project based on Linux containers that is showing high rates of adoption. Docker's first release was only a couple years ago, so the technology isn't yet considered mature, but it shows much promise. The combination of lower costs, simpler deployment and faster start times certainly helps.
In this article, I go over some details of setting up a system based on several independent containers, each providing a distinct, separate role, and I explain some aspects of the underlying network configuration. You can't think about production deployment without being aware of how connections are made, how ports are used and how bridges and routing are set up, so I examine those points as well, while putting a simple Web database query application in place.

Basic Container Networking

Let's start by considering how Docker configures network aspects. When the Docker service dæmon starts, it configures a virtual bridge, docker0, on the host system (Figure 1). Docker picks a subnet not in use on the host and assigns a free IP address to the bridge. The first try is 172.17.42.1/16, but that could be different if there are conflicts. This virtual bridge handles all host-containers communications.
When Docker starts a container, by default, it creates a virtual interface on the host with a unique name, such as veth220960a, and an address within the same subnet. This new interface will be connected to the eth0 interface on the container itself. In order to allow connections, iptables rules are added, using a DOCKER-named chain. Network address translation (NAT) is used to forward traffic to external hosts, and the host machine must be set up to forward IP packets.
Figure 1. Docker uses a bridge to connect all containers on the same host to the local network.
The standard way to connect a container is in "bridged" mode, as described previously. However, for special cases, there are more ways to do this, which depend on the -net option for the docker run command. Here's a list of all available modes:
  • -net=bridge — The new container uses a bridge to connect to the rest of the network. Only its exported public ports will be accessible from the outside.
  • -net=container:ANOTHER.ONE — The new container will use the network stack of a previously defined container. It will share its IP address and port numbers.
  • -net=host — This is a dangerous option. Docker won't separate the container's network from the host's. The new container will have full access to the host's network stack. This can cause problems and security risks!
  • -net=none — Docker won't configure the container network at all. If you want, you can set up your own iptables rules (see Resources if you're interested in this). Even without the network, the container could contact the world by shared directories, for example.
Docker also sets up each container so it will have DNS resolution information. Run findmnt inside a container to produce something along the lines of Listing 1. By default, Docker uses the host's /etc/resolv.conf data for DNS resolution. You can use different nameservers and search lists with the --dns and --dns-search options.

Listing 1. The last three lines show Docker's special mount trick, so containers get information from Docker-managed host files.


root@4de393bdbd36:/var/www/html# findmnt -o TARGET,SOURCE
TARGET                  SOURCE
/                       /dev/mapper/docker-8:2-25824189-4de...822[/rootfs]
|-/proc                 proc
| |-/proc/sys           proc[/sys]
| |-/proc/sysrq-trigger proc[/sysrq-trigger]
| |-/proc/irq           proc[/irq]
| |-/proc/bus           proc[/bus]
| `-/proc/kcore         tmpfs[/null]
|-/dev                  tmpfs
| |-/dev/shm            shm
| |-/dev/mqueue         mqueue
| |-/dev/pts            devpts
| `-/dev/console        devpts[/2]
|-/sys                  sysfs
|-/etc/resolv.conf      /dev/sda2[/var/lib/docker/containers/4de...822/resolv.conf]
|-/etc/hostname         /dev/sda2[/var/lib/docker/containers/4de...822/hostname]
`-/etc/hosts            /dev/sda2[/var/lib/docker/containers/4de...822/hosts]
Now that you have an idea about how Docker sets up networking for individual containers, let's develop a small system that will be deployed via containers and then finish by working out how to connect all the pieces together.

Designing Your Application: the World Database

Let's say you need an application that will let you search for cities that include a given text string in their names. (Figure 2 shows a sample run.) For this example, I used the geographical information at GeoNames (see Resources) to create an appropriate database. Basically, you work with countries (identified by their ISO 3166-1 two-letter codes, such as "UY" for "Uruguay") and cities (with a name, a pair of coordinates and the country to which they belong). Users will be able to enter part of the city name and get all the matching cities (not very complex).
Figure 2. This sample application finds these cities with DARWIN in their names.
How should you design your mini-system? Docker is meant to package single applications, so in order to take advantage of containers, you'll run separate containers for each required role. (This doesn't necessarily imply that only a single process may run on a container. A container should fulfill a single, definite role, and if that implies running two or more programs, that's fine. With this very simple example, you'll have a single process per container, but that need not be the general case.)
You'll need a Web server, which will run in a container, and a database server, in a separate container. The Web server will access the database server, and end users will need connections to the Web server, so you'll have to set up those network connections.
Start by creating the database container, and there's no need to start from scratch. You can work with the official MySQL Docker image (see Resources) and save a bit of time. The Dockerfile that produces the image can specify how to download the required geographical data. The RUN commands set up a loaddata.sh script that takes care of that. (For purists: a single longer RUN command would have sufficed, but I used three here for clarity.) See Listing 2 for the complete Dockerfile file; it should reside in an otherwise empty directory. Building the worlddb image itself can be done from that directory with the sudo docker build -t worlddb . command.

Listing 2. The Dockerfile to create the database server also pulls down the needed geographical data.


FROM mysql:latest
MAINTAINER Federico Kereki fkereki@gmail.com

RUN     apt-get update && \
        apt-get -q -y install wget unzip && \
        wget 'http://download.geonames.org/export/dump/countryInfo.txt' && \
        grep -v '^#' countryInfo.txt >countries.txt && \
        rm countryInfo.txt && \
        wget 'http://download.geonames.org/export/dump/cities1000.zip' && \
        unzip cities1000.zip && \
        rm cities1000.zip

RUN     echo "\
        CREATE DATABASE IF NOT EXISTS world;    \
        USE world;                              \
        DROP TABLE IF EXISTS countries;         \
        CREATE TABLE countries (                \
                id CHAR(2),                     \
                ignore1 CHAR(3),                \
                ignore2 CHAR(3),                \
                ignore3 CHAR(2),                \
                name VARCHAR(50),               \
                capital VARCHAR(50),            \
                PRIMARY KEY (id));              \
        LOAD DATA LOCAL INFILE 'countries.txt'  \
                INTO TABLE countries            \
                FIELDS TERMINATED BY '\t';      \
        DROP TABLE IF EXISTS cities;            \
        CREATE TABLE cities (                   \
                id NUMERIC(8),                  \
                name VARCHAR(200),              \
                asciiname VARCHAR(200),         \
                alternatenames TEXT,            \
                latitude NUMERIC(10,5),         \
                longitude NUMERIC(10,5),        \
                ignore1 CHAR(1),                \
                ignore2 VARCHAR(10),            \
                country CHAR(2));               \
        LOAD DATA LOCAL INFILE 'cities1000.txt' \
                INTO TABLE cities               \
                FIELDS TERMINATED BY '\t';      \
        " > mydbcommands.sql

RUN     echo "#!/bin/bash \n                    \
        mysql -h localhost -u root -p\$MYSQL_ROOT_PASSWORD loaddata.sh && \
        chmod +x loaddata.sh
The sudo docker images command verifies that the image was created. After you create a container based on it, you'll be able to initialize the database with the ./loaddata.sh command.

Searching for Data: Your Web Site

Now let's work on the other part of the system. You can take advantage of the official PHP Docker image, which also includes Apache. All you need is to add the php5-mysql extension to be able to connect to the database server. The script should be in a new directory, along with search.php, the complete code for this "system". Building this image, which you'll name "worldweb", requires the sudo docker build -t worldweb . command (Listing 3).

Listing 3. The Dockerfile to create the Apache Web server is even simpler than the database one.


FROM php:5.6-apache
MAINTAINER Federico Kereki fkereki@gmail.com

COPY search.php /var/www/html/

RUN     apt-get update && \
        apt-get -q -y install php5-mysql && \
        docker-php-ext-install mysqli
The search application search.php is simple (Listing 4). It draws a basic form with a single text box at the top, plus a "Go!" button to run a search. The results of the search are shown just below that in a table. The process is easy too—you access the database server to run a search and output a table with a row for each found city.

Listing 4. The whole system consists of only a single search.php file.




Cities Search

Search for: "> prepare($query); $searchFor = "%".$_REQUEST["searchFor"]."%"; $stmt->bind_param("s", $searchFor); $stmt->execute(); $result = $stmt->get_result(); echo " ↪ "; foreach ($result->fetch_all(MYSQLI_NUM) as $row) { echo " "; foreach($row as $data) { echo ""; } echo " "; } echo "
CountryCityLatLong
".$data."
"; } catch (Exception $e) { echo "Exception " . $e->getMessage(); } } ?>
Both images are ready, so let's get your complete "system" running.

Linking Containers

Given the images that you built for this example, creating both containers is simple, but you want the Web server to be able to reach the database server. The easiest way is by linking the containers together. First, you start and initialize the database container (Listing 5).

Listing 5. The database container must be started first and then initialized.


# su -
# docker run -it -d -e MYSQL_ROOT_PASSWORD=ljdocker 
 ↪--name MYDB worlddb
fbd930169f26fce189a9d6020861eb136643fdc9ee73a4e1f114e0bfd0fe6a5c
# docker exec -it MYDB bash
root@fbd930169f26:/# dir
bin   cities1000.txt  dev    etc   lib    
 ↪loaddata.sh  mnt   opt   root  sbin     
 ↪srv  tmp  var
boot  countries.txt   entrypoint.sh  home  lib64  media 
 ↪mydbcommands.sql  proc  run   selinux  sys  usr
root@fbd930169f26:/# ./loaddata.sh
Warning: Using a password on the command line interface 
 ↪can be insecure.
root@fbd930169f26:/# exit
Now, start the Web container, with docker run -it -d -p 80:80 --link MYDB:MYDB --name MYWEB worldweb. This command has a couple interesting options:
  • -p 80:80 — This means that port 80 (the standard HTTP port) from the container will be published as port 80 on the host machine itself.
  • --link MYDB:MYDB — This means that the MYDB container (which you started earlier) will be accessible from the MYWEB container, also under the alias MYDB. (Using the database container name as the alias is logical, but not mandatory.) The MYDB container won't be visible from the network, just from MYWEB.
In the MYWEB container, /etc/hosts includes an entry for each linked container (Listing 6). Now you can see how search.php connects to the database. It refers to it by the name given when linking containers (see the mysqli_connect call in Listing 4). In this example, MYDB is running at IP 172.17.0.2, and MYWEB is at 172.17.0.3.

Listing 6. Linking containers in the same server is done via /etc/hosts entries.


# su -
# docker exec -it MYWEB bash
root@fbff94177fc7:/var/www/html# cat /etc/hosts
172.17.0.3     fbff94177fc7
127.0.0.1      localhost
...
172.17.0.2     MYDB

root@fbff94177fc7:/var/www/html# export
declare -x MYDB_PORT="tcp://172.17.0.2:3306"
declare -x MYDB_PORT_3306_TCP="tcp://172.17.0.2:3306"
declare -x MYDB_PORT_3306_TCP_ADDR="172.17.0.2"
declare -x MYDB_PORT_3306_TCP_PORT="3306"
declare -x MYDB_PORT_3306_TCP_PROTO="tcp"
...
The environment variables basically provide all the connection data for each linkage: what container it links to, using which port and protocol, and how to access each exported port from the destination container. In this case, the MySQL container just exports the standard 3306 port and uses TCP to connect. There's just a single problem with some of these variables. Should you happen to restart the MYDB container, Docker won't update them (although it would update the /etc/hosts information), so you must be careful if you use them!
Examining the iptables configuration, you'll find a DOCKER new chain (Listing 7). Port 80 on the host machine is connected to port 80 (http) in the MYWEB container, and there's a connection for port 3306 (mysql) linking MYWEB to MYDB.

Listing 7. Docker adds iptables rules to link containers' ports.


# sudo iptables --list DOCKER
Chain DOCKER (1 references)
target     prot opt source       destination
ACCEPT     tcp  --  anywhere     172.17.0.3   tcp dpt:http
ACCEPT     tcp  --  172.17.0.3   172.17.0.2   tcp dpt:mysql
ACCEPT     tcp  --  172.17.0.2   172.17.0.3   tcp spt:mysql
If you need to have circular links (container A links to container B, and vice versa), you are out of luck with standard Docker links, because you can't link to a non-running container! You might want to look into docker-dns (see Resources), which can create DNS records dynamically based upon running containers. (And in fact, you'll be using DNS later in this example when you set up containers in separate hosts.) Another possibility would imply creating a third container, C, to which both A and B would link, and through which they would be interconnected. You also could look into orchestration packages and service registration/discovery packages. Docker is still evolving in these areas, and new solutions may be available at any time.
You just saw how to link containers together, but there's a catch with this. It works only with containers on the same host, not on separate hosts. People are working on fixing this restriction, but there's an appropriate solution that can be used for now.

Weaving Remote Containers Together

If you had containers running on different servers, both local and remote ones, you could set up everything so the containers eventually could connect with each other, but it would be a lot of work and a complex configuration as well. Weave (currently on version 0.9.0, but quickly evolving; see Resources to get the latest version) lets you define a virtual network, so that containers can connect to each other transparently (optionally using encryption for added security), as if they were all on the same server. Weave behaves as a sort of giant switch, with all your containers connected in the same virtual network. An instance must run on each host to do the routing work.
Locally, on the server where it runs, a Weave router establishes a network bridge, prosaically named weave. It also adds virtual Ethernet connections from each container and from the Weave router itself to the bridge. Every time a local container needs to contact a remote one, packets are forwarded (possibly with "multi-hop" routing) to other Weave routers, until they are delivered by the (remote) Weave router to the remote container. Local traffic isn't affected; this forwarding applies only to remote containers (Figure 3).
Figure 3. Weave adds several virtual devices to redirect some of the traffic eventually to other servers.
Building a network out of containers is a matter of launching Weave on each server and then starting the containers. (Okay, there is a missing step here; I'll get to that soon.) First, launch Weave on each server with sudo weave launch. If you plan to connect containers across untrusted networks, add a password (obviously, the same for all Weave instances) by adding the -password some.secret.password option. If all your servers are within a secure network, you can do without that. See the sidebar for a list of all the available weave command-line options.

weave Command-Line Options

  • weave attach — Attach a previously started running Docker container to a Weave instance.
  • weave connect — Connect the local Weave instance to another one to add it into its network.
  • weave detach — Detach a Docker container from a Weave instance.
  • weave expose — Integrate the Weave network with a host's network.
  • weave hide — Revert a previous expose command.
  • weave launch — Start a local Weave router instance; you may specify a password to encrypt communications.
  • weave launch-dns — Start a local DNS server to connect Weave instances on distinct servers.
  • weave ps — List all running Docker containers attached to a Weave instance.
  • weave reset — Stop the running Weave instance and remove all of its network-related stuff.
  • weave run — Launch a Docker container.
  • weave setup — Download everything Weave needs to run.
  • weave start — Start a stopped Weave instance, re-engaging it to the Weave topology.
  • weave status — Provide data on the running Weave instance, including encryption, peers, routes and more.
  • weave stop — Stop a running Weave instance, disengaging it from the Weave topology.
  • weave stop-dns — Stop a running Weave DNS service.
  • weave version — List the versions of the running Weave components; today (April 2015) it would be 0.9.0.
When you connect two Weave routers, they exchange topology information to "learn" about the rest of the network. The gathered data is used for routing decisions to avoid unnecessary packet broadcasts. To detect possible changes and to work around any network problems that might pop up, Weave routers routinely monitor connections. To connect two routers, on a server, type the weave connect the.ip.of.another.server command. (To drop a Weave router, do weave forget ip.of.the.dropped.host.) Whenever you add a new Weave router to an existing network, you don't need to connect it to every previous router. All you need to do is provide it with the address of a single existing Weave instance in the same network, and from that point on, it will gather all topology information on its own. The rest of the routers similarly will update their own information in the process.
Let's start Docker containers, attached to Weave routers. The containers themselves run as before; the only difference is they are started through Weave. Local network connections work as before, but connections to remote containers are managed by Weave, which encapsulates (and encrypts) traffic and sends it to a remote Weave instance. (This uses port 6783, which must be open and accessible on all servers running Weave.) Although I won't go into this here, for more complex applications, you could have several independent subnets, so containers for the same application would be able to talk among themselves, but not with containers for other applications.
First, decide which (unused) subnet you'll use, and assign a different IP on it to each container. Then, you can weave run each container to launch it through Docker, setting up all needed network connections. However, here you'll hit a snag, which has to do with the missing step I mentioned earlier. How will containers on different hosts connect to each other? Docker's --link option works only within a host, and it won't work if you try to link to containers on other hosts. Of course, you might work with IPs, but maintenance for that setup would be a chore. The best solution is using DNS, and Weave already includes an appropriate package, WeaveDNS.
WeaveDNS (a Docker container on its own) runs over a Weave network. A WeaveDNS instance must run on each server on the network, with the weave launch-dns command. You must use a different, unused subnet for WeaveDNS and assign a distinct IP within it to each instance. Then, when starting a Docker container, add a --with-dns option, so DNS information will be available. You should give containers a hostname in the .weave.local domain, which will be entered automatically into the WeaveDNS registers. A complete network will look like Figure 4.
Figure 4. Using Weave, containers in local and remote networks connect to each other transparently; access is simplified with Weave DNS.
Now, let's get your mini-system to run. I'm going to cheat a little, and instead of a remote server, I'll use a virtual machine for this example. My main box (at 192.168.1.200) runs OpenSUSE 13.2, while the virtual machine (at 192.168.1.108) runs Linux Mint 17, just for variety. Despite the different distributions, Docker containers will work just the same, which shows its true portability (Listing 8).

Listing 8. Getting the Weave network to run on two servers.


> # At 192.168.1.200 (OpenSUSE 13.2 server)
> su -
$ weave launch
$ weave launch-dns 10.10.10.1/24
$ C=$(weave run --with-dns 10.22.9.1/24 -it -d -e 
 ↪MYSQL_ROOT_PASSWORD=ljdocker -h MYDB.weave.local --name MYDB worlddb)
$ # You can now enter MYDB with "docker exec -it $C bash"

> # At 192.168.1.108 (Linux Mint virtual machine)
> su -
$ weave launch
$ weave launch-dns 10.10.10.2/24
$ weave connect 192.168.1.200
$ D=$(weave run --with-dns 10.22.9.2/24 -it -d -p 80:80 -h 
 ↪MYWEB.weave.local --name MYWEB worldweb)
The resulting configuration is shown in Figure 5. There are two hosts, on 192.168.1.200 and 192.168.1.108. Although it's not shown, both have port 6783 open for Weave to work. In the first host, you'll find the MYDB MySQL container (at 10.22.9.1/24 with port 3306 open, but just on that subnet) and a WeaveDNS server at 10.10.10.1/24. In the second host, you'll find the MYWEB Apache+PHP container (at 10.22.9.2/24 with port 80 open, exported to the server) and a WeaveDNS server at 10.10.10.2/24. From the outside, only port 80 of the MYWEB container is accessible.
Figure 5. The final Docker container-based system, running on separate systems, connected by Weave.
Because port 80 on the 192.168.1.108 server is directly connected to port 80 on the MYWEB server, you can access http://192.168.1.108/search.php and get the Web page you saw earlier (in Figure 2). Now you have a multi-host Weave network, with DNS services and remote Docker containers running as if they resided at the same host—success!

Conclusion

Now you know how to develop a multi-container system (okay, it's not very large, but still), and you've learned some details on the internals of Docker (and Weave) networking. Docker is still maturing, and surely even better tools will appear to simplify configuration, distribution and deployment of larger and more complex applications. The current availability of networking solutions for containers shows you already can begin to invest in these technologies, although be sure to keep up with new developments to simplify your job even further.

Resources

Get Docker itself from http://www.docker.com. The actual code is at https://github.com/docker/docker.
For more detailed documentation on Docker network configuration, see https://docs.docker.com/articles/networking.
The docker-dns site is at https://www.npmjs.com/package/docker-dns, and its source code is at https://github.com/bnfinet/docker-dns.
The official MySQL Docker image is at https://registry.hub.docker.com/_/mysql. If you prefer, there also are official repositories for MariaDB (https://registry.hub.docker.com/_/mariadb). Getting it to work shouldn't be a stretch.
The Apache+PHP official Docker image is at https://registry.hub.docker.com/_/php.
Weave is at http://weave.works, and the code itself is on GitHub at https://github.com/weaveworks/weave. For more detailed information on its features, go to https://zettio.github.io/weave/features.html.
WeaveDNS is on GitHub at https://github.com/weaveworks/weave/tree/master/weavedns.
For more on articles on Docker in Linux Journal, read the following:
The geographical data I used for the example in this article comes from GeoNames http://www.geonames.org. In particular, I used the countries table (http://download.geonames.org/export/dump/countryInfo.txt) and the cities (with more than 1,000 inhabitants) table (http://download.geonames.org/export/dump/cities1000.zip), but there are larger and smaller sets.


6 comments:

  1. I, Federico Kereki, am the author of the original article -- I think Sameh Attia should have included my name as writer, and not publishing it under his own name, without any reference to me.

    ReplyDelete
  2. Sure and I did include your original URL at the beginning of the article. If this is not enough please tell me.

    ReplyDelete
  3. Thanks for your attention - I'd rather you added my name in the title: "Concerning Containers' ... by Federico Kereki", so if a reader doesn't follow the link, he would know the author; I'd appreciate that.

    ReplyDelete
  4. during copy paste you missed something. The dockerfile for creating a database is not working

    ReplyDelete