Monday, July 28, 2014

Building a better Internet, one Tor relay at a time

Everybody’s talking about privacy and anonymity on the Internet these days, and many people are concerned with their apparent demise. Understandably so, considering the torrent of revelations we’ve been getting for over a year now, all about the beliefs and practices of (in)famous three-letter agencies.
We’re not about to reiterate the valid concerns of privacy and/or anonymity minded people. Instead, we are going to demonstrate how one can make a small but extremely significant contribution towards an Internet where anonymity is an everyday practical option and not an elusive goal. (If you’re also interested in privacy, then maybe it’s time to setup your very own OpenVPN server.)
You’ve probably heard about Tor. Technically speaking, it is a global mesh of nodes, also known as relays, which encrypt and bounce traffic between client computers and servers on the Internet. That encryption and bouncing of traffic is done in such a way, that it is practically impossible to know who visited a web site or used a network service in general. To put it simply, anytime I choose to surf the web using Tor it’s impossible for the administrators of the sites I visit to know my real IP address. Even if they get subpoenaed, they are just unable to provide the real addresses of the clients who reached them through Tor.
If you care about your anonymity or you’re just curious about Tor, then you may easily experience it by downloading the official, freely available Tor Browser Bundle. The effectiveness of Tor relies in the network of those aforementioned relays: The more active relays participate in the Tor network, the stronger the anonymity Tor clients enjoy is.
It is relatively easy to contribute to the strengthening of the Tor network. All you really need is an active Internet connection and a box/VM/VPS — or even a cheap computer like the Raspberry Pi. In the remainder of this post we demonstrate how one can setup a Tor relay on a VPS running Ubuntu Server 14.04 LTS (Trusty Tahr). You may follow our example to the letter or install Tor on some different kind of host, possibly running some other Linux distribution, flavor of BSD, OS X or even Windows.


We SSH into our Ubuntu VPS, gain access to the root account and add to the system a new, always up-to-date Tor repository:
$ sudo su
# echo "deb trusty main" >> /etc/apt/sources.list
If you’re not running the latest version of Ubuntu Server (14.04 LTS, at the time of this writing), then you should replace “trusty” with the corresponding codename. One way to find out the codename of your particular Ubuntu version is with the help of lsb_release utility:
# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 14.04 LTS
Release:        14.04
Codename:       trusty
Now, let’s refresh all the local repositories:
# apt-get update
W: GPG error: trusty InRelease:
The following signatures couldn't be verified because the
public key is not available: NO_PUBKEY
The Tor repository signature cannot be verified, so naturally we get an error. The verification fails because the public key is missing. We may manually download that key and let APT know about it, but it’s better to install the package instead. That way, whenever the signing key changes, we won’t have to re-download the corresponding public key.
# apt-get install
Reading package lists... Done
Building dependency tree      
Reading state information... Done
The following NEW packages will be installed:
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 4138 B of archives.
After this operation, 20.5 kB of additional disk space will be used.
WARNING: The following packages cannot be authenticated!
Install these packages without verification? [y/N] y
We confirm the installation of and then refresh the local repositories:
# apt-get update
This time around there should be no errors. To install Tor itself, we just have to type…
# apt-get install tor
Reading package lists... Done
Building dependency tree      
Reading state information... Done
The following extra packages will be installed:
  tor-geoipdb torsocks
Suggested packages:
  mixmaster xul-ext-torbutton socat tor-arm polipo privoxy apparmor-utils
The following NEW packages will be installed:
  tor tor-geoipdb torsocks
0 upgraded, 3 newly installed, 0 to remove and 0 not upgraded.
Need to get 1317 kB of archives.
After this operation, 5868 kB of additional disk space will be used.
Do you want to continue? [Y/n]y
That’s all great! By now, Tor should be up and running:
# netstat -antp
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0    *               LISTEN      808/sshd       
tcp        0      0*               LISTEN      973/tor        
tcp        0      0      ESTABLISHED 2095/sshd: sub0
tcp6       0      0 :::22                   :::*                    LISTEN      808/sshd
But before we add a new relay to the Tor network, we should properly configure it first.


The Tor configuration file is named torrc and it resides within the /etc/tor directory. Before we make any changes to it, it’s a good idea to keep a backup. Then we open up torrc with a text editor, e.g., nano:
# cp /etc/tor/torrc /etc/tor/torrc.original
# nano /etc/tor/torrc
We locate the following lines and modify them to fit our setup. Please take a closer look at our modifications:
SocksPort 0
Log notice file /var/log/tor/notices.log
Nickname parabing
ORPort 9001
DirPort 9030
Address # this is optional
ContactInfo cvarelas AT gmail DOT com
RelayBandwidthRate 128 KB
RelayBandwidthBurst 192 KB
ExitPolicy reject *:*
Some explaining is in order.
  • SocksPort 0
    We want Tor to act as a relay only and ignore connections from local applications.
  • Log notice file /var/log/tor/notices.log
    All messages of level “notice” and above should go to /var/log/tor/notices.log. Check the five available message levels here.
  • Nickname parabing
    A name for our Tor relay. Feel free to name yours anyway you like. The relay will be searchable in the various public relay databases by that name.
  • ORPort 9001
    This is the standard Tor port for incoming network connections.
  • DirPort 9030
    This is the standard port for distributing information about the public Tor directory.
  • Address
    This is optional but in some cases useful. If your relay has trouble participating in the Tor network during startup, then try providing here the fully qualified domain name or the public IP address of the host computer/VM/VPS.
  • ContactInfo cvarelas AT gmail DOT com
    You may type a real email address here — and you don’t have to care for syntax correctness: The address may be just intelligible, so if anyone wishes to contact you for any reason then he or she will have a chance to know an email of yours by looking up your relay in a public directory.
  • RelayBandwidthRate 128 KB
    The allowed bandwidth for incoming traffic. In this example it’s 128 kilobytes per second, that is 8 x 128 = 1024Kbps or 1Mbps. Please note that RelayBandwidthRate must be at least 20 kilobytes per second.
  • RelayBandwidthBurst 192 KB
    This is the allowed bandwidth burst for incoming traffic. In our example it’s 50% more than the allowed RelayBandwidthRate.
  • ExitPolicy reject *:*
    This relay does not allow exits to the “normal” Internet — it’s just a member of the Tor network. If you’re hosting the relay yourself at home, then it’s highly recommended to disallow exits. This is true even if you’re running Tor on a VPS. See the Closing comments section for more on when it is indeed safe to allow exits to the Internet.
After all those modifications in /etc/tor/torrc we’re ready to restart Tor (it was automatically activated immediately after the installation). But before we do that, there might be a couple of things we should take care of.

Port forwarding and firewalls

It’s likely that the box/VM/VPS our Tor relay is hosted is protected by some sort of firewall. If this is indeed the case, then we should make sure that the TCP ports for ORPort and DirPort are open. For example, one of our Tor relays lives on a GreenQloud instance and that particular IaaS provider places a firewall in front of any VPS (instance). That’s why we had to manually open ports 9001/TCP and 9030/TCP on the firewall of that instance. There’s also the case of the ubiquitous residential NAT router. In this extremely common scenario we have to add two port forwarding rules to the ruleset of the router, like the following:
  • redirect all incoming TCP packets for port 9001 to port 9001 on the host with IP a.b.c.d
  • redirect all incoming TCP packets for port 9030 to port 9030 on the host with IP a.b.c.d
where a.b.c.d is the IP address of the relay host’s Internet-facing network adapter.

First-time startup and checks

To let Tor know about the fresh modifications in /etc/tor/torrc, we simply restart it:
# service tor restart
* Stopping tor daemon...
* ...
* Starting tor daemon...        [ OK ]
To see what’s going on during Tor startup, we take a look at the log file:
# tail -f /var/log/tor/notices.log
Jul 18 09:30:07.000 [notice] Bootstrapped 80%: Connecting to the Tor network.
Jul 18 09:30:07.000 [notice] Guessed our IP address as 37.b.c.d (source:
Jul 18 09:30:08.000 [notice] Bootstrapped 85%: Finishing handshake with first hop.
Jul 18 09:30:09.000 [notice] Bootstrapped 90%: Establishing a Tor circuit.
Jul 18 09:30:09.000 [notice] Tor has successfully opened a circuit. Looks like client functionality is working.
Jul 18 09:30:09.000 [notice] Bootstrapped 100%: Done.
Jul 18 09:30:09.000 [notice] Now checking whether ORPort 37.b.c.d:9001 and DirPort 37.b.c.d:9030 are reachable... (this may take up to 20 minutes -- look for log messages indicating success)
Jul 18 09:30:10.000 [notice] Self-testing indicates your DirPort is reachable from the outside. Excellent.
Jul 18 09:30:11.000 [notice] Self-testing indicates your ORPort is reachable from the outside. Excellent. Publishing server descriptor.
Jul 18 09:31:43.000 [notice] Performing bandwidth self-test...done.
(Press [CTR+C] to stop viewing the log file.) Notice that DirPort and ORPort are reachable — and that’s good. If any of those ports is not reachable, then check the firewall/port forwarding rules. You may also have to activate the Address directive in /etc/tor/torrc and restart the tor service.
You can look up any Tor relay in the Atlas directory. The relay shown on the screenshot is one of our own and it lives in a datacenter in Iceland, a country with strong pro-privacy laws.

Relay monitoring

One way to find out if your new, shiny Tor relay is actually active, is to look it up on Atlas. You may also monitor its operation in real-time with arm (anonymizing relay monitor). Before we install arm, let’s make a couple of modifications to /etc/tor/torrc. At first we locate the following two lines and uncomment them (i.e., delete the # character on the left):
ControlPort 9051
HashedControlPassword ...
We then move at the end of torrc and add this line:
DisableDebuggerAttachment 0
We make sure the modifications to torrc are saved and then restart tor:
# service tor restart
To install arm we type:
# apt-get install tor-arm
Reading package lists... Done
Building dependency tree      
Reading state information... Done
The following extra packages will be installed:
  python-geoip python-socksipy python-support python-torctl
The following NEW packages will be installed:
  python-geoip python-socksipy python-support python-torctl tor-arm
0 upgraded, 5 newly installed, 0 to remove and 0 not upgraded.
Need to get 406 kB of archives.
After this operation, 1,735 kB of additional disk space will be used.
Do you want to continue? [Y/n] y
There’s no need to work from the root account anymore, so let’s exit to our non-privileged user account:
# exit
Right after the installation of the tor-arm package, a new account is created. The username of that account is debian-tor and for security reasons we run arm from the confines of said account:
$ sudo -u debian-tor arm

Closing comments

If you have more than one relays running, then no matter if they reside in the same local network or not you may want to put them in the same family, so clients will be able to avoid using more than one of your relays in a single circuit. To do that, on each node open up /etc/tor/torrc for editing, locate and uncomment the MyFamily directive and list the fingerprints of all your relays. One way to find the fingerprint of a relay is to look it up in Atlas; just search for the relay by name, click on the name and take a look at the Properties column. Another way is to simply run arm and check the information at the fourth line from the top of the terminal.
Thanks to arm (anonymizing relay monitor) we can monitor our Tor relay operation from our beloved terminal. The relay shown is hosted on Raspberry Pi with Raspbian.Tor relays can be configured to allow a predefined amount of traffic per time period and then hibernate until the next time period comes. Bandwidth isn’t always free in all VPS providers and/or ISPs, so you may want to define the AccountingMax and AccountingStart directives in your relay’s torrc file.
Now, in this post we setup a relay which is indeed a member of the global Tor network but it is not an exit node. In other words, no website or service on the Internet will see traffic coming from the public IP of our relay. This arrangement keeps us away from trouble. (Think about it: We can never know the true intentions of Tor clients, nor can we be responsible for their actions.) Having said that, we can’t stress enough that there’s always a high demand for Tor exit nodes. So if you want your contribution to the Tor network to have the highest positive impact possible, you might want to configure your relay to act as an exit relay. To do that, open /etc/tor/torrc, comment out the old ExitPolicy line and add this one:
ExitPolicy accept *:*
The above directive allows all kinds of exits, i.e., traffic destined to any TCP port, but you may selectively disallow exits to certain ports (services). See the following example:
ExitPolicy reject *:25, reject *:80, accept *:*
This means that all exits are allowed but not those to web or SMTP servers. In general, exit policies are considered first to last and the first match wins. You may split your policy in several lines, all beginning with ExitPolicy. See, for example, the default policy of any tor relay:
ExitPolicy reject *:25
ExitPolicy reject *:119
ExitPolicy reject *:135-139
ExitPolicy reject *:445
ExitPolicy reject *:563
ExitPolicy reject *:1214
ExitPolicy reject *:4661-4666
ExitPolicy reject *:6346-6429
ExitPolicy reject *:6699
ExitPolicy reject *:6881-6999
ExitPolicy accept *:*
We recommend that you read the man page of torrc for more details on exit policies.
Judging from my personal experience, if you completely allow all exits on your relay then it’s almost certain that sooner rather than later you’ll get an email from your VPS provider or your ISP. This has happened to me more than four times already. In one of those cases there were complaints about downloading of infringing content (movies and TV shows) via BitTorrent. At another time, an email from my ISP was mentioning excessive malware activity originating from my public IP at home. Each time I was the recipient of such emails, I immediately modified the exit policy of the corresponding Tor instance and continued using it as a non-exit relay. After the change of the policy, I had no further (justified) complaints from the ISP/VPS provider.
My understanding is that even if your relay allows exits, it’s still highly improbable that you’ll get yourself in any sort of legal trouble. It is *not impossible* though, and occasionally it all depends on the law in your country and/or other legal precedents. So my recommendation is to always disallow exits or use the default exit policy. If your relay is hosted in a university, then you should probably get away by allowing all kinds of exits. In any case, always be cooperative and immediately comply to any requests from your ISP or VPS provider.
Congratulations on your new Tor relay — and have fun!

Thursday, July 24, 2014

Linux / Unix logtop: Realtime Log Line Rate Analyser

How can I analyze line rate taking log file as input on a Linux system? How do I find the IP flooding my Apache/Nginx/Lighttpd web-server on a Debian or Ubuntu Linux?

Tutorial details
DifficultyEasy (rss)
Root privilegesYes
Estimated completion timeN/A
You need to use a tool called logtop. It is a system administrator tool to analyze line rate taking log file as input. It reads on stdin and print a constantly updated result displaying, in columns in the following format: Line number, count, frequency, and the actual line

How do install logtop on a Debian or Ubuntu based system?

Simply type the following apt-get command:
$ sudo apt-get install logtop
Sample outputs:
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following NEW packages will be installed:
0 upgraded, 1 newly installed, 0 to remove and 3 not upgraded.
Need to get 15.7 kB of archives.
After this operation, 81.9 kB of additional disk space will be used.
Get:1 precise/universe logtop amd64 0.3-1 [15.7 kB]
Fetched 15.7 kB in 0s (0 B/s)
Selecting previously unselected package logtop.
(Reading database ... 114954 files and directories currently installed.)
Unpacking logtop (from .../logtop_0.3-1_amd64.deb) ...
Processing triggers for man-db ...
Setting up logtop (0.3-1) ...


The syntax is as follows:
logtop [OPTIONS] [FILE]
command | logtop
command1 | filter | logtop
command1 | filter | logtop [options] [file]


Here are some common examples of logtop.

Show the IP address flooding your LAMP server

Type the following command:
tail -f www.cyberciti.biz_access.log | cut -d' ' -f1 | logtop
Sample outputs:
Fig.01: logtop command in action
Fig.01: logtop command in action

See squid cache HIT and MISS log

tail -f cache.log | grep -o "HIT\|MISS" | logtop
To see realtime hit / miss ratio on some caching software log file, enter:
tail -f access.log | cut -d' ' -f1 | logtop -s 20000
The -s option set logtop to work with the maximum of K lines instead of 10000.

Monday, July 21, 2014

How to set up a highly available Apache cluster using Heartbeat

A highly available cluster uses redundant servers to ensure maximum uptime. Redundant nodes mitigate risks related to single points of failure. Here's how you can set up a highly available Apache server cluster on CentOS.
Heartbeat provides cluster infrastructure services such as inter-cluster messaging, node memberships, IP allocation and migration, and starting and stopping of services. Heartbeat can be used to build almost any kind of highly available clusters for enterprise applications such as Apache, Samba, and Squid. Moreover, it can be coupled with load balancing software so that incoming requests are shared by all cluster nodes.
Our example cluster will consist of three servers that run Heartbeat. We'll test failover by taking down servers manually and checking whether the website they serve is still available. Here's our testing topology:
Topology The IP address against which the services are mapped needs to be reachable at all time. Normally Heartbeat would assign the designated IP address to a virtual network interface card (NIC) on the primary server for you. If the primary server goes down, the cluster will automatically shift the IP address to a virtual NIC on another of its available servers. When the primary server comes back online, it shifts the IP address back to the primary server again. This IP address is called "floating" because of its migratory properties.

Install packages on all servers

To set up the cluster, first install the prerequisites on each node using yum:
yum install PyXML cluster-glue cluster-glue-libs resource-agents
Next, download and install two Heartbeat RPM files that are not available in the official CentOS repository.
rpm -ivh heartbeat-*
Alternatively, you can add the EPEL repository to your sources and use yum for the installs.
Heartbeat will manage starting up and stopping Apache's httpd service, so stop Apache and disable it from being automatically started:
service httpd stop
chkconfig httpd off

Set up hostnames

Now set the server hostnames by editing /etc/sysconfig/network on each system and changing the HOSTNAME line:
The new hostname will activate at the next server boot-up. You can use the hostname command to immediately activate it without restarting the server:
You can verify that the hostname has been properly set by running uname -n on each server.

Configure Heartbeat

To configure Heartbeat, first copy its default configuration files from /usr to /etc/ha.d/:
cp /usr/share/doc/heartbeat-3.0.4/authkeys /etc/ha.d/
cp /usr/share/doc/heartbeat-3.0.4/ /etc/ha.d/
cp /usr/share/doc/heartbeat-3.0.4/haresources /etc/ha.d/
You must then modify all three files on all of your cluster nodes to match your requirements.
The authkeys file contains the pre-shared password to be used by the cluster nodes while communicating with each other. Each Heartbeat message within the cluster contains the password, and nodes process only those messages that have the correct password. Heartbeat supports SHA1 and MD5 passwords. In authkeys, the following directives set the authentication method as SHA1 and define the password to be used:
auth 2
2 sha1 pre-shared-password
Save the file, then give it permissions of r-- with the command chmod 600 /etc/ha.d/authkeys.
Next, in, define timers, cluster nodes, messaging mechanisms, layer 4 ports, and other settings:
## logging ##
logfile        /var/log/ha-log
logfacility     local0hea

## timers ##
## All timers are set in seconds. Use 'ms' if you need to define time in milliseconds. ##

## heartbeat intervals ##
keepalive 2

## node is considered dead after this time ##
deadtime 15

## some servers take longer time to boot. this timer defines additional time to wait before confirming that a server is down ##
##  the recommended time for this timer is at least twice of the dead timer ##
initdead 120

## messaging parameters ##
udpport        694

bcast   eth0
## you can use multicasts or unicasts as well ##

## node definitions ##
## make sure that the hostnames match uname -n ##

Finally, the file haresources contains the hostname of the server that Heartbeat considers the primary node, as well as the floating IP address. It is vital that this file be identical across all servers. As long as the primary node is up, it serves all requests; Heartbeat stops the highly available service on all other nodes. When Heartbeat detects that that primary node is down, it automatically starts the service on the next available node in the cluster. When the primary node comes back online, Heartbeat sets it to take over again and serve all requests. Finally, this file contains the name of the script that is responsible for the highly available service: httpd in this case. Other possible values might be squid, smb, nmb, or postfix, mapping to the name of the service startup script typically located in the /etc/init.d/ directory.
In haresources, define to be the primary server, to be the floating IP address, and httpd to be the highly available service. You do not need to create any interface or manually assign the floating IP address to any interface – Heartbeat takes care of that for you: httpd
After the configuration files are ready on each of the servers, start the Heartbeat service and add it to system startup:
service heartebeat start
chkconfig heartbeat on
You can keep an eye on the Heartbeat log with the command tailf /var/log/ha-log.
Heartbeat can be used to for multiple services. For example, the following directive in haresources would make Heartbeat manage both Apache and Samba services: httpd smb nmb
However, unless you're also running a cluster resource manager (CRM) such as Pacemaker, I do not recommend using Heartbeat to provide mulitple services in a single cluster. Without Pacemaker, Heartbeat monitors cluster nodes in layer 3 using IP addresses. As long as an IP address is reachable, Heartbeat is oblivious to any crashes or difficulties that services may be facing on a server node.


Once Heartbeat is up and running, test it out. Create separate index.html files on all three servers so you can see which server is serving the page. Browse to or, if you have DNS set up, its domain name equivalent. The page should be loaded from, and you can check this by looking at the Apache log file in server1. Try refreshing the page and verify whether the page is being loaded from the same server each time.
If this goes well, test failover by stopping the Heartbeat service on The floating IP address should be migrated to server 2, and the page should be loaded from there. A quick look into server2 Apache log should confirm the fact. If you stop the service on server2 as well, the web pages will be loaded from, the only available node in the cluster. When you restart the services on server1 and server2, the floating IP address should migrate from the active node to server1, per the setup in haresources.
As you can see, it's easy to set up a highly available Apache cluster under CentOS using Heartbeat. While we used three servers, Heartbeat should work with more or fewer nodes as well. Heartbeat has no constraint on the number of nodes, so you can scale the setup as you need.