Thursday, April 19, 2012

Web Filtering On Squid Proxy


This HOWTO describes how to protect your home / small enterprise network users from objectionable internet contents with help of HTTP proxy. Our goal is to set up a free Linux based server running Squid and deploy web filtering application on it saving bandwidth, speeding up web access and blocking obsessive and potentially illegal and malicious web files.
In this tutorial I will assume that network environment consists of a SOHO level router that distributes wireless Wi-Fi, several desktop and laptop computers, iPads and some mobile smart phones as shown on the following network diagram.
Network Diagramm


Set Up CentOS 6.2 Linux On Proxy Server

Our proxy server will be built using free version of CentOS Linux 6.2. It is also possible to use RedHat Linux 6.2 with paid subscription of you need guaranteed level of support for your servers.
In order to install CentOS Linux, go to http://mirror.centos.org/centos/6/isos/i386/ and download the CentOS-6.2-i386-minimal.iso image file. Burn it on a spare CD, insert into your server's CD drive and power it on.
Follow the installation steps accepting the defaults or customizing the required parts of the install according to your needs. Configure machine hostname as "proxy" and root password as "P@ssw0rd" (without quotation marks). Wait until the installation is complete and then reboot the system.
The installed version of CentOS usually does not have network connectivity enabled by default. In order to enable network access we need to perform the following.
  1. Assign a static IP address of 192.168.1.2 with network mask 255.255.255.0 to our proxy server by modifying startup script /etc/sysconfig/network-scripts/ifcfg-eth0. Open it and add these lines:
    BOOTPROTO=static
    NETMASK=255.255.255.0
    IPADDR=192.168.1.2
    ONBOOT=yes
  2. Set default gateway settings in /etc/sysconfig/network configuration file by adding this line:
    GATEWAY=192.168.1.1
  3. Adjust DNS resolve settings in /etc/resolv.conf by adding IP address of the DNS server that runs on router:
    nameserver 192.168.1.1
Restart your network subsystem by typing
/etc/init.d/network restart
in the root terminal or by just restarting the server. After restart, confirm that the network functions correctly by typing in the terminal (there should not be any errors in the outputs on these commands):
$ ping -c 3 192.168.1.1
$ nslookup google.com
Before we do any further installation it is recommended to update the freshly installed system with the latest security patches that may have come out after ISO has been released. So type
yum update
in the root terminal and reboot the server after update completes.

Setup Squid On Proxy Server

We will use Squid as caching and filtering proxy that runs on our Proxy Server. In order to install the version of Squid that comes with the 6.2 CentOS distribution type
yum install squid
in the root terminal. Squid and all related packages and dependencies are downloaded from the Internet and installed automatically.
Make Squid proxy service start on system boot automatically by typing
chkconfig squid on
Reboot your server or just start Squid for the first time manually with
service squid start
The only thing to do is to let the external users from our home network access Squid. Open configuration file /etc/squid/squid.conf and add the following line:
visible_hostname proxy
Also check that http_access allow localnet and acl localnet src 192.168.0.0/16 are present in the configuration file.
Restart Squid by typing
service squid restart
Verify that Squid runs correctly by pointing your user browser to the IP address of the Proxy Server (192.168.1.2) and surfing to some of your favorite websites.
NOTE: you may need to adjust firewall settings in CentOS in order to let proxy users connect to port 3128 on the Proxy Server. Use system-config-firewall-tui or iptables commands to do that. A good idea would be to allow access also to port 80 as we will use this port for managing QuintoLabs Content Security through Web UI as described later.

Setup QuintoLabs Content Security

Next step is to install Content Security for Squid from QuintoLabs (I will refer to it as qlproxy further in text). For those who do not know, QuintoLabs Content Security is an ICAP daemon/URL rewriter that integrates with existing Squid proxy server and provides rich content filtering functionality to sanitize web traffic passing into internal home / enterprise network. It may be used to block illegal or potentially malicious file downloads, remove annoying advertisements, prevent access to various categories of the web sites and block resources with explicit content (i.e. prohibit explicit and adult content).
NOTE: there are other tools except qlproxy that have almost the same functionality. Some of the well-known are SquidGuard (SG) and DansGuardian (DG). While these tools are ok from the theoretical perspective you need to install them both to get the same functionality as qlproxy. SG runs as URL Rewriter and DG is even as a separate proxy itself. It also does not support SMP processing relying on resource ineffective process-per-connection server model leading to exploded requirements on e.g. URL block database. It is also a problem to tie SG and DG together as they have different configuration directives and largely independent of each other forcing the admin to look into two different places when he needs to adjust only one filtering policy.
We will use version 2.0 of qlproxy that was released this month. The most prominent feature of that release is a policy based web filtering when users of the proxy are organized into several groups with different levels of strictness.
By default qlproxy comes with three polices preinstalled. Strict policy contains web filter settings put on maximum level and is supposed to protect minors and K12 students from inappropriate contents on the Internet. Relaxed policy blocks only excessive advertisements and was supposed to be used by network administrators, teachers and all those who do not need filtered access to web but would like to evade most ads. The last group is default and contains less restrictive web filtering settings suitable for normal web browsing without explicitly adult contents shown.
The good thing about this is that you are free to design the policies yourself it you find the predefined policies not suitable for your network environment.
Anyway, in order to install Content Security 2.0 we have to get the CentOS / RedHat RPM package manually from QuintoLabs web site at http://www.quintolabs.com/qlicap_download.php and upload the package to the Proxy Server using scp. Another way is to type the following commands in the root terminal of the Proxy Server directly (as one line):
# curl http://quintolabs.com/qlproxy/binaries/2.0.0/qlproxy-2.0.0-bb01d.i386.rpm>qlproxy-2.0.0-bb01d.i386.rpm
After download completes (approx. 21Mb) run the following command to install the downloaded package and all its dependencies (note the package comes in i386 flavor but yum takes care of correct installation on x86_64 architectures):
# yum localinstall qlproxy-2.0.0-bb01d.i386.rpm
The yum installation manager will run for a while and the program will be installed into /opt/quintolabs/qlproxy (binaries), /var/opt/quintolabs/qlproxy (various logs and content filtering databases) and /etc/opt/quintolabs/qlproxy (configuration).
NOTE: this howto assumes you have SELinux disabled on your machine. For specific notes considering SELinux based installation of qlproxy see their web site and sample SELinux policy installed in /opt/quintolabs/qlproxy/usr/share/selinux. In order to disable SELinux set SELINUX=disabled in /etc/selinux/config and reboot.

Integrate Squid And Content Security

QuintoLabs Content Security may be integrated with Squid in two different ways - as ICAP server and as URL rewriter. It is recommended to use ICAP integration as it gives access to all HTTP traffic passing through Squid and allows qlproxy to perform full request and response filtering (ICAP is supported in Squid version 3 and up).
The README file in /etc/opt/quintolabs/qlproxy folder contains detailed instructions on how to perform integration with Squid on different platforms (Debian, Ubuntu, RedHat and even Windows). To integrate it with Squid running on CentOS we need to add the following lines to /etc/squid/squid.conf configuration file:
icap_enable on
icap_preview_enable on
icap_preview_size 4096
icap_persistent_connections on
icap_send_client_ip on
icap_send_client_username on
icap_service qlproxy1 reqmod_precache bypass=0 icap://127.0.0.1:1344/reqmod
icap_service qlproxy2 respmod_precache bypass=0 icap://127.0.0.1:1344/respmod
adaptation_access qlproxy1 allow all
adaptation_access qlproxy2 allow all
Restart Squid by typing
service squid restart
and try surfing your favorite web sites and to see how many ads are blocked. Another useful test is to go to the eicar.com web site and try to download a sample artificial eicar.com virus to see that *.com files are blocked by the download filter.
Default installation of Content Security is quite usable out of the box but in order to adjust it for our network requirements described earlier we will have to perform some configuration changes as described below (all paths are relative to /etc/opt/quintolabs/qlproxy/policies):
  1. Put all normal users into Strict filtering policy by adding their IP addresses (or user names if your Squid performs authentication) to the strict/members.conf file.
  2. Put all power users into Relaxed filtering policy by adding their IP addresses or user names to the relaxed/members.conf file.
  3. Enable extended AdBlock subscriptions for blocking English, German and Russian ads in blocks_ads.conf configuration file for both policies. Also block common web tracking engines by uncommenting EasyPrivacy subscription in the same files.
  4. Increase the level of adult blocking heuristics to "high" in the strict/block_adult_sites.conf file. Although it may result in excessive false blocking there is always the possibility to add incorrectly blocked site to exception list.
  5. The UrlBlock module that uses community developed database of categorized domains incorrectly puts blogspot.com into an adult category... so we will add it to the exception list of a relaxed policy in relaxed/exceptions.conf to be able to read the blogs.
  6. Knowing that worms, trojans and other malware related software often connect to the world by numeric IP addresses instead of normal hostnames, we will put a magic regexp url = http://\d+\.\d+\.\d+\.\d+/.* into strict/block_sites_by_name.conf file to block access to web sites by IP.
Now issue a restart command to make qlproxyd daemon reload the configuration
/etc/init.d/qlproxy restart

Setup Web UI Of Content Security With Apache

QuintoLabs Content Security contains a minimal Web UI that lets you see the current program configuration, view reports of usage activity and program logs from a remote host using your favorite browser. Web UI is written using Django Python Framework and integrates with Apache using mod_wsgi deployed in virtualized Python environment (to minimize package dependences).
To install Apache type the following in the root terminal
yum install httpd
Make Apache service autostart on system boot by typing
chkconfig httpd on
Reboot your machine or just start Apache for the first time manually by typing
service httpd start
Then install additional Apache and Python modules by typing in the root terminal:
# yum install mod_wsgi python-setuptools
# easy_install virtualenv
# cd /var/opt/quintolabs/qlproxy/www
# virtualenv --no-site-packages qlproxy_django
# ./qlproxy_django/bin/easy_install django==1.3.1
Integrate Web UI with Apache by adding the following lines to configuration file /etc/httpd/httpd.conf:

    ServerName proxy.lan
    ServerAdmin webmaster@proxy.lan

    LogLevel info
    ErrorLog /var/log/httpd/proxy.lan-error.log
    CustomLog /var/log/httpd/proxy.lan-access.log combined

    # aliases to static files (must come before the mod_wsgi settings)
    Alias /static/ /var/opt/quintolabs/qlproxy/www/qlproxy/static/
    Alias /redirect/ /var/opt/quintolabs/qlproxy/www/qlproxy/redirect/

    # mod_wsgi settings
    WSGIDaemonProcess proxy.lan display-name=%{GROUP}
    WSGIProcessGroup proxy.lan        
    WSGIScriptAlias / /var/opt/quintolabs/qlproxy/www/qlproxy/qlproxy.wsgi
    
        Order deny,allow
        Allow from all
    

Add the following line to the /etc/httpd/conf.d/wsgi.conf to let the mod_wsgi run in daemon mode:
WSGISocketPrefix /var/run/wsgi
NOTE: if you get "Access denied" error page trying to access http://localhost then check if SELinux permissions might be preventing access to /var/opt/quintolabs/qlproxy/www/qlproxy/ directory for httpd process.
After restart of Apache navigate to http://192.168.1.2/qlproxy to see program configuration, logs and generated reports.

Resume

The only thing left is to point network users to Proxy Server. There are several possibilities to do that automatically (think WPAD) but for testing purposes manual proxy configuration should be more than enough. So point the browser to proxy at 192.168.1.4 port 3128, surf to some favorite web sites and see the difference - IP addresses in URLs are blocked, explicitly adult content sites are forbidden. RAM and CPU usage on the server is minimal, surfing experience is acceptable. System is automatically updated once a day for the latest URL block list and AdBlock subscriptions and requires minimal additional maintenance.
For more information see the following resources:

1 comment:

  1. Squidblacklist.org is the worlds leading publisher of native acl blacklists tailored specifically for Squid proxy, and alternative formats for all major third party plugins as well as
    many other filtering platforms. Including SquidGuard, DansGuardian, and ufDBGuard, as well as pfSense and more.

    There is room for better blacklists, we intend to fill that gap.


    It would be our pleasure to serve you.

    Signed,

    Benjamin E. Nichols
    http://www.squidblacklist.org

    ReplyDelete