Saturday, August 30, 2014

How to secure a LAMP server on CentOS or RHEL

http://xmodulo.com/2014/08/secure-lamp-server-centos-rhel.html

LAMP is a software stack composed of Linux (an operating system as a base layer), Apache (a web server that "sits on top" of the OS), MySQL (or MariaDB, as a relational database management system), and finally PHP (a server-side scripting language that is used to process and display information stored in the database).
In this article we will assume that each component of the stack is already up and running, and will focus exclusively on securing the LAMP server(s). We must note, however, that server-side security is a vast subject, and therefore cannot be addressed adequately and completely in a single article.
In this post, we will cover the essential must-do's to secure each part of the stack.

Securing Linux

Since you may want to manage your CentOS server via ssh, you need to consider the following tips to secure remote access to the server by editing the /etc/ssh/sshd_config file.
1) Use key-based authentication, whenever possible, instead of basic authentication (username + password) to log on to your server remotely. We assume that you have already created a key pair with your user name on your client machine and copied it to your server (see the tutorial).
1
2
3
PasswordAuthentication no
RSAAuthentication yes
PubkeyAuthentication yes
2) Change the port where sshd will be listening on. A good idea for the port is a number higher than 1024:
1
Port XXXX
3) Allow only protocol 2:
1
Protocol 2
4) Configure the authentication timeout, do not allow root logins, and restrict which users may login, via ssh:
1
2
3
LoginGraceTime 2m
PermitRootLogin no
AllowUsers gacanepa
5) Allow only specific hosts (and/or networks) to login via ssh:
In the /etc/hosts.deny file:
1
sshd: ALL
In the /etc/hosts.allow file:
1
sshd: XXX.YYY.ZZZ. AAA.BBB.CCC.DDD
where XXX.YYY.ZZZ. represents the first 3 octets of an IPv4 network address and AAA.BBB.CCC.DDD is an IPv4 address. With that setting, only hosts from network XXX.YYY.ZZZ.0/24 and host AAA.BBB.CCC.DDD will be allowed to connect via ssh. All other hosts will be disconnected before they even get to the login prompt, and will receive an error like this:

(Do not forget to restart the sshd daemon to apply these changes: service sshd restart).
We must note that this approach is a quick and easy -but somewhat rudimentary- way of blocking incoming connections to your server. For further customization, scalability and flexibility, you should consider using plain iptables and/or fail2ban.

Securing Apache

1) Make sure that the system user that is running Apache web server does not have access to a shell:
# grep -i apache /etc/passwd
If user apache has a default shell (such as /bin/sh), we must change it to /bin/false or /sbin/nologin:
# usermod -s /sbin/nologin apache

The following suggestions (2 through 5) refer to the /etc/httpd/conf/httpd.conf file:
2) Disable directory listing: this will prevent the browser from displaying the contents of a directory if there is no index.html present in that directory.
Delete the word Indexes in the Options directive:
1
2
3
4
5
# The Options directive is both complicated and important.  Please see
# http://httpd.apache.org/docs/2.2/mod/core.html#options
# for more information.
#
Options Indexes FollowSymLinks
Should read:
1
Options None

In addition, you need to make sure that the settings for directories and virtual hosts do not override this global configuration.
Following the example above, if we examine the settings for the /var/www/icons directory, we see that "Indexes MultiViews FollowSymLinks" should be changed to "None".

    Options Indexes MultiViews FollowSymLinks
    AllowOverride None
    Order allow,deny
    Allow from all


    Options None
    AllowOverride None
    Order allow,deny
    Allow from all

Before After
3) Hide Apache version, as well as module/OS information in error (e.g. Not Found and Forbidden) pages.
1
2
ServerTokens Prod # This means that the http response header will return just "Apache" but not its version number
ServerSignature Off # The OS information is hidden

4) Disable unneeded modules by commenting out the lines where those modules are declared:

TIP: Disabling autoindex_module is another way to hide directory listings when there is not an index.html file in them.
5) Limit HTTP request size (body and headers) and set connection timeout:
Directive Context Example and meaning
LimitRequestBody server config, virtual host, directory, .htaccess Limit file upload to 100 KiB max. for the uploads directory:
1
2
3
"/var/www/test/uploads">
   LimitRequestBody 102400
</Directory>
This directive specifies the number of bytes from 0 (meaning unlimited) to 2147483647 (2GB) that are allowed in a request body.
LimitRequestFieldSize server config, virtual host Change the allowed HTTP request header size to 4KiB (default is 8KiB), server wide:
1
LimitRequestFieldSize 4094
This directive specifies the number of bytes that will be allowed in an HTTP request header and gives the server administrator greater control over abnormal client request behavior, which may be useful for avoiding some forms of denial-of-service attacks.
TimeOut server config, virtual host Change the timeout from 300 (default if no value is used) to 120:
1
TimeOut 120
Amount of time, in seconds, the server will wait for certain events before failing a request.
For more directives and instructions on how to set them up, refer to the Apache docs.

Securing MySQL Server

We will begin by running the mysql_secure_installation script which comes with mysql-server package.
1) If we have not set a root password for MySQL server during installation, now it's the time to do so, and remember: this is essential in a production environment.

The process will continue:

2) Remove the anonymous user:

3) Only allow root to connect from localhost:

4) Remove the default database named test:

5) Apply changes:

6) Next, we will edit some variables in the /etc/my.cnf file:
1
2
3
4
[mysqld]
bind-address=127.0.0.1 # MySQL will only accept connections from localhost
local-infile=0 # Disable direct filesystem access
log=/var/log/mysqld.log # Enable log file to watch out for malicious activities
Don't forget to restart MySQL server with 'service mysqld restart'.
Now, when it comes to day-to-day database administration, you'll find the following suggestions useful:
  • If for some reason we need to manage our database remotely, we can do so by connecting via ssh to our server first to perform the necessary querying and administration tasks locally.
  • We may want to enable direct access to the filesystem later if, for example, we need to perform a bulk import of a file into the database.
  • Keeping logs is not as critical as the two things mentioned earlier, but may come in handy to troubleshoot our database and/or be aware of unfamiliar activities.
  • DO NOT, EVER, store sensitive information (such as passwords, credit card numbers, bank PINs, to name a few examples) in plain text format. Consider using hash functions to obfuscate this information.
  • Make sure that application-specific databases can be accessed only by the corresponding user that was created by the application to that purpose:
To adjust access permission of MySQL users, use these instructions.
First, retrieve the list of users from the user table:
gacanepa@centos:~$ mysql -u root -p
Enter password: [Your root password here]
mysql> SELECT User,Host FROM mysql.user;

Make sure that each user only has access (and the minimum permissions) to the databases it needs. In the following example, we will check the permissions of user db_usuario:
mysql> SHOW GRANTS FOR 'db_usuario'@'localhost';

You can then revoke permissions and access as needed.

Securing PHP

Since this article is oriented at securing the components of the LAMP stack, we will not go into detail as far as the programming side of things is concerned. We will assume that our web applications are secure in the sense that the developers have gone out of their way to make sure that there are no vulnerabilities that can give place to common attacks such as XSS or SQL injection.
1) Disable unnecessary modules:
We can display the list of current compiled in modules with the following command: php -m

And disable those that are not needed by either removing or renaming the corresponding file in the /etc/php.d directory.
For example, since the mysql extension has been deprecated as of PHP v5.5.0 (and will be removed in the future), we may want to disable it:
# php -m | grep mysql
# mv /etc/php.d/mysql.ini /etc/php.d/mysql.ini.disabled

2) Hide PHP version information:
# echo "expose_php=off" >> /etc/php.d/security.ini [or modify the security.ini file if it already exists]

3) Set open_basedir to a few specific directories (in php.ini) in order to restrict access to the underlying file system:

4) Disable remote code/command execution along with easy exploitable functions such as exec(), system(), passthru(), eval(), and so on (in php.ini):
1
2
3
allow_url_fopen = Off
allow_url_include = Off
disable_functions = "exec, system, passthru, eval"


Summing Up

1) Keep packages updated to their most recent version (compare the output of the following commands with the output of 'yum info [package]'):
The following commands return the current versions of Apache, MySQL and PHP:
# httpd -v
# mysql -V (capital V)
# php -v

Then 'yum update [package]' can be used to update the package in order to have the latest security patches.
2) Make sure that configuration files can only be written by the root account:
# ls -l /etc/httpd/conf/httpd.conf
# ls -l /etc/my.cnf
# ls -l /etc/php.ini /etc/php.d/security.ini

3) Finally, if you have the chance, run these services (web server, database server, and application server) in separate physical or virtual machines (and protect communications between them via a firewall), so that in case one of them becomes compromised, the attacker will not have immediate access to the others. If that is the case, you may have to tweak some of the configurations discussed in this article. Note that this is just one of the setups that could be used to increase security in your LAMP server.

An introduction to Apache Hadoop for big data

http://opensource.com/life/14/8/intro-apache-hadoop-big-data


Apache Hadoop is an open source software framework for storage and large scale processing of data-sets on clusters of commodity hardware. Hadoop is an Apache top-level project being built and used by a global community of contributors and users. It is licensed under the Apache License 2.0.
Doug Cutting with his son's stuffed elephant, Hadoop
Hadoop was created by Doug Cutting and Mike Cafarella in 2005. It was originally developed to support distribution for the Nutch search engine project. Doug, who was working at Yahoo! at the time and is now Chief Architect of Cloudera, named the project after his son's toy elephant. Cutting's son was 2 years old at the time and just beginning to talk. He called his beloved stuffed yellow elephant "Hadoop" (with the stress on the first syllable). Now 12, Doug's son often exclaims, "Why don't you say my name, and why don't I get royalties? I deserve to be famous for this!"

The Apache Hadoop framework is composed of the following modules

  1. Hadoop Common: contains libraries and utilities needed by other Hadoop modules
  2. Hadoop Distributed File System (HDFS): a distributed file-system that stores data on the commodity machines, providing very high aggregate bandwidth across the cluster
  3. Hadoop YARN: a resource-management platform responsible for managing compute resources in clusters and using them for scheduling of users' applications
  4. Hadoop MapReduce: a programming model for large scale data processing
All the modules in Hadoop are designed with a fundamental assumption that hardware failures (of individual machines, or racks of machines) are common and thus should be automatically handled in software by the framework. Apache Hadoop's MapReduce and HDFS components originally derived respectively from Google's MapReduce and Google File System (GFS) papers.
Beyond HDFS, YARN and MapReduce, the entire Apache Hadoop "platform" is now commonly considered to consist of a number of related projects as well: Apache Pig, Apache Hive, Apache HBase, and others.
An illustration of the Apache Hadoop ecosystem
For the end-users, though MapReduce Java code is common, any programming language can be used with "Hadoop Streaming" to implement the "map" and "reduce" parts of the user's program. Apache Pig and Apache Hive, among other related projects, expose higher level user interfaces like Pig latin and a SQL variant respectively. The Hadoop framework itself is mostly written in the Java programming language, with some native code in C and command line utilities written as shell-scripts.

HDFS and MapReduce

There are two primary components at the core of Apache Hadoop 1.x: the Hadoop Distributed File System (HDFS) and the MapReduce parallel processing framework. These are both open source projects, inspired by technologies created inside Google.

Hadoop distributed file system

The Hadoop distributed file system (HDFS) is a distributed, scalable, and portable file-system written in Java for the Hadoop framework. Each node in a Hadoop instance typically has a single namenode, and a cluster of datanodes form the HDFS cluster. The situation is typical because each node does not require a datanode to be present. Each datanode serves up blocks of data over the network using a block protocol specific to HDFS. The file system uses the TCP/IP layer for communication. Clients use Remote procedure call (RPC) to communicate between each other.

HDFS stores large files (typically in the range of gigabytes to terabytes) across multiple machines. It achieves reliability by replicating the data across multiple hosts, and hence does not require RAID storage on hosts. With the default replication value, 3, data is stored on three nodes: two on the same rack, and one on a different rack. Data nodes can talk to each other to rebalance data, to move copies around, and to keep the replication of data high. HDFS is not fully POSIX-compliant, because the requirements for a POSIX file-system differ from the target goals for a Hadoop application. The tradeoff of not having a fully POSIX-compliant file-system is increased performance for data throughput and support for non-POSIX operations such as Append.
HDFS added the high-availability capabilities for release 2.x, allowing the main metadata server (the NameNode) to be failed over manually to a backup in the event of failure, automatic fail-over.
The HDFS file system includes a so-called secondary namenode, which misleads some people into thinking that when the primary namenode goes offline, the secondary namenode takes over. In fact, the secondary namenode regularly connects with the primary namenode and builds snapshots of the primary namenode's directory information, which the system then saves to local or remote directories. These checkpointed images can be used to restart a failed primary namenode without having to replay the entire journal of file-system actions, then to edit the log to create an up-to-date directory structure. Because the namenode is the single point for storage and management of metadata, it can become a bottleneck for supporting a huge number of files, especially a large number of small files. HDFS Federation, a new addition, aims to tackle this problem to a certain extent by allowing multiple name-spaces served by separate namenodes.
An advantage of using HDFS is data awareness between the job tracker and task tracker. The job tracker schedules map or reduce jobs to task trackers with an awareness of the data location. For example, if node A contains data (x, y, z) and node B contains data (a, b, c), the job tracker schedules node B to perform map or reduce tasks on (a,b,c) and node A would be scheduled to perform map or reduce tasks on (x,y,z). This reduces the amount of traffic that goes over the network and prevents unnecessary data transfer. When Hadoop is used with other file systems, this advantage is not always available. This can have a significant impact on job-completion times, which has been demonstrated when running data-intensive jobs. HDFS was designed for mostly immutable files and may not be suitable for systems requiring concurrent write-operations.
Another limitation of HDFS is that it cannot be mounted directly by an existing operating system. Getting data into and out of the HDFS file system, an action that often needs to be performed before and after executing a job, can be inconvenient. A filesystem in Userspace (FUSE) virtual file system has been developed to address this problem, at least for Linux and some other Unix systems.
File access can be achieved through the native Java API, the Thrift API, to generate a client in the language of the users' choosing (C++, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, Smalltalk, or OCaml), the command-line interface, or browsed through the HDFS-UI web app over HTTP.

JobTracker and TaskTracker: The MapReduce engine

Jobs and tasks in Hadoop
Above the file systems comes the MapReduce engine, which consists of one JobTracker, to which client applications submit MapReduce jobs. The JobTracker pushes work out to available TaskTracker nodes in the cluster, striving to keep the work as close to the data as possible.
With a rack-aware file system, the JobTracker knows which node contains the data, and which other machines are nearby. If the work cannot be hosted on the actual node where the data resides, priority is given to nodes in the same rack. This reduces network traffic on the main backbone network.
If a TaskTracker fails or times out, that part of the job is rescheduled. The TaskTracker on each node spawns off a separate Java Virtual Machine process to prevent the TaskTracker itself from failing if the running job crashes the JVM. A heartbeat is sent from the TaskTracker to the JobTracker every few minutes to check its status. The Job Tracker and TaskTracker status and information is exposed by Jetty and can be viewed from a web browser.
 Hadoop 1.x MapReduce System is composed of the JobTracker, which is the master, and the per-node slaves, TaskTrackers
If the JobTracker failed on Hadoop 0.20 or earlier, all ongoing work was lost. Hadoop version 0.21 added some checkpointing to this process. The JobTracker records what it is up to in the file system. When a JobTracker starts up, it looks for any such data, so that it can restart work from where it left off.

Known limitations of this approach in Hadoop 1.x

The allocation of work to TaskTrackers is very simple. Every TaskTracker has a number of available slots (such as "4 slots"). Every active map or reduce task takes up one slot. The Job Tracker allocates work to the tracker nearest to the data with an available slot. There is no consideration of the current system load of the allocated machine, and hence its actual availability. If one TaskTracker is very slow, it can delay the entire MapReduce job—especially towards the end of a job, where everything can end up waiting for the slowest task. With speculative execution enabled, however, a single task can be executed on multiple slave nodes.

Apache Hadoop NextGen MapReduce (YARN)

MapReduce has undergone a complete overhaul in hadoop-0.23 and we now have, what we call, MapReduce 2.0 (MRv2) or YARN.
Apache™ Hadoop® YARN is a sub-project of Hadoop at the Apache Software Foundation introduced in Hadoop 2.0 that separates the resource management and processing components. YARN was born of a need to enable a broader array of interaction patterns for data stored in HDFS beyond MapReduce. The YARN-based architecture of Hadoop 2.0 provides a more general processing platform that is not constrained to MapReduce.
Architectural view of YARN
The fundamental idea of MRv2 is to split up the two major functionalities of the JobTracker, resource management and job scheduling/monitoring, into separate daemons. The idea is to have a global ResourceManager (RM) and per-application ApplicationMaster (AM). An application is either a single job in the classical sense of Map-Reduce jobs or a DAG of jobs.
The ResourceManager and per-node slave, the NodeManager (NM), form the data-computation framework. The ResourceManager is the ultimate authority that arbitrates resources among all the applications in the system.
The per-application ApplicationMaster is, in effect, a framework specific library and is tasked with negotiating resources from the ResourceManager and working with the NodeManager(s) to execute and monitor the tasks.
Overview of Hadoop1.0 and Hadopp2.0
As part of Hadoop 2.0, YARN takes the resource management capabilities that were in MapReduce and packages them so they can be used by new engines. This also streamlines MapReduce to do what it does best, process data. With YARN, you can now run multiple applications in Hadoop, all sharing a common resource management. Many organizations are already building applications on YARN in order to bring them IN to Hadoop.
A next-generation framework for Hadoop data processing
As part of Hadoop 2.0, YARN takes the resource management capabilities that were in MapReduce and packages them so they can be used by new engines. This also streamlines MapReduce to do what it does best, process data. With YARN, you can now run multiple applications in Hadoop, all sharing a common resource management. Many organizations are already building applications on YARN in order to bring them IN to Hadoop. When enterprise data is made available in HDFS, it is important to have multiple ways to process that data. With Hadoop 2.0 and YARN organizations can use Hadoop for streaming, interactive and a world of other Hadoop based applications.

What YARN does

YARN enhances the power of a Hadoop compute cluster in the following ways:
  • Scalability: The processing power in data centers continues to grow quickly. Because YARN ResourceManager focuses exclusively on scheduling, it can manage those larger clusters much more easily.
  • Compatibility with MapReduce: Existing MapReduce applications and users can run on top of YARN without disruption to their existing processes.
  • Improved cluster utilization: The ResourceManager is a pure scheduler that optimizes cluster utilization according to criteria such as capacity guarantees, fairness, and SLAs. Also, unlike before, there are no named map and reduce slots, which helps to better utilize cluster resources.
  • Support for workloads other than MapReduce: Additional programming models such as graph processing and iterative modeling are now possible for data processing. These added models allow enterprises to realize near real-time processing and increased ROI on their Hadoop investments.
  • Agility: With MapReduce becoming a user-land library, it can evolve independently of the underlying resource manager layer and in a much more agile manner.

How YARN works

The fundamental idea of YARN is to split up the two major responsibilities of the JobTracker/TaskTracker into separate entities:
  • a global ResourceManager
  • a per-application ApplicationMaster
  • a per-node slave NodeManager and
  • a per-application container running on a NodeManager
The ResourceManager and the NodeManager form the new, and generic, system for managing applications in a distributed manner. The ResourceManager is the ultimate authority that arbitrates resources among all the applications in the system. The per-application ApplicationMaster is a framework-specific entity and is tasked with negotiating resources from the ResourceManager and working with the NodeManager(s) to execute and monitor the component tasks. The ResourceManager has a scheduler, which is responsible for allocating resources to the various running applications, according to constraints such as queue capacities, user-limits etc. The scheduler performs its scheduling function based on the resource requirements of the applications. The NodeManager is the per-machine slave, which is responsible for launching the applications' containers, monitoring their resource usage (cpu, memory, disk, network) and reporting the same to the ResourceManager. Each ApplicationMaster has the responsibility of negotiating appropriate resource containers from the scheduler, tracking their status, and monitoring their progress. From the system perspective, the ApplicationMaster runs as a normal container.

Thursday, August 28, 2014

How to create a site-to-site IPsec VPN tunnel using Openswan in Linux

http://xmodulo.com/2014/08/create-site-to-site-ipsec-vpn-tunnel-openswan-linux.html

A virtual private network (VPN) tunnel is used to securely interconnect two physically separate networks through a tunnel over the Internet. Tunneling is needed when the separate networks are private LAN subnets with globally non-routable private IP addresses, which are not reachable to each other via traditional routing over the Internet. For example, VPN tunnels are often deployed to connect different NATed branch office networks belonging to the same institution.
Sometimes VPN tunneling may be used simply for its security benefit as well. Service providers or private companies may design their networks in such a way that vital servers (e.g., database, VoIP, banking servers) are placed in a subnet that is accessible to trusted personnel through a VPN tunnel only. When a secure VPN tunnel is required, IPsec is often a preferred choice because an IPsec VPN tunnel is secured with multiple layers of security.
This tutorial will show how we can easily create a site-to-site VPN tunnel using Openswan in Linux.

Topology

This tutorial will focus on the following topologies for creating an IPsec tunnel.



Installing Packages and Preparing VPN Servers

Usually, you will be managing site-A only, but based on the requirements, you could be managing both site-A and site-B. We start the process by installing Openswan.
On Red Hat based Systems (CentOS, Fedora or RHEL):
# yum install openswan lsof
On Debian based Systems (Debian, Ubuntu or Linux Mint):
# apt-get install openswan
Now we disable VPN redirects, if any, in the server using these commands:
# for vpn in /proc/sys/net/ipv4/conf/*;
# do echo 0 > $vpn/accept_redirects;
# echo 0 > $vpn/send_redirects;
# done
Next, we modify the kernel parameters to allow IP forwarding and disable redirects permanently.
# vim /etc/sysctl.conf
1
2
3
net.ipv4.ip_forward = 1
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.all.send_redirects = 0
Reload /etc/sysctl.conf:
# sysctl -p
We allow necessary ports in the firewall. Please make sure that the rules are not conflicting with existing firewall rules.
# iptables -A INPUT -p udp --dport 500 -j ACCEPT
# iptables -A INPUT -p tcp --dport 4500 -j ACCEPT
# iptables -A INPUT -p udp --dport 4500 -j ACCEPT
Finally, we create firewall rules for NAT.
# iptables -t nat -A POSTROUTING -s site-A-private-subnet -d site-B-private-subnet -j SNAT --to site-A-Public-IP
Please make sure that the firewall rules are persistent.
Note:
  • You could use MASQUERADE instead of SNAT. Logically it should work, but it caused me to have issues with virtual private servers (VPS) in the past. So I would use SNAT if I were you.
  • If you are managing site-B as well, create similar rules in site-B server.
  • Direct routing does not need SNAT.

Preparing Configuration Files

The first configuration file that we will work with is ipsec.conf. Regardless of which server you are configuring, always consider your site as 'left' and remote site as 'right'. The following configuration is done in siteA's VPN server.
# vim /etc/ipsec.conf
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
## general configuration parameters ##
 
config setup
        plutodebug=all
        plutostderrlog=/var/log/pluto.log
        protostack=netkey
        nat_traversal=yes
        virtual_private=%v4:10.0.0.0/8,%v4:192.168.0.0/16,%v4:172.16.0.0/16
        ## disable opportunistic encryption in Red Hat ##
        oe=off
 
## disable opportunistic encryption in Debian ##
## Note: this is a separate declaration statement ##
include /etc/ipsec.d/examples/no_oe.conf
 
## connection definition in Red Hat ##
conn demo-connection-redhat
        authby=secret
        auto=start
        ike=3des-md5
        ## phase 1 ##
        keyexchange=ike
        ## phase 2 ##
        phase2=esp
        phase2alg=3des-md5
        compress=no
        pfs=yes
        type=tunnel
        left=
        leftsourceip=
        leftsubnet=/netmask
        ## for direct routing ##
        leftsubnet=/32
        leftnexthop=%defaultroute
        right=
        rightsubnet=/netmask
 
## connection definition in Debian ##
conn demo-connection-debian
        authby=secret
        auto=start
        ## phase 1 ##
        keyexchange=ike
        ## phase 2 ##
        esp=3des-md5
        pfs=yes
        type=tunnel
        left=
        leftsourceip=
        leftsubnet=/netmask
        ## for direct routing ##
        leftsubnet=/32
        leftnexthop=%defaultroute
        right=
        rightsubnet=/netmask
Authentication can be done in several different ways. This tutorial will cover the use of pre-shared key, which is added to the file /etc/ipsec.secrets.
# vim /etc/ipsec.secrets
1
2
3
siteA-public-IP  siteB-public-IP:  PSK  "pre-shared-key"
## in case of multiple sites ##
siteA-public-IP  siteC-public-IP:  PSK  "corresponding-pre-shared-key"

Starting the Service and Troubleshooting

The server should now be ready to create a site-to-site VPN tunnel. If you are managing siteB as well, please make sure that you have configured the siteB server with necessary parameters. For Red Hat based systems, please make sure that you add the service into startup using chkconfig command.
# /etc/init.d/ipsec restart
If there are no errors in both end servers, the tunnel should be up now. Taking the following into consideration, you can test the tunnel with ping command.
  1. The siteB-private subnet should not be reachable from site A, i.e., ping should not work if the tunnel is not up.
  2. After the tunnel is up, try ping to siteB-private-subnet from siteA. This should work.
Also, the routes to the destination's private subnet should appear in the server's routing table.
# ip route
[siteB-private-subnet] via [siteA-gateway] dev eth0 src [siteA-public-IP]
default via [siteA-gateway] dev eth0
Additionally, we can check the status of the tunnel using the following useful commands.
# service ipsec status
IPsec running  - pluto pid: 20754
pluto pid 20754
1 tunnels up
some eroutes exist
# ipsec auto --status
## output truncated ##
000 "demo-connection-debian":     myip=; hisip=unset;
000 "demo-connection-debian":   ike_life: 3600s; ipsec_life: 28800s; rekey_margin: 540s; rekey_fuzz: 100%; keyingtries: 0; nat_keepalive: yes
000 "demo-connection-debian":   policy: PSK+ENCRYPT+TUNNEL+PFS+UP+IKEv2ALLOW+SAREFTRACK+lKOD+rKOD; prio: 32,28; interface: eth0;

## output truncated ##
000 #184: "demo-connection-debian":500 STATE_QUICK_R2 (IPsec SA established); EVENT_SA_REPLACE in 1653s; newest IPSEC; eroute owner; isakmp#183; idle; import:not set

## output truncated ##
000 #183: "demo-connection-debian":500 STATE_MAIN_I4 (ISAKMP SA established); EVENT_SA_REPLACE in 1093s; newest ISAKMP; lastdpd=-1s(seq in:0 out:0); idle; import:not set
The log file /var/log/pluto.log should also contain useful information regarding authentication, key exchanges and information on different phases of the tunnel. If your tunnel doesn't come up, you could check there as well.
If you are sure that all the configuration is correct, and if your tunnel is still not coming up, you should check the following things.
  1. Many ISPs filter IPsec ports. Make sure that UDP 500, TCP/UDP 4500 ports are allowed by your ISP. You could try connecting to your server IPsec ports from a remote location by telnet.
  2. Make sure that necessary ports are allowed in the firewall of the server/s.
  3. Make sure that the pre-shared keys are identical in both end servers.
  4. The left and right parameters should be properly configured on both end servers.
  5. If you are facing problems with NAT, try using SNAT instead of MASQUERADING.
To sum up, this tutorial focused on the procedure of creating a site-to-site IPSec VPN tunnel in Linux using Openswan. VPN tunnels are very useful in enhancing security as they allow admins to make critical resources available only through the tunnels. Also VPN tunnels ensure that the data in transit is secured from eavesdropping or interception.
Hope this helps. Let me know what you think.