Wednesday, April 22, 2015

What are good open-source log monitoring tools on Linux

http://xmodulo.com/open-source-log-monitoring-tools-linux.html

In an operating system, logs are all about keeping track of events, be it critical system errors, resource usage warnings, transaction history, application status, or user activities. These logs, which are stored as (text or binary) files in the system, are useful for system auditing, debugging and maintenance. However, with so many different system entities generating log files, and even at growing rate, the challenge as a system admin is to how to "consume" these log files effectively.
That's when log monitoring tools come into the picture, which streamline the often laborious process of collecting, parsing and analyzing log files, as well as alerting system admins for any interesting events. These tools are designed from ground up focused on log monitoring, so they offer many attractive features, such as scalable log aggregation and filtering, human-readable display, event correlation, visual or email notification, flexible log retention policy, and so on.
In this post, I am going to introduce a list of popular open-source log monitoring software for Linux, ranging from simple log file viewers to full-blown log monitoring frameworks.

Log Aggregation and Filtering

Log monitoring would not be possible without efficient and scalable mechanisms to collect and pre-process log files. Tools in this category focus on shipping, collecting, filtering, indexing and storing log files, so that they can be further analyzed and visualized in subsequent monitoring pipelines.
1. rsyslog: an open-source log collector server which can filter and consolidate log data (based on syslog protocol) from different hosts and devices in the network. rsyslog can be configured as a server or a client, where the former plays the role of a log collector and the latter runs as a log sender.
2. syslog-ng: another open-source implementation of the syslog protocol with more advanced and user-friendly features such as content-based filtering, easier-to-understand config format, and real-time event correlation.
3. systemd journal: systemd journal can be configured for remote journal logging, where locally logged events are forwarded to a remote server over HTTP. In this setup, systemd-journal-upload on a client host serializes and forwards journal messages to systemd-journal-remote running on a remote collector server.
4. logstash: an open-source tool that collects, parses, and stores log files for offline search and analysis. logstash can run in various pipelines due to many plugins supporting different input/output interfaces, decoding/encoding, and filtering rules. Input plugins allows logstash to gather log files from different sources and protocols (e.g., files, S3, RabbitMQ, syslog, collectd, TCP/UDP sockets). Filter/codec plugins are used to parse, convert, modify and add metadata to log files. Output plugins pass processed log files to various target storages (e.g., file, Google cloud storage, Nagios, S3, Zabbix).
5. collectd: a daemon service which gathers various system-level statistics, and stores them for historical analysis or real-time graphing. Similar to logstash, collectd is an extensible architecture, where you can enable various input/output plugins to change its collection behaviors. For log collection, collectd can leverage LogFile and Network plugins to aggregate remote log files.
6. Logster: an open-source utility for parsing log files for any interesting data, and aggregating extracted data into metrics for subsequent reporting and graphing pipelines.
7. Fluentd: a unified log aggregation layer which allows in-stream processing for a variety of streaming data and log files. It comes with a huge plugin ecosystem with more than 300 plugins to support various input sources and output interfaces.
8. Nxlog: a unified log collector and forwarder which supports a variety of log sources, formats and protocols. Advanced features include multi-threaded log collection and processing, message buffering and prioritization, built-in log rotation, and TLS/SSL transport.
9. Scribe:: a scalable log collector server developed by Facebook. Scribe can aggregate log data which is streamed in real time from a large number of clients. It uses Apache Thrift for protocol encoding, so its interface is compatible with pretty much any languages. While a proven solution, Scribe is not something you can deploy quickly as a turnkey. Also, note that Scribe is no longer updated and maintained.
10. Flume: a highly scalable and reliable service to transport and collect large volumes of streaming log data from any clients, and store them in backend storage such as Apache Hadoops' HDFS.

Log Browsing

Tools in this category are frontend for log monitoring, which allows system admins to view (raw or processed) log files in human-friendly interface.
11. multitail: a command-line tool that colorizes and shows (growing) logfiles in multiple ncurses windows. Optionally, multitail can filter log lines using regular expressions, and show merged output of more than one log file in a single window.

12. KLogView: a GUI-based logfile viewer for KDE desktop, which can show any number of log files in one or more dockable panels. Optionally, filters and alerts can be set up, and log files are visualized in custom colors and fonts.

13. lnav: a console-based log-file viewer which shows a merged and filtered view of multiple log files via intuitive interfaces. Visual guides include syntax highlighting, structured-views (e.g., XML, JSON), and interactive SQL queries. lnav supports various log formats such as common web access log format, syslog, VMWare ESXi, strace.

14. log.io: a web browser based log monitoring tool powered by node.js and socket.io. Log data on remote hosts are delivered in real time by co-located Harvesters to a centralized node.js server, and the central log server then streams aggregated log data to client web browser. By design log.io is stateless with no persistent storage.

Log Inspection and Analysis

Once log data has been collected in a centralized location, the question is how to analyze log data to obtain any meaningful results. Tools in this category help with log data analysis.
15. sysdig: originally developed as a system monitoring and troubleshooting tool, sysdig can become a good log analyzer since log files are essential input for system monitoring. In fact, sysdig's strength lies in powerful Lua scripts (called "chisels"), and it comes with several chisels that can analyze system and application logs, and correlate log data with other system-level events.
16. Graphite: an open-source real-time graphing tool which can collect and visualize time-series data. Integrated with other log collectors such as logster, it can enable graphing/trending analysis of log data, with automatic data retention and threshold alarming.

Application-Specific Log Monitoring

Some log monitoring tools are developed for specific applications (e.g., web server or proxy). These tools process locally-generated log files of a target application to enable visual reports, structured queries or automatic actions off of the log files. They can come in handy when you don't want to set up a full-blown remote log monitoring system.
17. GoAccess: a real-time HTTP server log file analyzer which allows one to view various HTTP server statistics interactively, in real-time. GoAccess supports Apache, Nginx, Amazon CloudFront, and Microsoft IIS log formats.

18. ngxtop: a console-based program which parses nginx access logs, and shows nginx web-server statistics in real time via top-like interfaces.

19. fail2ban: an open-source intrusion prevention tool that inspects log files of well-known services such as SSH or Apache web server. Based on log monitoring, fail2ban protects these services from unauthorized access or brute-force attacks by setting up automatic iptables blocking rules.
20. asql: a command-line tools which converts unstructured plain-text Apache web server logs into a structured SQLite database, so that system admins can query various web server statistics from the database.

21. Sarg: a dedicated Squid log analyzer that can generate from Squid log files human-readable HTML reports about network traffic proxied by Squid.

All-in-One Log Monitoring

Tools in this category tout comprehensive log monitoring features, such as log collection, system-wide monitoring, customizable monitoring targets, detailed reports, notifications, etc. Pretty much everything is packed into a single system.
22. logwatch: an open-source log parser and analyzer which can interpret a wide range of common service and application logs, and generate customizable HTML reports ready for email delivery.
23. Nagios: an enterprise-class network and infrastructure monitoring system which comes with extensible monitoring and alerting capabilities. Nagios plugins (example here) can turn Nagios into a centralized log monitoring server, where you can view the status of custom log checks and get notified of any threshold breaches.
24. Graylog: a fully-integrated log management platform which is capable of collecting, indexing, storing and analyzing virtually any kind of data (both structured and unstructured) from remote servers. Written in Java, Graylog is easy to set up, and requires little maintenance. Optionally, its input interface can be integrated with different log collectors such as rsyslog, syslog-ng and logstash. It also features a great and easy-to-understand web-based dashboard with pre-defined views for quick access.

25. OSSEC: a host-based intrusion detection system which can perform detailed log analysis based on pre-built rules. While OSSEC is highly customizable, out of box it can analyze and detect many well-known network-based attacks, system errors, software misuse, policy violations by using system log files, and alert you with emails for active response. It also allows configurable log retention policy, and can integrate with syslog infrastructure for remote monitoring.

Conclusion

In this post, I review popular Linux log management tools and monitoring software. This is by no means an exhaustive list, and can only grow with your input. If you have any good (or bad) experience with any existing log monitoring tool, share your experience. I'll be happy to incorporate your comments.


No comments:

Post a Comment