The difference between mediocre products and great products is logging. Learn why it’s so, and how to tie it all together.
Just like security, logging is another key component of web applications (or applications in general) that gets sidelined because of old habits and the inability to see ahead. What many see as useless reams of digital tape are powerful tools to look inside your applications, correct errors, improve weak areas, and delight customers.
Before we get on to centralized logging, let’s first look into why logging is such a big deal.
Two types (levels) of logging
Computers are deterministic systems, except when they’re not.
As a professional developer, I’ve come across many cases where the observed behavior of the app baffled everyone for days on end, but the key was always in the logs. Every piece of software we run produces (or at least should generate) logs, which tell us what it was going through when the problematic situation occurred.
Now, logging, as I see it, is of two types: auto-generated logs and programmer-generated logs. Please note that this isn’t any textbook differentiation, and quoting me on this terminology will land you in trouble. 😉
The image above shows what can be termed as an auto-generated log.
In this specific case, it’s a WordPress system logging an unexpected condition (a Notice) when running some PHP code. Logs like these are being generated all the time tirelessly — by database tools like MySQL, web servers like Apache, programming languages and environments, mobile devices, and even operating systems.
These rarely contain much value, and programmers don’t even bother to look into them, except when something goes wrong. At such moments, they dig deep into the logs, trying to understand what went wrong.
But auto-generated logs can help only so much. If several people have admin access to a site, for example, and one of them happens to delete an essential piece of information, it’s impossible to detect the culprit with the use of auto-generated logs. From the perspective of the systems tied together as the application, it was just another day in the job — someone had the needed authority to execute a task, and so the system carried it out.
What’s needed here is an additional layer of explicit, extensive logging that creates trails for the human side of things. These are what I term as programmer-generated logs, and they form the backbone of sensitive industries like banking. Here’s an example of what such a logging scheme might look like:
Logging is power
So, given these two types of logs in a system, here’s how you can leverage them and ramp up the impact.
Staying ahead of the customer
“Customer delight” has come to be known as a useless marketing gimmick, but thanks to logging, it can be made very real. I know of digital products that monitor their logs like a hawk, and as soon as a customer breaks something on the page, they can call the customer and offer to help.
Just think about it — within seconds of getting an ugly error, you get a call from the company that says, “Hey, I understand you were trying to add this item to the cart, but it kept dying. Is it okay for me to add this time and complete the order for you?”
Delighted customer? You bet!
Team morale and productivity
Like I said before, when bugs go untracked for a long time, the developers in your team get frustrated and lose more and more time chasing their tails. And here’s the thing with debugging — it requires a fresh, curious mind from the start. If a WTF thought so much as enters your brain, the whole process goes for a toss.
And what makes debugging hard? In my experience, lack of logging, or the lack of knowledge of logging. For starters, you may not realize that your favorite database is also just another piece of software that generates logs, or you not be logging extensively in your application (see programmer-generated logs above).
I particularly remember a case where the application was going unresponsive, and no one knew why. A few days later, the culprit was the disk I/O limit reached due to excessive traffic. Because no one bothered to look there, no one could figure out why.
What if two years down the line your customer says that all those orders weren’t placed by them but some hacker?
What argument would have to entertain or reject their request? If you have extensive logging (IP address, date and time, credit card, etc.), then you’ll be able to analyze all that and reach a decision. Good or bad, it will at least have some objective basis, rather than resembling a shot in the dark.
The same is true if you come under some regulatory lens or are required to undergo a third-party audit as part of a new, important project. Not having a robust logging system will show you in a bad light.
Improving existing systems
How do you go about improving the current system?
Should you merely throw more RAM and CPU threads at it? What if your app is slow despite enough resources? Where is the bottleneck? More often than not, logging is the answer.
For instance, all major database systems have a feature for logging slow queries.
If you visit the slow query log regularly, you’ll get to know which operations and taking the most time, and hence uncover small but important areas that need work. Often, a small change like this works better than doubling the hardware capacity.
There’s no counting how many ways a good logging system helps you. Perhaps the best argument is that it’s an automated activity that once set up, doesn’t need any monitoring, and will save you from ruin someday.
With that out of the way, let’s look at some of the amazing Open Source Log Collectors (unified logging tools) out there. Just in case you’re wondering, we did cover commercial cloud-based logging tools in an earlier post.
Graylog is one of the leading names in the industry when it comes to industry-grade logging and visualization capabilities. It’s also unique in that it scans your collected logs for signs of security vulnerabilities and notifies you instantly.
While Graylog is a centralized logging system, it has the flexibility you need, letting you customize alerts, dashboards, and more.
Greylog is open-source, but there’s an enterprise plan if your needs are complex.
With clients like SAP, Cisco, and LinkedIn on its roster, Graylog is a tool you can trust with your eyes closed.
If you’re a fan or user of the Elastic stack, Logstash is worth checking out (the ELK stack is already a thing, in case you didn’t know). Like other logging tools on this list, Logstash if fully open-source, allowing you the freedom to deploy and use as you wish.
But don’t be misled: Logstash is a mothership with capabilities far outweighing any humble logging tool. It’s able to collect vast amounts of data from multiple platforms, allows you to define and execute your own data pipelines, make sense of unstructured log dumps, and more.
Of course, the only limitation is that it works with the Elastic suite of products only, but if you’re starting and looking to scale soon, Logstash is the way to go!
Among centralized logging tools that work as a middle layer for data ingestion, Flutend is a first among equals. With an excellent library of plugins, Fluentd is able to capture data from virtually any production system, knead it into the desired structure, build a custom pipeline, and feed it to your favorite analytics platform, be it MongoDB or Elasticsearch.
Fluentd is built on Ruby, is entirely open source, and is extensively popular because of its flexibility and modularity.
With major companies like Microsoft, Atlassian, and Twilio using the platform, Fluentd has nothing to prove. 🙂
If really, really large data sets are your challenge, and you eventually want to feed everything into something like Hadoop, Flume is one of the best choices around. It’s a “pure” open source project, in the sense that it’s maintained by our beloved Apache Foundation, which means there is no enterprise plan.
This may or may not be what you’re exactly looking for. 🙂
Written in Java (which continues to astonish me when it comes to groundbreaking tech), Flume’s source code is entirely open. Flume is best for you if you’re looking for a distributed, fault-tolerant data ingestion platform for heavy-duty stuff.
I give it zero out of ten for product naming, but Octopussy can be a good choice if your needs are simple, and you’re wondering about what all the fuss related to pipelines, ingestion, aggregation, etc., is all about.
In my opinion, Octopussy covers the needs of most of the products out there (estimated stats are useless, but if I had to guess, I’d say it takes care of 80% of use cases in the real world).
Octopussy doesn’t have a great UI (see here) at all, but it makes up for it regarding speed and lack of bloat. The source is available on GitHub, as expected, and I do think it’s worth a serious look.
Rsyslog stands for a rocket-fast system for log processing.
It is a utility for Unix-like operating systems. In technical terms, it is a message router with dynamically loadable inputs and outputs and is highly configurable.
It can take input from multiple data sources, transform it, and send the output to several destinations. With Rsyslog, you can deliver 1 million messages per second over local destinations.
Rsyslog also provides a Windows agent that works very closely with the Rsyslog Linux agent. It is used for integration between the two environments. This windows agent is used to forward the event logs of windows and setup file monitor service.
Below are other features offered by Rsyslog:
- Flexible configurations
- Provides multi-threading capabilities
- Log file manipulation protection using log signatures and encryption.
- Supports Big Data platforms
- Provides content-based filtering capabilities
LOGalyze was a commercial product that was recently made open source. Though I couldn’t the project on GitHub, they do make a Windows installer and all source code downloadable.
If you’re intent on a community, you can find details of a mailing list here.
LOGalyze is a relatively flexible and powerful offering that will work nicely for single-system deployments that seek to combine logging from known sources like Postfix, Apache, etc., and produce the output in CSV, PDF, HTML or similar formats. Yes, it doesn’t do everything, but since it was a commercial product at one time, it does so rather well.
When it comes to choosing a tool for the job, I have two criteria: it has to be focused, and it has to be backed by an active business model. The problem with open-source software, in general, is that a few months/years down the road, chances of stagnation or death are high. There’s no count of how many logging tools were launched with gusto, only to be found now in the GitHub graveyard.
Measured by this yardstick, LogPacker is a favorite for me.
As you can tell from the screenshot, LogPacker is all about logs, and nothing else. Their push is definitely towards their cloud offerings, but you are more than welcome to download and install it on your servers (GitHub page here).
Clustering and aggregation are available for those wanting to use it on a non-trivial scale, and enterprise plans are available who want to work with the API or need larger deployments. A refreshingly minimalistic (focused, though not feature-poor) take on logging management, in my opinion!
I’m sure there are those among us who don’t want all the ceremony associated with a “unified,” “centralized” logging system. Their business comes from single servers, and they’re looking for something quick and efficient for watching their log files. Well, say hello to Logwatch.
Once installed, LogWatch can scan your system logs and create a report of the type you want. It’s a somewhat dated piece of software (read “reliable”), though, and was written in Perl. So, you’ll need Perl 5.6+ on your server to run it. I don’t have any screenshots to share as it’s a purely command line, daemonized process.
If you’re a CLI junkie and have a love for the old-school way of doing things, you’ll love Logwatch!
The Syslog-ng tool was developed as a way to process Syslog (an established client-server protocol for system logging) data files in real-time. Over time, though, it has come to support other data formats: unstructured, SQL, and NoSQL. How the Syslog protocol works are pretty much summed up neatly in the following illustration.
syslog-ng is a production-grade, reliable log collection and classification tool that was written in C and has been an established name in the industry for long. The best part is its extensibility, allowing you to write plugins in C, Python, Java, Lua, or Perl.
Short for (Log Navigator), lnav is a pure-terminal tool that works on a single machine, single directory. It’s for those who have their logging unified into a single directory or want to filter and display real-time logs from a single source.
If you thought lnav was nothing more than glorified
grep you’d be wrong. There are several features that will make you fall in love with it: time-series view, pretty-printing (for JSON and other formats), color-coded log sources, powerful filters, ability to understand several logging protocols, and more.
It’s just that sometimes you want a zero-hassle, zero setup, maybe-temporary logging layer, and lnav fits the bill perfectly!
And there you have it!
It was a hard list to compile, to be frank, as logging isn’t as popular as, say, content management, and all mindshare seems to have been grabbed by three or four tools. Still, everyone’s needs are different, and I’ve tried to cover them extensively.
From silly command-line, no-setup tools to full-blown data juggernauts, it’s all here! Did I miss something? Of course, I did!