• Get application security done the right way! Detect, Protect, Monitor, Accelerate, and more…
  • Monitor and analyze web server logs with open source real-time log analyzer – GoAccess

    Web troubleshooting is fun and can be frustrating if you are not equipped with right tools.

    If you are supporting heavy traffic website then often you need to analyze and monitor web servers logs for performance & capacity planning. This is essential for web engineer.

    Checking smaller log size manually is ok, but if you have the large file, then it wouldn’t be fun to go through millions of lines to find the metrics.

    That’s why you need tools to facilitate administrator job and make it more productive.

    GoAccess is a lightweight open-source log analyzer which supports multiple log format and can be used with any of the following.

    • Nginx
    • Apache HTTP
    • AWS ELB, S3, CloudFront
    • Google cloud storage

    What metrics can you analyze with GoAccess?

    Nearly everything you capture in the logs. To give you an idea:

    • Time is taken to serve the request
    • Visitor IP, DNS, host
    • Visitor’s browser & Operating System details
    • 404 not found details
    • Top requests/visitor
    • Bandwidth
    • Static files
    • Geo Location
    • Status Code
    • and more..

    Looking for these metrics to be monitored of your site?

    Good!

    On which OS you can install?

    GoAccess got only one dependency – ncurses. If you can install, you can use it any OS.

    It’s available in distribution package for:

    • Ubuntu
    • Debian
    • Fedora
    • CentOS
    • FreeBSD/OpenBSD
    • Slackware
    • Arch Linux
    • Gentoo
    • MacOS
    • Windows through Cygwin

    However, you can also build from the source or use with Docker.

    If you are new to Docker, I would recommend taking this Docker Mastery course.

    Installing GoAccess on Ubuntu

    • Login to Ubuntu server with the root privilege
    • Use apt-get to install as below
    apt-get install goaccess

    Easy.

    Installing on CentOS

    Log in to the server and execute yumcommand

    yum install goaccess

    Installing using Source on CentOS/Ubuntu

    Love compiling from source?

    Here are the steps.

    • Install the following dependencies if using CentOS
    yum install gcc ncurses-devel glib2-devel geoip-devel tokyocabinet-devel
    • If using Ubuntu
    apt-get install libncursesw5-dev libgeoip-dev make
    • Download the latest package using wget
    wget http://tar.goaccess.io/goaccess-1.2.tar.gz
    • Extract the downloaded file
    gunzip –c goaccess-1.2.tar.gz | tar xvf –
    • Go to newly created folder, which you got after extract
    cd goaccess-1.2
    • Compile with the below command
    ./configure --enable-geoip=legacy --enable-utf8
    make
    make install

    Well done, you have installed GoAccess and all set to analyze the logs.

     

    Verify Installation

    Once installed, just execute goaccess on the command prompt and it should print the usage like below.

    [[email protected] goaccess-1.2]# goaccess 
    GoAccess - 1.2
    Usage: goaccess [filename] [ options ... ] [-c][-M][-H][-q][-d][...]
    The following options can also be supplied to the command:
    Log & Date Format Options
      --date-format=<dateformat>      - Specify log date format. e.g., %d/%b/%Y
      --log-format=<logformat>        - Specify log format. Inner quotes need to be
                                        escaped, or use single quotes.
      --time-format=<timeformat>      - Specify log time format. e.g., %H:%M:%S
    User Interface Options
      -c --config-dialog              - Prompt log/date/time configuration window.
      -i --hl-header                  - Color highlight active panel.
      -m --with-mouse                 - Enable mouse support on main dashboard.
      --color=<fg:bg[attrs, PANEL]>   - Specify custom colors. See manpage for more
                                        details and options.
      --color-scheme=<1|2|3>          - Schemes: 1 => Grey, 2 => Green, 3 => Monokai.
      --html-custom-css=<path.css>    - Specify a custom CSS file in the HTML report.
      --html-custom-js=<path.js>      - Specify a custom JS file in the HTML report.
      --html-prefs=<json_obj>         - Set default HTML report preferences.
      --html-report-title=<title>     - Set HTML report page title and header.
      --json-pretty-print             - Format JSON output w/ tabs & newlines.
      --max-items                     - Maximum number of items to show per panel.
                                        See man page for limits.
      --no-color                      - Disable colored output.
      --no-column-names               - Don't write column names in term output.
      --no-csv-summary                - Disable summary metrics on the CSV output.
      --no-progress                   - Disable progress metrics.
      --no-tab-scroll                 - Disable scrolling through panels on TAB.
      --no-html-last-updated          - Hide HTML last updated field.
    Server Options
      --addr=<addr>                   - Specify IP address to bind server to.
      --daemonize                     - Run as daemon (if --real-time-html enabled).
      --fifo-in=<path>                - Path to read named pipe (FIFO).
      --fifo-out=<path>               - Path to write named pipe (FIFO).
      --origin=<addr>                 - Ensure clients send the specified origin header
                                        upon the WebSocket handshake.
      --port=<port>                   - Specify the port to use.
      --real-time-html                - Enable real-time HTML output.
      --ssl-cert=<cert.crt>           - Path to TLS/SSL certificate.
      --ssl-key=<priv.key>            - Path to TLS/SSL private key.
      --ws-url=<url>                  - URL to which the WebSocket server responds.
    File Options
      -                               - The log file to parse is read from stdin.
      -f --log-file=<filename>        - Path to input log file.
      -l --debug-file=<filename>      - Send all debug messages to the specified
                                        file.
      -p --config-file=<filename>     - Custom configuration file.
      --invalid-requests=<filename>   - Log invalid requests to the specified file.
      --no-global-config              - Don't load global configuration file.
    Parse Options
      -a --agent-list                 - Enable a list of user-agents by host.
      -d --with-output-resolver       - Enable IP resolver on HTML|JSON output.
      -e --exclude-ip=<IP>            - Exclude one or multiple IPv4/6. Allows IP
                                        ranges e.g. 192.168.0.1-192.168.0.10
      -H --http-protocol=<yes|no>     - Set/unset HTTP request protocol if found.
      -M --http-method=<yes|no>       - Set/unser HTTP request method if found.
      -o --output=file.html|json|csv  - Output either an HTML, JSON or a CSV file.
      -q --no-query-string            - Ignore request's query string. Removing the
                                        query string can greatly decrease memory
                                        consumption.
      -r --no-term-resolver           - Disable IP resolver on terminal output.
      --444-as-404                    - Treat non-standard status code 444 as 404.
      --4xx-to-unique-count           - Add 4xx client errors to the unique visitors
                                        count.
      --all-static-files              - Include static files with a query string.
      --crawlers-only                 - Parse and display only crawlers.
      --date-spec=<date|hr>           - Date specificity. Possible values: `date`
                                        (default), or `hr`.
      --double-decode                 - Decode double-encoded values.
      --enable-panel=<PANEL>          - Enable parsing/displaying the given panel.
      --hour-spec=<hr|min>            - Hour specificity. Possible values: `hr`
                                        (default), or `min` (tenth of a min).
      --ignore-crawlers               - Ignore crawlers.
      --ignore-panel=<PANEL>          - Ignore parsing/displaying the given panel.
      --ignore-referer=<NEEDLE>       - Ignore a referer from being counted. Wild cards
                                        are allowed. i.e., *.bing.com
      --ignore-status=<CODE>          - Ignore parsing the given status code.
      --num-tests=<number>            - Number of lines to test. >= 0 (10 default)
      --process-and-exit              - Parse log and exit without outputting data.
      --real-os                       - Display real OS names. e.g, Windows XP, Snow
                                        Leopard.
      --sort-panel=PANEL,METRIC,ORDER - Sort panel on initial load. For example:
                                        --sort-panel=VISITORS,BY_HITS,ASC. See
                                        manpage for a list of panels/fields.
      --static-file=<extension>       - Add static file extension. e.g.: .mp3.
                                        Extensions are case sensitive.
    GeoIP Options
      -g --std-geoip                  - Standard GeoIP database for less memory
                                       consumption.
      --geoip-database=<path>         - Specify path to GeoIP database file. i.e.,
                                        GeoLiteCity.dat, GeoIPv6.dat ...
    Other Options
      -h --help                       - This help.
      -V --version                    - Display version information and exit.
      -s --storage                    - Display current storage method. e.g., B+
                                        Tree, Hash.
      --dcf                           - Display the path of the default config
                                        file when `-p` is not used.
    Examples can be found by running `man goaccess`.
    For more details visit: http://goaccess.io
    GoAccess Copyright (C) 2009-2016 by Gerardo Orellana
    [[email protected] goaccess-1.2]# 

    Analyzing Nginx & Apache with GoAccess

    One of the quickest ways to analyze access.log is by using-fparameter.

    Ex:

    goaccess -f access.log

    Above, I am instructing to open the file access.log. This will show you the overall dashboard and the following 15 sections.

    • Unique visitors per day
    • Requested files
    • Static requests (fonts, image, pdf, etc)
    • Not found (404) requests
    • Visitor’s IP/host details
    • Visitor’s OS
    • Browser details
    • Time distribution
    • Referrer
    • HTTP status code
    • Geo location

    If the chosen file is getting updated in a real-time then you will notice metrics get updated on the terminal. Here, you can go through the metrics you need to analyze.

    Real-time Monitoring over HTTP(s)

    GoAccess let you redirect the output to HTML file which you can use as a real-time monitoring. This is handy when you don’t want to login to the server each time you need to verify some metrics.

    goaccess /var/log/nginx/access.log -o /var/www/geekflare.com/htdocs/real-time.html --log-format=COMBINED --real-time-html

    Above, I am redirecting output to real-time.html file which is available under htdocs. Since it’s htdocs, I can access this file from https://geekflare.com/real-time.html whenever I need to see the metrics.

    A beautiful dashboard!

    However, I won’t recommend doing this way in production. I am sure you don’t want someone to read your web server logs and you may want to apply the following restriction.

    • Protect the file with user and password
    • Allow accessing only from your IP
    • Use other URL with custom port and put that behind a firewall so only allowed IP/users can access

    GoAccess looks powerful open-source logs analyzer. It’s lightweight and FREE so go ahead and give a try.

    You may also be interested to check out cloud-based log analyzer.