A troubleshooting performance related issue in IT world is always challenging, and if you were not aware of right tools, then it would be frustrating.

If you are working as a support in a production environment then most probably you will need to deal with performance related issues in Linux environment.

Are you in support function and working on Linux server?

Let’s go through some of the most used Linux command line utilities to diagnose performance-related issues.

Note: Some of the commands listed below may not be installed by default, so you got to install them manually.

lsof

lsof stands for “list open files” to help you to find all the opened files and processes along with the one who opened them. The lsof utility can be convenient to use in some scenarios.

To list, all the files opened by particular PID

# lsof –p PID

Count number of files & processes

[[email protected] ~]# lsof -p 4271 | wc -l
34
[[email protected] ~]#

Check the currently opened log file

# lsof –p | grep log

Find out port number used by daemon

[[email protected] ~]# lsof -i -P |grep 4271

nginx     4271   root   6u IPv4 51306     0t0 TCP *:80 (LISTEN)

nginx     4271   root   7u IPv4 51307     0t0 TCP *:443 (LISTEN)

[[email protected] ~]#

pidstat

pidstat can be used to monitor tasks managed by Linux kernel. Troubleshooting I/O related issue can be the ease with this command.

List I/O statistics of all the PID

# pidstat –d

To displace I/O stats for particular PID

# pidstat –p 4271 –d

If you are doing real-time troubleshooting for some process, then you can monitor the I/O in an interval. Below example is to monitor every 5 seconds.

[[email protected] ~]# pidstat -p 4362 -d 5

Linux 3.10.0-327.13.1.el7.x86_64 (localhost.localdomain)          08/13/2016             _x86_64_         (2 CPU) 

07:01:30 PM   UID       PID   kB_rd/s   kB_wr/s kB_ccwr/s Command

07:01:35 PM     0     4362     0.00     0.00     0.00 nginx

07:01:40 PM     0     4362     0.00     0.00     0.00 nginx

07:01:45 PM     0     4362     0.00     0.00     0.00 nginx

07:01:50 PM     0     4362     0.00     0.00     0.00 nginx

top

Probably one of the most used commands on Linux would be top. The top command can be used to display system summary information and current utilization.

Just executing top command can show you CPU utilization, process details, a number of tasks, memory utilization, a number of zombie processes, etc.

top

To display process details for specific user

# top –u username

To kill the process, you can execute top and press k. It will prompt you to enter the PID to be killed.

top-kill

ps

ps stands for process status and widely used a command to get a snapshot of the running process. Very useful to find out if a process is running or not and if running then prints PID.

To find out the PID and process details by some word

# ps –ef |grep word

ps-output

tcpdump

Troubleshooting network issue is always challenging, and one of the essential commands to use is tcpdump.

You can use tcpdump to capture the network packets on a network interface.

To capture the packets on particular network interface

# tcpdump –i $interface –w /tmp/capture

tcpdump-output

As you can see above has captured the traffic flow on “eno16777736” interface.

To capture network traffic between source and destination IP

# tcpdump src $IP and dst host $IP

Capture network traffic for destination port 443

# tcpdump dst port 443
tcpdump: data link type PKTAP
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on pktap, link-type PKTAP (Packet Tap), capture size 262144 bytes
12:02:30.833845 IP 192.168.1.2.49950 > ec2-107-22-185-206.compute-1.amazonaws.com.https: Flags [.], ack 421458229, win 4096, length 0
12:02:32.076893 IP 192.168.1.2.49953 > 104.25.133.107.https: Flags [S], seq 21510813, win 65535, options [mss 1460,nop,wscale 5,nop,nop,TS val 353259990 ecr 0,sackOK,eol], length 0
12:02:32.090389 IP 192.168.1.2.49953 > 104.25.133.107.https: Flags [.], ack 790725431, win 8192, length 0
12:02:32.090630 IP 192.168.1.2.49953 > 104.25.133.107.https: Flags [P.], seq 0:517, ack 1, win 8192, length 517
12:02:32.109903 IP 192.168.1.2.49953 > 104.25.133.107.https: Flags [.], ack 147, win 8187, length 0

Read the captured file

# tcpdump –r filename

For ex: to read above captured file

# tcpdump –r /tmp/test

iostat

iostat stands for input-output statistics and often used to diagnose performance issue with storage devices. You can monitor CPU, Device & Network file system utilization report with iostat.

Display disk I/O statistics

[[email protected] ~]# iostat -d
Linux 3.10.0-327.13.1.el7.x86_64 (localhost.localdomain)          08/13/2016             _x86_64_         (2 CPU)
Device:           tps   kB_read/s   kB_wrtn/s   kB_read   kB_wrtn
sda               1.82       55.81       12.63     687405     155546
[[email protected] ~]#

Display CPU statistics

[[email protected] ~]# iostat -c
Linux 3.10.0-327.13.1.el7.x86_64 (localhost.localdomain)          08/13/2016             _x86_64_         (2 CPU)
avg-cpu: %user   %nice %system %iowait %steal   %idle
           0.59   0.02   0.33   0.54   0.00   98.52
[[email protected] ~]#

ldd

ldd stands for list dynamic dependencies to show shared libraries needed by the library. The ldd command can be handy to diagnose the application startup problem.

If some program is not starting due to dependencies not available then you can ldd to find out the shared libraries it’s looking for.

[[email protected] sbin]# ldd httpd
            linux-vdso.so.1 => (0x00007ffe7ebb2000)
            libpcre.so.1 => /lib64/libpcre.so.1 (0x00007fa4d451e000)
            libselinux.so.1 => /lib64/libselinux.so.1 (0x00007fa4d42f9000)
            libaprutil-1.so.0 => /lib64/libaprutil-1.so.0 (0x00007fa4d40cf000)
            libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007fa4d3e98000)
            libexpat.so.1 => /lib64/libexpat.so.1 (0x00007fa4d3c6e000)
            libdb-5.3.so => /lib64/libdb-5.3.so (0x00007fa4d38af000)
            libapr-1.so.0 => /lib64/libapr-1.so.0 (0x00007fa4d3680000)
            libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fa4d3464000)
            libdl.so.2 => /lib64/libdl.so.2 (0x00007fa4d325f000)
            libc.so.6 => /lib64/libc.so.6 (0x00007fa4d2e9e000)
            liblzma.so.5 => /lib64/liblzma.so.5 (0x00007fa4d2c79000)
            /lib64/ld-linux-x86-64.so.2 (0x00007fa4d4a10000)
            libuuid.so.1 => /lib64/libuuid.so.1 (0x00007fa4d2a73000)
            libfreebl3.so => /lib64/libfreebl3.so (0x00007fa4d2870000)
[[email protected] sbin]#

netstat

netstat (Network Statistics) is a popular command to print network connections, interface statistics and to troubleshoot various network related issue.

To show stats of all protocols

# netstat –s

You can use grep to find out if any errors

[[email protected] sbin]# netstat -s | grep error
   0 packet receive errors
   0 receive buffer errors
   0 send buffer errors
[[email protected] sbin]#

To show kernel routing table

[[email protected] sbin]# netstat -r
Kernel IP routing table
Destination     Gateway         Genmask         Flags   MSS Window irtt Iface
default         gateway         0.0.0.0         UG       0 0         0 eno16777736
172.16.179.0   0.0.0.0         255.255.255.0   U         0 0         0 eno16777736
192.168.122.0   0.0.0.0         255.255.255.0   U         0 0         0 virbr0
[[email protected] sbin]#

free

If your Linux server is running out of memory or just want to find out how much memory available out of available memory, then the free command will help you.

[[email protected] sbin]# free -g
             total       used       free     shared buff/cache   available
Mem:             5           0           3           0           1           4
Swap:             5           0           5
[[email protected] sbin]#

-g means to show the details in GB. So as you can see total available memory is 5 GB and 3 GB is free.

sar

sar (System Activity Report) will be helpful to collect a number of a report including CPU, Memory and device load.

By just executing sar command will show you system utilization for the entire day.

sar-output

By default, it stores utilization report in 10 minutes. If you need something shorter in real-time, you can use as below.

Show CPU report for 3 times every 3 seconds

[[email protected] sbin]# sar 3 2
Linux 3.10.0-327.13.1.el7.x86_64 (localhost.localdomain)          08/13/2016             _x86_64_         (2 CPU)
11:14:02 PM     CPU     %user     %nice   %system   %iowait   %steal     %idle
11:14:05 PM     all     1.83     0.00     0.50     0.17     0.00     97.51
11:14:08 PM     all     1.50     0.00      0.17     0.00     0.00     98.33
Average:       all     1.67     0.00     0.33     0.08     0.00     97.92
[[email protected] sbin]#

Show Memory usage report

# sar –r

Show network report

# sar –n ALL

ipcs

ipcs (InterProcess Communication System) provides a report on the semaphore, shared memory & message queue.

To list the message queue

# ipcs –q

To list the semaphores

# ipcs –s

To list the shared memory

# ipcs –m

To display current usage status of IPC

[[email protected] sbin]# ipcs -u

------ Messages Status --------
allocated queues = 0
used headers = 0
used space = 0 bytes

------ Shared Memory Status --------
segments allocated 5
pages allocated 2784
pages resident 359
pages swapped   0
Swap performance: 0 attempts       0 successes

------ Semaphore Status --------
used arrays = 0
allocated semaphores = 0
[[email protected] sbin]#

I hope above commands help in the various situation at your system administration job.

These are just to give you an idea about commands, and if interested you may check out this Linux performance monitoring and troubleshooting course.