A troubleshooting performance related issue in IT world is always challenging, and if you were not aware of right tools, then it would be frustrating.
If you are working as a support in a production environment then most probably you will need to deal with performance related issues in Linux environment.
Are you in support function and working on Linux server?
Let’s go through some of the most used Linux command line utilities to diagnose performance-related issues.
Note: Some of the commands listed below may not be installed by default, so you got to install them manually.
lsof stands for “list open files” to help you to find all the opened files and processes along with the one who opened them. The lsof utility can be convenient to use in some scenarios.
To list, all the files opened by particular PID
# lsof –p PID
Count number of files & processes
[[email protected] ~]# lsof -p 4271 | wc -l 34 [[email protected] ~]#
Check the currently opened log file
# lsof –p | grep log
Find out port number used by daemon
[[email protected] ~]# lsof -i -P |grep 4271 nginx 4271 root 6u IPv4 51306 0t0 TCP *:80 (LISTEN) nginx 4271 root 7u IPv4 51307 0t0 TCP *:443 (LISTEN) [[email protected] ~]#
pidstat can be used to monitor tasks managed by Linux kernel. Troubleshooting I/O related issue can be the ease with this command.
List I/O statistics of all the PID
# pidstat –d
To displace I/O stats for particular PID
# pidstat –p 4271 –d
If you are doing real-time troubleshooting for some process, then you can monitor the I/O in an interval. Below example is to monitor every 5 seconds.
[[email protected] ~]# pidstat -p 4362 -d 5 Linux 3.10.0-327.13.1.el7.x86_64 (localhost.localdomain) 08/13/2016 _x86_64_ (2 CPU) 07:01:30 PM UID PID kB_rd/s kB_wr/s kB_ccwr/s Command 07:01:35 PM 0 4362 0.00 0.00 0.00 nginx 07:01:40 PM 0 4362 0.00 0.00 0.00 nginx 07:01:45 PM 0 4362 0.00 0.00 0.00 nginx 07:01:50 PM 0 4362 0.00 0.00 0.00 nginx
Probably one of the most used commands on Linux would be top. The top command can be used to display system summary information and current utilization.
Just executing top command can show you CPU utilization, process details, a number of tasks, memory utilization, a number of zombie processes, etc.
To display process details for specific user
# top –u username
To kill the process, you can execute top and press k. It will prompt you to enter the PID to be killed.
ps stands for process status and widely used a command to get a snapshot of the running process. Very useful to find out if a process is running or not and if running then prints PID.
To find out the PID and process details by some word
# ps –ef |grep word
Troubleshooting network issue is always challenging, and one of the essential commands to use is tcpdump.
You can use tcpdump to capture the network packets on a network interface.
To capture the packets on particular network interface
# tcpdump –i $interface –w /tmp/capture
As you can see above has captured the traffic flow on “eno16777736” interface.
To capture network traffic between source and destination IP
# tcpdump src $IP and dst host $IP
Capture network traffic for destination port 443
# tcpdump dst port 443 tcpdump: data link type PKTAP tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on pktap, link-type PKTAP (Packet Tap), capture size 262144 bytes 12:02:30.833845 IP 192.168.1.2.49950 > ec2-107-22-185-206.compute-1.amazonaws.com.https: Flags [.], ack 421458229, win 4096, length 0 12:02:32.076893 IP 192.168.1.2.49953 > 184.108.40.206.https: Flags [S], seq 21510813, win 65535, options [mss 1460,nop,wscale 5,nop,nop,TS val 353259990 ecr 0,sackOK,eol], length 0 12:02:32.090389 IP 192.168.1.2.49953 > 220.127.116.11.https: Flags [.], ack 790725431, win 8192, length 0 12:02:32.090630 IP 192.168.1.2.49953 > 18.104.22.168.https: Flags [P.], seq 0:517, ack 1, win 8192, length 517 12:02:32.109903 IP 192.168.1.2.49953 > 22.214.171.124.https: Flags [.], ack 147, win 8187, length 0
Read the captured file
# tcpdump –r filename
For ex: to read above captured file
# tcpdump –r /tmp/test
iostat stands for input-output statistics and often used to diagnose performance issue with storage devices. You can monitor CPU, Device & Network file system utilization report with iostat.
Display disk I/O statistics
[[email protected] ~]# iostat -d Linux 3.10.0-327.13.1.el7.x86_64 (localhost.localdomain) 08/13/2016 _x86_64_ (2 CPU) Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn sda 1.82 55.81 12.63 687405 155546 [[email protected] ~]#
Display CPU statistics
[[email protected] ~]# iostat -c Linux 3.10.0-327.13.1.el7.x86_64 (localhost.localdomain) 08/13/2016 _x86_64_ (2 CPU) avg-cpu: %user %nice %system %iowait %steal %idle 0.59 0.02 0.33 0.54 0.00 98.52 [[email protected] ~]#
ldd stands for list dynamic dependencies to show shared libraries needed by the library. The ldd command can be handy to diagnose the application startup problem.
If some program is not starting due to dependencies not available then you can ldd to find out the shared libraries it’s looking for.
[[email protected] sbin]# ldd httpd linux-vdso.so.1 => (0x00007ffe7ebb2000) libpcre.so.1 => /lib64/libpcre.so.1 (0x00007fa4d451e000) libselinux.so.1 => /lib64/libselinux.so.1 (0x00007fa4d42f9000) libaprutil-1.so.0 => /lib64/libaprutil-1.so.0 (0x00007fa4d40cf000) libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007fa4d3e98000) libexpat.so.1 => /lib64/libexpat.so.1 (0x00007fa4d3c6e000) libdb-5.3.so => /lib64/libdb-5.3.so (0x00007fa4d38af000) libapr-1.so.0 => /lib64/libapr-1.so.0 (0x00007fa4d3680000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fa4d3464000) libdl.so.2 => /lib64/libdl.so.2 (0x00007fa4d325f000) libc.so.6 => /lib64/libc.so.6 (0x00007fa4d2e9e000) liblzma.so.5 => /lib64/liblzma.so.5 (0x00007fa4d2c79000) /lib64/ld-linux-x86-64.so.2 (0x00007fa4d4a10000) libuuid.so.1 => /lib64/libuuid.so.1 (0x00007fa4d2a73000) libfreebl3.so => /lib64/libfreebl3.so (0x00007fa4d2870000) [[email protected] sbin]#
netstat (Network Statistics) is a popular command to print network connections, interface statistics and to troubleshoot various network related issue.
To show stats of all protocols
# netstat –s
You can use grep to find out if any errors
[[email protected] sbin]# netstat -s | grep error 0 packet receive errors 0 receive buffer errors 0 send buffer errors [[email protected] sbin]#
To show kernel routing table
[[email protected] sbin]# netstat -r Kernel IP routing table Destination Gateway Genmask Flags MSS Window irtt Iface default gateway 0.0.0.0 UG 0 0 0 eno16777736 172.16.179.0 0.0.0.0 255.255.255.0 U 0 0 0 eno16777736 192.168.122.0 0.0.0.0 255.255.255.0 U 0 0 0 virbr0 [[email protected] sbin]#
If your Linux server is running out of memory or just want to find out how much memory available out of available memory, then the free command will help you.
[[email protected] sbin]# free -g total used free shared buff/cache available Mem: 5 0 3 0 1 4 Swap: 5 0 5 [[email protected] sbin]#
-g means to show the details in GB. So as you can see total available memory is 5 GB and 3 GB is free.
sar (System Activity Report) will be helpful to collect a number of a report including CPU, Memory and device load.
By just executing sar command will show you system utilization for the entire day.
By default, it stores utilization report in 10 minutes. If you need something shorter in real-time, you can use as below.
Show CPU report for 3 times every 3 seconds
[[email protected] sbin]# sar 3 2 Linux 3.10.0-327.13.1.el7.x86_64 (localhost.localdomain) 08/13/2016 _x86_64_ (2 CPU) 11:14:02 PM CPU %user %nice %system %iowait %steal %idle 11:14:05 PM all 1.83 0.00 0.50 0.17 0.00 97.51 11:14:08 PM all 1.50 0.00 0.17 0.00 0.00 98.33 Average: all 1.67 0.00 0.33 0.08 0.00 97.92 [[email protected] sbin]#
Show Memory usage report
# sar –r
Show network report
# sar –n ALL
ipcs (InterProcess Communication System) provides a report on the semaphore, shared memory & message queue.
To list the message queue
# ipcs –q
To list the semaphores
# ipcs –s
To list the shared memory
# ipcs –m
To display current usage status of IPC
[[email protected] sbin]# ipcs -u ------ Messages Status -------- allocated queues = 0 used headers = 0 used space = 0 bytes ------ Shared Memory Status -------- segments allocated 5 pages allocated 2784 pages resident 359 pages swapped 0 Swap performance: 0 attempts 0 successes ------ Semaphore Status -------- used arrays = 0 allocated semaphores = 0 [[email protected] sbin]#
I hope above commands help in the various situation at your system administration job.
These are just to give you an idea about commands, and if interested you may check out this Linux performance monitoring and troubleshooting course.