20 Linux System Monitoring Tools Every SysAdmin Should Know

by Vivek Gite · 98 comments

Need to monitor Linux server performance? Try these built-in command and a few add-on tools. Most Linux distributions are equipped with tons of monitoring. These tools provide metrics which can be used to get information about system activities. You can use these tools to find the possible causes of a performance problem. The commands discussed below are some of the most basic commands when it comes to system analysis and debugging server issues such as:

  1. Finding out bottlenecks.
  2. Disk (storage) bottlenecks.
  3. CPU and memory bottlenecks.
  4. Network bottlenecks.


#1: top - Process Activity Command

The top program provides a dynamic real-time view of a running system i.e. actual process activity. By default, it displays the most CPU-intensive tasks running on the server and updates the list every five seconds.

Fig.01: Linux top command

Fig.01: Linux top command

Commonly Used Hot Keys

The top command provides several useful hot keys:

Hot Key Usage
t Displays summary information off and on.
m Displays memory information off and on.
A Sorts the display by top consumers of various system resources. Useful for quick identification of performance-hungry tasks on a system.
f Enters an interactive configuration screen for top. Helpful for setting up top for a specific task.
o Enables you to interactively select the ordering within top.
r Issues renice command.
k Issues kill command.
z Turn on or off color/mono


=> Related: How do I Find Out Linux CPU Utilization?

#2: vmstat - System Activity, Hardware and System Information

The command vmstat reports information about processes, memory, paging, block IO, traps, and cpu activity.
# vmstat 3
Sample Outputs:

procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0      0 2540988 522188 5130400    0    0     2    32    4    2  4  1 96  0  0
 1  0      0 2540988 522188 5130400    0    0     0   720 1199  665  1  0 99  0  0
 0  0      0 2540956 522188 5130400    0    0     0     0 1151 1569  4  1 95  0  0
 0  0      0 2540956 522188 5130500    0    0     0     6 1117  439  1  0 99  0  0
 0  0      0 2540940 522188 5130512    0    0     0   536 1189  932  1  0 98  0  0
 0  0      0 2538444 522188 5130588    0    0     0     0 1187 1417  4  1 96  0  0
 0  0      0 2490060 522188 5130640    0    0     0    18 1253 1123  5  1 94  0  0

Display Memory Utilization Slabinfo

# vmstat -m

Get Information About Active / Inactive Memory Pages

# vmstat -a
=> Related: How do I find out Linux Resource utilization to detect system bottlenecks?

#3: w - Find Out Who Is Logged on And What They Are Doing

w command displays information about the users currently on the machine, and their processes.
# w username
# w vivek

Sample Outputs:

 17:58:47 up 5 days, 20:28,  2 users,  load average: 0.36, 0.26, 0.24
USER     TTY      FROM              LOGIN@   IDLE   JCPU   PCPU WHAT
root     pts/0    10.1.3.145       14:55    5.00s  0.04s  0.02s vim /etc/resolv.conf
root     pts/1    10.1.3.145       17:43    0.00s  0.03s  0.00s w

#4: uptime - Tell How Long The System Has Been Running

The uptime command can be used to see how long the server has been running. The current time, how long the system has been running, how many users are currently logged on, and the system load averages for the past 1, 5, and 15 minutes.
# uptime
Output:

 18:02:41 up 41 days, 23:42,  1 user,  load average: 0.00, 0.00, 0.00

1 can be considered as optimal load value. The load can change from system to system. For a single CPU system 1 - 3 and SMP systems 6-10 load value might be acceptable.

#5: ps - Displays The Processes

ps command will report a snapshot of the current processes. To select all processes use the -A or -e option:
# ps -A
Sample Outputs:

  PID TTY          TIME CMD
    1 ?        00:00:02 init
    2 ?        00:00:02 migration/0
    3 ?        00:00:01 ksoftirqd/0
    4 ?        00:00:00 watchdog/0
    5 ?        00:00:00 migration/1
    6 ?        00:00:15 ksoftirqd/1
....
.....
 4881 ?        00:53:28 java
 4885 tty1     00:00:00 mingetty
 4886 tty2     00:00:00 mingetty
 4887 tty3     00:00:00 mingetty
 4888 tty4     00:00:00 mingetty
 4891 tty5     00:00:00 mingetty
 4892 tty6     00:00:00 mingetty
 4893 ttyS1    00:00:00 agetty
12853 ?        00:00:00 cifsoplockd
12854 ?        00:00:00 cifsdnotifyd
14231 ?        00:10:34 lighttpd
14232 ?        00:00:00 php-cgi
54981 pts/0    00:00:00 vim
55465 ?        00:00:00 php-cgi
55546 ?        00:00:00 bind9-snmp-stat
55704 pts/1    00:00:00 ps

ps is just like top but provides more information.

Show Long Format Output

# ps -Al
To turn on extra full mode (it will show command line arguments passed to process):
# ps -AlF

To See Threads ( LWP and NLWP)

# ps -AlFH

To See Threads After Processes

# ps -AlLm

Print All Process On The Server

# ps ax
# ps axu

Print A Process Tree

# ps -ejH
# ps axjf
# pstree

Print Security Information

# ps -eo euser,ruser,suser,fuser,f,comm,label
# ps axZ
# ps -eM

See Every Process Running As User Vivek

# ps -U vivek -u vivek u

Set Output In a User-Defined Format

# ps -eo pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:14,comm
# ps axo stat,euid,ruid,tty,tpgid,sess,pgrp,ppid,pid,pcpu,comm
# ps -eopid,tt,user,fname,tmout,f,wchan

Display Only The Process IDs of Lighttpd

# ps -C lighttpd -o pid=
OR
# pgrep lighttpd
OR
# pgrep -u vivek php-cgi

Display The Name of PID 55977

# ps -p 55977 -o comm=

Find Out The Top 10 Memory Consuming Process

# ps -auxf | sort -nr -k 4 | head -10

Find Out top 10 CPU Consuming Process

# ps -auxf | sort -nr -k 3 | head -10

#6: free - Memory Usage

The command free displays the total amount of free and used physical and swap memory in the system, as well as the buffers used by the kernel.
# free
Sample Output:

            total       used       free     shared    buffers     cached
Mem:      12302896    9739664    2563232          0     523124    5154740
-/+ buffers/cache:    4061800    8241096
Swap:      1052248          0    1052248

=> Related: :

  1. Linux Find Out Virtual Memory PAGESIZE
  2. Linux Limit CPU Usage Per Process
  3. How much RAM does my Ubuntu / Fedora Linux desktop PC have?

#7: iostat - Average CPU Load, Disk Activity

The command iostat report Central Processing Unit (CPU) statistics and input/output statistics for devices, partitions and network filesystems (NFS).
# iostat
Sample Outputs:

Linux 2.6.18-128.1.14.el5 (www03.nixcraft.in) 	06/26/2009

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           3.50    0.09    0.51    0.03    0.00   95.86

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              22.04        31.88       512.03   16193351  260102868
sda1              0.00         0.00         0.00       2166        180
sda2             22.04        31.87       512.03   16189010  260102688
sda3              0.00         0.00         0.00       1615          0

=> Related: : Linux Track NFS Directory / Disk I/O Stats

#8: sar - Collect and Report System Activity

The sar command is used to collect, report, and save system activity information. To see network counter, enter:
# sar -n DEV | more
To display the network counters from the 24th:
# sar -n DEV -f /var/log/sa/sa24 | more
You can also display real time usage using sar:
# sar 4 5
Sample Outputs:

Linux 2.6.18-128.1.14.el5 (www03.nixcraft.in) 		06/26/2009

06:45:12 PM       CPU     %user     %nice   %system   %iowait    %steal     %idle
06:45:16 PM       all      2.00      0.00      0.22      0.00      0.00     97.78
06:45:20 PM       all      2.07      0.00      0.38      0.03      0.00     97.52
06:45:24 PM       all      0.94      0.00      0.28      0.00      0.00     98.78
06:45:28 PM       all      1.56      0.00      0.22      0.00      0.00     98.22
06:45:32 PM       all      3.53      0.00      0.25      0.03      0.00     96.19
Average:          all      2.02      0.00      0.27      0.01      0.00     97.70

=> Related: : How to collect Linux system utilization data into a file

#9: mpstat - Multiprocessor Usage

The mpstat command displays activities for each available processor, processor 0 being the first one. mpstat -P ALL to display average CPU utilization per processor:
# mpstat -P ALL
Sample Output:

Linux 2.6.18-128.1.14.el5 (www03.nixcraft.in)	 	06/26/2009

06:48:11 PM  CPU   %user   %nice    %sys %iowait    %irq   %soft  %steal   %idle    intr/s
06:48:11 PM  all    3.50    0.09    0.34    0.03    0.01    0.17    0.00   95.86   1218.04
06:48:11 PM    0    3.44    0.08    0.31    0.02    0.00    0.12    0.00   96.04   1000.31
06:48:11 PM    1    3.10    0.08    0.32    0.09    0.02    0.11    0.00   96.28     34.93
06:48:11 PM    2    4.16    0.11    0.36    0.02    0.00    0.11    0.00   95.25      0.00
06:48:11 PM    3    3.77    0.11    0.38    0.03    0.01    0.24    0.00   95.46     44.80
06:48:11 PM    4    2.96    0.07    0.29    0.04    0.02    0.10    0.00   96.52     25.91
06:48:11 PM    5    3.26    0.08    0.28    0.03    0.01    0.10    0.00   96.23     14.98
06:48:11 PM    6    4.00    0.10    0.34    0.01    0.00    0.13    0.00   95.42      3.75
06:48:11 PM    7    3.30    0.11    0.39    0.03    0.01    0.46    0.00   95.69     76.89

=> Related: : Linux display each multiple SMP CPU processors utilization individually.

#10: pmap - Process Memory Usage

The command pmap report memory map of a process. Use this command to find out causes of memory bottlenecks.
# pmap -d PID
To display process memory information for pid # 47394, enter:
# pmap -d 47394
Sample Outputs:

47394:   /usr/bin/php-cgi
Address           Kbytes Mode  Offset           Device    Mapping
0000000000400000    2584 r-x-- 0000000000000000 008:00002 php-cgi
0000000000886000     140 rw--- 0000000000286000 008:00002 php-cgi
00000000008a9000      52 rw--- 00000000008a9000 000:00000   [ anon ]
0000000000aa8000      76 rw--- 00000000002a8000 008:00002 php-cgi
000000000f678000    1980 rw--- 000000000f678000 000:00000   [ anon ]
000000314a600000     112 r-x-- 0000000000000000 008:00002 ld-2.5.so
000000314a81b000       4 r---- 000000000001b000 008:00002 ld-2.5.so
000000314a81c000       4 rw--- 000000000001c000 008:00002 ld-2.5.so
000000314aa00000    1328 r-x-- 0000000000000000 008:00002 libc-2.5.so
000000314ab4c000    2048 ----- 000000000014c000 008:00002 libc-2.5.so
.....
......
..
00002af8d48fd000       4 rw--- 0000000000006000 008:00002 xsl.so
00002af8d490c000      40 r-x-- 0000000000000000 008:00002 libnss_files-2.5.so
00002af8d4916000    2044 ----- 000000000000a000 008:00002 libnss_files-2.5.so
00002af8d4b15000       4 r---- 0000000000009000 008:00002 libnss_files-2.5.so
00002af8d4b16000       4 rw--- 000000000000a000 008:00002 libnss_files-2.5.so
00002af8d4b17000  768000 rw-s- 0000000000000000 000:00009 zero (deleted)
00007fffc95fe000      84 rw--- 00007ffffffea000 000:00000   [ stack ]
ffffffffff600000    8192 ----- 0000000000000000 000:00000   [ anon ]
mapped: 933712K    writeable/private: 4304K    shared: 768000K

The last line is very important:

  • mapped: 933712K total amount of memory mapped to files
  • writeable/private: 4304K the amount of private address space
  • shared: 768000K the amount of address space this process is sharing with others

=> Related: : Linux find the memory used by a program / process using pmap command

#11 and #12: netstat and ss - Network Statistics

The command netstat displays network connections, routing tables, interface statistics, masquerade connections, and multicast memberships. ss command is used to dump socket statistics. It allows showing information similar to netstat. See the following resources about ss and netstat commands:

#13: iptraf - Real-time Network Statistics

The iptraf command is interactive colorful IP LAN monitor. It is an ncurses-based IP LAN monitor that generates various network statistics including TCP info, UDP counts, ICMP and OSPF information, Ethernet load info, node stats, IP checksum errors, and others. It can provide the following info in easy to read format:

  • Network traffic statistics by TCP connection
  • IP traffic statistics by network interface
  • Network traffic statistics by protocol
  • Network traffic statistics by TCP/UDP port and by packet size
  • Network traffic statistics by Layer2 address
Fig.02: General interface statistics: IP traffic statistics by network interface

Fig.02: General interface statistics: IP traffic statistics by network interface

Fig.03 Network traffic statistics by TCP connection

Fig.03 Network traffic statistics by TCP connection

#14: tcpdump - Detailed Network Traffic Analysis

The tcpdump is simple command that dump traffic on a network. However, you need good understanding of TCP/IP protocol to utilize this tool. For.e.g to display traffic info about DNS, enter:
# tcpdump -i eth1 'udp port 53'
To display all IPv4 HTTP packets to and from port 80, i.e. print only packets that contain data, not, for example, SYN and FIN packets and ACK-only packets, enter:
# tcpdump 'tcp port 80 and (((ip[2:2] - ((ip[0]&0xf)<<2)) - ((tcp[12]&0xf0)>>2)) != 0)'
To display all FTP session to 202.54.1.5, enter:
# tcpdump -i eth1 'dst 202.54.1.5 and (port 21 or 20'
To display all HTTP session to 192.168.1.5:
# tcpdump -ni eth0 'dst 192.168.1.5 and tcp and port http'
Use wireshark to view detailed information about files, enter:
# tcpdump -n -i eth1 -s 0 -w output.txt src or dst port 80

#15: strace - System Calls

Trace system calls and signals. This is useful for debugging webserver and other server problems. See how to use to trace the process and see What it is doing.

#16: /Proc file system - Various Kernel Statistics

/proc file system provides detailed information about various hardware devices and other Linux kernel information. See Linux kernel /proc documentations for further details. Common /proc examples:
# cat /proc/cpuinfo
# cat /proc/meminfo
# cat /proc/zoneinfo
# cat /proc/mounts

17#: Nagios - Server And Network Monitoring

Nagios is a popular open source computer system and network monitoring application software. You can easily monitor all your hosts, network equipment and services. It can send alert when things go wrong and again when they get better. FAN is "Fully Automated Nagios". FAN goals are to provide a Nagios installation including most tools provided by the Nagios Community. FAN provides a CDRom image in the standard ISO format, making it easy to easilly install a Nagios server. Added to this, a wide bunch of tools are including to the distribution, in order to improve the user experience around Nagios.

18#: Cacti - Web-based Monitoring Tool

Cacti is a complete network graphing solution designed to harness the power of RRDTool's data storage and graphing functionality. Cacti provides a fast poller, advanced graph templating, multiple data acquisition methods, and user management features out of the box. All of this is wrapped in an intuitive, easy to use interface that makes sense for LAN-sized installations up to complex networks with hundreds of devices. It can provide data about network, CPU, memory, logged in users, Apache, DNS servers and much more. See how to install and configure Cacti network graphing tool under CentOS / RHEL.

#19: KDE System Guard - Real-time Systems Reporting and Graphing

KSysguard is a network enabled task and system monitor application for KDE desktop. This tool can be run over ssh session. It provides lots of features such as a client/server architecture that enables monitoring of local and remote hosts. The graphical front end uses so-called sensors to retrieve the information it displays. A sensor can return simple values or more complex information like tables. For each type of information, one or more displays are provided. Displays are organized in worksheets that can be saved and loaded independently from each other. So, KSysguard is not only a simple task manager but also a very powerful tool to control large server farms.

Fig.05 KDE System Guard

Fig.05 KDE System Guard {Image credit: Wikipedia}

See the KSysguard handbook for detailed usage.

#20: Gnome System Monitor - Real-time Systems Reporting and Graphing

The System Monitor application enables you to display basic system information and monitor system processes, usage of system resources, and file systems. You can also use System Monitor to modify the behavior of your system. Although not as powerful as the KDE System Guard, it provides the basic information which may be useful for new users:

  • Displays various basic information about the computer's hardware and software.
  • Linux Kernel version
  • GNOME version
  • Hardware
  • Installed memory
  • Processors and speeds
  • System Status
  • Currently available disk space
  • Processes
  • Memory and swap space
  • Network usage
  • File Systems
  • Lists all mounted filesystems along with basic information about each.
Fig.06 The Gnome System Monitor application

Fig.06 The Gnome System Monitor application

Bounce: Additional Tools

A few more tools:

  • nmap - scan your server for open ports.
  • lsof - list open files, network connections and much more.
  • ntop web based tool - ntop is the best tool to see network usage in a way similar to what top command does for processes i.e. it is network traffic monitoring software. You can see network status, protocol wise distribution of traffic for UDP, TCP, DNS, HTTP and other protocols.
  • Conky - Another good monitoring tool for the X Window System. It is highly configurable and is able to monitor many system variables including the status of the CPU, memory, swap space, disk storage, temperatures, processes, network interfaces, battery power, system messages, e-mail inboxes etc.
  • GKrellM - It can be used to monitor the status of CPUs, main memory, hard disks, network interfaces, local and remote mailboxes, and many other things.
  • vnstat - vnStat is a console-based network traffic monitor. It keeps a log of hourly, daily and monthly network traffic for the selected interface(s).
  • htop - htop is an enhanced version of top, the interactive process viewer, which can display the list of processes in a tree form.
  • mtr - mtr combines the functionality of the traceroute and ping programs in a single network diagnostic tool.

Did I miss something? Please add your favorite system motoring tool in the comments.

Featured Articles:

Want to read Linux tips and tricks, but don't have time to check our blog everyday? Subscribe to our daily email newsletter to make sure you don't miss a single tip/tricks. Subscribe to our weekly newsletter here!

{ 98 comments… read them below or add one }

1 VonSkippy 06.27.09 at 5:10 am

Pretty much common knowledge (or should be) but handy to have listed all in one place.

2 robb 06.27.09 at 8:29 am

yeap most of them are must-have tools.
good job of collecting them in a post.

3 Chris 06.27.09 at 8:37 am

Nice list. For systems with just a few nodes I recommend Munin. It’s easy to install and configure. My favorite tool for monitoring a linux cluster is Ganglia.

P.S. I think you should change this “#2: vmstat – Network traffic statistics by TCP connection …”

4 ftaurino 06.27.09 at 9:09 am

another useful tool is dstat , which combines vmstat, iostat, ifstat, netstat information and more. but this is a very useful list with some interesting examples!

5 James 06.27.09 at 9:23 am

pocess or process. haha, i love typos

6 Artur 06.27.09 at 9:40 am

What about Munin ? Lots easier and lighter than Cacti.

7 Raj 06.27.09 at 10:13 am

Nice list, worth bookmarking!

8 rkarim 06.27.09 at 10:22 am

I have a step-by-step nagios implementation howto, some one may try that. please visit http://www.linux-bd.com/
and I always thanks vivek, to run such a nice site http://www.cyberciti.biz/

9 kaosmonk 06.27.09 at 10:53 am

Once again, great article!!

10 Amr El-Sharnoby 06.27.09 at 11:07 am

I can see that the best tool to monitor processes , CPU, memeory and disk bottleneck at once is atop …

But the tool itself can cause a lot of trouble in heavily loaded servers and it enables process accounting and has a service running all the time …

To use it efficiently on RHEL , CentOS;
1- install rpmforge repo
2- # yum install atop
3- # killalll atop
4- # chkconfig atop off
5- # rm -rf /tmp/atop.d/ /var/log/atop/
6- then don’t directly run “atop” command , but instead run it as follows;
# ATOPACCT=” atop

This tool has saved me hundreds of hours really! and helped me to diagnose bottlenecks and solve them that couldn’t otherwise be easily detected and would need many different tools

11 Vivek Gite 06.27.09 at 1:01 pm

@Chris / James

Thanks for the heads-up!

12 Solaris 06.27.09 at 1:26 pm

Great post, also great reference.

13 quba 06.27.09 at 1:46 pm

Hi,

We have just added your latest post “20 Linux System Monitoring Tools

Every SysAdmin Should Know” to our Directory of Technology . You

can check the inclusion of the post here . We are delighted

to invite you to submit all your future posts to the directory and get a huge base of

visitors to your website.

Warm Regards

Techtrove.info Team

http://www.techtrove.info

14 Cristiano 06.27.09 at 1:57 pm

You probably wanna add IFTOP tool, its really simple and light, very useful when u need to have a last moment remote access to a server to see hows the trific going.

15 Peko 06.27.09 at 3:40 pm

Yeah, well why a so good admin (I dig(g) your site) won’t you use spelling checkers?
Typo #2 Web-based __Monitioring__ Tool

16 paul tergeist 06.27.09 at 4:17 pm

maybe it’s a typo too, but the title should be :
“.. Tools Every SysAdmin MUST Know”
and still, this is advanced user knowledge, at most. I would not trust a sysadmin that knows so few. And..

17 harrywwc 06.27.09 at 10:56 pm

Hi guys,

good list – and some great submitted pointers to other useful tools.

to those carp-ing on about typo’s – give us all a break. you’ve never made a typo? ever?

Idea: How ’bout those who have never *ever* made an error in typing text be the first one(s) to give people grief about making a typo?

I _used_ to be a real PITA about this; then I grew up.

The purpose of this blog, and other forms of communication, is to *communicate* concepts and ideas. *If* you have received those clearly – in spite of the typos – then the purpose has been fulfilled.

/me gets down off his soapbox

.h

18 Pádraig Brady 06.27.09 at 11:37 pm

A script I use often to show the real memory usage of programs on linux, is ps_mem.py

I also summarised a few linux monitoring tools here

I’d also mention the powertop utility

19 Saad 06.27.09 at 11:54 pm

This blog is more impressive and more useful than ever. I need more help regarding proper installation document on “php-network weathermap” on Cacti as plugins

20 Jack 06.28.09 at 2:18 am

No love for whowatch ? Real time info on who’s logged in, how their connected (SSH, TTY, etc) and what process thay have running.

http://www.pttk.ae.krakow.pl/~mike/#whowatch

21 Ponzu 06.28.09 at 2:28 am

vi — tool used to examine and modify almost any configuration file.

22 Eric schulman 06.28.09 at 5:38 am

dtrace is a notable mention for the picky hackers that wish to know more about the behavior of the operating system and it’s programs internals.

23 Ashok kumar 06.28.09 at 5:48 am

hi gud information , keep it up

ash

24 Enzo 06.28.09 at 6:09 am

You missed: iftop & nethogs

25 Adrian Fita 06.28.09 at 7:09 am

Excellent list. Like Amr El-Sharnoby above, I also find atop indispensable and think it must be installed on every system.

In addition I would like to add iotop to monitor disk usage per process and jnettop to very easily monitor bandwidth allocation between connections on a Linux system.

26 Knightsream 06.28.09 at 8:53 am

Well, the one i use right now is Pandora FMS 3.0 and its making my work easy.

27 praveen k 06.28.09 at 12:56 pm

I would like to add
whoami ,who am i, finger, pinky , id commands

28 create own website 06.28.09 at 3:32 pm

i always love linux, great article

29 Mathieu Desnoyers 06.28.09 at 9:14 pm

One tool which seems to be missing from this list is LTTng. It is a system-wide tracing tool which helps understanding complex performance problems in multithreaded, multiprocess applications involving many userspace-kernel interactions.

The project is available at http://www.lttng.org. Recent SuSE distributions, WindRiver, Monta Vista and STLinux offer the tracer as distribution packages. The standard way to use it is to install a patched kernel though. It comes with a trace analyzer, LTTV, which provides nice view of the overall system behavior.

Mathieu

30 Andy Leo 06.29.09 at 1:02 am

Very useful, well done. Thanks!

31 Aveek Sen 06.29.09 at 1:29 am

Very informative.

32 The Hulk 06.29.09 at 2:11 am

I love this website.

33 kburger 06.29.09 at 3:08 am

If we’re talking about a web server, apachetop is a nice tool to see Apache’s activity.

34 Ram 06.29.09 at 4:07 am

Dude you forgot the most important of ALL!

net-snmpd

With it you can collect vast amounts of information. Then with snmpwalk and scripts you can create your own web NMS to collect simple information like ping, disk space, services down.

35 Kartik Mistry 06.29.09 at 5:15 am

`iotop` is nice one to be include in list. I used `vnstat` very much for keeping track of my download when I was on limited connection :)

36 Vivek Gite 06.29.09 at 7:03 am

@Everyone

Thanks for sharing all your tools with us.

37 feilong 06.29.09 at 10:01 am

Very useful, thinks for sharing.

Take a look to a great tools called nmon. I use it on AIX IBM system but works now on all GNU/linux system now.

38 boz 06.29.09 at 10:21 am

mtr

39 Scyldinga 06.29.09 at 10:21 am

I’m with @paul tergeist, tools every linux user should know. The ps samples are nice, thanks.

No reference to configuration management tools ?

cfengine/puppet/chef?

40 Ken McDonell 06.29.09 at 9:19 pm

Nice summary article.

If your “system” is large and/or distributed, and the performance issues you’re tackling are complex, you may wish to explore Performance Co-Pilot (PCP). It unifies all of the performance data from the tools you’ve mentioned (and more), can be extended to include new applications and service layers, works across the network and for clusters and provides both real-time and retrospective analysis.

See http://www.oss.sgi.com/projects/pcp

PCP is included in the Debian-based and SUSE distributions and is likely to appear in the RH distributions in the future.

As a bonus, PCP also works for monitoring non-Linux platforms (Windows and some of the Unix derivatives).

41 Lance 06.30.09 at 2:37 am

I love your collection.

I use about 25% of those regularly, and another 25% semi-regularly. I’ll have to add another 25% of those to my list of regulars.

Thanks for compiling this list.

42 bogo 06.30.09 at 6:01 am

Very nice collection of linux applications. I work with linux but I can’t say that i know them all.

43 MEHTA GHANSHYAM 06.30.09 at 9:28 am

REALLY ITS VERY GOOD N USEFULL FOR ALL ADMIN.
THANKS ONCE AGAIN

44 fasil 06.30.09 at 12:06 pm

Good post…already bookmarked… cheers

45 Aleksey Tsalolikhin 06.30.09 at 7:30 pm

I’ll just mention “ngrep” – network grep.

Great list, thanks!!

Aleksey

46 Abdul Kayyum 07.01.09 at 3:40 pm

Thanks for sharing this information..

47 Aurelio 07.01.09 at 8:20 pm

feilong, I agree. I use nmon on my linux boxes from years. It’s worth a look.

48 komradebob 07.01.09 at 10:36 pm

Great article, many great suggestions.

Was surprised not to see these among the suggestions:

bmon – graphs/tracks network activity/bandwidth real time.
etherape – great visual indicator of what traffic is going where on the network
wireshark – tcpdump on steroids.
multitail – tail multiple files in a single terminal window
swatch – track your log files and fire off alerts

49 pradeep 07.02.09 at 11:14 am

how the hell i missed this site this many days… :P thank god i found it… :) i love it…

50 Jay 07.04.09 at 5:23 pm

O personally much prefer htop to top. Displays everything very nicely.

phpsysinfo is another nice light web-based monitoring tool. Very easy to setup and use.

51 Manuel Fraga 07.05.09 at 4:55 pm

Osmius: The Open Source Monitoring Tool is C++ and Java. Monitor “everything” connected to a network with incredible performance. Create and integrate Business Services, SLAs and ITIL processes such as availability management and capacity planning.

52 aR 07.06.09 at 4:17 pm

thanks for sharing all the helpful tools.

53 Shailesh Mishra 07.07.09 at 7:13 pm

Nice compilation. As usual, always very useful.

It would be nice if some of you knowledgeable guys can shed some light on java heap monitoring thing, thread lock detection and analysis, heap analysis etc.

54 Bjarne Rasmussen 07.07.09 at 8:00 pm

nmon is a nice tool… try google for it, it rocks

55 Balaji 07.12.09 at 5:50 pm

Very much Useful Information’s,
trafmon is one more useful tool

56 Stefan 07.15.09 at 8:18 pm

And for those which like lightweight and concise graphical metering:
xosview +disk -ints -bat

57 Raja 07.19.09 at 3:03 am

Awesome. Especially love the ps tips. Very interesting

58 Rajat 07.24.09 at 4:04 am

Thanks very good info!!!

59 nima0102 07.27.09 at 7:39 am

It’s really nice :)

60 David Thomas 08.12.09 at 9:49 am

Excellent list!

61 Vinidog 08.29.09 at 4:53 am

Nice… very nice guy!!!! ;-)

62 Bob Marcan 09.04.09 at 11:00 am

From the guy who wrote the collect utility for Tru64:

Name : collectl Relocations: (not relocatable)
Version : 3.3.5 Vendor: Fedora Project
Release : 1.fc10 Build Date: Fri Aug 21 13:22:42 2009
Install Date: Tue Sep 1 18:10:34 2009 Build Host: x86-5.fedora.phx.redhat.com
Group : Applications/System Source RPM: collectl-3.3.5-1.fc10.src.rpm
Size : 1138212 License: GPLv2+ or Artistic
Signature : DSA/SHA1, Mon Aug 31 14:42:40 2009, Key ID bf226fcc4ebfc273
Packager : Fedora Project
URL : http://collectl.sourceforge.net
Summary : A utility to collect various linux performance data
Description :
A utility to collect linux performance data

Best regards, Bob

63 Tman 09.05.09 at 8:48 pm

For professional network monitoring use Zenoss:
Zenoss Core (open source): http://www.zenoss.com/product/network-monitoring

64 Somnath Pal 09.14.09 at 9:02 am

Hi,

Thanks for the nice collection with useful samples. Consider adding tools to monitor SAN storage, multipath etc. also.

Best Regards,
Somnath

65 Eddy 09.17.09 at 8:41 am

I did not see ifconfig or iwconfig on the list

66 Kestev 09.17.09 at 1:57 pm

openNMS

67 Sergiy 09.25.09 at 12:39 pm

Thanks for the article. I am not admin myself, but tools are very useful for me too.

Thanks for the comments also :)

68 Mark Seger 09.28.09 at 6:02 pm

When I wrote collectl my goal was to replace as many utilities as possible for several reasons including:
- not all write to log files
- different output formats make correlation VERY difficult
- sar is close but still too many things it doesn’t collect
- I wanted option to generate data that can be easily plotted or loaded into spreadsheet
- I wanted sub-second monitoring
- I want an API and I want to be able to send data over sockets to other tools
- and a whole lot more

I think I succeeded on many fronts, in particular not having to worry if the right data is being collected. Just install rpm and type “/etc/init.d/collectl start” and you’re collecting everything such as slabs and processes every 60 seconds and everything else every 10 seconds AND using <0.1% of the CPU to do so. I personally believe if you're collecting performance counters at a minute or coarser you're not really seeing what your system is doing.

As for the API, I worked with some folks at PNNL to monitor their 2300 node cluster, pass the data to ganglia and from there they pass it to their own real-time plotting tool that can display counters for the entire cluster in 3D. They also collectl counters from individual CPUs and pass that data to collectl as well.

I put together a very simple mapping of 'standard' utilities like sar to the equivilent collectl commands just to get a feel for how they compare. But also keep in mind there are a lot of things collectl does for which there is no equivalent system command, such as Infiniband or Lustre monitoring. How about buddyinfo? And more…

http://collectl.sourceforge.net/Matrix.html

-mark

69 PeteG 09.29.09 at 5:33 am

Darn,
I’ve been using Linux since Windows 98 was the current MicroSnot FOPA.
I know all this stuff. I do not make typoous.
Why do you post this stuff?
We all know it.
Sure we do!
But do we remember it? I just read through it and found stuff that I used long ago and it was like I just learned it. I found stuff I didn’t know either.
Hummmm…… Imagine that!
Thanks, particularly for the PDF.
Saved me making one.
Hey, where’s the HTML to PDF howto?

Thanks again.

70 Denilson 10.26.09 at 11:55 pm

Use:
free -m
To show memory usage in megabytes, which is much more useful.

71 AndrewW 11.05.09 at 11:48 pm

Is it possible to display hard drive temps from hddtemp in KSysGuard? They are available in Ksensors and GKrellM, without any configuration required. However I prefer the interface and flexibility of KSysGuard. Is there a way of configuring it?

Andrew

72 Abhijit 11.10.09 at 1:46 pm

Zabbix open source monitoring tool

http://www.zabbix.com

73 Kevin 11.15.09 at 10:55 pm

Thanks, good work

74 Stefano 11.22.09 at 4:09 pm

Just thanks! :)

75 GBonev 11.25.09 at 2:13 pm

Good Job on assembling the list
If I may suggest trafshow as an alternative to iptraf when you need to see more detailed info on source/destination , proto and ports at once.

76 Gokul 12.07.09 at 4:43 am

How to install the Kickstart method in linux

77 Bilal Ahmad 12.08.09 at 4:01 pm

Very nice collection.. Worth a bookmark…Bravo…

78 Jalal Hajigholamali 12.09.09 at 5:07 am

Thanks a lot…

79 mancai 12.11.09 at 6:40 pm

nice sharing, this is what i want looking for few day ago… tq

80 aruinanjan 12.14.09 at 7:41 am

This is a nice document for new user, thaks to owner of this document.

arun

81 myghty 12.16.09 at 7:57 am

Great post!! Thanks.

82 Rakib Hasan 12.16.09 at 2:09 pm

Very helpful. Thanks a lot!

83 PRR 12.22.09 at 9:25 pm

After so many thanks. Add one more……..

thank you. It’s very handy.

84 Yusuf 12.25.09 at 7:35 pm

Mark,

I am in technology myself and this tutorial page is very well organized
Thanks for taking the time to create this awesome page
great help for Linux new bees like myself.

85 Yusuf 12.25.09 at 7:40 pm

I meant to thank Vivek Gita
once again awesome job

86 Shrik 12.31.09 at 9:58 am

Thank you very much VERY GOOD WEBSITE

87 sekar 01.01.10 at 4:16 pm

it is cool

88 Giriraaj 01.05.10 at 7:38 am

Thanks for sharing most resourceful information.

89 Bhagyesh Dhamecha 01.06.10 at 11:58 am

Dear all Members,

Thanks for sharing all your knowledge about Linux.. i really thankful for your share linux tips..!!

thanks and continue this jurny…as well

thank you..

90 Ganesan AS 01.10.10 at 1:53 pm

Good info. Thanks for sharing.
May GOD bless you to do more.

91 Mark Seger 01.10.10 at 2:38 pm

This is indeed an impressive collection of tools but I still have to ask if people are really happy with having to know so many names, so many switches and so many formats. If you run one command and see something weird doesn’t it bother you if you have to run a different tool but the anomaly already passed and you can no longer see it with a different tool? For example if you see a drop in network performance and wonder if there was a memory or cpu problem, it’s too late to go back and see what else was going on. I know it bothers me. Again, by running collectl I never have to worry about that because it collects everything (when run as a deamon) or you can just tell it to report lots of things when running interactively and by default is shows cpu, disk and network. If you want to add memory, you can always include it but you will need a wider screen to see the output.

As a curiosity for those who run sar – I never do – what do you use for a monitoring interval? The default is to take 10 minute samples which I find quite worthless – remember sar has been around forever dating back to when cpus were much slower and monitoring much more expensive. I’d recommend to run sar with a 10 second sampling level like collectl and you’ll get far more out of it. The number of situations which this would be too much of a load on your system would be extremely rare. Anyone care to comment?

-mark

92 miles 01.12.10 at 4:58 am

Amr El-Sharnoby:
atop is awesome, thanks for the tip.

93 Serg 01.12.10 at 6:09 am

hi Mark

absolutely agreed with you mate! if you are the sysadmin something – you will do it for yourself and do it right!
These tools like ps,top and other is commonly used by users who administrated a non-productive or desktop systems or for some users who’s temporary came to the system and who needed to get a little bit of information about the box – and its pretty good enough for them. )

94 met00 01.12.10 at 6:15 pm

If you are running a web server and you have multiple clients writing code, you will one day see CPU slow to a crawl. “Why?”, you will ask. ps -ef and top will show that mysql is eating up resources…

HMM?

If only there was a tool which showed me what command was being issued against the database…

mytop

Once you find the select statement that has mysql running at 99% of the CPU, you can kill the query and then go chase down the client and kill them too (or in my case bill them at $250/hr for fixing their code).

95 Mark Seger 01.12.10 at 6:36 pm

re mysql – it’s not necessarily that straight forward. I was working with someone who had a system with mysql that was crawling. it was taking multiple seconds for vi to echo a single character! we ran collectl on it and could see low cpu, low network and low disk i/o. Lots of available memory, so what gives? A close look showed me that even those the I/O rates were low, the average request sizes were also real low – probably do so small db requests.

digging even deeper with collectl I saw the i/o request service times were multiple seconds! in other words when you requested an I/O operation not matter how fast the disk is, it took over 2 second to complete and that’s why vi was so slow, it was trying to write to it’s backing store.

bottom line – running a single tool and only looking at one thing does not tell the whole story. you need to see multiple things AND see them at the same time.

-mark

96 mtituh Alu 01.19.10 at 2:09 pm

I have a postfix mail server, recently through tcpdump I see alot of traffic to dc.mx.aol.com, fedExservices.com, wi.rr.com, mx1.dixie-net.com. I believe my mail server is spamming. How do I find out it is spamming? and how do I stop it. Please help.

97 Vivek Gite 01.19.10 at 3:01 pm

Only allow authenticated email users to send an email. There are other things too such as anti-spam, ssl keys, domain keys and much more.

98 kirankumarl 02.03.10 at 9:26 am

Dear sir pls send me some linex pdf file by wich i can learn how to install & maintanes

Leave a Comment

You can use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Previous post:

Next post: