Need to monitor Linux server performance? Try these built-in command and a few add-on tools. Most Linux distributions are equipped with tons of monitoring. These tools provide metrics which can be used to get information about system activities. You can use these tools to find the possible causes of a performance problem. The commands discussed below are some of the most basic commands when it comes to system analysis and debugging server issues such as:
- Finding out bottlenecks.
- Disk (storage) bottlenecks.
- CPU and memory bottlenecks.
- Network bottlenecks.
#1: top - Process Activity Command
The top program provides a dynamic real-time view of a running system i.e. actual process activity. By default, it displays the most CPU-intensive tasks running on the server and updates the list every five seconds.
Commonly Used Hot Keys
The top command provides several useful hot keys:
| Hot Key | Usage |
|---|---|
| t | Displays summary information off and on. |
| m | Displays memory information off and on. |
| A | Sorts the display by top consumers of various system resources. Useful for quick identification of performance-hungry tasks on a system. |
| f | Enters an interactive configuration screen for top. Helpful for setting up top for a specific task. |
| o | Enables you to interactively select the ordering within top. |
| r | Issues renice command. |
| k | Issues kill command. |
| z | Turn on or off color/mono |
=> Related: How do I Find Out Linux CPU Utilization?
#2: vmstat - System Activity, Hardware and System Information
The command vmstat reports information about processes, memory, paging, block IO, traps, and cpu activity.
# vmstat 3
Sample Outputs:
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 0 2540988 522188 5130400 0 0 2 32 4 2 4 1 96 0 0 1 0 0 2540988 522188 5130400 0 0 0 720 1199 665 1 0 99 0 0 0 0 0 2540956 522188 5130400 0 0 0 0 1151 1569 4 1 95 0 0 0 0 0 2540956 522188 5130500 0 0 0 6 1117 439 1 0 99 0 0 0 0 0 2540940 522188 5130512 0 0 0 536 1189 932 1 0 98 0 0 0 0 0 2538444 522188 5130588 0 0 0 0 1187 1417 4 1 96 0 0 0 0 0 2490060 522188 5130640 0 0 0 18 1253 1123 5 1 94 0 0
Display Memory Utilization Slabinfo
# vmstat -m
Get Information About Active / Inactive Memory Pages
# vmstat -a
=> Related: How do I find out Linux Resource utilization to detect system bottlenecks?
#3: w - Find Out Who Is Logged on And What They Are Doing
w command displays information about the users currently on the machine, and their processes.
# w username
# w vivek
Sample Outputs:
17:58:47 up 5 days, 20:28, 2 users, load average: 0.36, 0.26, 0.24 USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT root pts/0 10.1.3.145 14:55 5.00s 0.04s 0.02s vim /etc/resolv.conf root pts/1 10.1.3.145 17:43 0.00s 0.03s 0.00s w
#4: uptime - Tell How Long The System Has Been Running
The uptime command can be used to see how long the server has been running. The current time, how long the system has been running, how many users are currently logged on, and the system load averages for the past 1, 5, and 15 minutes.
# uptime
Output:
18:02:41 up 41 days, 23:42, 1 user, load average: 0.00, 0.00, 0.00
1 can be considered as optimal load value. The load can change from system to system. For a single CPU system 1 - 3 and SMP systems 6-10 load value might be acceptable.
#5: ps - Displays The Processes
ps command will report a snapshot of the current processes. To select all processes use the -A or -e option:
# ps -A
Sample Outputs:
PID TTY TIME CMD
1 ? 00:00:02 init
2 ? 00:00:02 migration/0
3 ? 00:00:01 ksoftirqd/0
4 ? 00:00:00 watchdog/0
5 ? 00:00:00 migration/1
6 ? 00:00:15 ksoftirqd/1
....
.....
4881 ? 00:53:28 java
4885 tty1 00:00:00 mingetty
4886 tty2 00:00:00 mingetty
4887 tty3 00:00:00 mingetty
4888 tty4 00:00:00 mingetty
4891 tty5 00:00:00 mingetty
4892 tty6 00:00:00 mingetty
4893 ttyS1 00:00:00 agetty
12853 ? 00:00:00 cifsoplockd
12854 ? 00:00:00 cifsdnotifyd
14231 ? 00:10:34 lighttpd
14232 ? 00:00:00 php-cgi
54981 pts/0 00:00:00 vim
55465 ? 00:00:00 php-cgi
55546 ? 00:00:00 bind9-snmp-stat
55704 pts/1 00:00:00 ps
ps is just like top but provides more information.
Show Long Format Output
# ps -Al
To turn on extra full mode (it will show command line arguments passed to process):
# ps -AlF
To See Threads ( LWP and NLWP)
# ps -AlFH
To See Threads After Processes
# ps -AlLm
Print All Process On The Server
# ps ax
# ps axu
Print A Process Tree
# ps -ejH
# ps axjf
# pstree
Print Security Information
# ps -eo euser,ruser,suser,fuser,f,comm,label
# ps axZ
# ps -eM
See Every Process Running As User Vivek
# ps -U vivek -u vivek u
Set Output In a User-Defined Format
# ps -eo pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:14,comm
# ps axo stat,euid,ruid,tty,tpgid,sess,pgrp,ppid,pid,pcpu,comm
# ps -eopid,tt,user,fname,tmout,f,wchan
Display Only The Process IDs of Lighttpd
# ps -C lighttpd -o pid=
OR
# pgrep lighttpd
OR
# pgrep -u vivek php-cgi
Display The Name of PID 55977
# ps -p 55977 -o comm=
Find Out The Top 10 Memory Consuming Process
# ps -auxf | sort -nr -k 4 | head -10
Find Out top 10 CPU Consuming Process
# ps -auxf | sort -nr -k 3 | head -10
#6: free - Memory Usage
The command free displays the total amount of free and used physical and swap memory in the system, as well as the buffers used by the kernel.
# free
Sample Output:
total used free shared buffers cached Mem: 12302896 9739664 2563232 0 523124 5154740 -/+ buffers/cache: 4061800 8241096 Swap: 1052248 0 1052248
=> Related: :
- Linux Find Out Virtual Memory PAGESIZE
- Linux Limit CPU Usage Per Process
- How much RAM does my Ubuntu / Fedora Linux desktop PC have?
#7: iostat - Average CPU Load, Disk Activity
The command iostat report Central Processing Unit (CPU) statistics and input/output statistics for devices, partitions and network filesystems (NFS).
# iostat
Sample Outputs:
Linux 2.6.18-128.1.14.el5 (www03.nixcraft.in) 06/26/2009
avg-cpu: %user %nice %system %iowait %steal %idle
3.50 0.09 0.51 0.03 0.00 95.86
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 22.04 31.88 512.03 16193351 260102868
sda1 0.00 0.00 0.00 2166 180
sda2 22.04 31.87 512.03 16189010 260102688
sda3 0.00 0.00 0.00 1615 0
=> Related: : Linux Track NFS Directory / Disk I/O Stats
#8: sar - Collect and Report System Activity
The sar command is used to collect, report, and save system activity information. To see network counter, enter:
# sar -n DEV | more
To display the network counters from the 24th:
# sar -n DEV -f /var/log/sa/sa24 | more
You can also display real time usage using sar:
# sar 4 5
Sample Outputs:
Linux 2.6.18-128.1.14.el5 (www03.nixcraft.in) 06/26/2009 06:45:12 PM CPU %user %nice %system %iowait %steal %idle 06:45:16 PM all 2.00 0.00 0.22 0.00 0.00 97.78 06:45:20 PM all 2.07 0.00 0.38 0.03 0.00 97.52 06:45:24 PM all 0.94 0.00 0.28 0.00 0.00 98.78 06:45:28 PM all 1.56 0.00 0.22 0.00 0.00 98.22 06:45:32 PM all 3.53 0.00 0.25 0.03 0.00 96.19 Average: all 2.02 0.00 0.27 0.01 0.00 97.70
=> Related: : How to collect Linux system utilization data into a file
#9: mpstat - Multiprocessor Usage
The mpstat command displays activities for each available processor, processor 0 being the first one. mpstat -P ALL to display average CPU utilization per processor:
# mpstat -P ALL
Sample Output:
Linux 2.6.18-128.1.14.el5 (www03.nixcraft.in) 06/26/2009 06:48:11 PM CPU %user %nice %sys %iowait %irq %soft %steal %idle intr/s 06:48:11 PM all 3.50 0.09 0.34 0.03 0.01 0.17 0.00 95.86 1218.04 06:48:11 PM 0 3.44 0.08 0.31 0.02 0.00 0.12 0.00 96.04 1000.31 06:48:11 PM 1 3.10 0.08 0.32 0.09 0.02 0.11 0.00 96.28 34.93 06:48:11 PM 2 4.16 0.11 0.36 0.02 0.00 0.11 0.00 95.25 0.00 06:48:11 PM 3 3.77 0.11 0.38 0.03 0.01 0.24 0.00 95.46 44.80 06:48:11 PM 4 2.96 0.07 0.29 0.04 0.02 0.10 0.00 96.52 25.91 06:48:11 PM 5 3.26 0.08 0.28 0.03 0.01 0.10 0.00 96.23 14.98 06:48:11 PM 6 4.00 0.10 0.34 0.01 0.00 0.13 0.00 95.42 3.75 06:48:11 PM 7 3.30 0.11 0.39 0.03 0.01 0.46 0.00 95.69 76.89
=> Related: : Linux display each multiple SMP CPU processors utilization individually.
#10: pmap - Process Memory Usage
The command pmap report memory map of a process. Use this command to find out causes of memory bottlenecks.
# pmap -d PID
To display process memory information for pid # 47394, enter:
# pmap -d 47394
Sample Outputs:
47394: /usr/bin/php-cgi Address Kbytes Mode Offset Device Mapping 0000000000400000 2584 r-x-- 0000000000000000 008:00002 php-cgi 0000000000886000 140 rw--- 0000000000286000 008:00002 php-cgi 00000000008a9000 52 rw--- 00000000008a9000 000:00000 [ anon ] 0000000000aa8000 76 rw--- 00000000002a8000 008:00002 php-cgi 000000000f678000 1980 rw--- 000000000f678000 000:00000 [ anon ] 000000314a600000 112 r-x-- 0000000000000000 008:00002 ld-2.5.so 000000314a81b000 4 r---- 000000000001b000 008:00002 ld-2.5.so 000000314a81c000 4 rw--- 000000000001c000 008:00002 ld-2.5.so 000000314aa00000 1328 r-x-- 0000000000000000 008:00002 libc-2.5.so 000000314ab4c000 2048 ----- 000000000014c000 008:00002 libc-2.5.so ..... ...... .. 00002af8d48fd000 4 rw--- 0000000000006000 008:00002 xsl.so 00002af8d490c000 40 r-x-- 0000000000000000 008:00002 libnss_files-2.5.so 00002af8d4916000 2044 ----- 000000000000a000 008:00002 libnss_files-2.5.so 00002af8d4b15000 4 r---- 0000000000009000 008:00002 libnss_files-2.5.so 00002af8d4b16000 4 rw--- 000000000000a000 008:00002 libnss_files-2.5.so 00002af8d4b17000 768000 rw-s- 0000000000000000 000:00009 zero (deleted) 00007fffc95fe000 84 rw--- 00007ffffffea000 000:00000 [ stack ] ffffffffff600000 8192 ----- 0000000000000000 000:00000 [ anon ] mapped: 933712K writeable/private: 4304K shared: 768000K
The last line is very important:
- mapped: 933712K total amount of memory mapped to files
- writeable/private: 4304K the amount of private address space
- shared: 768000K the amount of address space this process is sharing with others
=> Related: : Linux find the memory used by a program / process using pmap command
#11 and #12: netstat and ss - Network Statistics
The command netstat displays network connections, routing tables, interface statistics, masquerade connections, and multicast memberships. ss command is used to dump socket statistics. It allows showing information similar to netstat. See the following resources about ss and netstat commands:
- ss: Display Linux TCP / UDP Network and Socket Information
- Get Detailed Information About Particular IP address Connections Using netstat Command
#13: iptraf - Real-time Network Statistics
The iptraf command is interactive colorful IP LAN monitor. It is an ncurses-based IP LAN monitor that generates various network statistics including TCP info, UDP counts, ICMP and OSPF information, Ethernet load info, node stats, IP checksum errors, and others. It can provide the following info in easy to read format:
- Network traffic statistics by TCP connection
- IP traffic statistics by network interface
- Network traffic statistics by protocol
- Network traffic statistics by TCP/UDP port and by packet size
- Network traffic statistics by Layer2 address
#14: tcpdump - Detailed Network Traffic Analysis
The tcpdump is simple command that dump traffic on a network. However, you need good understanding of TCP/IP protocol to utilize this tool. For.e.g to display traffic info about DNS, enter:
# tcpdump -i eth1 'udp port 53'
To display all IPv4 HTTP packets to and from port 80, i.e. print only packets that contain data, not, for example, SYN and FIN packets and ACK-only packets, enter:
# tcpdump 'tcp port 80 and (((ip[2:2] - ((ip[0]&0xf)<<2)) - ((tcp[12]&0xf0)>>2)) != 0)'
To display all FTP session to 202.54.1.5, enter:
# tcpdump -i eth1 'dst 202.54.1.5 and (port 21 or 20'
To display all HTTP session to 192.168.1.5:
# tcpdump -ni eth0 'dst 192.168.1.5 and tcp and port http'
Use wireshark to view detailed information about files, enter:
# tcpdump -n -i eth1 -s 0 -w output.txt src or dst port 80
#15: strace - System Calls
Trace system calls and signals. This is useful for debugging webserver and other server problems. See how to use to trace the process and see What it is doing.
#16: /Proc file system - Various Kernel Statistics
/proc file system provides detailed information about various hardware devices and other Linux kernel information. See Linux kernel /proc documentations for further details. Common /proc examples:
# cat /proc/cpuinfo
# cat /proc/meminfo
# cat /proc/zoneinfo
# cat /proc/mounts
17#: Nagios - Server And Network Monitoring
Nagios is a popular open source computer system and network monitoring application software. You can easily monitor all your hosts, network equipment and services. It can send alert when things go wrong and again when they get better. FAN is "Fully Automated Nagios". FAN goals are to provide a Nagios installation including most tools provided by the Nagios Community. FAN provides a CDRom image in the standard ISO format, making it easy to easilly install a Nagios server. Added to this, a wide bunch of tools are including to the distribution, in order to improve the user experience around Nagios.
18#: Cacti - Web-based Monitoring Tool
Cacti is a complete network graphing solution designed to harness the power of RRDTool's data storage and graphing functionality. Cacti provides a fast poller, advanced graph templating, multiple data acquisition methods, and user management features out of the box. All of this is wrapped in an intuitive, easy to use interface that makes sense for LAN-sized installations up to complex networks with hundreds of devices. It can provide data about network, CPU, memory, logged in users, Apache, DNS servers and much more. See how to install and configure Cacti network graphing tool under CentOS / RHEL.
#19: KDE System Guard - Real-time Systems Reporting and Graphing
KSysguard is a network enabled task and system monitor application for KDE desktop. This tool can be run over ssh session. It provides lots of features such as a client/server architecture that enables monitoring of local and remote hosts. The graphical front end uses so-called sensors to retrieve the information it displays. A sensor can return simple values or more complex information like tables. For each type of information, one or more displays are provided. Displays are organized in worksheets that can be saved and loaded independently from each other. So, KSysguard is not only a simple task manager but also a very powerful tool to control large server farms.
See the KSysguard handbook for detailed usage.
#20: Gnome System Monitor - Real-time Systems Reporting and Graphing
The System Monitor application enables you to display basic system information and monitor system processes, usage of system resources, and file systems. You can also use System Monitor to modify the behavior of your system. Although not as powerful as the KDE System Guard, it provides the basic information which may be useful for new users:
- Displays various basic information about the computer's hardware and software.
- Linux Kernel version
- GNOME version
- Hardware
- Installed memory
- Processors and speeds
- System Status
- Currently available disk space
- Processes
- Memory and swap space
- Network usage
- File Systems
- Lists all mounted filesystems along with basic information about each.
Bounce: Additional Tools
A few more tools:
- nmap - scan your server for open ports.
- lsof - list open files, network connections and much more.
- ntop web based tool - ntop is the best tool to see network usage in a way similar to what top command does for processes i.e. it is network traffic monitoring software. You can see network status, protocol wise distribution of traffic for UDP, TCP, DNS, HTTP and other protocols.
- Conky - Another good monitoring tool for the X Window System. It is highly configurable and is able to monitor many system variables including the status of the CPU, memory, swap space, disk storage, temperatures, processes, network interfaces, battery power, system messages, e-mail inboxes etc.
- GKrellM - It can be used to monitor the status of CPUs, main memory, hard disks, network interfaces, local and remote mailboxes, and many other things.
- vnstat - vnStat is a console-based network traffic monitor. It keeps a log of hourly, daily and monthly network traffic for the selected interface(s).
- htop - htop is an enhanced version of top, the interactive process viewer, which can display the list of processes in a tree form.
- mtr - mtr combines the functionality of the traceroute and ping programs in a single network diagnostic tool.
Did I miss something? Please add your favorite system motoring tool in the comments.
Featured Articles:
- 20 Linux System Monitoring Tools Every SysAdmin Should Know
- 20 Linux Server Hardening Security Tips
- 10 Greatest Open Source Software Of 2009
- My 10 UNIX Command Line Mistakes
- Top 5 Email Client For Linux, Mac OS X, and Windows Users
- Top 20 OpenSSH Server Best Security Practices
- Top 10 Open Source Web-Based Project Management Software
- Top 5 Linux Video Editor Software
- Email this to a friend
- Download PDF version
- Printable version
- Comment RSS feed
- Last Updated: Oct/23/2009








{ 98 comments… read them below or add one }
Pretty much common knowledge (or should be) but handy to have listed all in one place.
yeap most of them are must-have tools.
good job of collecting them in a post.
Nice list. For systems with just a few nodes I recommend Munin. It’s easy to install and configure. My favorite tool for monitoring a linux cluster is Ganglia.
P.S. I think you should change this “#2: vmstat – Network traffic statistics by TCP connection …”
another useful tool is dstat , which combines vmstat, iostat, ifstat, netstat information and more. but this is a very useful list with some interesting examples!
pocess or process. haha, i love typos
What about Munin ? Lots easier and lighter than Cacti.
Nice list, worth bookmarking!
I have a step-by-step nagios implementation howto, some one may try that. please visit http://www.linux-bd.com/
and I always thanks vivek, to run such a nice site http://www.cyberciti.biz/
Once again, great article!!
I can see that the best tool to monitor processes , CPU, memeory and disk bottleneck at once is atop …
But the tool itself can cause a lot of trouble in heavily loaded servers and it enables process accounting and has a service running all the time …
To use it efficiently on RHEL , CentOS;
1- install rpmforge repo
2- # yum install atop
3- # killalll atop
4- # chkconfig atop off
5- # rm -rf /tmp/atop.d/ /var/log/atop/
6- then don’t directly run “atop” command , but instead run it as follows;
# ATOPACCT=” atop
This tool has saved me hundreds of hours really! and helped me to diagnose bottlenecks and solve them that couldn’t otherwise be easily detected and would need many different tools
@Chris / James
Thanks for the heads-up!
Great post, also great reference.
Hi,
We have just added your latest post “20 Linux System Monitoring Tools
Every SysAdmin Should Know” to our Directory of Technology . You
can check the inclusion of the post here . We are delighted
to invite you to submit all your future posts to the directory and get a huge base of
visitors to your website.
Warm Regards
Techtrove.info Team
http://www.techtrove.info
You probably wanna add IFTOP tool, its really simple and light, very useful when u need to have a last moment remote access to a server to see hows the trific going.
Yeah, well why a so good admin (I dig(g) your site) won’t you use spelling checkers?
Typo #2 Web-based __Monitioring__ Tool
maybe it’s a typo too, but the title should be :
“.. Tools Every SysAdmin MUST Know”
and still, this is advanced user knowledge, at most. I would not trust a sysadmin that knows so few. And..
Hi guys,
good list – and some great submitted pointers to other useful tools.
to those carp-ing on about typo’s – give us all a break. you’ve never made a typo? ever?
Idea: How ’bout those who have never *ever* made an error in typing text be the first one(s) to give people grief about making a typo?
I _used_ to be a real PITA about this; then I grew up.
The purpose of this blog, and other forms of communication, is to *communicate* concepts and ideas. *If* you have received those clearly – in spite of the typos – then the purpose has been fulfilled.
/me gets down off his soapbox
.h
A script I use often to show the real memory usage of programs on linux, is ps_mem.py
I also summarised a few linux monitoring tools here
I’d also mention the powertop utility
This blog is more impressive and more useful than ever. I need more help regarding proper installation document on “php-network weathermap” on Cacti as plugins
No love for whowatch ? Real time info on who’s logged in, how their connected (SSH, TTY, etc) and what process thay have running.
http://www.pttk.ae.krakow.pl/~mike/#whowatch
vi — tool used to examine and modify almost any configuration file.
dtrace is a notable mention for the picky hackers that wish to know more about the behavior of the operating system and it’s programs internals.
hi gud information , keep it up
ash
You missed: iftop & nethogs
Excellent list. Like Amr El-Sharnoby above, I also find atop indispensable and think it must be installed on every system.
In addition I would like to add iotop to monitor disk usage per process and jnettop to very easily monitor bandwidth allocation between connections on a Linux system.
Well, the one i use right now is Pandora FMS 3.0 and its making my work easy.
I would like to add
whoami ,who am i, finger, pinky , id commands
i always love linux, great article
One tool which seems to be missing from this list is LTTng. It is a system-wide tracing tool which helps understanding complex performance problems in multithreaded, multiprocess applications involving many userspace-kernel interactions.
The project is available at http://www.lttng.org. Recent SuSE distributions, WindRiver, Monta Vista and STLinux offer the tracer as distribution packages. The standard way to use it is to install a patched kernel though. It comes with a trace analyzer, LTTV, which provides nice view of the overall system behavior.
Mathieu
Very useful, well done. Thanks!
Very informative.
I love this website.
If we’re talking about a web server, apachetop is a nice tool to see Apache’s activity.
Dude you forgot the most important of ALL!
net-snmpd
With it you can collect vast amounts of information. Then with snmpwalk and scripts you can create your own web NMS to collect simple information like ping, disk space, services down.
`iotop` is nice one to be include in list. I used `vnstat` very much for keeping track of my download when I was on limited connection :)
@Everyone
Thanks for sharing all your tools with us.
Very useful, thinks for sharing.
Take a look to a great tools called nmon. I use it on AIX IBM system but works now on all GNU/linux system now.
mtr
I’m with @paul tergeist, tools every linux user should know. The ps samples are nice, thanks.
No reference to configuration management tools ?
cfengine/puppet/chef?
Nice summary article.
If your “system” is large and/or distributed, and the performance issues you’re tackling are complex, you may wish to explore Performance Co-Pilot (PCP). It unifies all of the performance data from the tools you’ve mentioned (and more), can be extended to include new applications and service layers, works across the network and for clusters and provides both real-time and retrospective analysis.
See http://www.oss.sgi.com/projects/pcp
PCP is included in the Debian-based and SUSE distributions and is likely to appear in the RH distributions in the future.
As a bonus, PCP also works for monitoring non-Linux platforms (Windows and some of the Unix derivatives).
I love your collection.
I use about 25% of those regularly, and another 25% semi-regularly. I’ll have to add another 25% of those to my list of regulars.
Thanks for compiling this list.
Very nice collection of linux applications. I work with linux but I can’t say that i know them all.
REALLY ITS VERY GOOD N USEFULL FOR ALL ADMIN.
THANKS ONCE AGAIN
Good post…already bookmarked… cheers
I’ll just mention “ngrep” – network grep.
Great list, thanks!!
Aleksey
Thanks for sharing this information..
feilong, I agree. I use nmon on my linux boxes from years. It’s worth a look.
Great article, many great suggestions.
Was surprised not to see these among the suggestions:
bmon – graphs/tracks network activity/bandwidth real time.
etherape – great visual indicator of what traffic is going where on the network
wireshark – tcpdump on steroids.
multitail – tail multiple files in a single terminal window
swatch – track your log files and fire off alerts
how the hell i missed this site this many days… :P thank god i found it… :) i love it…
O personally much prefer htop to top. Displays everything very nicely.
phpsysinfo is another nice light web-based monitoring tool. Very easy to setup and use.
Osmius: The Open Source Monitoring Tool is C++ and Java. Monitor “everything” connected to a network with incredible performance. Create and integrate Business Services, SLAs and ITIL processes such as availability management and capacity planning.
thanks for sharing all the helpful tools.
Nice compilation. As usual, always very useful.
It would be nice if some of you knowledgeable guys can shed some light on java heap monitoring thing, thread lock detection and analysis, heap analysis etc.
nmon is a nice tool… try google for it, it rocks
Very much Useful Information’s,
trafmon is one more useful tool
And for those which like lightweight and concise graphical metering:
xosview +disk -ints -bat
Awesome. Especially love the ps tips. Very interesting
Thanks very good info!!!
It’s really nice :)
Excellent list!
Nice… very nice guy!!!! ;-)
From the guy who wrote the collect utility for Tru64:
Name : collectl Relocations: (not relocatable)
Version : 3.3.5 Vendor: Fedora Project
Release : 1.fc10 Build Date: Fri Aug 21 13:22:42 2009
Install Date: Tue Sep 1 18:10:34 2009 Build Host: x86-5.fedora.phx.redhat.com
Group : Applications/System Source RPM: collectl-3.3.5-1.fc10.src.rpm
Size : 1138212 License: GPLv2+ or Artistic
Signature : DSA/SHA1, Mon Aug 31 14:42:40 2009, Key ID bf226fcc4ebfc273
Packager : Fedora Project
URL : http://collectl.sourceforge.net
Summary : A utility to collect various linux performance data
Description :
A utility to collect linux performance data
Best regards, Bob
For professional network monitoring use Zenoss:
Zenoss Core (open source): http://www.zenoss.com/product/network-monitoring
Hi,
Thanks for the nice collection with useful samples. Consider adding tools to monitor SAN storage, multipath etc. also.
Best Regards,
Somnath
I did not see ifconfig or iwconfig on the list
openNMS
Thanks for the article. I am not admin myself, but tools are very useful for me too.
Thanks for the comments also :)
When I wrote collectl my goal was to replace as many utilities as possible for several reasons including:
- not all write to log files
- different output formats make correlation VERY difficult
- sar is close but still too many things it doesn’t collect
- I wanted option to generate data that can be easily plotted or loaded into spreadsheet
- I wanted sub-second monitoring
- I want an API and I want to be able to send data over sockets to other tools
- and a whole lot more
I think I succeeded on many fronts, in particular not having to worry if the right data is being collected. Just install rpm and type “/etc/init.d/collectl start” and you’re collecting everything such as slabs and processes every 60 seconds and everything else every 10 seconds AND using <0.1% of the CPU to do so. I personally believe if you're collecting performance counters at a minute or coarser you're not really seeing what your system is doing.
As for the API, I worked with some folks at PNNL to monitor their 2300 node cluster, pass the data to ganglia and from there they pass it to their own real-time plotting tool that can display counters for the entire cluster in 3D. They also collectl counters from individual CPUs and pass that data to collectl as well.
I put together a very simple mapping of 'standard' utilities like sar to the equivilent collectl commands just to get a feel for how they compare. But also keep in mind there are a lot of things collectl does for which there is no equivalent system command, such as Infiniband or Lustre monitoring. How about buddyinfo? And more…
http://collectl.sourceforge.net/Matrix.html
-mark
Darn,
I’ve been using Linux since Windows 98 was the current MicroSnot FOPA.
I know all this stuff. I do not make typoous.
Why do you post this stuff?
We all know it.
Sure we do!
But do we remember it? I just read through it and found stuff that I used long ago and it was like I just learned it. I found stuff I didn’t know either.
Hummmm…… Imagine that!
Thanks, particularly for the PDF.
Saved me making one.
Hey, where’s the HTML to PDF howto?
Thanks again.
Use:
free -mTo show memory usage in megabytes, which is much more useful.
Is it possible to display hard drive temps from hddtemp in KSysGuard? They are available in Ksensors and GKrellM, without any configuration required. However I prefer the interface and flexibility of KSysGuard. Is there a way of configuring it?
Andrew
Zabbix open source monitoring tool
http://www.zabbix.com
Thanks, good work
Just thanks! :)
Good Job on assembling the list
If I may suggest trafshow as an alternative to iptraf when you need to see more detailed info on source/destination , proto and ports at once.
How to install the Kickstart method in linux
Very nice collection.. Worth a bookmark…Bravo…
Thanks a lot…
nice sharing, this is what i want looking for few day ago… tq
This is a nice document for new user, thaks to owner of this document.
arun
Great post!! Thanks.
Very helpful. Thanks a lot!
After so many thanks. Add one more……..
thank you. It’s very handy.
Mark,
I am in technology myself and this tutorial page is very well organized
Thanks for taking the time to create this awesome page
great help for Linux new bees like myself.
I meant to thank Vivek Gita
once again awesome job
Thank you very much VERY GOOD WEBSITE
it is cool
Thanks for sharing most resourceful information.
Dear all Members,
Thanks for sharing all your knowledge about Linux.. i really thankful for your share linux tips..!!
thanks and continue this jurny…as well
thank you..
Good info. Thanks for sharing.
May GOD bless you to do more.
This is indeed an impressive collection of tools but I still have to ask if people are really happy with having to know so many names, so many switches and so many formats. If you run one command and see something weird doesn’t it bother you if you have to run a different tool but the anomaly already passed and you can no longer see it with a different tool? For example if you see a drop in network performance and wonder if there was a memory or cpu problem, it’s too late to go back and see what else was going on. I know it bothers me. Again, by running collectl I never have to worry about that because it collects everything (when run as a deamon) or you can just tell it to report lots of things when running interactively and by default is shows cpu, disk and network. If you want to add memory, you can always include it but you will need a wider screen to see the output.
As a curiosity for those who run sar – I never do – what do you use for a monitoring interval? The default is to take 10 minute samples which I find quite worthless – remember sar has been around forever dating back to when cpus were much slower and monitoring much more expensive. I’d recommend to run sar with a 10 second sampling level like collectl and you’ll get far more out of it. The number of situations which this would be too much of a load on your system would be extremely rare. Anyone care to comment?
-mark
Amr El-Sharnoby:
atop is awesome, thanks for the tip.
hi Mark
absolutely agreed with you mate! if you are the sysadmin something – you will do it for yourself and do it right!
These tools like ps,top and other is commonly used by users who administrated a non-productive or desktop systems or for some users who’s temporary came to the system and who needed to get a little bit of information about the box – and its pretty good enough for them. )
If you are running a web server and you have multiple clients writing code, you will one day see CPU slow to a crawl. “Why?”, you will ask. ps -ef and top will show that mysql is eating up resources…
HMM?
If only there was a tool which showed me what command was being issued against the database…
mytop
Once you find the select statement that has mysql running at 99% of the CPU, you can kill the query and then go chase down the client and kill them too (or in my case bill them at $250/hr for fixing their code).
re mysql – it’s not necessarily that straight forward. I was working with someone who had a system with mysql that was crawling. it was taking multiple seconds for vi to echo a single character! we ran collectl on it and could see low cpu, low network and low disk i/o. Lots of available memory, so what gives? A close look showed me that even those the I/O rates were low, the average request sizes were also real low – probably do so small db requests.
digging even deeper with collectl I saw the i/o request service times were multiple seconds! in other words when you requested an I/O operation not matter how fast the disk is, it took over 2 second to complete and that’s why vi was so slow, it was trying to write to it’s backing store.
bottom line – running a single tool and only looking at one thing does not tell the whole story. you need to see multiple things AND see them at the same time.
-mark
I have a postfix mail server, recently through tcpdump I see alot of traffic to dc.mx.aol.com, fedExservices.com, wi.rr.com, mx1.dixie-net.com. I believe my mail server is spamming. How do I find out it is spamming? and how do I stop it. Please help.
Only allow authenticated email users to send an email. There are other things too such as anti-spam, ssl keys, domain keys and much more.
Dear sir pls send me some linex pdf file by wich i can learn how to install & maintanes