mod_compress: Lighttpd Gzip Compression To Improve Download and Browsing Speed

Posted on in Categories Apache, High performance computing, Howto, lighttpd, Linux, News, php, UNIX last updated December 14, 2008

Gzip compression reduces response times by reducing the size of the HTTP response. This document describes gzipping http traffic which can reduces the response size by about 70%. Approximately 90% of today’s Internet traffic travels through browsers that claim to support compression.

Install Squid Proxy Server on CentOS / Redhat enterprise Linux 5

Posted on in Categories CentOS, Linux, RedHat/Fedora Linux, Squid caching server, Suse Linux, Sys admin, Tips last updated February 24, 2008

I’ve already wrote about setting up a Linux transparent squid proxy system. However I’m getting lots of questions about Squid basic installation and configuration:

How do I install Squid Proxy server on CentOS 5 Liinux server?

Sure Squid server is a popular open source GPLd proxy and web cache. It has a variety of uses, from speeding up a web server by caching repeated requests, to caching web, name server query , and other network lookups for a group of people sharing network resources. It is primarily designed to run on Linux / Unix-like systems. Squid is a high-performance proxy caching server for Web clients, supporting FTP, gopher, and HTTP data objects. Unlike traditional caching software, Squid handles all requests in a single, non-blocking, I/O-driven process. Squid keeps meta data and especially hot objects cached in RAM, caches DNS lookups, supports non-blocking DNS lookups, and implements negative caching of failed requests. Squid consists of a main server program squid, a Domain Name System lookup program (dnsserver), a program for retrieving FTP data (ftpget), and some management and client tools.

Install Squid on CentOS / RHEL 5

Use yum command as follows:
# yum install squid

Loading "installonlyn" plugin
Setting up Install Process
Setting up repositories
Reading repository metadata in from local files
Parsing package install arguments
Resolving Dependencies
--> Populating transaction set with selected packages. Please wait.
---> Package squid.i386 7:2.6.STABLE6-4.el5 set to be updated
--> Running transaction check

Dependencies Resolved

 Package                 Arch       Version          Repository        Size 
 squid                   i386       7:2.6.STABLE6-4.el5  updates           1.2 M

Transaction Summary
Install      1 Package(s)         
Update       0 Package(s)         
Remove       0 Package(s)         

Total download size: 1.2 M
Is this ok [y/N]: y
Downloading Packages:
Running Transaction Test
Finished Transaction Test
Transaction Test Succeeded
Running Transaction
  Installing: squid                        ######################### [1/1] 

Installed: squid.i386 7:2.6.STABLE6-4.el5

Squid Basic Configuration

Squid configuration file located at /etc/squid/squid.conf. Open file using a text editor:
# vi /etc/squid/squid.conf
At least you need to define ACL (access control list) to work with squid. The defaults port is TCP 3128. Following example ACL allowing access from your local networks and Make sure you adapt to list your internal IP networks from where browsing should be allowed:
acl our_networks src
http_access allow our_networks

Save and close the file. Start squid proxy server:
# chkconfig squid on
# /etc/init.d/squid start


init_cache_dir /var/spool/squid... Starting squid: .       [  OK  ]

Verify port 3128 is open:
# netstat -tulpn | grep 3128

tcp        0      0      *                   LISTEN      20653/(squid)

Open TCP port 3128

Finally make sure iptables is allowing to access squid proxy server. Just open /etc/sysconfig/iptables file:
# vi /etc/sysconfig/iptables
Append configuration:
-A RH-Firewall-1-INPUT -m state --state NEW,ESTABLISHED,RELATED -m tcp -p tcp --dport 3128 -j ACCEPT
Restart iptables based firewall:
# /etc/init.d/iptables restart

Flushing firewall rules:                                   [  OK  ]
Setting chains to policy ACCEPT: filter                    [  OK  ]
Unloading iptables modules:                                [  OK  ]
Applying iptables firewall rules:                          [  OK  ]
Loading additional iptables modules: ip_conntrack_netbios_n[  OK  ]

Client configuration

Open a webbrowser > Tools > Internet option > Network settings > and setup Squid server IP address and port # 3128.

See also

You may find our previous squid tips useful:

Squid Security and blocking content Related Tips

Squid Authentication Related Tips

Squid Other Tips

How to optimize a web page for faster and better experience

Posted on in Categories Apache, High performance computing, Howto, lighttpd, Linux, Tips, Tuning, UNIX last updated August 4, 2007

You may have noticed that most my webpage are loading bit faster. Here is what I did:

a) CSS code moved to its own file and included CSS at the top

b) Removed unnecessary (read as fancy web 2.0 stupid stuff) external javascript snippets

c) I’ve moved external javascript to bottom of page/template engine. For example google analytics JS code moved to bottom of webpage.

d) Turn on Apache gzip/mod_deflate compression

e) Turn on WordPress caching

f) Turn on php script caching (I’m using eAccelerator)

g) Tweak MySQL for optimization. Turn on query cache and other settings.

h) If possible switch to lighttpd or use squid / lighttpd as caching server for old good Apache.

If you have tons of cash to burn (assuming that your web app demands performance):

  • Consider using CDN (Content Delivery Network) such as Akamai or SAVVIS.
  • Server load balancing

However there are some external JS script snippets such as Google Adsense which slows down loading of a webpage. In few months I may roll out a new template and I will try to fix this issue :)

I’m interested to know what other people’s experiences with web page optimization. Feel free to share your tips.

How do I find out Linux Resource utilization to detect system bottlenecks?

Posted on in Categories Linux, Troubleshooting last updated November 23, 2007

Q. How can I find out Linux Resource utilization using vmstat command? How do I get information about high disk I/O and memory usage?

A. vmstat command reports information about processes, memory, paging, block IO, traps, and cpu activity. However, a real advantage of vmstat command output – is to the point and (concise) easy to read/understand. The output of vmstat command use to help identify system bottlenecks. Please note that Linux vmstat does not count itself as a running process.

Here is an output of vmstat command from my enterprise grade system:
$ vmstat -S M

procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
3  0      0   1963    607   2359    0    0     0     0    0     1 32  0 68  0


  • The fist line is nothing but six different categories. The second line gives more information about each category. This second line gives all data you need.
  • -S M: vmstat lets you choose units (k, K, m, M) default is K (1024 bytes) in the default mode. I am using M since this system has over 4 GB memory. Without -M option it will use K as unit

$ vmstat

procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
3  0      0 2485120 621952 2415368  0    0     0     0    0     1 32  0 68  0

Field Description For Vm Mode

(a) procs is the process-related fields are:

  • r: The number of processes waiting for run time.
  • b: The number of processes in uninterruptible sleep.

(b) memory is the memory-related fields are:

  • swpd: the amount of virtual memory used.
  • free: the amount of idle memory.
  • buff: the amount of memory used as buffers.
  • cache: the amount of memory used as cache.

(c) swap is swap-related fields are:

  • si: Amount of memory swapped in from disk (/s).
  • so: Amount of memory swapped to disk (/s).

(d) io is the I/O-related fields are:

  • bi: Blocks received from a block device (blocks/s).
  • bo: Blocks sent to a block device (blocks/s).

(e) system is the system-related fields are:

  • in: The number of interrupts per second, including the clock.
  • cs: The number of context switches per second.

(f) cpu is the CPU-related fields are:

These are percentages of total CPU time.

  • us: Time spent running non-kernel code. (user time, including nice time)
  • sy: Time spent running kernel code. (system time)
  • id: Time spent idle. Prior to Linux 2.5.41, this includes IO-wait time.
  • wa: Time spent waiting for IO. Prior to Linux 2.5.41, shown as zero.

As you see the first output produced gives averages data since the last reboot. Additional reports give information on a sampling period of length delay. You need to sample data using delays i.e. collect data by setting intervals. For example collect data every 2 seconds (or collect data every 2 second 5 times only):
$ vmstat -S M 2
$ vmstat -S M 2 5

procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
3  0      0   1756    607   2359    0    0     0     0    0     1 32  0 68  0
3  0      0   1756    607   2359    0    0     0     0 1018    65 38  0 62  0
3  0      0   1756    607   2359    0    0     0     0 1011    64 37  0 63  0
3  0      0   1756    607   2359    0    0     0    20 1018    72 37  0 63  0
3  0      0   1756    607   2359    0    0     0     0 1012    64 37  0 62  0
3  0      0   1756    607   2359    0    0     0     0 1011    65 38  0 63  0
3  0      0   1995    607   2359    0    0     0     0 1012    62 35  2 63  0
3  0      0   1731    607   2359    0    0     0     0 1012    64 34  3 62  0
3  0      0   1731    607   2359    0    0     0     0 1013    72 38  0 62  0
3  0      0   1731    607   2359    0    0     0     0 1013    63 37  0 63  0

This is what most system administrators do to identify system bottlenecks. I hope all of you find vmstat data is concise and easy to read.

See also: