≡ Menu

cache

Gzip is the most popular and effective compression method. Most modern web browser supports and accepts compressed data transfer. By gziping response time can reduced by 60-70% as compare to normal web page. The end result is faster web site experience for both dial up (they're not dead yet - I've dial up account for backup purpose) and broadband user. I've already written about speeding up Apache 2.x web access or downloads with mod_deflate.

mod_compress for Lighttpd 1.4.xx

Lighttpd 1.4.xx supports gzip compression using mod_compress. This module can reduces the network load and can improve the overall throughput of the webserver. All major http-clients support compression by announcing it in the Accept-Encoding header as follows:

Accept-Encoding: gzip, deflate

If lighttpd sees this header in the request, it can compress the response using one of the methods listed by the client. The web server notifies the web client of this via the Content-Encoding header in the response:

Content-Encoding: gzip

This is used to negotiate the most suitable compression method. Lighttpd support deflate, gzip and bzip2.

Configure mod_compress

Open your lighttpd.conf file:
# vi /etc/lighttpd/lighttpd.conf
Append mod_compress to server.modules directive:
server.modules += ( "mod_compress" )
Setup compress.cache-dir to stored all cached file:
compress.cache-dir = "/tmp/lighttpdcompress/"
Finally, define mimetypes to get compressed. Following will allow to compress javascript, plain text files, css file,xml file etc:

compress.filetype           = ("text/plain","text/css", "text/xml", "text/javascript" )

Save and close the file. Create /tmp/lighttpdcompress/ file:
# mkdir -p /tmp/lighttpdcompress/
# chown lighttpd:lighttpd /tmp/lighttpdcompress/

Restart lighttpd:
# /etc/init.d/lighttpd restart

How do I enable mod_compress per virtual host?

Use conditional $HTTP host directive, for example turn on compression for theos.in:

$HTTP["host"] =~ "theos\.in" {
  compress.cache-dir = "/var/www/cache/theos.in/"
}

PHP dynamic compression

Open php.in file:
# vi /etc/php.ini
To compress dynamic content with PHP please enable following two directives:
zlib.output_compression = On
zlib.output_handler = On

Save and close the file. Restart lighttpd:
# service lighttpd restart

Cleaning cache directory

You need to run a shell script for cleaning out cache directory.

See also:

I've already wrote about setting up a Linux transparent squid proxy system. However I'm getting lots of questions about Squid basic installation and configuration:

How do I install Squid Proxy server on CentOS 5 Liinux server?

Sure Squid server is a popular open source GPLd proxy and web cache. It has a variety of uses, from speeding up a web server by caching repeated requests, to caching web, name server query , and other network lookups for a group of people sharing network resources. It is primarily designed to run on Linux / Unix-like systems. Squid is a high-performance proxy caching server for Web clients, supporting FTP, gopher, and HTTP data objects. Unlike traditional caching software, Squid handles all requests in a single, non-blocking, I/O-driven process. Squid keeps meta data and especially hot objects cached in RAM, caches DNS lookups, supports non-blocking DNS lookups, and implements negative caching of failed requests. Squid consists of a main server program squid, a Domain Name System lookup program (dnsserver), a program for retrieving FTP data (ftpget), and some management and client tools.

Install Squid on CentOS / RHEL 5

Use yum command as follows:
# yum install squid
Output:

Loading "installonlyn" plugin
Setting up Install Process
Setting up repositories
Reading repository metadata in from local files
Parsing package install arguments
Resolving Dependencies
--> Populating transaction set with selected packages. Please wait.
---> Package squid.i386 7:2.6.STABLE6-4.el5 set to be updated
--> Running transaction check
Dependencies Resolved
=============================================================================
 Package                 Arch       Version          Repository        Size
=============================================================================
Installing:
 squid                   i386       7:2.6.STABLE6-4.el5  updates           1.2 M
Transaction Summary
=============================================================================
Install      1 Package(s)
Update       0 Package(s)
Remove       0 Package(s)
Total download size: 1.2 M
Is this ok [y/N]: y
Downloading Packages:
Running Transaction Test
Finished Transaction Test
Transaction Test Succeeded
Running Transaction
  Installing: squid                        ######################### [1/1]
Installed: squid.i386 7:2.6.STABLE6-4.el5
Complete!

Squid Basic Configuration

Squid configuration file located at /etc/squid/squid.conf. Open file using a text editor:
# vi /etc/squid/squid.conf
At least you need to define ACL (access control list) to work with squid. The defaults port is TCP 3128. Following example ACL allowing access from your local networks 192.168.1.0/24 and 192.168.2.0/24. Make sure you adapt to list your internal IP networks from where browsing should be allowed:
acl our_networks src 192.168.1.0/24 192.168.2.0/24
http_access allow our_networks

Save and close the file. Start squid proxy server:
# chkconfig squid on
# /etc/init.d/squid start

Output:

init_cache_dir /var/spool/squid... Starting squid: .       [  OK  ]

Verify port 3128 is open:
# netstat -tulpn | grep 3128
Output:

tcp        0      0 0.0.0.0:3128                0.0.0.0:*                   LISTEN      20653/(squid)

Open TCP port 3128

Finally make sure iptables is allowing to access squid proxy server. Just open /etc/sysconfig/iptables file:
# vi /etc/sysconfig/iptables
Append configuration:
-A RH-Firewall-1-INPUT -m state --state NEW,ESTABLISHED,RELATED -m tcp -p tcp --dport 3128 -j ACCEPT
Restart iptables based firewall:
# /etc/init.d/iptables restart
Output:

Flushing firewall rules:                                   [  OK  ]
Setting chains to policy ACCEPT: filter                    [  OK  ]
Unloading iptables modules:                                [  OK  ]
Applying iptables firewall rules:                          [  OK  ]
Loading additional iptables modules: ip_conntrack_netbios_n[  OK  ]

Client configuration

Open a webbrowser > Tools > Internet option > Network settings > and setup Squid server IP address and port # 3128.

See also

You may find our previous squid tips useful:

Squid Security and blocking content Related Tips

Squid Authentication Related Tips

Squid Other Tips

You may have noticed that most my webpage are loading bit faster. Here is what I did:

a) CSS code moved to its own file and included CSS at the top

b) Removed unnecessary (read as fancy web 2.0 stupid stuff) external javascript snippets

c) I’ve moved external javascript to bottom of page/template engine. For example google analytics JS code moved to bottom of webpage.

d) Turn on Apache gzip/mod_deflate compression

e) Turn on WordPress caching

f) Turn on php script caching (I’m using eAccelerator)

g) Tweak MySQL for optimization. Turn on query cache and other settings.

h) If possible switch to lighttpd or use squid / lighttpd as caching server for old good Apache.

If you have tons of cash to burn (assuming that your web app demands performance):

  • Consider using CDN (Content Delivery Network) such as Akamai or SAVVIS.
  • Server load balancing

However there are some external JS script snippets such as Google Adsense which slows down loading of a webpage. In few months I may roll out a new template and I will try to fix this issue :)

I'm interested to know what other people's experiences with web page optimization. Feel free to share your tips.

Q. How can I find out Linux Resource utilization using vmstat command? How do I get information about high disk I/O and memory usage?

A. vmstat command reports information about processes, memory, paging, block IO, traps, and cpu activity. However, a real advantage of vmstat command output - is to the point and (concise) easy to read/understand. The output of vmstat command use to help identify system bottlenecks. Please note that Linux vmstat does not count itself as a running process.

Here is an output of vmstat command from my enterprise grade system:
$ vmstat -S M
Output:

procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
3  0      0   1963    607   2359    0    0     0     0    0     1 32  0 68  0

Where,

  • The fist line is nothing but six different categories. The second line gives more information about each category. This second line gives all data you need.
  • -S M: vmstat lets you choose units (k, K, m, M) default is K (1024 bytes) in the default mode. I am using M since this system has over 4 GB memory. Without -M option it will use K as unit

$ vmstat
Output:

procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
3  0      0 2485120 621952 2415368  0    0     0     0    0     1 32  0 68  0

Field Description For Vm Mode

(a) procs is the process-related fields are:

  • r: The number of processes waiting for run time.
  • b: The number of processes in uninterruptible sleep.

(b) memory is the memory-related fields are:

  • swpd: the amount of virtual memory used.
  • free: the amount of idle memory.
  • buff: the amount of memory used as buffers.
  • cache: the amount of memory used as cache.

(c) swap is swap-related fields are:

  • si: Amount of memory swapped in from disk (/s).
  • so: Amount of memory swapped to disk (/s).

(d) io is the I/O-related fields are:

  • bi: Blocks received from a block device (blocks/s).
  • bo: Blocks sent to a block device (blocks/s).

(e) system is the system-related fields are:

  • in: The number of interrupts per second, including the clock.
  • cs: The number of context switches per second.

(f) cpu is the CPU-related fields are:

These are percentages of total CPU time.

  • us: Time spent running non-kernel code. (user time, including nice time)
  • sy: Time spent running kernel code. (system time)
  • id: Time spent idle. Prior to Linux 2.5.41, this includes IO-wait time.
  • wa: Time spent waiting for IO. Prior to Linux 2.5.41, shown as zero.

As you see the first output produced gives averages data since the last reboot. Additional reports give information on a sampling period of length delay. You need to sample data using delays i.e. collect data by setting intervals. For example collect data every 2 seconds (or collect data every 2 second 5 times only):
$ vmstat -S M 2
OR
$ vmstat -S M 2 5
Output:

procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
3  0      0   1756    607   2359    0    0     0     0    0     1 32  0 68  0
3  0      0   1756    607   2359    0    0     0     0 1018    65 38  0 62  0
3  0      0   1756    607   2359    0    0     0     0 1011    64 37  0 63  0
3  0      0   1756    607   2359    0    0     0    20 1018    72 37  0 63  0
3  0      0   1756    607   2359    0    0     0     0 1012    64 37  0 62  0
3  0      0   1756    607   2359    0    0     0     0 1011    65 38  0 63  0
3  0      0   1995    607   2359    0    0     0     0 1012    62 35  2 63  0
3  0      0   1731    607   2359    0    0     0     0 1012    64 34  3 62  0
3  0      0   1731    607   2359    0    0     0     0 1013    72 38  0 62  0
3  0      0   1731    607   2359    0    0     0     0 1013    63 37  0 63  0

This is what most system administrators do to identify system bottlenecks. I hope all of you find vmstat data is concise and easy to read.

See also: