Compressing large Apache log files on a shared hosting server

Posted on in Categories News last updated December 31, 2005

If your site is quite popular then your apache log file could quickly becomes large. It could be anywhere between 200MB or 1 GB per month. Downloading such large apache log file not just take lot of time but bandwidth too. We have lots of shared hosting customer on apache web server. If log files are small, they you can download it easily using ftp program.

Nevertheless, when log file becomes larger, downloading them is a big problem. For security reason no hosting provider provides ssh access, further system(), exec() and other php functions are disabled in php.

How the hell customer will get his/her apache raw log file. With the help of php script you can compress file and then download it. Php provides function called gzwrite() which is binary safe gz-file writer.

Syntax:

int gzwrite ( resource zp, string string [, int length] )

Where,

  • zp : is a file pointer
  • string : the string to write
  • length : the number of uncompressed bytes to write

PHP script:

<?
// How to use?
// http://mydomain.com/private/gzip?i=access.log&o=/access.log.gz
function gzip ($in, $out, $param="1")
{
if (!file_exists ($in) || !is_readable ($in))
 return false;
if ((!file_exists ($out) && !is_writable (dirname ($out)) ||
(file_exists($out) &amp;& !is_writable($out)) ))
 return false;

$in_file = fopen ($in, "rb");
if (!$out_file = gzopen ($out, "wb".$param)) {
 return false;
}

while (!feof ($in_file)) {
 $buffer = fgets ($in_file, 4096);
 gzwrite ($out_file, $buffer, 4096);
}

fclose ($in_file);
gzclose ($out_file);

return true;
}

$in=$_GET['i'];
$out=$_GET['o'];
echo "<html><head><title>Php Gzip program</title></head><body>\n";
if (
echo "Status:";
if ( (gzip($in,$out,"9")) == false ){
echo "<h1>Failed</h1>\n";
}
else {
echo "<h1>Done</h1>\n";
}
echo "</body></html>";
?>

Just upload script to your webserver directory in password-protected directory. Then run php script by typing http url:

http://mydomain/private/gzip.php?i=access.log&o=access.log.gz

Where,

  • i=file.log: Raw Apache log file (as a Input file)
  • o=file.log.gz: Output file i.e. compressed file

Result

  • Raw access.log file size 267730KB before running gzip.php
  • Raw access.log.gz file size 16301KB after compressing using gzip.php

Make sure private directory is apache password protected, otherwise anyone can use your script. Please note that I am not a php expert or regular programmer, if you have a better solution or script please feel free to post it.

Reference:

It is now New Years Eve. Year 2005 was a good year for all of us at nixCraft. We are proud of our small community and shared learning that we do here. Thanks to everyone who has contributed to increase our knowledge of Linux and Open source by leaving a comment or links.

Happy New Year [Naye Varsha Ki Shubhkamanyen (hindi)]
We do appreciate you and hope that you will have a great New Year celebration.

FreeBSD and Linux changing Desktop Environments/login manager

Posted on in Categories News last updated December 30, 2005

Asked by Christopher

Q. I have both FreeBSD and Red Hat Linux desktop computers. I would like to change KDE to GNOME or vice versa. Under FreeBSD X windows is working but without KDE/Gnome desktop, I am using Intel Celeron computer can I run or install kde?

A.

Changing Desktop Environments under Red Hat
Use utility called switchdesk, open terminal, and type

# switchdesk

Select desktop you would like to use, click Ok.
OR
From System Setting select > Switch desktop tool
OR
Modify /etc/sysconfig/desktop file

# vi /etc/sysconfig/desktop

Setup/modify variable DISPLAYMANAGER:

  • If you want Gnome Desktop setup it to DISPLAYMANAGER=”GNOME”
  • If you want KDE Desktop setup it to DISPLAYMANAGER=”KDE”

Note: Debian user can need to modify file /etc/X11/default-display-manager and need to put full path of desktop manager. Therefore, if you are using Gnome use path /usr/bin/gdm, for Kde use path /usr/bin/kdm.

FreeBSD Desktop system
Since you are using low end Celeron I recommend using xfce desktop (you can use Gnome too but it will be little bit slow). It is a desktop environment based on the GTK+ toolkit used by GNOME, but is much more lightweight and simple.

To install xfce, type following command as root user:

# pkg_add -r xfce4

Open your configuration file .xinitrc and append/add following line to it:

# cd
# vi .xinitrc

Simply add following line:

/usr/X11R6/bin/startxfce4

Save the file, exit to shell prompt. Above line, tell the X server to launch XFce the every time X is started.

GTK+ fundamentals, Part 1: Why use GTK+?

Posted on in Categories Book Review, News last updated July 22, 2006

IBM developerworks has publslihed an article on GTK+ fundamentals. It is very nice easy to understand tutorial on GTK+. This article, the first in a three-part series, introduces you to the world of GTK+. It explains what GTK+ is why you should consider using it, and the benefits it provides. Together with the rest of the series, this installment provides enough introductory information that, if you decide to use GTK+ in your own projects, you’ll know where to look for further materials. GTK+ is a graphical user interface (GUI) tool kit. That is, it’s a library (or, in fact, a collection of several closely related libraries) that allow you to create GUI-based applications. Think of GTK+ as a toolbox in which you can find many ready building blocks for creating GUIs. Read full article here

Iptables MAC Address Filtering

Posted on in Categories Iptables, Linux last updated December 15, 2010

LAN or wireless access can be filtered by using the MAC addresses of the devices transmitting within your network. A mac address is acronym for media access control address, is a unique address assigned to almost all-networking hardware such as Ethernet cards, routers, mobile phones, wireless cards and so on (see mac address at wikipedia for more information). This quick tutorial explains how to block or deny access using MAC address using iptables – Linux administration tool for IPv4 packet filtering and NAT.

Book review: Linux Troubleshooting (PDF version)

Posted on in Categories Book Review last updated July 22, 2006

Slashdot has published book review of Linux Troubleshooting book by Bruce Perens. This book is reviewed by Mary Norbury-Glaser. The Bruce Perens Open Source Series of books published by Prentice Hall PTR is a strong collection of nearly 20 volumes focusing on Linux and open source technology. Edited by Linux guru and former Debian GNU/Linux Project Leader, Bruce Perens, the books are aimed toward developers, sysadmins and power users. Several months following the release of a new print volume, a free electronic version is made available on Prentice Hall PTR’s web site. The series includes some excellent editions including Official Samba-3 HOWTO and Reference Guide (2nd ed.), Linux Quick Fix Notebook and PHP 5 Power Programming. The newest book by Mark Wilding and Dan Behman, Self-Service Linux: Determining Problems and Finding Solutions, is another well-written and worthy companion to this series. Read entire review online.

Summery

  • Book title: Linux Troubleshooting
  • Author: Mark Wilding and Dan Behman
  • Author web/blog: —
  • Publisher: Prentice Hall, PTR
  • Pub Date: November 2005
  • ISBN: 013147751X
  • Pages: 456
  • Level of experience needed: Linux newbie (noobs)
  • Who will find useful: Linux System administrators
  • Additional goodies included (such as CDROM) : —
  • Sample chapters: — (entire book will be available in ebook (pdf ) format)
  • Our rating: none (Slashdot rating 8)

Tentakel to execute commands on multiple Linux or UNIX Servers

Posted on in Categories Automation, CentOS, Debian Linux, Download of the day, FreeBSD, Gentoo Linux, Howto, Linux, Monitoring, Networking, OpenBSD, RedHat/Fedora Linux, Sys admin, Tips, Tuning, Ubuntu Linux, UNIX last updated August 17, 2007

This is Part II in a series on Execute Commands on Multiple Linux or UNIX Servers Simultaneously. The full series is Part I, Part II, and Part III. Many times, you want to execute a command not only on one server, but also on several servers. For example, find out

  • Version of kernel
  • Version of Apache web server
  • Update static html or images files on all web servers via rsync
  • Find out user information, server information, memory usage etc
  • Security/patch checking

tentakel

I have already covered how to execute commands on multiple Linux or UNIX servers via shell script. The disadvantage of script is commands do not run in parallel on all servers. However, several tools exist to automate this procedure in parallel. With the help of tool called tentakel, you run distributed command execution. It is a program for executing the same command on many hosts in parallel using ssh (it supports other methods too). Main advantage is you can create several sets of servers according requirements. For example webserver group, mail server group, home servers group etc. The command is executed in parallel on all servers in this group (time saving). By default, every result is printed to stdout (screen). The output format can be defined for each group.

How it works?

Consider the following sample setup:

admin workstation   Group                  Hosts
|----------------> www-servers        host1, host2,host3
|----------------> homeservers        192.168.1.12,192.168.1.15
IP: 192.168.1.1

You need to install tentakel on admin workstation (192.168.1.1). We have two group servers, first is group of web server with three host and another is homeservers with two hosts.

The requirements on the remote hosts (groups) need a running sshd server on the remote side. You need to setup ssh-key based login between admin workstation and all group servers/hosts to take full advantage of this tentakel distributed command execution method.

System requirement

Tentakel requires a working Python installation. It is known to work with Python 2.3. Python 2.2 and Python 2.1 are not supported. If you are using old version of python then please upgrade it.

Configuration

Let us see howto install and configure tentakel.

Step # 1 : Download tentakel

Visit sourceforge home page to download tentakel or download RPM files from tentakel home page.

Step # 2: Install tentakel

Untar source code, enter:

# tar -zxvf tentakel-2.2.tgz

You should be root user for the install step. To install it type

# make
# make install

Step # 3 Configure groups

For demonstration purpose we will use following setup:

   admin pc                    Group           hosts
Running Debian Linux       homeservers     192.168.1.12 192.168.1.15
User: jadmin

Copy sample tentakel configuration file tentakel.conf.example to /etc directory

# cp tentakel.conf.example /etc/ tentakel.conf

Modify /etc/tentakel.conf according to above setup, at the end your file should look like as follows:

# first section: global parameters
set ssh_path="/usr/bin/ssh"
set method="ssh"  # ssh method
set user="jadmin"   # ssh username for remote servers
#set format="%d %o\n" # output format see man page
#set maxparallel="3"  # run at most 3 commands in parallel

# our home servers with two hosts
group homeservers ()
+192.168.1.12 +192.168.1.15

# localhost
group local ()
+127.0.0.1

Save the file and exit to shell prompt. Where,
group homeservers () : Group name
+192.168.1.12 +192.168.1.15 : Host inclusion. name is included and can be an ip address or a hostname.

Step # 4 Configure SSH password less login

Configure ssh-key based login to avoid password prompt between admin workstation and group servers for jadmin user.

Step # 5 Test tentakel

Login as jadmin and type the following command:

$ tentakel -g homeservers

interactive mode
tentakel(homeservers)>

Where,
-g groupname: Select the group groupname The group must be defined in the configuration file (here it is homeservers). If not specified tentakel implicitly assumes the default group.

At tentakel(homeservers)> prompt type command uname and uptime command as follows:

exec "uname -mrs"
exec "uptime"

Few more examples
Find who is logged on all homeservers and what they are doing (type at shell prompt)

$ tentakel -g homeservers "w"

Executes the uptime command on all hosts defined in group homeservers:

$ tentakel -g homeservers uptime

As you can see, tentakel is very powerful and easy to use tool. It also supports the concept of plugins. A plugin is a single Python module and must appear in the $HOME/.tentakel/plugins/ directory. Main advantage of plugin is customization according to your need. For example, entire web server or mysql server farm can be controlled according our requirements.
However, tentakel is not the only utility for this kind of work. There are programs that do similar things or have to do with tentakel in some way. The complete list can be found online here. tentakel should work on almost all variant of UNIX/BSD or Linux distributions.

Time is a precious commodity, especially if you’re a system administrator. No other job pulls people in so many directions at once. Users interrupt you constantly with requests, preventing you from getting anything done and putting lots of pressure on you. What do you do? The answer is time management. Read our book review of Time Management for System Administrators. Continue reading Execute commands on multiple hosts using expect tool Part III of this series.

Reference:

  • Read tentakel man page for tentakel configuration options
  • tentakel home page

Update: Damon confirmed that it works on Windows too with little modification.