Hardware

A redundant array of inexpensive disks (RAID) allows high levels of storage reliability. RAID is not a backup solution. It is used to improve disk I/O (performance) and reliability of your server or workstation. A RAID can be deployed using both software and hardware. But the real question is whether you should use a hardware RAID solution or a software RAID solution.

{ 33 comments }

I’ve Windows Vista installed as a guest under Ubuntu Linux using VMWARE Workstation 6.0. This is done for testing purpose and browsing a few site that only works with Internet Explorer. Since I only use it for testing I made 16GB for Vista and 5GB for CentOS and 5GB in size for FreeBSD guest operating systems. However, after some time I realized I’m running out of disk space under both CentOS and Vista. Adding a second hard drive under CentOS solved my problem as LVM was already in use. Unfortunately, I needed to double 32GB space without creating a new D: drive under Windows Vista. Here is a simple procedure to increase your Virtual machine’s disk capacity by resizing vmware vmdk file.

{ 35 comments }

From my mailbag:

How do I find out if a given PCI hardware is supported of by the current CentOS / Debian / RHEL / Fedora Linux kernel?

You can easily find out find out if a given piece of PCI hardware such as RAID, network, sound, graphics card is supported or not by the current Linux kernel using the following utilities under any Linux distributions.

{ 10 comments }

The Blue Screen of Death (BSoD) is used for the error screen displayed by Microsoft Windows, after encountering a critical system. Linux / UNIX like operating system may get a kernel panic. It is just like BSoD. The BSoD and a kernel panic generated using a Machine Check Exception (MCE). MCE is nothing but feature of AMD / Intel 64 bit systems which is used to detect an unrecoverable hardware problem.

Program such mcelog decodes machine check events (hardware errors) on x86-64 machines running a 64-bit Linux kernel. It should be run regularly as a cron job on any x86-64 Linux system. This is useful for predicting server hardware failure before actual server crash.

{ 9 comments }

Unplanned downtime may be the result of a software bug, human error, equipment failure, power failure, and much more. Last week was a bad one. We faced three different downtime:

  • First, there was a fiber cut for one of our data center resulting into routing anomalies due BGP reroute. Traffic was rerouted but updating those BGP tables took some time to update.
  • Someone from networking team failed to follow proper maintenance procedures for network device resulted into 55 minutes downtime.
  • One of our SAN hardware failure – Many internal UNIX / Linux web applications use SAN to store data including file server, tracking apps, R&D apps, IT help desk, LAN and WAN servers failed. This one lasted for 12 hrs. It was stared around midnight. The vendor replaced entire SAN hardware. Now we have dual stacked SAN as a backup device for internal usage.

Note: There is a poll embedded within this post, please visit the site to participate in this post’s poll.

{ 6 comments }

The round-robin database tool aims to handle time-series data like network bandwidth, temperatures, CPU load etc. The data gets stored in round-robin database so that system storage footprint remains constant over time. Lighttpd comes with mod_rrdtool to monitor the server load and other details. This is useful for debugging and tuning lighttpd / fastcgi server performance.

{ 12 comments }

Applications that perform a lot of memory accesses (several GBs) may obtain performance improvements by using large pages due to reduced Translation Lookaside Buffer (TLB) misses. HugeTLBfs is memory management feature offered in Linux kernel, which is valuable for applications that use a large virtual address space. It is especially useful for database applications such as MySQL, Oracle and others. Other server software(s) that uses the prefork or similar (e.g. Apache web server) model will also benefit.

The CPU’s Translation Lookaside Buffer (TLB) is a small cache used for storing virtual-to-physical mapping information. By using the TLB, a translation can be performed without referencing the in-memory page table entry that maps the virtual address. However, to keep translations as fast as possible, the TLB is usually small. It is not uncommon for large memory applications to exceed the mapping capacity of the TLB. Users can use the huge page support in Linux kernel by either using the mmap system call or standard SYSv shared memory system calls (shmget, shmat).

{ 11 comments }