Today, I've upgraded total 8 servers from 4GiB to 8GiB to improve performance of system by inserting additional memory modules. We started each server and checked for memory count at console. All severs booted normally after the upgrade and services such as SMTP, NFS, CIFS, HTTP started as expected. Shortly, afterwords I got a call from help desk about pop3 server for slow performance.
The pop3 server node was giving out timeout errors and download speed was very slow for all MUAs. I tried to ssh into box and it bounced back with 22: Connection refused error. I wasn't ready to take down server from rack; so I fired KVM over IP java client. Eventually, I found that server is reporting only 2GiB RAM instead of doubling the total memory. This was bad. The worst problem was, POP3 server node did not fall back to backup node. Our LVS (Linux Virtual Server based cluster) failed to detect problem. So I made few changes to pickup working POP3 node.
My investigation revealed that this memory problem occurred because the new RAM was incompatible with the server motherboard. I did verified the available RAM for first five nodes and went to back to office for something else. Another person hooked back the rest and told me that he verified the available RAM. Whenever, I perform memory upgrade, I always verify the amount of memory reported by the system when it is rebooted and I never assume the memory is there.
Another lesson learned - never ever trust third person. If he has verified the available RAM immediately after installing the new modules, we would noticed the problem immediately instead of waiting for users to complain back. Another reason not to perform upgrades on Fridays.
You should follow me on twitter here or grab rss feed to keep track of new changes.
Featured Articles:
- 30 Handy Bash Shell Aliases For Linux / Unix / Mac OS X
- Top 30 Nmap Command Examples For Sys/Network Admins
- 25 PHP Security Best Practices For Sys Admins
- 20 Linux System Monitoring Tools Every SysAdmin Should Know
- 20 Linux Server Hardening Security Tips
- Linux: 20 Iptables Examples For New SysAdmins
- Top 20 OpenSSH Server Best Security Practices
- Top 20 Nginx WebServer Best Security Practices
- 20 Examples: Make Sure Unix / Linux Configuration Files Are Free From Syntax Errors
- 15 Greatest Open Source Terminal Applications Of 2012

- My 10 UNIX Command Line Mistakes
- Top 10 Open Source Web-Based Project Management Software
- Top 5 Email Client For Linux, Mac OS X, and Windows Users
- The Novice Guide To Buying A Linux Laptop











{ 8 comments… read them below or add one }
Thanks for the information
“Another reason not to perform upgrades on Fridays” HAHAHAHAHA,
I love overtime bonus on my payment, but you not :p
as I am often reminded when I delegate to another… “trust, but always verify”
.h
The 11th Commandment: Thou shalt make no hardware changes, neither shalt thou release software updates on the 5th day. :-)
This mailinglist seemed pretty interesting at a first glance but has time after time proven itself rather low-tech.. When upgrading servers it could be useful to make sure you stick the right kinds of memories in the machines, how insightful..
We always run memtest (until test#4) on all our new servers, you’d be amazed how many memory modules actually fail. (From my experience, maybe 1 of 1000. (including single-bit errors, and non-”critial” errors.))
I’m sure you could run many of these systems without noticing this problem, but better safe than sorry! :D
all ways buy the memory from same vendor where from u buy server at least try from same vendor.
On Red Hat, I’ve seen several issues where PAE kernels were not installed. Someone called their data center, had the ram updated, but failed to install the appropriate kernel. For would very large server provider, I saw them go through 4 sticks of RAM before realizing it was the OS.