Why my Linux server ext3 filesystem go read-only?

by Vivek Gite on August 28, 2007 · 25 comments

From my mailbag:

We have 5 Dell server collocated running CentOS 4.x and 5.x server operating system. Sometime my file system (ext3) goes read-only. I’d like to know what could be causing such a problem?

My guess:
a) Hardware problem / hard disk problem, check harddisk for errors.

b) High disk I/O aka busy I/O retry error can mark low level disk call as failed. This will force ext3 to go into read only mode.

c) High disk I/O on SAN

d) SAN is not configured properly for the path failover.

In all sort of problems ext3 goes read-only to protect the filesystem and further damage. If you are using VMWARE, check out official webpage to download SCSI patches or workaround for vmware problems.

So what could be causing the file system on Linux go read-only?

Apart from above generic problem, any other error can trigger filesystem on Linux go read only. I hope our reader / seasoned Linux admin can help to answer this question. Please share the experiences and advice in the comments.

Featured Articles:

Share this with other sys admins!
Facebook it - Tweet it - Print it -

We're here to help you make the most of sysadmin work. So, subscribe!

{ 25 comments… read them below or add one }

1 Chmouel Boudjnah August 29, 2007

RAID going in degraded mode, the fs may detect that and the server can get in read only.

Reply

2 Abe Cheng March 7, 2008

I have similar issues with RHEL 4ES U3 or U4. Did you get this issue(s) resolved yet? Thanks.

Reply

3 Greg March 20, 2008

This is a bug in the Linux kernel. It apparently is fixed in kernels as of 2.6.22. Also, I *think* it has been back-ported to some earlier kernels. For example CentOS 5.1 (and RHEL 5.1 for that matter) supposedly have it fixed as well. I was commonly seeing it in CentOS 5 virtual machine running on VMWare. Here is their Knowledge Base article for more info.

Although that article is targeted for virtual machines, apparently the kernel bug affected Linux running on real hardware as well.

Reply

4 Santhosh October 1, 2010

Hi Greg,
As you told “This is a bug in the Linux kernel and it is fixed in kernels as of 2.6.22″, I would like to know what exactly the problem is i.e. which scenario would have caused file system to become readonly.
We are currently using kernel version 2.6.10 and we are facing same problem. I would like to know what exactly the fix done in 2.6.22(files modified). So that i can go through the same file and incorporate same in our kernel.

Reply

5 Erik July 14, 2008

OK, so I am having a similar problem. CentOS 4.2. Don’t ask but the following commands were executed:
chown -R 0 etc opt var
chgrp -R 0 etc opt var
chmod -R 0500 opt
chmod -R 0600 etc var

Several things stopped working… after the server was rebooted, I can log in still: root@(none) and I can even see all the etc, opt, and var data. However, when I try:
chmod -R 0755 etc

I get all sorts of changing file blah; Read-only file system. I cannot alter the permissions. Is there a way to fix this?

Reply

6 Erik July 14, 2008

I ran:
mount -o remount /

That fixed it – no longer in read-only mode. Now I can go correct the damage done.

Reply

7 Travis July 2, 2011

Dude….that mount -o remount / saved my butt….

Truly….you are the man.

Where did you learn that? Really….

Reply

8 Rajagopal January 15, 2009

This saved my day. I spent the last four hours trying to figure out why I always ended up in a readonly mode when boot failed. Thanks for sharing the info.

Reply

9 Fahdi February 6, 2009

@Erik
Dude you are great you saved me to reinstall OS. Great I fixed my OS.

Thank Alot.

Regards,

Reply

10 joseph September 27, 2009

To become Read-only is much better than damaging the filesystem. Then what you need to do is to backup the partition followed by a fsck.

Reply

11 Sam Jas November 26, 2009

Hi All,
I am frequently getting read-only file system error on my server.
We are using postgres, Grid database. The size of database is very huge.
CentOS 5.3 64 bit Areca high point rocket raid 3520 8 port
32 GB RAM
Assemble hardware.
We are daily processing millions of rows and loadiing into database. We have marked that when we create a new database it worked fine upto 20 or 25 days. After that we
are getting errors like “read only file system” , data is corrupted. Therefore we are running fsck to remove bad blocks from the disk. However, after running fsck also we are getting the same error.

I will appreciate you if somebody help me to get rid out of this issue.

Reply

12 Hari February 13, 2010

Please give me step by step instructions…

Help me dudes……………….

Reply

13 Touhid April 2, 2010

I have corrected above problem with fsck, e2fsck.
go with “linux rescue” and repair sectors.

Reply

14 sathishkumar November 25, 2010

Due to server improper shut down or hang up that time the sectors may corrupt. When the system use the corrupted sector. The machine will go to readonly format.

Reply

15 Ram Mohan January 17, 2011

edit fstab, try ext3 with mount options errors=continue,barrier=0 for /
reboot and see…

Reply

16 Ggunzelman April 19, 2011

I cannot figure this out. I am relatively new to Centos. My AsteriskNow system came with Centos and after a few days it went read only. I fixed it the first time with mount -o remount / only to have it happen again 2 days later. Now that doesn’t work. I have been down for two days. Linux Rescue ??? How exactly. I tried to do this with LiveCD and it complains about ext4 missing. I am so frustrated I want to give it up but it is the phones and I have a year long SIP contract. Any help would be appreciated.
Centos 2.6.18
AsteriskNow 1.8

Hardware, Intel ITX 1.2GHz Celeron, 1Gb memory, 4Gb SSD drive, TDM800P(4 FXS, 4FXO)

Reply

17 Ggunzelman April 19, 2011

So just on a whim and for no good reason (I figured the system was hosed by now) I rebooted and did another mount -o remount /. I still could not login ssh. I noticed several deamons failing including auditd. I still could not log in to the Asterisk web manager interface. I couldn’t even run Asterisk CLI. So I ran yum update kernel followed by yum update. It found a kernel update and installed it. It found 209 other updates and installed them. Mind you when I set this up last week I did all the updates it could find. So would I be correct in assuming that these updates were the result of data corruption. Because now another reboot produces a good working system.

Reply

18 Emil@start a blog May 25, 2011

Thanks for the mount – o remount / … I will have to try that. I rebooted a system that had the issue. I’ll have to remember to try this when it happens again. Still don’t know the root cause yet.

Reply

19 ARAO December 6, 2011

Really Really it was of great use (mount -o remount /) ..Thanks :-)

Reply

20 coyote June 13, 2011

hi thanks man, this saved my day…… may the force be with u, Erik dude! \\//,

Reply

21 Sachin June 24, 2011

We have virtual server running RHEL 3. There are four disk sda, sdb, sdc and sdd.
Recently twice the server went into hung mode and also console appeared blank.
We had to hard boot the machine both the times.When server was up and running, we found in system logs that 3 disks were fine, but 4th disk sdd had gone in read-only mode and need recovery to get in write mode. We recorded the logs and presented to client and suggested them to do fsck.

But later on after sometime the recovery got completed and the partition was enabled in rw mode.

Now the strange thing is that it had time-stamp when it shown the message that disk went into read only mode.
So my question is that does the system do recovery in log buffer to reduce IO contention. And when it takes the partition for recovery then does it record the same time and after it is completed then does it push to disk with the same timestamp?

Reply

22 Enzo August 22, 2011

mount -o remount / did not work for me. All I get is this:

[root@starlight ~]# mount -o remount /
mount: block device /dev/VolGroup00/LogVol00 is write-protected, mounting read-only
[root@starlight ~]# touch shit
touch: cannot touch `shit’: Read-only file system
[root@starlight ~]#

Reply

23 Ringo September 15, 2011

# mount -o remount /
did not work. The following error came up:

-bash: /bin/mount: Input/output error

Reply

24 KP October 15, 2011

Thank you for this article, I’ve encountered the same problem. My server was not accessible anymore, nothing to find in the logs after reboot.
Now I was connected with ssh to update the system and it suddenly was read-only. I guess my usb drive is defect somehow.

Reply

25 Arun October 28, 2011

Try below am succees
echo j > /proc/sysrq-trigger

-j Forcibly “Just thaw it” – filesystems frozen by the FIFREEZE ioctl

Reply

Leave a Comment

You can use these HTML tags and attributes for your code and commands: <strong> <em> <ol> <li> <u> <ul> <blockquote> <pre> <a href="" title="">
What is 5 + 12 ?
Please leave these two fields as-is:
Are you a human being? Solve the simple math so we know that you are a human and not a bot.




Previous post:

Next post: