Why my Linux server ext3 filesystem go read-only?

Posted on in Categories Ask nixCraft, CentOS, File system, Linux, RedHat/Fedora Linux, Sys admin, Tips, Troubleshooting, Ubuntu Linux last updated August 28, 2007

From my mailbag:

We have 5 Dell server collocated running CentOS 4.x and 5.x server operating system. Sometime my file system (ext3) goes read-only. I’d like to know what could be causing such a problem?

My guess:
a) Hardware problem / hard disk problem, check harddisk for errors.

b) High disk I/O aka busy I/O retry error can mark low level disk call as failed. This will force ext3 to go into read only mode.

c) High disk I/O on SAN

d) SAN is not configured properly for the path failover.

In all sort of problems ext3 goes read-only to protect the filesystem and further damage. If you are using VMWARE, check out official webpage to download SCSI patches or workaround for vmware problems.

So what could be causing the file system on Linux go read-only?

Apart from above generic problem, any other error can trigger filesystem on Linux go read only. I hope our reader / seasoned Linux admin can help to answer this question. Please share the experiences and advice in the comments.

45 comment

  1. This is a bug in the Linux kernel. It apparently is fixed in kernels as of 2.6.22. Also, I *think* it has been back-ported to some earlier kernels. For example CentOS 5.1 (and RHEL 5.1 for that matter) supposedly have it fixed as well. I was commonly seeing it in CentOS 5 virtual machine running on VMWare. Here is their Knowledge Base article for more info.

    Although that article is targeted for virtual machines, apparently the kernel bug affected Linux running on real hardware as well.

    1. Hi Greg,
      As you told “This is a bug in the Linux kernel and it is fixed in kernels as of 2.6.22”, I would like to know what exactly the problem is i.e. which scenario would have caused file system to become readonly.
      We are currently using kernel version 2.6.10 and we are facing same problem. I would like to know what exactly the fix done in 2.6.22(files modified). So that i can go through the same file and incorporate same in our kernel.

  2. OK, so I am having a similar problem. CentOS 4.2. Don’t ask but the following commands were executed:
    chown -R 0 etc opt var
    chgrp -R 0 etc opt var
    chmod -R 0500 opt
    chmod -R 0600 etc var

    Several things stopped working… after the server was rebooted, I can log in still: root@(none) and I can even see all the etc, opt, and var data. However, when I try:
    chmod -R 0755 etc

    I get all sorts of changing file blah; Read-only file system. I cannot alter the permissions. Is there a way to fix this?

    1. Dude….that mount -o remount / saved my butt….

      Truly….you are the man.

      Where did you learn that? Really….

      1. Pinoy,
        If it says that then what i found worked, was to umount the FS first and then mount it before running the below command.

        Also if having issues unmounting there is probably a user/process active in the filesystem.
        check by using:-
        fuser -cu /opt

        when you have unmounted and remounted, run this command to enable it as read/write.

        mount -n -o remount,rw /opt
        (Obviously changing /opt to whatever filesystem you have the issue with.)

  3. Hi All,
    I am frequently getting read-only file system error on my server.
    We are using postgres, Grid database. The size of database is very huge.
    CentOS 5.3 64 bit Areca high point rocket raid 3520 8 port
    32 GB RAM
    Assemble hardware.
    We are daily processing millions of rows and loadiing into database. We have marked that when we create a new database it worked fine upto 20 or 25 days. After that we
    are getting errors like “read only file system” , data is corrupted. Therefore we are running fsck to remove bad blocks from the disk. However, after running fsck also we are getting the same error.

    I will appreciate you if somebody help me to get rid out of this issue.

  4. Due to server improper shut down or hang up that time the sectors may corrupt. When the system use the corrupted sector. The machine will go to readonly format.

    1. Hi sathishkumar,

      Can you give me more details about this failure? Why the act of use corrupted sector makes the FS running in a read-only state?

      I have many server runing OpenSuse Kenel 2.6.27.23-0.1, and on each improper shutdown or power failure the system come back in the ready-only state.

      Thanks in advanced!

    1. Thanks.

      That’s what I came here looking for — the option in fstab to replace.
      I not disillusioned, believing that going RO on error is a bug, but I didn’t know what option to replace it with. I’d rather monitor for errors than have a Xen/XCP Server go read only and take down all guests.

      Much appreciated.

  5. I cannot figure this out. I am relatively new to Centos. My AsteriskNow system came with Centos and after a few days it went read only. I fixed it the first time with mount -o remount / only to have it happen again 2 days later. Now that doesn’t work. I have been down for two days. Linux Rescue ??? How exactly. I tried to do this with LiveCD and it complains about ext4 missing. I am so frustrated I want to give it up but it is the phones and I have a year long SIP contract. Any help would be appreciated.
    Centos 2.6.18
    AsteriskNow 1.8

    Hardware, Intel ITX 1.2GHz Celeron, 1Gb memory, 4Gb SSD drive, TDM800P(4 FXS, 4FXO)

  6. So just on a whim and for no good reason (I figured the system was hosed by now) I rebooted and did another mount -o remount /. I still could not login ssh. I noticed several deamons failing including auditd. I still could not log in to the Asterisk web manager interface. I couldn’t even run Asterisk CLI. So I ran yum update kernel followed by yum update. It found a kernel update and installed it. It found 209 other updates and installed them. Mind you when I set this up last week I did all the updates it could find. So would I be correct in assuming that these updates were the result of data corruption. Because now another reboot produces a good working system.

  7. Thanks for the mount – o remount / … I will have to try that. I rebooted a system that had the issue. I’ll have to remember to try this when it happens again. Still don’t know the root cause yet.

  8. We have virtual server running RHEL 3. There are four disk sda, sdb, sdc and sdd.
    Recently twice the server went into hung mode and also console appeared blank.
    We had to hard boot the machine both the times.When server was up and running, we found in system logs that 3 disks were fine, but 4th disk sdd had gone in read-only mode and need recovery to get in write mode. We recorded the logs and presented to client and suggested them to do fsck.

    But later on after sometime the recovery got completed and the partition was enabled in rw mode.

    Now the strange thing is that it had time-stamp when it shown the message that disk went into read only mode.
    So my question is that does the system do recovery in log buffer to reduce IO contention. And when it takes the partition for recovery then does it record the same time and after it is completed then does it push to disk with the same timestamp?

  9. mount -o remount / did not work for me. All I get is this:

    [[email protected] ~]# mount -o remount /
    mount: block device /dev/VolGroup00/LogVol00 is write-protected, mounting read-only
    [[email protected] ~]# touch shit
    touch: cannot touch `shit’: Read-only file system
    [[email protected] ~]#

    1. Can you try fsck (if you are unsure about fsck please google it before proceeding)for any errors and mount the file system in rw.
      mount -o remount,rw /

  10. Thank you for this article, I’ve encountered the same problem. My server was not accessible anymore, nothing to find in the logs after reboot.
    Now I was connected with ssh to update the system and it suddenly was read-only. I guess my usb drive is defect somehow.

  11. Posted it in the wrong discussion – so here we go again:
    I think my drive may have crapped. I saw a “Read only file system” trying to log some things. Logged in as root and still no luck writing to the disk. Then forced a fcsk with a reboot – still no luck. A ‘df’ does not even show that drive now. Any thoughts/ideas?

      1. Well, mine is a standalone server – CentOS6.2, 64 cores, 512GB RAM. No external drives/usb sticks connected to it.

  12. I met the same problem, on my company development web server.

    I lost a looot of work… backups were on the same server…

    This ubuntu server was running as a virtual machine on ESXI server.

    I set up a new VM, and a few hours later, I am having now the same problem on a totally new linux installation.

    So, I think I’ll go for a hardrive problem…

    1. I hope we have learned not to store backups on a development server.

      And this issues is almost always related to a failing hard drive.

  13. Hi,
    We are having a server installed Red Hat, kernel (2.6.18-274.3.1.el5) and Red Hat version (Red Hat Enterprise Linux Server release 5.7 (Tikanga)) with Raid 10.
    When we try to copy files to one mounted partition it will change autocratically to read-only. There were some bad sectors in one of Hard disk and it also replaced with a new HDD.
    Kindly required your advice’s for what can cause for the issue even we change the HDD.

    Thanks !

  14. in every O/S I have seen this occur in, the RO mode is a protective measure,Tthere is an inconsistency in the volume metadata, which may be cached. So, when this happens on Linux, I just reboot the machine. FSCK typically will be called at boot time, and this approach has worked every time.

  15. Hi All,
    My server RHEL 6.4 and, my root (/) file system goes in readonly mode after 3 to 4 days.
    it is happening from last 1 month.
    Every time i reboot the server and server comes in writemode.
    Can anyone tell me how can i fix it permanently. guys please help me.

Leave a Comment