NFS Stale File Handle error and solution

Posted on in Categories Linux, Tips, Troubleshooting last updated October 9, 2006

Sometime NFS can result in to weird problems. For example NFS mounted directories sometimes contain stale file handles. If you run command such as ls or vi you will see an error:
$ ls
.: Stale File Handle

First let us try to understand the concept of Stale File Handle. Managing NFS and NIS, 2nd Edition book defines filehandles as follows (a good book if you would like to master NFS and NIS):
A filehandle becomes stale whenever the file or directory referenced by the handle is removed by another host, while your client still holds an active reference to the object. A typical example occurs when the current directory of a process, running on your client, is removed on the server (either by a process running on the server or on another client).

So this can occur if the directory is modified on the NFS server, but the directories modification time is not updated.

How do I fix this problem?

a) The best solution is to remount directory from the NFS client using mount command:
# umount -f /mnt/local
# mount -t nfs nfsserver:/path/to/share /mnt/local

First command (umount) forcefully unmount a disk partition /mnt/local (NFS).

(b) Or try to mount NFS directory with the noac option. However I don’t recommend using noac option because of performance issue and Checking files on NFS filesystem referenced by file descriptors (i.e. the fcntl and ioctl families of functions) may lead to inconsistent result due to the lack of consistency check in kernel even if noac is used.

Posted by: Vivek Gite

The author is the creator of nixCraft and a seasoned sysadmin and a trainer for the Linux operating system/Unix shell scripting. He has worked with global clients and in various industries, including IT, education, defense and space research, and the nonprofit sector. Follow him on Twitter, Facebook, Google+.

Share this on (or read 44 comments/add one below):

44 comment

  1. I encounter these errors on my Ubuntu client when I reboot my Fedora server with the NFS shares. I find I need to first umount the shares on the client, then restart nsf on the server, then remount the shares on the client. I don’t claim to understand the logic behind this — but it works.

  2. I had this error on my home laptop, so it had nothing to do with servers. it was when I tried t install songbird on my computer, something went wrong and when I wanted to reinstall songbird my computer gave this error, I’ve tried to reboot, to delete the /usr/share/songbird directory but nothing worked. Finally I let it like it was and just ran the scripts while they weren’t in place (right from my home folder) and now, a few weeks later the problem hes resolved himself and I could reinstall songbird without problems. If you could say how this happened please let me know.

  3. When a client mounts a server’s exported NFS mount point to the client’s specific mount point, client and server negotiate a unique id for that event (i.e. the id is unique for the mount, mount point, server, server exported file system, etc.), there is therefore a new unique ID for every successful mount request. All communications between the client and the server include this unique ID. When a server reboots, the server (intentionally) will have no record of the unique ID, hence the client will get an error when it tries to access the remote file system. The stale file system error is telling you that this has occurred.

    Unmounting (force) and remounting from the client will resolve this problem IF there are no other references within the client that are retaining the handle/ID. I.e. this is a problem for processes that are running and have open file handles when you do the force unmount. When you force unmount/remount and still get the error, you have some program that is probably (?) not following proper file handle semantics about closing files.

    However, watch out for a related problem, in autofs I have seen timeout options. These can cause a stale NFS handle in some combinations of NFS (version 2 also) client to server communications, and processes that have (correctly) open file handles.

    So, a stale NFS handle occurring on a client after a server reboot, resolved on the client by an un/remount of the client’s file system is proper behaviour.

  4. I just bot laptop with Limpus Linux an it and the first time I started gave me this notice
    fale NFS Stale File Handle error : I dont have much experience with linux how can I solve the problem , or instal windows xp (when I tried it gave mi the similar error like stoped to not damage sistem)?

  5. Usually I find some operation that will force the inode tables on client and server to update de-zombifies stale NFS data. Something as simple as remaking the mount point directory on the client (seen that solve some really weird stuff) or renaming a directory on the server and forcing the client to try to use it, then renaming it back. This is all assuming simple umount / mount and/or kill / restart stuff doesn’t work.

    *cough*network failure system*cough*

  6. I have this problem on my system running on windows. I am running an embedded program loaded on chip and not at all connected to server. when I will do ls it shows the error “Stale NFS file handle”. I am using Putty / hyperterminal . The solution mention above not working for me. please help me out………..

    Thanking you in advanced.

  7. I have my file system stored at hda2 partation which is of type ext2 and getting folling error
    mount /dev/hda2 /mnt/hda2
    EXT2-fs warning: mounting unchecked fs, running e2fsck is recommended
    #
    # cd /mnt/hda2
    #
    # ls
    ls: ./^J#: Stale NFS file handle
    ls: ./AppInfo: Stale NFS file handle
    ls: ./C: Stale NFS file handle

    Settings selinux
    alsa-mae-init launchup sys
    b# lib test_snd
    bin linuxrc tmp
    config_init lost+found tmp_bac
    data mnt
    dev opt usr
    etc proc var
    i root
    init sbin

  8. The umount/mount fix is a *really* big hammer. Besides, it only fixes the symptoms, it doesn’t fix the disease. Hasn’t anyone written a fix so that when the Stale condition is detected, the filehandle/dirhandle in question is refreshed (made non-Stale)? Seems like a simple enough fix. We get the Stale handles between RedHat Ent. and Tru64, sometimes in a matter of seconds of making the mount. Neither file system has been restarted. A umount/mount will fix it for a short time, but it returns quickly. Using the NOAC option significantly reduces the frequency of occurrence, but also significantly reduces write speed to the mounted files, even within one open/write/close session, the write is greatly slowed. This is especially curious, since the file attributes should be immaterial during the writes, IMO. Our biggest problem is that we can’t pay a trained monkey to sit around watching for the Stale handle incidents and do the umount/mount 24/7. Besides the umount will affect other running processes. It’s hardly a win. Does anyone know of better solutions? Has anyone else had success with the `ls -alh’ fix?

  9. Perhaps I am missing something, but I have never ONCE been able to get the above to work.

    I always get stuck with:
    # umount -f /mnt/home-ext
    umount2: Device or resource busy
    umount.nfs: /mnt/home-ext: device is busy

    Linux can be so stupid sometimes, I mean how can it be busy if the mount is stale and so by its nature NOTHING is able to access it?

    1. Hi Jozef,

      I really don’t know but I’m having a problem with a server which directories are mounted in a different server. My guess is like you said, a problem with the network, but so far, haven’t found the way of testing it.

      Did you have a similar problem or did you get any answer to your question?

      Cheers.

  10. if you are running nautilus file manager, you’ll probably find the problem is nautilus, not NFS at all. try “killall nautilus” from any shell prompt. works for me, so far..

  11. This problem is haunting me in newer Fedora 17 installs. Autofs works fine, but is not timing out the mounts after the resource is left. So they seem to stay alive until something as the remote host being rebooted happens.
    Then the stale mount thing…
    But after forcing the umount (which seems to fix things partially, no more stale mount alerts), now I get:
    Too many levels of symbolic links

    Any hint on this ?
    Thanks

  12. I got same issue (mount.nfs: Stale NFS file handle) the first time I’ve attempted to mount a shared folder.
    I dont really have anything to umount of in busy state

    Any idea appreciated

    thx
    Nicola

  13. Try:

    exportfs -f

    on the server first, if that does not work then on the client try :

    mount -o remount  /path

    if that fails with device is busy/in-use, find the offending processes with:

    fuser -fvm /path

    and retry remount

    1. I’ve been using NFS for the better part of 20 years and have run into this problem off and on but never found a solution until I came across this post.

      exportfs -f on the server did the trick for me.

      Thanks! This just got added into my toolbox if sysadmin tricks.

    2. Have this error from time to time. Tried all solutions mentioned up to here.

      Thanks Paul Freeman, exportfs -f was new to me and solved the problem without restart nfs or the server.

  14. I found that i could not umount -f /path always getting the stale message.
    when i looked at my server i could see that in the exportfs -av that the ip address listed was not what i was connected on anymore. Looking at my router i found that i had a Dynamic DHCP address. I added and a reservastion for my MAC address on my old IP address and reconnected my wireless then mounted and everything worked as before.

  15. I had this problem and I was sure that files/directory are still available at host.
    So, I just back traced one directory and came back on current directory, everything was working then.

  16. Thank you. My Yum was locking up and when I did trace it was a stale nfs.

    strace yum -y update {package}

    I unmounted and remounted and it worked again. Explains why yum and server issues.

  17. I am facing issue on my NFS filesystem. When i try to access directories it displays unknow erroe

    STGDPMWEB1:/shareddata # cd STP
    -bash: cd: STP: Unknown error 521
    i am using below command to mount… any one can help plsss

    mount -t nfs4 NSDLSTAG:/HFS/shareddata /shareddata/

    STGDPMWEB1:/shareddata # ls -lart
    ls: cannot access uploadfiles: Unknown error 521
    ls: cannot access downloadfiles: Unknown error 521
    ls: cannot access mail: Unknown error 521
    ls: cannot access MessagesExportedFromProjectWeb: Unknown error 521
    ls: cannot access STP: Unknown error 521
    ls: cannot access abcd: Unknown error 521
    total 5
    d????????? ? ? ? ? ? uploadfiles
    d????????? ? ? ? ? ? mail
    d????????? ? ? ? ? ? downloadfiles
    -????????? ? ? ? ? ? abcd
    d????????? ? ? ? ? ? STP
    d????????? ? ? ? ? ? MessagesExportedFromProjectWeb
    drwxr-xr-x 28 root root 4096 Feb 3 11:35 ..
    drwxrwxrwx 7 root bin 512 Feb 3 13:21 .

  18. So much misinformation in this thread! It is not true at all that rebooting the NFS server should lead to stale file handles. Indeed, NFS was designed to be a stateless protocol where the server could be rebooted, or indeed migrated to a different host, without interruption to clients (besides a delay in response to NFS requests while the server is down).

    If this is not happening then your NFS configuration is broken. There are numerous ways this breakage can happen on the Linux NFS server. One way is if you do not specify the fsid option in your /etc/exports, and your NFS server decides to automatically assign a different fsid portion of the file handle after a reboot. Whether this will be a problem depends on your configuration.

    Another way things can break is if you start up the NFS daemon on the NFS server and make it available on the network before having exported all filesystems. It’s crucial when using NFS in a HA environment that the virtual HA IP address is only added to a server after an NFS failover once all exports have been loaded on the server.

    About the only situation in a correctly configured NFS environment where you will get stale NFS file handle and have to remount filesystems on the client is if the server was restored from a filesystem-level (not block-level) backup, leading to files having different inode numbers and therefore different NFS file handles. There is no simple way around this issue.

    This just touches the surface of some of the possible issues that could lead to the type of problems OP describes. Having to unmount then remount filesystems is *not* the expected behaviour of a correctly configured NFS environment.

  19. I had same problem.

    When using df:

    “df: `/mnt/nfs_share_name’: Stale NFS file handle”

    But was not possible to me reboot the server.

    The solution is unmounting all Stale NFS fila handle shares and then mount again.

    But “mount -t nfs nfsserver:/path/to/nfs/share /mnt/destination” was giving me this error:
    “mount.nfs: Stale NFS file handle”

    After this, i mount nfs share with next command:
    “mount -t nfs -o ro,soft,rsize=32768,wsize=32768 NFS_HOST:/path_to_nfs_share /mnt/mount_destination”

    In my case, i use mounts with read only permissions and don’t want them to mount on boot, so there is no reference to it on /etc/fstab.

    If you need to read and write, you have to use “rw” instead “ro”.

    Hope it helps.

    Regards.

Leave a Comment