5 Tips To Speed Up Linux Software Raid Rebuilding And Re-syncing

by on June 15, 2010 · 49 comments· LAST UPDATED June 14, 2013

in , ,

It is no secret that I am a pretty big fan of excellent Linux Software RAID. Creating, assembling and rebuilding small array is fine. But, things started to get nasty when you try to rebuild or re-sync large size array. You may get frustrated when you see it is going to take 22 hours to rebuild the array. You can always increase the speed of Linux Software RAID 0/1/5/6 reconstruction using the following five tips.

Recently, I build a small NAS server running Linux for one my client with 5 x 2TB disks in RAID 6 configuration for all in one backup server for Mac OS X and Windows XP/Vista client computers. Next, I cat /proc/mdstat and it reported that md0 is active and recovery is in progress. The recovery speed was around 4000K/sec and will complete in approximately in 22 hours. I wanted to finish this early.

Tip #1: /proc/sys/dev/raid/{speed_limit_max,speed_limit_min} kernel variables

The /proc/sys/dev/raid/speed_limit_min is config file that reflects the current "goal" rebuild speed for times when non-rebuild activity is current on an array. The speed is in Kibibytes per second (1 kibibyte = 210 bytes = 1024 bytes), and is a per-device rate, not a per-array rate . The default is 1000.

The /proc/sys/dev/raid/speed_limit_max is config file that reflects the current "goal" rebuild speed for times when no non-rebuild activity is current on an array. The default is 100,000.

To see current limits, enter:
# sysctl dev.raid.speed_limit_min
# sysctl dev.raid.speed_limit_max

NOTE: The following hacks are used for recovering Linux software raid, and to increase the speed of RAID rebuilds. Options are good for tweaking rebuilt process and may increase overall system load, high cpu and memory usage.

To increase speed, enter:

 
echo value > /proc/sys/dev/raid/speed_limit_min
 

OR

 
sysctl -w dev.raid.speed_limit_min=value
 

In this example, set it to 50000 K/Sec, enter:
# echo 50000 > /proc/sys/dev/raid/speed_limit_min
OR
# sysctl -w dev.raid.speed_limit_min=50000
If you want to override the defaults you could add these two lines to /etc/sysctl.conf:

#################NOTE ################
##  You are limited by CPU and memory too #
###########################################
dev.raid.speed_limit_min = 50000
## good for 4-5 disks based array ##
dev.raid.speed_limit_max = 2000000
## good for large 6-12 disks based array ###
dev.raid.speed_limit_max = 5000000

Tip #2: Set read-ahead option

Set readahead (in 512-byte sectors) per raid device. The syntax is:
# blockdev --setra 65536 /dev/mdX
## Set read-ahead to 32 MiB ##
# blockdev --setra 65536 /dev/md0
# blockdev --setra 65536 /dev/md1

Tip #3: Set stripe-cache_size for RAID5 or RAID 6

This is only available on RAID5 and RAID6 and boost sync performance by 3-6 times. It records the size (in pages per device) of the stripe cache which is used for synchronising all write operations to the array and all read operations if the array is degraded. The default is 256. Valid values are 17 to 32768. Increasing this number can increase performance in some situations, at some cost in system memory. Note, setting this value too high can result in an "out of memory" condition for the system. Use the following formula:

memory_consumed = system_page_size * nr_disks * stripe_cache_size

To set stripe_cache_size to 16 MiB for /dev/md0, type:
# echo 16384 > /sys/block/md0/md/stripe_cache_size
To set stripe_cache_size to 32 MiB for /dev/md3, type:
# echo 32768 > /sys/block/md3/md/stripe_cache_size

Tip #4: Disable NCQ on all disks

The following will disable NCQ on /dev/sda,/dev/sdb,..,/dev/sde using bash for loop

 
for i in sd[abcde]
do
  echo 1 > /sys/block/$i/device/queue_depth
done
 

Tip #5: Bitmap Option

Bitmaps optimize rebuild time after a crash, or after removing and re-adding a device. Turn it on by typing the following command:
# mdadm --grow --bitmap=internal /dev/md0
Once array rebuild or fully synced, disable bitmaps:
# mdadm --grow --bitmap=none /dev/md0

Results

My speed went from 4k to 51k:
cat /proc/mdstat
Sample outputs:

Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath]
md5 : active raid1 sde2[2](S) sdd2[3](S) sdc2[4](S) sdb2[1] sda2[0]
      530048 blocks [2/2] [UU]
md0 : active raid6 sde3[4] sdd3[3] sdc3[2] sdb3[1] sda3[0]
      5855836800 blocks level 6, 64k chunk, algorithm 2 [5/5] [UUUUU]
      [============>........]  resync = 61.7% (1205475036/1951945600) finish=242.9min speed=51204K/sec

Monitoring raid rebuilding/recovery process like a pro

You cat /proc/mdstat file. This read-only file contains information about the status of currently running array and shows rebuilding speed:
# cat /proc/mdstat
Alternatively use the watch command to display /proc/mdstat output on screen repeatedly, type:
# watch -n1 cat /proc/mdstat
Sample outputs:

Fig.01: Performance optimization for Linux raid6 for /dev/md2

Fig.01: Performance optimization for Linux raid6 for /dev/md2


The following command provide details about /dev/md2 raid arrray including status and health report:
# mdadm --detail /dev/md2
Sample outputs:
Fig.02: Finding out information about md2 raid array

Fig.02: Finding out information about md2 raid array


Another option is to see what is actually happening by typing the following iostat command to see disk utilization:

 
watch iostat -k 1 2
watch -n1 iostat -k 1 2
 

Sample outputs:

Fig.03: Find out CPU statistics and input/output statistics for devices and partitions

Fig.03: Find out CPU statistics and input/output statistics for devices and partitions

References:
TwitterFacebookGoogle+PDF versionFound an error/typo on this page? Help us!

{ 49 comments… read them below or add one }

1 zosky June 16, 2010 at 2:05 am

i love your blog, but now i hate you … i added a new drive last week and watched mdadm reshape for ~80hrs … guess it’ll be easier next time. thanks for the tip :P

Reply

2 nixCraft June 16, 2010 at 10:33 am

Heh.. no problem. I’ve updated the post with bitmap option which will increase speed further.

Reply

3 ManvsTech June 16, 2010 at 3:56 pm

That’s an awesome tip. From my rough calculations I assume it finished in about 1hr 45min. May not be a good thing if you bill your clients per hour :)

Reply

4 Jord June 17, 2010 at 5:53 pm

Hi there,

this for sure will help a lot when I’ll install our budget backup SAN :-). Do you by any chance also now whether such limits exists for drbd-syncing?

Cheers!

Reply

5 nixCraft June 17, 2010 at 6:04 pm

Yes, drbd also suffer from the same issue and can be fixed by editing /etc/drbd.conf. Look out for rate directive:

 rate 40M;

Another note LVM really slows down performance with drbd; so make sure you avoid it. See this help page [drbd.org].

Reply

6 Thank You June 18, 2010 at 2:57 pm

I always faced this speed problem and I thought this is due to software issue. With this tip I reduced my build time from 9.2 hours to less than 1 hour.

Reply

7 Cr0t June 18, 2010 at 8:11 pm

One of my machines has 5x 2TB in a RAID5. I got even more speed out of my volume after tweaking the drives and the md0 volume itself.

—BEGIN—

#!/bin/bash
blockdev --setra 16384 /dev/sd[abcdefg]
echo 1024 > /sys/block/sda/queue/read_ahead_kb
echo 1024 > /sys/block/sdb/queue/read_ahead_kb
echo 1024 > /sys/block/sdc/queue/read_ahead_kb
echo 1024 > /sys/block/sdd/queue/read_ahead_kb
echo 1024 > /sys/block/sde/queue/read_ahead_kb
echo 256 > /sys/block/sda/queue/nr_requests
echo 256 > /sys/block/sdb/queue/nr_requests
echo 256 > /sys/block/sdc/queue/nr_requests
echo 256 > /sys/block/sdd/queue/nr_requests
echo 256 > /sys/block/sde/queue/nr_requests
# Set read-ahead.
echo "Setting read-ahead to 64 MiB for /dev/md0"
blockdev --setra 65536 /dev/md0
# Set stripe-cache_size for RAID5.
echo "Setting stripe_cache_size to 16 MiB for /dev/md0"
echo 16384 > /sys/block/md0/md/stripe_cache_size
echo 8192 > /sys/block/md0/md/stripe_cache_active
# Disable NCQ on all disks.
echo "Disabling NCQ on all disks..."
echo 1 > /sys/block/sda/device/queue_depth
echo 1 > /sys/block/sdb/device/queue_depth
echo 1 > /sys/block/sdc/device/queue_depth
echo 1 > /sys/block/sdd/device/queue_depth
echo 1 > /sys/block/sde/device/queue_depth

—END—

Reply

8 nixCraft June 18, 2010 at 8:23 pm

Thanks for sharing your script.

Reply

9 Jord July 16, 2010 at 11:20 am

One more question on the software raid possibilities:
with a hardware raid, when one disk dies, you can still boot the machine. I had achieved the same for RAID-1 with info found on the net, but how does one goes to achieve the same for a 6 disk RAID-5 setup? I can’t find that info…. Any suggestions or links?

One possible solution I came up with (but I’m not too keen on using it) would be to install the system to an USB drive/stick (of which one can easily make backups, or even use 2 usb devices in RAID-1) and keep the 6 drives as pure data-disks (no OS) in RAID 5.

Thanks in advance

Reply

10 David Clymer August 10, 2011 at 12:32 pm

You could split the disks into two partitions, a small 100M partition, and a data partition which uses the rest of the disk. Set up two raid volumes. The small partitions would be raid1, and the rest raid5/6 whatever. install everything on the raid6 volume and mount the raid1 as /boot. Setup grub for each disk:

root (hd0,0)
setup (hd0)
root (hd1,0)
setup (hd1)
…etc.

In this way, any drive could be removed and you could still boot the machine.

Reply

11 franta February 17, 2011 at 11:33 pm

I suffer from renaming interfaces on reboot … sometimes system disk is sda sometimes sde …it’s a little bit weird … but it made me make your script a little bit more dynamic :)

#!/bin/bash
#Parses output of mdadm --detail for specified array (here md0) for devices used in raid
disks=$(sudo mdadm --detail /dev/md0 | grep "/dev/sd" | awk '{print $7}' | cut -c6-8)
for i in $disks
do
echo "Setting read-ahead to 16MiB for disk "$i" in RAID (probably)"
blockdev --setra 16384 /dev/$i
echo "Setting read_ahead_kb to 1024 for disk "$i" in RAID..."
echo 1024 > /sys/block/$i/queue/read_ahead_kb
echo "Setting nr_requests to 256 for disk "$i" in RAID..."
echo 256 > /sys/block/$i/queue/nr_requests
echo "Disabling NCQ for disk "$i" in RAID..."
echo 1 > /sys/block/$i/device/queue_depth
done
# Set read-ahead.
echo "Setting read-ahead to 64 MiB for /dev/md0"
blockdev --setra 65536 /dev/md0
# Set stripe-cache_size for RAID5.
echo "Setting stripe_cache_size to 16 MiB for /dev/md0"
echo 16384 > /sys/block/md0/md/stripe_cache_size
#echo 8192 > /sys/block/md0/md/stripe_cache_active  << it's readonly

It’s my first post in here, so I hope it will be nicely formated :)

Reply

12 upsite February 28, 2011 at 1:04 am

Hi Cr0t,

on which hardware specs reliy your setra and read_ahead_kb values?
i got WD20EARS with 64mb memory.
so is it save/usefull to increase your values?

Best practice values ? limits?
thanx ;)

Reply

13 Jord July 29, 2010 at 1:07 pm

The easiest solution I found was to combine raid 1 (for booting the system) & raid 5 (for data) on the same six disks. Agreed, I now have a 6 disk raid 1, but it is only 2GB. The rest is raid 5. Hopes this helps someone facing the same problems sometimes.

Cheers, Jord

Reply

14 Andy Lee Robinson August 9, 2010 at 6:00 am

A suggestion, boot from a pendrive.
I just leave a pendrive in a usb port and boot from that as there isn’t any space for more drives.
Had some problems installing Fedora 12 some months ago using the onboard fakeraid and locking up, so reinstalled as standalone drives and mdadm.
I wanted to make a kickass colocated web/mysql server and I’m still soaking it, but I’m worried that resyncing may cause performance issues at critical times, as I have seen while developing.
(it’s now part of a 3-node multiple master mysql cluster+slave experiment, with one node at the ISP linked to here at home over 30mbps / ssh tunnels) :-)

My setup:
RAID10, persistent superblock (host i7/920 + 12Gb RAM)
|=========md0========|====md1====| /dev/sda 1Tb
|=========md0========|====md1====| /dev/sdb 1Tb
|=========md0========|====md1====| /dev/sdc 1Tb
|=========md0========|====md1====| /dev/sdd 1Tb
|=========md0========| /dev/sde 750Gb
|=========md0========| /dev/sdf 750Gb
|======| /dev/sdg 2x250Gb (HW raid1)
|=| /dev/sdh 1Gb pendrive /boot

Reply

15 Rory August 20, 2010 at 4:29 am

So can you add the intent bitmap during a RAID 5 reshape? That would really help me out…

Definitely should have looked at this before. Great blog post, man.

Reply

16 dalobo February 21, 2011 at 2:32 am

When I run sudo mdadm –grow –bitmap=internal /dev/md1

I get:

mdadm: failed to set internal bitmap.

The RAID is rebuilding when I ran this. Do I need to do this before I rebuild? If so, how??

md1 : active raid5 sde[4] sdf[2] sdg[3] sdd[1] sdb[0]
5860543488 blocks super 0.91 level 5, 4k chunk, algorithm 2 [5/5] [UUUUU]
[=>...................] reshape = 8.2% (161204840/1953514496) finish=1231.1min speed=24262K/sec

No change in speed.

Reply

17 Israel September 20, 2012 at 10:08 pm

You can use –bitmap=internal with create/build/assemble.

Reply

18 jmuellers April 28, 2011 at 8:58 am

Setting the speed limits didn’t help in my case.
I have raid6 with 6xwd20ears, with 0xfb partitions (4kb physical 4kb logical, aligned).
Resync speed after setting up the array is around 20000kb/s, I assume it will get even slower towards the end of the process.

Maybe it would be faster with AAM set to 255 (now set to 128), unfortunatly I cannot set it through my raid controller (only works if the drive is plugged in on the mainboard sata controller).

I’m also not able to set the bitmap, but i think it would not affect the initial resync after having just built the array, am I right?

I would also be interested in further information about setting read ahead etc.

Reply

19 jmuellers April 28, 2011 at 9:33 am

edit:
Sometimes the speed is rising from 20000k to 100000k for a few seconds or minutes and then falling back to 20. I have no explanation for that behaviour.
Except for the resync the system is idle.
md0_raid6 is using about 5% while speed ist 20000k, 25% while speed ist 100000k
md0_resync is using a few percent at 20000k, 20% at 100000k.
Just in this moment the speed went up to 100000k for only 10 seconds… no clue.

Reply

20 indulis May 15, 2011 at 6:35 am

Sounds like the RAID setup can do 100 MB/sec, but you have changed the /proc/sys/dev/raid/speed_limit_max setting only, this is used if there is no other activity on the disks at all. SO the rise to 100MB/sec is wen no other I/O activity happens. Try changing /proc/sys/dev/raid/speed_limit_min

echo “50000″ > /proc/sys/dev/raid/speed_limit_min

then see what happens.

Reply

21 cd June 8, 2011 at 5:20 pm

just for sharing : replacing a failed drive , 2TB caviar black – WD2002FAEX, 3 good drives + 1 new on a production server (still gets 100Mbs trafic out )

[>....................] recovery = 2.8% (56272128/1942250240) finish=888.0min speed=35394K/sec

The drives ar having a 92% busy
DSK | sda | busy 92%

—-total-cpu-usage—- -dsk/total- -net/total- —paging– —system–
usr sys idl wai hiq siq| read writ| recv send| in out | int csw
6 6 83 3 0 2| 107M 51M| 0 0 | 5.6B 44B|8900 14k
1 10 72 14 0 3| 363M 150M|8421k 23M| 0 0 |9330 11k
7 12 63 14 1 4| 416M 150M|8009k 25M| 0 0 |9804 12k

won’t push the speed limit to keep the server responsive.

ps: careful running “magic” scripts with “magic” settings. got my server (colocated) frozen while changing some readhead values “on the fly”. Costed me 30$ extra intervention to the colo guys to reboot it (it was on weekend :)) .

Can be fun tweaking but really there’s no silver bullet that comes without some risks. For oldies : there were times when computers came with a Turbo button. If you’d press it the cpu frequency would get higher. Of course 50% of the time Duke Nukem would freeze if you press the turbo button while playing:)

Reply

22 rtcbad October 24, 2011 at 10:37 pm

Thanks to this article and the comments on tweaking the drive parameters from Cr0t, rebuild performance on RAID1 went from 32K to over 110K during system idle. The combination of increasing the dev.raid.speed_limit_min to 50000 and disabling NCQ on both drives gave a phenomenal increase in performance. After seeing how much disabling NCQ helped, I further increased dev.raid.speed_limit_min to 110000 to further speed up the rebuild. The rebuild went from an initial estimate of 6 hours, to only taking about 1.5 hours after these tweaks.

Reply

23 M1TH November 21, 2011 at 5:02 pm

Also, thanks to Cr0t!! sped up my raid build from about 28-30K/sec to 40-45K/sec, topping out over 50K occasionally. recognize what you are doing issuing those commands, which is tuning the hard drive parameters that deal with pretty low-level communication options of the device. options tied to the drivers, firmware and kernel. so, use them with care, and don’t be surprised if extreme circumstances cause a lock-up or other failure case causing you to have to start your raid rebuild over. point being, use with caution and in experimental environment as mentioned. still, this did not cause any issues with my setup even doing it during the build, and in most cases on properly working hardware it should not. but be warned!

anywho, the original problem for me turned out to be one bad hard drive causing the array to be built at <1K/sec. sad thing is that nothing about the system or even smartctl really indicated errors. it was really diagnosed by noticing the smartctl 'remap' value skyrocketing on one device where it was not on any of the others, and physically looking at the box showed 10x more LED activity on the problematic disk as it tried every read and write vs the other 19 in the array. so, don't rely on your os to inform you of hard drive issues and you can't rely on SMART at face value either. i think any experienced sys-admin would validate that statement.

happy tuning everyone =) thanks for the info.

Reply

24 eddie samaz March 22, 2012 at 9:31 pm

Thank you. You saved me tonnes of hours.

Reply

25 Manuel Schmitt April 2, 2012 at 7:40 am

For howto on German go here

Reply

26 Tom in MN April 14, 2012 at 6:26 pm

There is no reason to turn bitmaps off (unless you need to grow the array), in fact, the whole point to bitmaps is to leave them on so only the unsynced parts of the device (as indicated by the bitmap) need to be rebuilt when a crash or other unclean shutdown occurs. Turn them on when you first make the raid and leave them on.

Also you can’t turn them on during a resync anyways.

Reply

27 schnitzel April 16, 2012 at 7:02 pm

Thanks for the tips, and thanks to Cr0t for sharing his script.
Speed went from 26M/sec to 140M/sec

Reply

28 Flávio April 17, 2012 at 2:40 pm

I did
But Didn´t work :/

The speed_min is 50000 and the speed on process continues 1000kb/s

Reply

29 fuzzy May 19, 2012 at 10:34 am

Further to Cr0t’s script…
A little clean up + added lines to set speed max/min
NOTE: only edit the lines at the beginning

— begin script —

#!/bin/bash
### EDIT THESE LINES ONLY
RAID_NAME="md0"
# devices seperated by spaces i.e. "a b c d..." between "(" and ")"
RAID_DRIVES=(b c d e f g)
# this should be changed to match above line
blockdev --setra 16384 /dev/sd[bcdefg]
SPEED_MIN=50000
SPEED_MAX=200000
### DO NOT EDIT THE LINES BELOW
echo $SPEED_MIN > /proc/sys/dev/raid/speed_limit_min
echo $SPEED_MAX > /proc/sys/dev/raid/speed_limit_max
# looping though drives that make up raid -> /dev/sda,/dev/sdb...
for index in "${RAID_DRIVES[@]}"
do
        echo 1024 > /sys/block/sd${index}/queue/read_ahead_kb
        echo 256 > /sys/block/sd${index}/queue/nr_request
        # Disabling NCQ on all disks...
        echo 1 > /sys/block/sd${index}/device/queue_depth
done
# Set read-ahead.
echo "Setting read-ahead to 64 MiB for /dev/${RAID_NAME}"
blockdev --setra 65536 /dev/${RAID_NAME}
# Set stripe-cache_size
echo "Setting stripe_cache_size to 16 MiB for /dev/${RAID_NAME}"
echo 16384 > /sys/block/${RAID_NAME}/md/stripe_cache_size
echo 8192 > /sys/block/${RAID_NAME}/md/stripe_cache_active

— end script —

Reply

30 nixCraft May 19, 2012 at 5:05 pm

@fuzzy,

Nice update. One suggestion, you can accept md0 (or md1) and SPEED_MIN and
SPEED_MAX using command line args:

#!/bin/bash
### EDIT THESE LINES ONLY
RAID_NAME="${1:-md0}"
# devices seperated by spaces i.e. "a b c d..." between "(" and ")"
RAID_DRIVES=(b c d e f g)
# this should be changed to match above line
blockdev --setra 16384 /dev/sd[bcdefg]
SPEED_MIN=${2:-50000}
SPEED_MAX=${3:-200000}
# reset is same as posted by fuzzy

One can run it as:

./script md1 60000 100000

Reply

31 nor500 May 22, 2012 at 4:55 am

I have raid5 with 5 wd20ears. Before this script:
echo 50000 > /proc/sys/dev/raid/speed_limit_min
I had rebuilding speeds of around 15000K/sec. After running that script I have 55000-70000K/sec, but somehow after the rebuilding was running couple of hours it slowed down again to the original speed of 15000K/sec…
When I query the speed with this command: ” sysctl dev.raid.speed_limit_min “, it still shows ” dev.raid.speed_limit_min = 50000 “, but it just slowed down to an average speed of 15000K/sec.
What happened? What can I do? Please help me in this issue.

Reply

32 pfarthing6 August 1, 2012 at 11:20 pm

Just wanted to add, that if the sync is going when you change these values, it won’t (or at least didn’t for me) make a diff. So, you can bounce the machine OR do this which causes the resync to restart.

echo “idle” > /sys/block/md0/md/sync_action

Reply

33 xyxproceed October 18, 2013 at 7:19 am

Thanks for your suggestions, but it won’t work if I run the command echo “idle” > /sys/block/md0/md/sync_action, the md still be recovering instead of idle.
how can we set the stripe_cache_size and make it work? However, we can set it in linux softraid kernel module, but I wouldn’t like to set it there.

Reply

34 pfarthing6 August 1, 2012 at 11:24 pm

And changing this value really kicks the resync into overdrive…

echo [size val] > /sys/block/md1/md/stripe_cache_size

Reply

35 Schnitzel August 4, 2012 at 12:55 am

Let me add aswell, I’ve been running a raid6 for a few months, and done quite a few rebuilds, and often it fails during rebuild, but having it rebuild at slower speeds gives me more data security… just my 2 cents

Reply

36 davet August 7, 2012 at 5:44 am

Great post – thanks
I have 2 x 16GB USB thumb drives with CentOS software RAID level 1 operating 5 x 2TB software RAID level 5 with hot spare which offers 6TB iSCSI target to the network. My poor man’s SAN :)

Reply

37 Brian Candler February 4, 2013 at 4:50 pm

If your RAID rebuild is only running at the min speed, it could be because you have just created an ext4 filesystem and it is doing background initialisation (look for a process called “ext4lazyinit”). When that completes, as long as there’s no other disk I/O going on, the RAID rebuild should crank up to the max speed.

Reply

38 Dion R April 5, 2013 at 5:53 pm

Once the rebuild is complete do recommend changing any of these values back to something or leaving them as is? If you recommend changing them, what would recommend they be changed to?

Examples of changed values:
echo 1024 > /sys/block/sd${index}/queue/read_ahead_kb
echo 256 > /sys/block/sd${index}/queue/nr_request
echo 1 > /sys/block/sd${index}/device/queue_depth

Thoughts?

Reply

39 BitProcessor May 28, 2013 at 6:04 pm

setting the queue_depth of all disks to 1 speeded up the resync time of a raid5 array of 5 disks (2TB and partitioned 3TB) with a factor of 1,5x

thanks for that Cr0t

(raising the min_speed limit also helped a bit ;) )

Reply

40 Sicofante June 20, 2013 at 2:41 am

I’d like to ask a naive question: how do these changes affect the RAID array speed in ordinary use, i.e. when not rebuilding the array? Should they be put back to the previous values after the rebuilt is finished?

Reply

41 Jinn Ko August 12, 2013 at 11:05 am

These steps helped a great deal, improving resync performance of my RAID-5 array with 4x WD20EARX drives from 15MB/s to 45MB/s on a dual-core hyperthreaded atom-d525. Didn’t realize the NCQ settings had such a big impact, improving performance by 200%, while the stripe-cache-size improved performance by around 25%.

I have made a script that saves current values so you can restore them later, before setting the high performance values. It will also detect the underlying disks to set the NCQ for you, along with the read-ahead and sync_speed values. I left out the setting of the bitmap as this can only be done while not syncing, so wasn’t useful to me.

The script can be found at https://gist.github.com/jinnko/6209899.

Reply

42 Peter Buechel September 19, 2013 at 2:23 pm

On a Synology Rackstation, I increased an ongoing RAID5 rebuild speed from 10 MB/s to 20-30 MB/s by increasing

sysctl -w dev.raid.speed_limit_min=100000
(default was 10,000)

without significant CPU load increase.

I wonder if I could set the stripe-cache_size during the rebuild or if I should/must wait till it’s over?

Reply

43 pdwalker January 31, 2014 at 7:19 pm

Thanks.

I’m not sure why these tips works, but I got a resync performance that went from 20MB/s to about 140MB/s. That’s an improvement.

Reply

44 eMBee March 18, 2014 at 4:23 pm

changing dev.raid.speed_limit_min and dev.raid.speed_limit_max alone gave me a more than 10fold increase in speed. instead of taking almost a week, i expect the rebuild to be done overnight. the other settings i was not able to change because the resync was already in progress, but
echo “idle” > /sys/block/md0/md/sync_action
was also useful, while not stopping the resync it allowed me to control which device was synced first. instead of the slow huge one i could get it to sync the smaller / and /var devices which (thanks to the speedup) were done in a few minutes. with those out of the way, now i can sleep better while the last huge data disk is syncing.

Reply

45 Janus May 4, 2014 at 4:31 pm

Thanks a lot for the tips! I went with #1 and #2 and it sure work’s like a charm!

Should the settings be reverted after (re)syncing the RAID???

Cheers!

Reply

46 Bill McGonigle June 25, 2014 at 4:09 pm

Thanks for this post. I tried all the settings in the main post and the scripts in comments, and all slowed down my rebuild by a small degree (RAID 1). Only increasing nr_requests made a really bad change (-25%) – the rest were barely noticable. I take this to mean that my distro’s (Fedora 20) defaults are as good as this hardware (SuperMicro/Hitachi) can get. So, for others:

queue_depth: 31
drive/array ra: 256
read_ahead_kb: 128
nr_requests: 128
speed_limit_min/max: 200000

Reply

47 Valentin-Erich July 16, 2014 at 8:16 pm

Great walk-through and great team assembled here. Thanks NIX and all!
I migrated a 5 drive RAID from ZFS with 2 extra drives for Swing Space.
This required about 3 rounds of pool expansions and reshapes, and with these settings I managed to cut the reshape times to half.

I am on the last reshape, and get about 33MBps on reshapes
One of the disks is a bit on the slower side, and constantly maxes out its IOpS).

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid5 sdd1[1] sde1[2] sdg1[4] sdf1[3] sdc1[0]
7813770240 blocks super 1.2 level 5, 512k chunk, algorithm 2 [5/5] [UUUUU]
[=>...................] reshape = 6.7% (265179648/3906885120) finish=1798.8min speed=33740K/sec
bitmap: 0/30 pages [0KB], 65536KB chun

I had to redo my /etc/mdadm/mdadm.conf by hand a few times as the pool did not activate upon restarts, because somehow my discs reordered (yes… I was bold enough to restart during a reshape… learned my lesson)

Reply

48 Ian Tester August 10, 2014 at 2:08 pm

Isn’t NCQ a good thing? So why does disabling it give you a speed boost?

Reply

49 David Goodwin September 16, 2014 at 3:47 pm

Short summary – the bitmap stuff is wrong in the above.

It will only help if you need to re-add a device that had been in the array and became disconnected due to a system crash (or similar). So an event that would normally trigger a full resync between the devices can be partially avoided – as the bitmap can be consulted to determine which parts need synchronisation.

You cannot add a bitmap to an array that is syncing.

If you remove the bitmap, you may get slightly better write performance (as writing to the disk will cause the bitmap to be updated, if one is present) – at the cost of losing a potential faster resync in the future after some sort of crash.

See : https://raid.wiki.kernel.org/index.php/Write-intent_bitmap

Reply

Leave a Comment

Tagged as: , , , , , , ,

Previous post:

Next post: