You need to use the following syntax to check SATA or SAS disk which are typically simulate a (logical) disk for each array of (physical) disks to the OS. /dev/sgX can be used as pass through I/O controls providing direct access to each physical disk for Adaptec raid controllers.
Is my Adaptec RAID card detected by Linux?
Type the following command:
# lspci | egrep -i 'raid|adaptec'
Sample outputs:
81:00.0 RAID bus controller: Adaptec AAC-RAID (rev 09)
Download and install Adaptec Storage Manager
You need to install Adaptec Storage Manager for your Linux distribution as per installed RAID card. Visit this site to grab the software.
SATA Health Check Disk Syntax
To scan disk, enter:
# smartctl --scan
Sample outputs:
/dev/sda -d scsi # /dev/sda, SCSI device
So /dev/sda is one device reported as SCSI device. This RAID device is made of 4 disks located in /dev/sg{1,2,3,4}. Type the following smartclt command to check disk behind /dev/sda raid:
# smartctl -d sat --all /dev/sgX
# smartctl -d sat --all /dev/sg1
Ask the device to report its SMART health status or pending TapeAlert message if any, run:
# smartctl -d sat --all /dev/sg1 -H
For SAS disk use the following syntax:
# smartctl -d scsi --all /dev/sgX
# smartctl -d scsi --all /dev/sg1
### Ask the device to report its SMART health status or pending TapeAlert message ###
# smartctl -d scsi --all /dev/sg1 -H
Sample outputs:
smartctl version 5.38 [x86_64-redhat-linux-gnu] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ Device: SEAGATE ST3146855SS Version: 0002 Serial number: xxxxxxxxxxxxxxx Device type: disk Transport protocol: SAS Local Time is: Wed Jul 7 04:34:30 2010 CDT Device supports SMART and is Enabled Temperature Warning Enabled SMART Health Status: OK Current Drive Temperature: 24 C Drive Trip Temperature: 68 C Elements in grown defect list: 0 Vendor (Seagate) cache information Blocks sent to initiator = 1857385803 Blocks received from initiator = 1967221471 Blocks read from cache and sent to initiator = 804439119 Number of read and write commands whose size <= segment size = 312098925 Number of read and write commands whose size > segment size = 45998 Vendor (Seagate/Hitachi) factory information number of hours powered up = 13224.42 number of minutes until next internal SMART test = 42 Error counter log: Errors Corrected by Total Correction Gigabytes Total ECC rereads/ errors algorithm processed uncorrected fast | delayed rewrites corrected invocations [10^9 bytes] errors read: 58984049 1 0 58984050 58984050 3151.730 0 write: 0 0 0 0 0 9921230881.600 0 verify: 1308 0 0 1308 1308 0.000 0 Non-medium error count: 0 No self-tests have been logged Long (extended) Self Test duration: 1367 seconds [22.8 minutes] |
Here is another output from SAS based disk called /dev/sg2
# smartctl -d scsi --all /dev/sg2 -H
Sample outputs:
Replace /dev/sg1 with your disk number. If you’ve raid 10 array with 4 disks than:
- /dev/sg0 – RAID 10 controller (you will not get any info or /dev/sg0).
- /dev/sg1 – First disk in RAID 10 array.
- /dev/sg2 – Second disk in RAID 10 array.
- /dev/sg3 – Third disk in RAID 10 array.
- /dev/sg4 – Fourth disk in RAID 10 array.
How do I run hard disk check?
Type the following command:
# smartctl -t short -d scsi /dev/sg2
# smartctl -t long -d scsi /dev/sg2
Where,
- -t short : Run short test.
- -t long : Run long test.
- -d scsi : Specify scsi as device type.
- --all : Show all SMART information for device.
How do I use Adaptec Storage Manager?
Another simple command to just check basic status is as follows:
# /usr/StorMan/arcconf getconfig 1 | more
# /usr/StorMan/arcconf getconfig 1 | grep State
# /usr/StorMan/arcconf getconfig 1 | grep -B 3 State
Sample outputs:
---------------------------------------------------------------------- Device #0 Device is a Hard drive State : Online -- S.M.A.R.T. : No Device #1 Device is a Hard drive State : Online -- S.M.A.R.T. : No Device #2 Device is a Hard drive State : Online -- S.M.A.R.T. : No Device #3 Device is a Hard drive State : Online |
Please note that newer version of arcconf is located in /usr/Adaptec_Event_Monitor directory. So your full path must be as follows:
# /usr/Adaptec_Event_Monitor/arcconf getconfig
Where,
Prints controller configuration information.
Option AD : Adapter information only LD : Logical device information only LD# : Optionally display information about the specified logical device PD : Physical device information only MC : Maxcache 3.0 information only AL : All information (optional)
How do I check the health of my Adaptec RAID array itself on Linux?
\
Simply use the following command:
# /usr/Adaptec_Event_Monitor/arcconf getconfig 1
OR (older version)
# /usr/StorMan/arcconf getconfig 1
Sample outputs:
I have written a script based on arcconf.
You can find it here … http://fir3net.com/Programming/Bourne-/-BASH/adaptec-storage-manager-script-for-esx4.html
If you don’t know the drive numbers you can guess, or use sg_scan to determine them.
You may need to install sg_utils
List all generic scsi devices:
# sg_scan -i
Also, unless Adaptec has updated some of the older drivers this will only work with their series 2 and series 5 controllers, not series 3. Some additional information in the comments for this blog post: http://linux.adaptec.com/2008/09/26/smartmontools-and-adaptec-raid-controllers/
Actually above output came from series 3 controller Adaptec 3405 with latest firmware. However, we don’t have any series 2 controller left here. So I can’t verify series 2. The series 5 also works.
HTH.
I have series 2 and 5 and have not had issues with either of them. Perhaps Adaptec updated the drivers for series 3.
On my CentOS 5.5 x64, the smartctrl doesn’t has a -d sata option, but has a -d sat option. Maybe you have it misspelled.
Thanks for the heads up!
That is a different one. You can have “-d ata” to use ATA commands or newer “-d sat” to use SAT (SCSI/ATA Translation). The later is used usually for USB drives.
@ Vikas I seriously feel that the quality of the posts of late on this blog is coming down. The reason being that in your above post you have pasted the sample output and you have failed to mention what the output actually means,which indeed is of utmost importance.
What can i make out of the above copy-pasted stuff ?? I feel explaining the outcome of a command is of greater importance than anything else.I personally don’t mean to be scornful or anything as such but seriously the admin of the blog needs to take a second look at his posts to see if they make any sense at all to its readers. Its just a suggestion that i had since i really like your blog.
Seriously? You don’t understand above output? The command find out of, if there are any errors for reading and writing hard disk? The output will tell you if hard disk is going to fail or not in advance (hint #1, SMART Health Status: OK, hint #2 total error counter for uncorrcted error log is 0). Did you went thought two articles I linked at the bottom of the page (See also:) which explains the smartctl command and what to look out when disk fails. If you still don’t get it than I suggest you join our forum to post specific question or go through smartctl man page.
HTH,
PS: My name is vivek.
Note that some servers may not load the drivers needed to expose /dev/sg* interfaces. If you don’t see any of those devices on a system with Adaptec aacraid driver, try:
modprobe sg
To load the “scsi generic” driver that creates them.
Also, it’s useful to know that every unit in the array will respond to “-d scsi” style requests. Things like the logical drives will only give data that way. However, if there is actually a SATA drive connected to that device, then you can also request its SATA specific data via “-d sat” style commands.
Thanks for the hint on getting this to work with Smartctl.
Keep in mind that arcconf may also be installed as a standalone binary, and may be installed to /usr/Arcconf/arcconf .