RAID 基本概念
由于现代数据中心业务量的与日俱增,单台服务器上需要运行的数据也日益增多。当
单个物理磁盘在容量和安全性上不足以支持系统业务时,就需要将多个磁盘以某种特
定方式组合起来,对外作为一个可见的磁盘来使用,才可满足实际需要。磁盘组,就
是将一组物理磁盘组合起来,作为一个整体对外体现,是虚拟磁盘的基础。
虚拟磁盘,即使用磁盘组划分出来的连续的数据存储单元,相当于一个个独立的磁
盘,通过一定的配置,使其具有比单个物理磁盘更大的容量,及更高的安全性和数据
冗余性。
一个虚拟磁盘可以是:
● 一个完整的磁盘组。
● 多个完整的磁盘组。
● 一个磁盘组的一部分。
● 多个磁盘组的一部分(每个磁盘组划分一部分,共同组成虚拟磁盘)。
RAID控制器两个阵营
Broadcom (Avago)
2005年12月2日HP的安捷伦半导体事业部宣布更名为安华高科技(Avago Technologies)。
2013年12月17日Avago斥巨资66亿美元收购收购存储芯片制造商LSI;
2015年5月,Avago再下一城370亿美元收购博通(Broadcom),并且更名Broadcom,这也就是Broadcom (Avago)的由来。
Microsemi
2010年5月10日PMC-Sierra 3400万美元收购了Adaptec渠道存储业务。
2014年到2015年,美高森美集成电路(Microsemi)收购了以通信和数据中心为重点的公司包括:Mingoa,Centellax,Vitesse,PMC-Sierra。
常见RAID卡型号
以华为服务器为例,LSI 2208//3108,PM8060/8068, Avago MegaRAID SAS 9440-8i/9440-16i,MSCC SmartRAID 3152-8i/SmartHBA 2100-8i。
系统下raid卡管理工具为两种,Avago厂商为MegaRAID Storcli,Microsemi为Arcconf。
工具使用
1、StorCLI工具
StorCLI的安装路径为/opt/MegaRAID/storcli/,需进入该目录后执行RAID控制卡相关
命令。
(1)查询RAID 卡/RAID 组/物理硬盘信息
storcli64 /ccontroller_id show
storcli64 /ccontroller_id/eenclosure_id/sslot_id show all
storcli64 /ccontroller_id/vvd_id show all
controller_id 硬盘所在RAID卡的ID 可以将该参数设置为all,表示查询该工具可管理的
所有控制器的ID
enclosure_id 硬盘所在Enclosure的ID 可以将该参数设置为all,表示查询该工具可管理的
所有控制器连接的硬盘背板的ID
slot_id 物理硬盘槽位编号可以将该参数设置为all,表示查询所有硬盘的ID
vd_id 虚拟磁盘ID 可以为all,表示查询所有虚拟磁盘信息。
示例:
查询raid卡信息
[root@localhost]# ./storcli64 /c0 show
Generating detailed summary of the adapter, it may take a while to complete.
CLI Version = 007.0504.0000.0000 Nov 22, 2017
Operating system = Linux 4.19.36-vhulk1907.1.0.h453.eulerosv2r8.aarch64
Controller = 0
Status = Success
Description = None
Product Name = AVAGO MegaRAID SAS 9460-8i
Serial Number = SP01022668
SAS Address = 500062b206129f00
PCI Address = 00:01:00:00
System Time = 11/13/2019 04:33:58
Mfg. Date = 03/12/20
Controller Time = 11/13/2019 09:33:56
FW Package Build = 51.13.0-3223
BIOS Version = 7.13.00.0_070D0300
FW Version = 5.130.00-3059
Driver Name = megaraid_sas
Driver Version = 07.713.02.00
Current Personality = RAID-Mode
……
查询s0信息
[root@localhost]# ./storcli64 /c0/e252/s0 show all
CLI Version = 007.0409.0000.0000 Nov 06, 2017
Operating system = Linux3.10.0-514.el7.x86_64
Controller = 0
Status = Success
Description = Show Drive Information Succeeded.
Drive /c0/e252/s0 :
EID:Slt DID State DG Size Intf Med SED PI SeSz Model Sp Type
252:0 48 Onln 0 744.125 GB SATA SSD N N 512B INTEL SSDSC2BB800G4 U -
查询vd0信息
[root@localhost]#. /storcli64 /c0/v0 show all
Controller = 0
Status = Success
Description = None
/c0/v0 :
DG/VD TYPE State Access Consist Cache sCC Size Name
1/0 RAID1 Optl RW Yes RWTD - 1.089 TB
Cac=CacheCade|Rec=Recovery|OfLn=OffLine|Pdgd=Partially Degraded|dgrd=Degraded
Optl=Optimal|RO=Read Only|RW=Read Write|B=Blocked|Consist=Consistent|
R=Read Ahead Always|NR=No Read Ahead|WB=WriteBack|
AWB=Always WriteBack|WT=WriteThrough|C=Cached IO|D=Direct IO|sCC=Scheduled
Check Consistency
PDs for VD 0 :
EID:Slt DID State DG Size Intf Med SED PI SeSz Model Sp
25:22 14 Onln 1 1.089 TB SAS HDD N N 512B ST1200MM0007 U
25:23 26 Onln 1 1.089 TB SAS HDD N N 512B ST1200MM0007 U
(2)查询和设置物理硬盘初始化功能
storcli64 /ccontroller_id/eenclosure_id/sslot_id action initialization
enclosure_id为硬盘背板id,slot_id为硬盘槽位id,均可在查询raid卡配置信息回显中获取。
action为要执行的操作,可选为show、start、stop。
示例:
初始化s3硬盘
domino:~# ./storcli64 /c0/e252/s3 start initialization
Controller = 0
Status = Success
Description = Start Drive Initialization Succeeded.
查询初始化进度
domino:~# ./storcli64 /c0/e252/s3 show initialization
Controller = 0
Status = Success
Description = Show Drive Initialization Status Succeeded.
Drive-ID Progress% Status Estimated Time Left
/c0/e252/s3 0 In progress 0 Seconds
(3)创建和删除RAID
storcli64 /ccontroller_id add vd rlevel size=capacity drives=enclosure_id:slot_id |enclosure_id:startid-endid,enclosure_id:slot_id | enclosure_id:startid-endid
[pdperarray=pdperarray]
storcli64 /ccontroller_id/vraid_id del
level 要配置的RAID组级别可为0、1、5、6、10、50、60,数字分别代表对应的RAID组级别
capacity 要配置的RAID组容量
startid-endid 要加入RAID组的硬盘的起始和结束ID
slot_id 要加入RAID组的硬盘ID
pdperarray 子组中的硬盘数。创建RAID10、RAID50、RAID60时,需要设置此参数,创建其他级别的RAID组时,不需要设置此参数
raid_id 要删除的RAID组的ID
当加入到RAID组的中硬盘为多个硬盘时,需要使用逗号隔开,单个槽位硬盘表示为enclosure_id:slot_id,连续槽位的硬盘表示为:enclosure_id:startid-endid。
示例:
创建raid0
domino:~# ./storcli64 /c0 add vd r0 size=100GB drives=252:0-3
Controller = 0
Status = Success
Description = Add VD Succeeded
删除raid
./storcli64 /c0/v0 del
Controller = 0
Status = Success
Description = Delete VD Succeeded
(4)设置RAID 组的Cache 读写属性
storcli64 /ccontroller_id/vraid_id set wrcache=mode
mode Cache读写模式
● wt:当磁盘子系统接收到所有传输数据后,控制器将给主机返回数据传输完成信号。
● wb:控制器Cache收到所有的传输数据后,将给主机返回数据传输完成信号。
● awb:在RAID卡无电容或电容损坏的情况下,强制使用“wb”模式。
示例:
设置Cache读写模式为“wt”。
domino:~# ./storcli64 /c0/v0 set wrcache=wt
Controller = 0
Status = Success
Description = None
Details Status :
VD Property Value Status ErrCd ErrMsg
0 wrCache WT Success 0 -
(5)手动重构RAID
storcli64 /ccontroller_id/eenclosure_id/sslot_id insert dg=DG array=Arr row=Row
storcli64 /ccontroller_id/eenclosure_id/sslot_id start rebuild
DG 硬盘发生故障的DG的ID
Arr 硬盘发生故障的Array的ID
Row 硬盘发生故障的Array的row number
- 通过./storcli64 /c0 show 命令,查询故障硬盘的DG、Arr和Row所对应的数字。
2.通过storcli64 /ccontroller_id/eenclosure_id/sslot_id insert dg=DG array=Arr
row=Row 命令,将硬盘加入RAID组。
3.执行storcli64 /ccontroller_id/eenclosure_id/sslot_id start rebuild命令,手动重构
RAID。
示例:
将硬盘加入RAID组。
[root@localhost ~]# storcli64 /c0/e252/s1 insert dg=0 array=0 row=0
CLI Version = 007.0504.0000.0000 Nov 22, 2017
Operating system = Linux 3.10.0-693.el7.x86_64
Controller = 0
Status = Success
Description = Insert Drive Succeeded.
手动重构RAID。
[root@localhost ~]# storcli64 /c0/e252/s1 start rebuild
CLI Version = 007.0504.0000.0000 Nov 22, 2017
Operating system = Linux 3.10.0-693.el7.x86_64
Controller = 0
Status = Success
Description = Start Drive Rebuild Succeeded.
(6)查询超级电容的相关信息
storcli64 /ccontroller_id/cv show all
示例:
[root@localhost ~]# ./storcli64 /c0/cv show all
CLI Version = 007.0409.0000.0000 Nov 06, 2017
Operating system = Linux3.10.0-514.el7.x86_64
Controller = 0
Status = Success
Description = None
当回显信息的“State”显示为“FAILED”时,需更换超级电容
(7)设置硬盘直通功能
storcli64 /ccontroller_id set jbod=state
storcli64 /ccontroller_id/eenclosure_id/sslot_id set JBOD
state RAID卡JBOD功能的使能情况 on/off
示例:
使能RAID卡的硬盘直通功能,并设置slot 7硬盘为直通盘。
domino:~# ./storcli64 /c0 set jbod=on
Controller = 0
Status = Success
Description = None
Controller Properties :
Ctrl_Prop Value
JBOD ON
domino:~# ./storcli64 /c0/e252/s7 set JBOD
Controller = 0
Status = Success
Description = Set Drive JBOD Succeeded
(8)查看/导入/清除外部配置
storcli64 /ccontroller_id/fall import preview
storcli64 /ccontroller_id/fall import
storcli64 /ccontroller_id/fall delete
示例:
查看RAID卡的外部配置。
[root@localhost ~]# ./storcli64 /c0/fall import preview
CLI Version = 007.0504.0000.0000 Nov 22, 2017
Operating system = Linux 3.10.0-957.el7.x86_64
Controller = 0
Status = Success
Description = Operation on foreign configuration Succeeded
FOREIGN PREVIEW :
DG=Disk Group Index|Arr=Array Index|Row=Row Index|EID=Enclosure Device ID
DID=Device ID|Type=Drive Type|Onln=Online|Rbld=Rebuild|Dgrd=Degraded
Pdgd=Partially degraded|Offln=Offline|BT=Background Task Active
PDC=PD Cache|PI=Protection Info|SED=Self Encrypting Drive|Frgn=Foreign
DS3=Dimmer Switch 3|dflt=Default|Msng=Missing|FSpace=Free Space Present
TR=Transport Ready
Total foreign drive groups = 0
删除RAID卡的外部配置。
[root@localhost ~]# ./storcli64 /c0/fall delete
CLI Version = 007.0504.0000.0000 Nov 22, 2017
Operating system = Linux 3.10.0-957.el7.x86_64
Controller = 0
Status = Success
Description = Successfully deleted foreign configuration
2、ARCCONF工具
(1)查询RAID 卡、物理硬盘、Array、LD、maxCache的基本信息
arcconf getconfig controller_id AD 查询适配器信息
arcconf getconfig controller_id LD 查询所有LD的信息
arcconf getconfig controller_id LDLD_id 查询特定LD的信息
arcconf getconfig controller_id PD 查询所有物理硬盘的信息
arcconf getconfig controller_id PD physical_id 查询特定物理硬盘的信息
arcconf getconfig controller_id AR 查询所有Array的信息
arcconf getconfig controller_id AR AR_id 查询特定Array的信息
arcconf getconfig controller_id MC 查询maxCached的信息
controller_id 硬盘所在RAID卡的ID
LD_id LD的ID
physical_id 硬盘的physical编号
slot_id 物理硬盘槽位编号
AR_id Array的ID
查询所有Array的信息。
示例:
[root@localhost ~]# ./arcconf getconfig 1 ar
Controllers found: 1
Array Information
Array Number 0
Name : A
Status : Ok
Interface : SATA SSD
Total Size : 305152 MB
Unused Size : 204800 MB
Block Size : 512 Bytes
Array Utilization : 32.77% Used, 67.23% Unused
Type : Data
Transformation Status : Not Applicable
Spare Rebuild Mode : Dedicated
SSD I/O Bypass : Disabled
Array Logical Device Information
Logical 0 : Optimal (1, Data, 49999 MB) LogicalDrv 0
Array Physical Device Information
查询所有LD的基本信息。
[root@localhost ~]# ./arcconf getconfig 1 ld
Controllers found: 1
Logical device information
Logical Device number 0
Logical Device name : LogicalDrv 0
Disk Name : /dev/sdd
Block Size of member drives : 512 Bytes
Array : 0
RAID level : 1
Status of Logical Device : Optimal
Size : 49999 MB
Stripe-unit size : 256 KB
Full Stripe Size : 256 KB
Interface Type : SATA SSD
Device Type : Data
Boot Type : None
Heads : 255
Sectors Per Track : 32
Cylinders : 12549
(2)设置RAID 卡工作模式
3152-8iRAID卡支持三种工作模式,具体如下:
RAID模式:控制器下的逻辑盘会上报OS,但直通盘不会报送给OS。
HBA模式:控制器的所有RAID功能都被禁用,控制器下所有硬盘都被视为直通
盘。
Mixed模式:RAID逻辑盘和直通盘都会报送给OS。
默认为“Mixed”模式。
mode 2:设置为HBA模式。 3:设置为RAID模式。 5:设置为Mixed模式。
arcconf setcontrollermode controller_id mode
示例:
设置RAID卡工作模式为“Mixed”。
[root@localhost ~]# ./arcconf setcontrollermode 1 5
Controllers found: 1
Command completed successfully
查询RAID卡工作模式。
[root@localhost ~]# arcconf getconfig 1
Controllers found: 1
Controller information
Controller Status : Optimal
Controller Mode : Mixed
Channel description : SCSI
Controller Model : MSCC Adaptec SmartRAID 3152-8i
Controller Serial Number : 7A45F30016A
(3)使能和关闭LD 的cache 功能
arcconf setcache controller_id logicaldrive LD_id con
arcconf setcache controller_id logicaldrive LD_id coff
示例:
使能LD0的cache功能。
[root@localhost ~]# arcconf setcache 1 logicaldrive 0 con
Controllers found: 1
Cache mode is already set to Enabled.
Command aborted.
关闭LD0的cache功能。
[root@localhost ~]# arcconf setcache 1 logicaldrive 0 coff
Controllers found: 1
Command completed successfully.
(4)创建和删除RAID
arcconf create controller_id logicaldrive [options] <size> level <channel# ID#> [channel# ID#]
arcconf delete controller_id logicaldrive ld_id noprompt
示例:
创建raid1,默认快速初始化
[root@localhost ~]#arcconf create 1 logicaldrive max 1 0 8 0 9
Controllers found: 1
Space will be wasted as devices specified are of different sizes.
Do you want to add a logical device to the configuration?
Press y, then ENTER to continue or press ENTER to abort: y
Creating logical device: vd1
Command completed successfully.
删除ID为0的虚拟磁盘。
[root@localhost ~]# arcconf delete 1 logicaldrive 0 noprompt
Controllers found: 1
WARNING: Deleting this logical device will automatically delete array 0 because it is the only logical device
present on that array.
All data in logical device 0 will be lost.
Deleting: logical device 0 ("Logical Drive 1")
arr
Command completed successfully.
(5)查询和设置硬盘写Cache 策略
arcconf getconfig controller_id ad
arcconf setcache controller_id drivewritecachepolicy drivetype cachepolicy
drivetype cachepolicy...
drivetype 要设置写Cache策略的硬盘类型
● Configured:设置RAID/Mixed模式下RAID组成员盘的写Cache策略。
● Unconfigured:设置RAID/Mixed模式下非RAID组成员盘的写Cache策略。
● HBA:设置HBA模式下硬盘的写Cache策略。
cachepolicy 硬盘写Cache策略
● 0:Default(将硬盘的写Cache保持为默认状态)
● 1:Enabled(打开硬盘的写Cache功能)
● 2:Disabled(关闭硬盘的写Cache功能)
设置Configured状态的硬盘Cache写策略为Enabled,设置Unconfigured状态的硬盘
Cache写策略为Disabled。
[root@localhost ~]# ./arcconf setcache 1 drivewritecachepolicy Configured 1 Unconfigured 2
Controllers found: 1
Enabling controller drive write cache can increase write performance but risks losing the data in the cache
on sudden loss.
Command completed successfully.
查询当前的硬盘写Cache策略
[root@localhost ~]# ./arcconf getconfig 1 ad
Controllers found: 1
Controller information
Controller Statue : Optimal
Controller Mode :Mixed
Channle description :SCSI
Physical Drive Write Cache Policy Information
Configured Drives :Enabled
Unconfigured Drives :Disabled
HBA Drives :Default
3、常见问题:
(1)、LSI raid卡硬盘因误操作被热插拔后告警,此时硬盘被标记为Foreign,需手动清除或导入外部配置。
若为单盘raid0,可导入外部配置,若为其他raid级别,建议清除外部配置后,手动重构。
(2)、LSI raid卡,raid1/5双盘故障,raid失效场景下,可尝试手动设置最后故障硬盘状态为Ugood,更换硬盘尝试重构。