记录以下今天处理的服务器:
情况说明:linux 系统,不知道什么原因系统就突然不能用了(据说是前段时间断电来着,但是机房有应急电源)。
系统环境:
服务器:华为RH2288H V3 服务器
服务器系统:linux 龙蜥操作系统 Anolis OS 8.4,
硬盘:两块300G硬盘,做的raid1
两块硬盘故障灯都亮。
这个里面的都不能选:
以下是日志:
大神们给分析一下:到底什么原因导致的
这是app_debug_log日志
2023-11-17 21:10:37 Payload ERROR: payload_hs.c(1005): hse_activate_completed:hse_fru_activate_policy
2023-11-17 21:10:38 Payload ERROR: payload_hs.c(1019): hse_activate_completed:hs_send_evt(FRU_ACTIVATED_DEACTIVATED)
2023-11-17 21:10:38 Payload ERROR: payload_hs.c(1030): hse_activate_completed:hs_send_evt(FRU_ACTIVATED_COMPLETED)
2023-11-17 21:10:38 Payload ERROR: payload_pwr.c(767): Detect fru270868 payload power dropped.hotswap:M767 m_pwr_state:0 Hardware:1
2023-11-17 21:10:38 Payload ERROR: payload_hop.c(261): pwrpg_status:old_tmp=00,tmp=01
2023-11-17 21:10:38 Payload ERROR: payload_hs.c(177): move M1 to M2
2023-11-17 21:10:38 Payload ERROR: payload_hs.c(636): send activate event at M1
2023-11-17 21:10:38 Payload ERROR: payload_hs.c(948): hse_fru_activate:sending active event
2023-11-17 21:10:38 Payload ERROR: payload_pwr.c(1110): detect a host reset occured, start the host checker...
2023-11-17 21:10:38 Payload ERROR: payload_hs.c(213): move M2 to M3
2023-11-17 21:10:38 Payload ERROR: payload_hs.c(676): call pp_fru_pwr_ctrl(fru_id:0, POWER_ON)
2023-11-17 21:10:38 Payload ERROR: payload_hs.c(273): move M3 to M4
2023-11-17 21:10:38 Payload ERROR: payload_hop.c(1364): hop_on:already power on.
2023-11-17 21:10:38 Payload ERROR: payload_hs.c(1030): hse_activate_completed:hs_send_evt(FRU_ACTIVATED_COMPLETED)
2023-11-17 21:10:38 CpuMem ERROR: cpu.c(2880): Get cpu architecture failed!
2023-11-17 21:12:15 CpuMem ERROR: cpu.c(868): Cpu1:get processor_sn failed !
2023-11-17 21:12:15 CpuMem ERROR: cpu.c(990): Cpu2:get manufacturer failed !
2023-11-17 21:12:15 CpuMem ERROR: cpu.c(957): Cpu2:get processor_family failed !
2023-11-17 21:12:15 CpuMem ERROR: cpu.c(897): Cpu2:get processor_version failed !FRU
2023-11-17 21:12:15 CpuMem ERROR: cpu.c(868): Cpu2:get processor_sn failed !
2023-11-17 21:12:15 CpuMem ERROR: cpu.c(837): Cpu2:get processor_assettag failed !
2023-11-17 21:12:18 Payload ERROR: payload_pwr.c(1191): host start successfully, host checker exit.
2023-11-17 21:13:01 sensor_alarm ERROR: sel.c(626): NO matching Sel Filter, sensor_type is 0x1f, reading_type is 0x6f, event_data_1 is 0x7
2023-11-17 21:14:12 sensor_alarm ERROR: sel.c(626): NO matching Sel Filter, sensor_type is 0x1f, reading_type is 0x6f, event_data_1 is 0xa
2023-11-17 21:14:38 Payload ERROR: payload_pwr.c(5909): .... restart cause=0
2023-11-17 21:14:38 Payload ERROR: payload_pwr.c(1110): detect a host reset occured, start the host checker...
2023-11-17 21:15:59 CpuMem ERROR: cpu.c(868): Cpu1:get processor_sn failed !
2023-11-17 21:15:59 CpuMem ERROR: cpu.c(990): Cpu2:get manufacturer failed !
2023-11-17 21:15:59 CpuMem ERROR: cpu.c(957): Cpu2:get processor_family failed !
2023-11-17 21:15:59 CpuMem ERROR: cpu.c(897): Cpu2:get processor_version failed !
2023-11-17 21:15:59 CpuMem ERROR: cpu.c(868): Cpu2:get processor_sn failed !
2023-11-17 21:15:59 CpuMem ERROR: cpu.c(837): Cpu2:get processor_assettag failed !
2023-11-17 21:16:17 Payload ERROR: payload_pwr.c(1191): host start successfully, host checker exit.
2023-11-17 21:16:43 sensor_alarm ERROR: sel.c(626): NO matching Sel Filter, sensor_type is 0x1f, reading_type is 0x6f, event_data_1 is 0x7
2023-11-17 21:28:10 sensor_alarm ERROR: sel.c(626): NO matching Sel Filter, sensor_type is 0x1f, reading_type is 0x6f, event_data_1 is 0x8
2023-11-17 21:28:42 Payload ERROR: payload_pwr.c(5909): .... restart cause=0
2023-11-17 21:28:42 Payload ERROR: payload_pwr.c(1110): detect a host reset occured, start the host checker...
2023-11-17 21:29:57 CpuMem ERROR: cpu.c(868): Cpu1:get processor_sn failed !
2023-11-17 21:29:57 CpuMem ERROR: cpu.c(990): Cpu2:get manufacturer failed !
2023-11-17 21:29:57 CpuMem ERROR: cpu.c(957): Cpu2:get processor_family failed !
2023-11-17 21:29:57 CpuMem ERROR: cpu.c(897): Cpu2:get processor_version failed !
2023-11-17 21:29:57 CpuMem ERROR: cpu.c(868): Cpu2:get processor_sn failed !
2023-11-17 21:29:57 CpuMem ERROR: cpu.c(837): Cpu2:get processor_assettag failed !
2023-11-17 21:30:22 Payload ERROR: payload_pwr.c(1191): host start successfully, host checker exit.
2023-11-17 21:31:08 Payload ERROR: payload_hop.c(208): fru0 acpi_status:old_tmp=01,tmp=00
2023-11-17 21:31:08 Payload ERROR: payload_pwr.c(706): pp_check_pwr_mutation 706:Detect fru0 payload power dropped.hotswap:M4 m_pwr_state:1 Hardware:0
2023-11-17 21:31:08 Payload ERROR: payload_hop.c(261): pwrpg_status:old_tmp=01,tmp=00
2023-11-17 21:31:08 Payload ERROR: payload_hs.c(394): move M4 to M6
2023-11-17 21:31:09 Payload ERROR: payload_pwr.c(706): pp_check_pwr_mutation 706:Detect fru0 payload power dropped.hotswap:M6 m_pwr_state:0 Hardware:0
2023-11-17 21:31:09 Payload ERROR: payload_hs.c(548): move M6 to M1
2023-11-17 21:31:09 Payload ERROR: payload_pwr.c(706): pp_check_pwr_mutation 706:Detect fru0 payload power dropped.hotswap:M1 m_pwr_state:0 Hardware:0
2023-11-17 21:37:34 Payload ERROR: payload_hop.c(208): fru0 acpi_status:old_tmp=00,tmp=01
2023-11-17 21:37:34 Payload ERROR: payload_hs.c(1005): hse_activate_completed:hse_fru_activate_policy
2023-11-17 21:37:34 Payload ERROR: payload_hs.c(1019): hse_activate_completed:hs_send_evt(FRU_ACTIVATED_DEACTIVATED)
2023-11-17 21:37:34 Payload ERROR: payload_hs.c(1030): hse_activate_completed:hs_send_evt(FRU_ACTIVATED_COMPLETED)
2023-11-17 21:37:34 Payload ERROR: payload_pwr.c(767): Detect fru270868 payload power dropped.hotswap:M767 m_pwr_state:0 Hardware:1
2023-11-17 21:37:34 Payload ERROR: payload_hop.c(261): pwrpg_status:old_tmp=00,tmp=01
2023-11-17 21:37:34 Payload ERROR: payload_hs.c(177): move M1 to M2
2023-11-17 21:37:34 Payload ERROR: payload_hs.c(636): send activate event at M1
2023-11-17 21:37:34 Payload ERROR: payload_hs.c(948): hse_fru_activate:sending active event
2023-11-17 21:37:34 Payload ERROR: payload_pwr.c(1110): detect a host reset occured, start the host checker...
2023-11-17 21:37:35 Payload ERROR: payload_hs.c(213): move M2 to M3
2023-11-17 21:37:35 Payload ERROR: payload_hs.c(676): call pp_fru_pwr_ctrl(fru_id:0, POWER_ON)
2023-11-17 21:37:35 Payload ERROR: payload_hs.c(273): move M3 to M4
2023-11-17 21:37:35 Payload ERROR: payload_hop.c(1364): hop_on:already power on.
2023-11-17 21:37:35 Payload ERROR: payload_hs.c(1030): hse_activate_completed:hs_send_evt(FRU_ACTIVATED_COMPLETED)
2023-11-17 21:39:13 Payload ERROR: payload_pwr.c(1191): host start successfully, host checker exit.
2023-11-17 21:46:16 Payload ERROR: payload_hop.c(208): fru0 acpi_status:old_tmp=01,tmp=00
2023-11-17 21:46:16 Payload ERROR: payload_pwr.c(706): pp_check_pwr_mutation 706:Detect fru0 payload power dropped.hotswap:M4 m_pwr_state:1 Hardware:0
2023-11-17 21:46:16 Payload ERROR: payload_hop.c(261): pwrpg_status:old_tmp=01,tmp=00
2023-11-17 21:46:16 Payload ERROR: payload_hs.c(394): move M4 to M6
2023-11-17 21:46:16 Payload ERROR: payload_pwr.c(706): pp_check_pwr_mutation 706:Detect fru0 payload power dropped.hotswap:M6 m_pwr_state:0 Hardware:0
2023-11-17 21:46:16 Payload ERROR: payload_hs.c(548): move M6 to M1
2023-11-17 21:46:17 Payload ERROR: payload_pwr.c(706): pp_check_pwr_mutation 706:Detect fru0 payload power dropped.hotswap:M1 m_pwr_state:0 Hardware:0
2023-11-17 21:46:25 Payload ERROR: payload_hop.c(208): fru0 acpi_status:old_tmp=00,tmp=01
2023-11-17 21:46:25 Payload ERROR: payload_hs.c(1005): hse_activate_completed:hse_fru_activate_policy
2023-11-17 21:46:25 Payload ERROR: payload_hs.c(1019): hse_activate_completed:hs_send_evt(FRU_ACTIVATED_DEACTIVATED)
2023-11-17 21:46:25 Payload ERROR: payload_hs.c(1030): hse_activate_completed:hs_send_evt(FRU_ACTIVATED_COMPLETED)
2023-11-17 21:46:25 Payload ERROR: payload_pwr.c(767): Detect fru270868 payload power dropped.hotswap:M767 m_pwr_state:0 Hardware:1
2023-11-17 21:46:25 Payload ERROR: payload_hop.c(261): pwrpg_status:old_tmp=00,tmp=01
2023-11-17 21:46:25 Payload ERROR: payload_hs.c(177): move M1 to M2
2023-11-17 21:46:25 Payload ERROR: payload_hs.c(636): send activate event at M1
2023-11-17 21:46:25 Payload ERROR: payload_hs.c(948): hse_fru_activate:sending active event
2023-11-17 21:46:25 Payload ERROR: payload_pwr.c(1110): detect a host reset occured, start the host checker...
2023-11-17 21:46:25 Payload ERROR: payload_hs.c(213): move M2 to M3
2023-11-17 21:46:25 Payload ERROR: payload_hs.c(676): call pp_fru_pwr_ctrl(fru_id:0, POWER_ON)
2023-11-17 21:46:25 Payload ERROR: payload_hop.c(1364): hop_on:already power on.
2023-11-17 21:46:26 Payload ERROR: payload_hs.c(273): move M3 to M4
2023-11-17 21:46:26 Payload ERROR: payload_hs.c(1030): hse_activate_completed:hs_send_evt(FRU_ACTIVATED_COMPLETED)
2023-11-17 21:48:06 Payload ERROR: payload_pwr.c(1191): host start successfully, host checker exit.
2023-11-17 21:49:15 Payload ERROR: payload_pwr.c(5909): .... restart cause=0
2023-11-17 21:49:15 Payload ERROR: payload_pwr.c(1110): detect a host reset occured, start the host checker...
2023-11-17 21:50:55 Payload ERROR: payload_pwr.c(1191): host start successfully, host checker exit.
2023-11-17 21:52:28 Payload ERROR: payload_pwr.c(5909): .... restart cause=0
2023-11-17 21:52:28 Payload ERROR: payload_pwr.c(1110): detect a host reset occured, start the host checker...
2023-11-17 21:54:07 Payload ERROR: payload_pwr.c(1191): host start successfully, host checker exit.
2023-11-17 21:55:19 Payload ERROR: payload_pwr.c(5909): .... restart cause=0
2023-11-17 21:55:19 Payload ERROR: payload_pwr.c(1110): detect a host reset occured, start the host checker...
2023-11-17 21:57:00 Payload ERROR: payload_pwr.c(1191): host start successfully, host checker exit.
2023-11-17 21:58:23 Payload ERROR: payload_pwr.c(5909): .... restart cause=0
2023-11-17 21:58:23 Payload ERROR: payload_pwr.c(1110): detect a host reset occured, start the host checker...
2023-11-17 22:00:02 Payload ERROR: payload_pwr.c(1191): host start successfully, host checker exit.
2023-11-17 22:08:07 Payload ERROR: payload_pwr.c(5909): .... restart cause=0
2023-11-17 22:08:07 Payload ERROR: payload_pwr.c(1110): detect a host reset occured, start the host checker...
2023-11-17 22:09:48 Payload ERROR: payload_pwr.c(1191): host start successfully, host checker exit.
2023-11-19 09:14:17 Payload ERROR: payload_hop.c(208): fru0 acpi_status:old_tmp=01,tmp=00
2023-11-19 09:14:17 Payload ERROR: payload_hs.c(394): move M4 to M6
2023-11-19 09:14:17 Payload ERROR: payload_pwr.c(706): pp_check_pwr_mutation 706:Detect fru0 payload power dropped.hotswap:M6 m_pwr_state:1 Hardware:0
2023-11-19 09:14:17 Payload ERROR: payload_hop.c(261): pwrpg_status:old_tmp=01,tmp=00
2023-11-19 09:14:18 Payload ERROR: payload_pwr.c(706): pp_check_pwr_mutation 706:Detect fru0 payload power dropped.hotswap:M6 m_pwr_state:0 Hardware:0
2023-11-19 09:14:18 Payload ERROR: payload_hs.c(548): move M6 to M1
2023-11-19 09:14:18 Payload ERROR: payload_pwr.c(706): pp_check_pwr_mutation 706:Detect fru0 payload power dropped.hotswap:M1 m_pwr_state:0 Hardware:0
2023-11-19 09:15:41 Payload ERROR: payload_hop.c(208): fru0 acpi_status:old_tmp=00,tmp=01
2023-11-19 09:15:41 Payload ERROR: payload_hs.c(1005): hse_activate_completed:hse_fru_activate_policy
2023-11-19 09:15:41 Payload ERROR: payload_hs.c(1019): hse_activate_completed:hs_send_evt(FRU_ACTIVATED_DEACTIVATED)
2023-11-19 09:15:41 Payload ERROR: payload_hs.c(1030): hse_activate_completed:hs_send_evt(FRU_ACTIVATED_COMPLETED)
2023-11-19 09:15:41 Payload ERROR: payload_pwr.c(767): Detect fru270868 payload power dropped.hotswap:M767 m_pwr_state:0 Hardware:1
2023-11-19 09:15:41 Payload ERROR: payload_hs.c(177): move M1 to M2
2023-11-19 09:15:41 Payload ERROR: payload_hop.c(261): pwrpg_status:old_tmp=00,tmp=01
2023-11-19 09:15:41 Payload ERROR: payload_hs.c(636): send activate event at M1
2023-11-19 09:15:41 Payload ERROR: payload_hs.c(948): hse_fru_activate:sending active event
2023-11-19 09:15:41 Payload ERROR: payload_pwr.c(1110): detect a host reset occured, start the host checker...
2023-11-19 09:15:42 Payload ERROR: payload_hs.c(213): move M2 to M3
2023-11-19 09:15:42 Payload ERROR: payload_hs.c(676): call pp_fru_pwr_ctrl(fru_id:0, POWER_ON)
2023-11-19 09:15:42 Payload ERROR: payload_hop.c(1364): hop_on:already power on.
2023-11-19 09:15:42 Payload ERROR: payload_hs.c(273): move M3 to M4
2023-11-19 09:15:42 Payload ERROR: payload_hs.c(1030): hse_activate_completed:hs_send_evt(FRU_ACTIVATED_COMPLETED)
2023-11-19 09:17:21 Payload ERROR: payload_pwr.c(1191): host start successfully, host checker exit.
2023-11-19 09:19:46 Payload ERROR: payload_hop.c(208): fru0 acpi_status:old_tmp=01,tmp=00
2023-11-19 09:19:46 Payload ERROR: payload_pwr.c(706): pp_check_pwr_mutation 706:Detect fru0 payload power dropped.hotswap:M4 m_pwr_state:1 Hardware:0
2023-11-19 09:19:46 Payload ERROR: payload_hop.c(261): pwrpg_status:old_tmp=01,tmp=00
2023-11-19 09:19:47 Payload ERROR: payload_hs.c(394): move M4 to M6
2023-11-19 09:19:47 Payload ERROR: payload_pwr.c(706): pp_check_pwr_mutation 706:Detect fru0 payload power dropped.hotswap:M6 m_pwr_state:0 Hardware:0
2023-11-19 09:19:47 Payload ERROR: payload_hs.c(548): move M6 to M1
2023-11-19 09:19:47 Payload : ERROR: payload_pwr.c(706): pp_check_pwr_mutation 706:Detect fru0 payload power dropped.hotswap:M6 m_pwr_state:0 Hardware:0 (repeated 2 times)
2023-11-19 09:19:47 Payload ERROR: payload_pwr.c(706): pp_check_pwr_mutation 706:Detect fru0 payload power dropped.hotswap:M1 m_pwr_state:0 Hardware:0
1970-01-01 00:00:35 Payload ERROR: payload_hop.c(191): get fru1 object fail!(result=-2009)
1970-01-01 00:00:35 Payload ERROR: payload_hop.c(191): get fru2 object fail!(result=-2009)
1970-01-01 00:00:35 Payload ERROR: payload_hop.c(191): get fru3 object fail!(result=-2009)
1970-01-01 00:00:35 Payload ERROR: payload_hop.c(191): get fru4 object fail!(result=-2009)
1970-01-01 00:00:35 Payload ERROR: payload_hop.c(191): get fru5 object fail!(result=-2009)
1970-01-01 00:00:35 Payload ERROR: payload_hop.c(191): get fru6 object fail!(result=-2009)
1970-01-01 00:00:35 Payload ERROR: payload_hop.c(191): get fru7 object fail!(result=-2009)
1970-01-01 00:00:35 Payload ERROR: payload_hop.c(191): get fru8 object fail!(result=-2009)
1970-01-01 00:00:35 Payload ERROR: payload_hop.c(191): get fru9 object fail!(result=-2009)
1970-01-01 00:00:35 Payload ERROR: payload_hop.c(191): get fru10 object fail!(result=-2009)
这是fdm_log日志
这是fdm_log日志
[Hardware Error Log]:NO.1 SMI Serial NO.0
collect:bios(smi) time: 2023-11-17 21:49:35 GMT flag:0x00
CPU:0 (socket:CPU1) LogType: IIO AER module:PCIe ADDITIONAL
DEV:(0x00:0x01.0x00)
First Error type: Non-Fatal ERROR Error Code: Received PCIe completion with UR (80)
Error type: corrected errors Error Code: PCIe link bandwidth changed (76)
------iio pcie additional reg dump:------
errpin_ctl: 0x00000000
errpin_stat: 0x00000000
g_sys_ctl: 0x00000000
g_sys_stat: 0x00000000
sys_map: 0x00000120
g_err_ctl: 0x00000000
g_ferr_stat: 0x00000000
g_nerr_stat: 0x00901140
g_cerr_stat: 0x00101140
g_f_ferr_stat: 0x00000000
g_n_ferr_stat: 0x00000000
g_f_nferr_stat: 0x00100000
g_n_nferr_stat: 0x00801140
g_f_cerr_stat: 0x00100000
g_n_cerr_stat: 0x00001140
pcie_uncorrectable_err_detect_mask: 0x00000000
pcie_correctable_err_detect_mask: 0x00000000
pcie_uncorrectable_err_stat: 0x00000040
pcie_correctable_err_stat: 0x00000001
pcie_uncorrectable_err_mask: 0x00000000
pcie_correctable_err_mask: 0x00000000
pcie_uncorrectable_err_ptr: 0x00000006
pcie_uncorrectable_err_sv: 0x00000002
pcie_global_err_stat: 0x0000
pcie_global_f_err_ptr: 0x0000
[Hardware Error Log]:NO.2 SMI Serial NO.0
collect:bios(smi) time: 2023-11-17 21:49:35 GMT flag:0x00
CPU:0 (socket:CPU1) LogType: IIO AER module:PCIe ADDITIONAL
DEV:(0x00:0x02.0x00)
First Error type: Non-Fatal ERROR Error Code: Received PCIe completion with UR (80)
Error type: corrected errors Error Code: PCIe link bandwidth changed (76)
------iio pcie additional reg dump:------
errpin_ctl: 0x00000000
errpin_stat: 0x00000000
g_sys_ctl: 0x00000000
g_sys_stat: 0x00000000
sys_map: 0x00000120
g_err_ctl: 0x00000000
g_ferr_stat: 0x00000000
g_nerr_stat: 0x00901140
g_cerr_stat: 0x00101140
g_f_ferr_stat: 0x00000000
g_n_ferr_stat: 0x00000000
g_f_nferr_stat: 0x00100000
g_n_nferr_stat: 0x00801140
g_f_cerr_stat: 0x00100000
g_n_cerr_stat: 0x00001140
pcie_uncorrectable_err_detect_mask: 0x00000000
pcie_correctable_err_detect_mask: 0x00000000
pcie_uncorrectable_err_stat: 0x00000040
pcie_correctable_err_stat: 0x00000001
pcie_uncorrectable_err_mask: 0x00000000
pcie_correctable_err_mask: 0x00000000
pcie_uncorrectable_err_ptr: 0x00000006
pcie_uncorrectable_err_sv: 0x00000002
pcie_global_err_stat: 0x0000
pcie_global_f_err_ptr: 0x0000
mass_operate_log日志
2022-09-10 05:06:27 IPMI,Unknown@Unknown,ipmi_app,Send message(CH-6) [ 2C B8 1C 20 0E D5 57 01 00 00 3C 00 69 ]
2022-09-10 05:26:57 IPMI,Unknown@Unknown,ipmi_app,Send message(CH-6) [ 2C B8 1C 20 0E D5 57 01 00 00 3C 00 69 ]
2023-11-18 05:16:37 IPMI,Unknown@Unknown,ipmi_app,Send message(CH-6) [ 2C B8 1C 20 0E D5 57 01 00 00 3C 00 69 ]
2023-11-18 05:19:06 IPMI,Unknown@Unknown,ipmi_app,Send message(CH-6) [ 2C B8 1C 20 0E D5 57 01 00 00 3C 00 69 ]
2023-11-18 05:21:36 IPMI,Unknown@Unknown,ipmi_app,Send message(CH-6) [ 2C B8 1C 20 0E D5 57 01 00 00 3C 00 69 ]
2023-11-18 06:24:38 IPMI,Unknown@Unknown,ipmi_app,Send message(CH-6) [ 2C B8 1C 20 0E D5 57 01 00 00 3C 00 69 ]
2023-11-18 06:26:25 IPMI,Unknown@Unknown,ipmi_app,Send message(CH-6) [ 2C B8 1C 20 0E D5 57 01 00 00 3C 00 69 ]
2023-11-18 06:35:17 IPMI,Unknown@Unknown,ipmi_app,Send message(CH-6) [ 2C B8 1C 20 0E D5 57 01 00 00 3C 00 69 ]
2023-11-18 07:21:40 IPMI,Unknown@Unknown,ipmi_app,Send message(CH-6) [ 2C B8 1C 20 0E D5 57 01 00 00 3C 00 69 ]
2023-11-18 07:24:42 IPMI,Unknown@Unknown,ipmi_app,Send message(CH-6) [ 2C B8 1C 20 0E D5 57 01 00 00 3C 00 69 ]
2023-11-17 18:46:05 IPMI,Unknown@Unknown,ipmi_app,Send message(CH-6) [ 2C B8 1C 20 0E D5 57 01 00 00 3C 00 69 ]
2023-11-17 18:46:28 IPMI,Unknown@Unknown,ipmi_app,Send message(CH-6) [ 2C B8 1C 20 0E D5 57 01 00 00 3C 00 69 ]
2023-11-17 18:51:20 IPMI,Unknown@Unknown,ipmi_app,Send message(CH-6) [ 2C B8 1C 20 0E D5 57 01 00 00 3C 00 69 ]
2023-11-17 19:28:14 IPMI,Unknown@Unknown,ipmi_app,Send message(CH-6) [ 2C B8 1C 20 0E D5 57 01 00 00 3C 00 69 ]
2023-11-17 19:45:00 IPMI,Unknown@Unknown,ipmi_app,Send message(CH-6) [ 2C B8 1C 20 0E D5 57 01 00 00 3C 00 69 ]
2023-11-17 19:48:57 IPMI,Unknown@Unknown,ipmi_app,Send message(CH-6) [ 2C B8 1C 20 0E D5 57 01 00 00 3C 00 69 ]
2023-11-17 19:50:23 IPMI,Unknown@Unknown,ipmi_app,Send message(CH-6) [ 2C B8 1C 20 0E D5 57 01 00 00 3C 00 69 ]
2023-11-17 20:39:49 IPMI,Unknown@Unknown,ipmi_app,Send message(CH-6) [ 2C B8 1C 20 0E D5 57 01 00 00 3C 00 69 ]
2023-11-17 20:45:42 IPMI,Unknown@Unknown,ipmi_app,Send message(CH-6) [ 2C B8 1C 20 0E D5 57 01 00 00 3C 00 69 ]
2023-11-17 21:12:15 IPMI,Unknown@Unknown,ipmi_app,Send message(CH-6) [ 2C B8 1C 20 0E D5 57 01 00 00 3C 00 69 ]
2023-11-17 21:15:59 IPMI,Unknown@Unknown,ipmi_app,Send message(CH-6) [ 2C B8 1C 20 0E D5 57 01 00 00 3C 00 69 ]
2023-11-17 21:29:57 IPMI,Unknown@Unknown,ipmi_app,Send message(CH-6) [ 2C B8 1C 20 0E D5 57 01 00 00 3C 00 69 ]
2023-11-17 21:51:40 IPMI,Unknown@Unknown,ipmi_app,Send message(CH-6) [ 2C B8 1C 20 0E D5 57 01 00 00 3C 00 69 ]
2023-11-17 21:54:50 IPMI,Unknown@Unknown,ipmi_app,Send message(CH-6) [ 2C B8 1C 20 0E D5 57 01 00 00 3C 00 69 ]
2023-11-17 21:57:51 IPMI,Unknown@Unknown,ipmi_app,Send message(CH-6) [ 2C B8 1C 20 0E D5 57 01 00 00 3C 00 69 ]
2023-11-19 09:40:44 IPMI,Unknown@Unknown,ipmi_app,Send message(CH-6) [ 2C B8 1C 20 0E D5 57 01 00 00 3C 00 69 ]
2023-11-19 10:18:33 IPMI,Unknown@Unknown,ipmi_app,Send message(CH-6) [ 2C B8 1C 20 0E D5 57 01 00 00 3C 00 69 ]
2023-11-19 10:52:12 IPMI,Unknown@Unknown,ipmi_app,Send message(CH-6) [ 2C B8 1C 20 0E D5 57 01 00 00 3C 00 69 ]
2023-11-19 10:56:25 IPMI,Unknown@Unknown,ipmi_app,Send message(CH-6) [ 2C B8 1C 20 0E D5 57 01 00 00 3C 00 69 ]
2023-11-19 11:02:53 IPMI,Unknown@Unknown,ipmi_app,Send message(CH-6) [ 2C B8 1C 20 0E D5 57 01 00 00 3C 00 69 ]
2023-11-19 11:34:47 IPMI,Unknown@Unknown,ipmi_app,Send message(CH-6) [ 2C B8 1C 20 0E D5 57 01 00 00 3C 00 69 ]
2023-11-19 11:41:43 IPMI,Unknown@Unknown,ipmi_app,Send message(CH-6) [ 2C B8 1C 20 0E D5 57 01 00 00 3C 00 69 ]
2023-11-19 11:44:44 IPMI,Unknown@Unknown,ipmi_app,Send message(CH-6) [ 2C B8 1C 20 0E D5 57 01 00 00 3C 00 69 ]
2023-11-19 12:08:41 IPMI,Unknown@Unknown,ipmi_app,Send message(CH-6) [ 2C B8 1C 20 0E D5 57 01 00 00 3C 00 69 ]
以下是磁盘日志
161 Normal 2023-11-18 Saturday 05:14:25 ACPI State Power on state 2200FFFF Asserted
162 Normal 2023-11-18 Saturday 05:14:25 DIMM010 Presence detected, dimm is 0/1/0 0C06FFFF Asserted
163 Normal 2023-11-18 Saturday 05:14:31 SysRestart System Restart [Power button][LOCAL] 1D0703FF Asserted
164 Normal 2023-11-18 Saturday 05:16:50 SysRestart System Restart [Unknown][IPMB] 1D0700FF Asserted
165 Normal 2023-11-18 Saturday 05:19:44 SysRestart System Restart [Unknown][IPMB] 1D0700FF Asserted
166 Major 2023-11-18 Saturday 05:21:56 CPU1 Prochot State Asserted 0341FFFF Asserted
167 Normal 2023-11-18 Saturday 06:22:18 SysRestart System Restart [Unknown][IPMB] 1D0700FF Asserted
168 Major 2023-11-18 Saturday 06:22:20 CPU1 Prochot State Asserted 03C1FFFF Deasserted
169 Normal 2023-11-18 Saturday 06:25:00 SysRestart System Restart [Unknown][IPMB] 1D0700FF Asserted
170 Normal 2023-11-18 Saturday 06:34:02 SysRestart System Restart [Unknown][IPMB] 1D0700FF Asserted
171 Normal 2023-11-18 Saturday 07:20:25 SysRestart System Restart [Unknown][IPMB] 1D0700FF Asserted
172 Normal 2023-11-18 Saturday 07:23:28 SysRestart System Restart [Unknown][IPMB] 1D0700FF Asserted
173 Normal 2023-11-17 Friday 18:43:54 Eth1 Link Down Slot is Disabled 2108FFFF Asserted
174 Normal 2023-11-17 Friday 18:44:36 SysRestart System Restart [Unknown][IPMB] 1D0700FF Asserted
175 Normal 2023-11-17 Friday 18:49:42 SysRestart System Restart [Unknown][IPMB] 1D0700FF Asserted
176 Normal 2023-11-17 Friday 19:18:27 DISK0 Hard disk presence 0D80FFFF Deasserted
177 Major 2023-11-17 Friday 19:18:44 DISK0 In Failed Array 0D06FFFF Asserted
178 Normal 2023-11-17 Friday 19:18:52 DISK0 Hard disk presence 0D00FFFF Asserted
179 Major 2023-11-17 Friday 19:19:02 DISK0 Hard disk drive fault 0D01FFFF Asserted
180 Major 2023-11-17 Friday 19:19:02 DISK0 In Failed Array 0D86FFFF Deasserted
181 Normal 2023-11-17 Friday 19:19:24 DISK0 Hard disk presence 0D80FFFF Deasserted
182 Normal 2023-11-17 Friday 19:19:27 DISK0 Hard disk presence 0D00FFFF Asserted
183 Normal 2023-11-17 Friday 19:19:30 DISK0 Hard disk presence 0D80FFFF Deasserted
184 Normal 2023-11-17 Friday 19:19:32 DISK0 Hard disk presence 0D00FFFF Asserted
185 Normal 2023-11-17 Friday 19:19:54 DISK0 Hard disk presence 0D80FFFF Deasserted
186 Normal 2023-11-17 Friday 19:19:57 DISK0 Hard disk presence 0D00FFFF Asserted
187 Normal 2023-11-17 Friday 19:25:01 DISK0 Hard disk presence 0D80FFFF Deasserted
188 Normal 2023-11-17 Friday 19:25:03 DISK0 Hard disk presence 0D00FFFF Asserted
189 Normal 2023-11-17 Friday 19:25:09 DISK0 Hard disk presence 0D80FFFF Deasserted
190 Major 2023-11-17 Friday 19:25:18 DISK0 Hard disk drive fault 0D81FFFF Deasserted
191 Major 2023-11-17 Friday 19:25:19 DISK0 In Failed Array 0D06FFFF Asserted
192 Normal 2023-11-17 Friday 19:25:59 DISK0 Hard disk presence 0D00FFFF Asserted
193 Major 2023-11-17 Friday 19:26:08 DISK0 Hard disk drive fault 0D01FFFF Asserted
194 Major 2023-11-17 Friday 19:26:08 DISK0 In Failed Array 0D86FFFF Deasserted
195 Normal 2023-11-17 Friday 19:26:55 SysRestart System Restart [Unknown][IPMB] 1D0700FF Asserted
196 Major 2023-11-17 Friday 19:27:07 DISK0 Hard disk drive fault 0D81FFFF Deasserted
197 Major 2023-11-17 Friday 19:28:09 DISK0 Hard disk drive fault 0D01FFFF Asserted
198 Normal 2023-11-17 Friday 19:42:20 Power Button Power button pressed 1400FFFF Asserted
199 Normal 2023-11-17 Friday 19:42:24 ACPI State Power off state 2206FFFF Asserted
200 Normal 2023-11-17 Friday 19:42:25 DIMM000 Presence detected, dimm is 0/0/0 0C86FFFF Deasserted
201 Normal 2023-11-17 Friday 19:42:26 DIMM010 Presence detected, dimm is 0/1/0 0C86FFFF Deasserted
202 Major 2023-11-17 Friday 19:42:35 DISK0 Hard disk drive fault 0D81FFFF Deasserted
203 Normal 2023-11-17 Friday 19:42:37 DISK0 Hard disk presence 0D80FFFF Deasserted
- "Eth1 Link Down" 表示以太网端口1的连接断开。
- "SysRestart" 表示系统重新启动的事件。
- "Hard disk presence" 表示硬盘的存在状态。
- "Hard disk drive fault" 表示硬盘驱动故障。
- "In Failed Array" 表示硬盘所在的阵列(RAID)处于失败状态。
在这些日志中,Normal 表示一般的事件状态,Major 则表示比较严重的事件状态,如硬盘故障等。
根据这些日志,系统经历了一些硬件问题,包括以太网连接断开、系统重新启动以及硬盘存在状态和故障状态的变化。这些事件可能会影响系统的正常运行,需要进一步的诊断和处理。
从日志可以看出从17号19点以后系统就彻底崩了。。。。大神们从日志还能分析出什么信息。
我们是这样处理的,和华为售后联系,确定是硬盘崩了,已经过质保了 ,只能自己联系第三方对硬盘做处理了。
华为客服让安照这个处理 :
这是RAID双成员盘手动恢复的操作文档:https://support.xfusion.com/support/#/zh/docOnline/EDOC1100080944?path=zh-cn_topic_0000001134131193&relationId=EDOC1100080946&mark=40
改操作会有导致数据丢失的风险,请谨慎操作。
但是我这情况和上面的还不一样,都不能点。。。。