1.
중국산 Xeon Server에 Proxmox를 설치하고 운영을 시작한지 이틀뒤에 사고가 발생하였습니다. 정상적으로 동작중인 서버가 동작을 하지 않고 먹통이 되었습니다.
무슨 일인가 확인하려고 부팅을 하였는데…부팅을 하지 못합니다.
처음 마더보드를 받았을 때 접했던 코드값은 ‘FF’였습니다. 처음 보는 코드값이라 당황스럽웠죠. CPU도 제거하고 Ram도 빼서 해보았지만 해결할 방도가 없었습니다. 제조사에 문의하니까, 메모리를 잘 청소하라고만 합니다.
방법을 찾기 위해 우선 무슨 의미인지 확인하였습니다. 먼저 MSI가 제공하는 중 일부분입니다.
BIOS Debug Hex Codes Decoded & BIOS Manuals Links
위 이미지 아래를 보시면 Boot Phase가 나옵니다. 좀더 상세히 알아보겠습니다.
Customizing the UEFI boot process 에 올라온 흐름도입니다.
위 프로세스로 보면 제 경우 OS Boot로 넘가가기 전에 멈춘 상태입니다. 그러면 에러 코드가 무엇을 의미할까요? MSI가 제공하는 코드설명입니다. Asus의 Q-Code도 비슷합니다.
00 – Not used
01 – Power on. Reset type detection (soft/hard)
02 – AP initialization before microcode loading
03 – System Agent initialization before microcode loading
04 – PCH initialization before microcode loading
05 – OEM initialization before microcode loading
06 – Microcode loading
07 – AP initialization after microcode loading
08 – System Agent initialization after microcode loading
09 – PCH initialization after microcode loading
0A – OEM initialization after microcode loading
0B – Cache initializationSEC Error Codes
0C – Reserved for future AMI SEC error codes
0D – Reserved for future AMI SEC error codes
0E – Microcode not found
0F – Microcode not loadedPEI Phase
10 – PEI Core is started
11 – Pre-memory CPU initialization is started
12 – Pre-memory CPU initialization (CPU module specific)
13 – Pre-memory CPU initialization (CPU module specific)
14 – Pre-memory CPU initialization (CPU module specific)
15 – Pre-memory System Agent initialization is started
16 – Pre-Memory System Agent initialization (System Agent module specific)
17 – Pre-Memory System Agent initialization (System Agent module specific)
18 – Pre-Memory System Agent initialization (System Agent module specific)
19 – Pre-memory PCH initialization is started
1A – Pre-memory PCH initialization (PCH module specific)
1B – Pre-memory PCH initialization (PCH module specific)
1C – Pre-memory PCH initialization (PCH module specific)
1D – OEM pre-memory initialization codes
1E – OEM pre-memory initialization codes
1F – OEM pre-memory initialization codes20 – OEM pre-memory initialization codes
21 – OEM pre-memory initialization codes
22 – OEM pre-memory initialization codes
23 – OEM pre-memory initialization codes
24 – OEM pre-memory initialization codes
25 – OEM pre-memory initialization codes
26 – OEM pre-memory initialization codes
27 – OEM pre-memory initialization codes
28 – OEM pre-memory initialization codes
29 – OEM pre-memory initialization codes
2A – OEM pre-memory initialization codes
2B – Memory initialization. Serial Presence Detect (SPD) data reading
2C – Memory initialization. Memory presence detection
2D – Memory initialization. Programming memory timing information
2E – Memory initialization. Confi guring memory
2F – Memory initialization (other)30 – Reserved for ASL (see ASL Status Codes section below)
31 – Memory Installed
32 – CPU post-memory initialization is started
33 – CPU post-memory initialization. Cache initialization
34 – CPU post-memory initialization. Application Processor(s) (AP) initialization
35 – CPU post-memory initialization. Boot Strap Processor (BSP) selection
36 – CPU post-memory initialization. System Management Mode (SMM) initialization
37 – Post-Memory System Agent initialization is started
38 – Post-Memory System Agent initialization (System Agent module specific)
39 – Post-Memory System Agent initialization (System Agent module specific)
3A – Post-Memory System Agent initialization (System Agent module specific)
3B – Post-Memory PCH initialization is started
3C – Post-Memory PCH initialization (PCH module specific)
3D – Post-Memory PCH initialization (PCH module specific)
3E – Post-Memory PCH initialization (PCH module specific)
3F – OEM post memory initialization codes40 – OEM post memory initialization codes
41 – OEM post memory initialization codes
42 – OEM post memory initialization codes
43 – OEM post memory initialization codes
44 – OEM post memory initialization codes
45 – OEM post memory initialization codes
46 – OEM post memory initialization codes
47 – OEM post memory initialization codes
48 – OEM post memory initialization codes
49 – OEM post memory initialization codes
4A – OEM post memory initialization codes
4B – OEM post memory initialization codes
4C – OEM post memory initialization codes
4D – OEM post memory initialization codes
4E – OEM post memory initialization codes
4F – DXE IPL is startedPEI Error Codes
50 – Memory initialization error. Invalid memory type or incompatible memory speed
51 – Memory initialization error. SPD reading has failed
52 – Memory initialization error. Invalid memory size or memory modules do not match
53 – Memory initialization error. No usable memory detected
54 – Unspecifi ed memory initialization error
55 – Memory not installed
56 – Invalid CPU type or Speed
57 – CPU mismatch
58 – CPU self test failed or possible CPU cache error
59 – CPU micro-code is not found or micro-code update is failed
5A – Internal CPU error
5B – reset PPI is not available
5C – Reserved for future AMI error codes
5D – Reserved for future AMI error codes
5E – Reserved for future AMI error codes
5F – Reserved for future AMI error codesDXE Phase
60 – DXE Core is started
61 – NVRAM initialization
62 – Installation of the PCH Runtime Services
63 – CPU DXE initialization is started
64 – CPU DXE initialization (CPU module specific)
65 – CPU DXE initialization (CPU module specific)
66 – CPU DXE initialization (CPU module specific)
67 – CPU DXE initialization (CPU module specific)
68 – PCI host bridge initialization
69 – System Agent DXE initialization is started
6A – System Agent DXE SMM initialization is started
6B – System Agent DXE initialization (System Agent module specific)
6C – System Agent DXE initialization (System Agent module specific)
6D – System Agent DXE initialization (System Agent module specific)
6E – System Agent DXE initialization (System Agent module specific)
6F – System Agent DXE initialization (System Agent module specific)70 – PCH DXE initialization is started
71 – PCH DXE SMM initialization is started
72 – PCH devices initialization
73 – PCH DXE Initialization (PCH module specific)
74 – PCH DXE Initialization (PCH module specific)
75 – PCH DXE Initialization (PCH module specific)
76 – PCH DXE Initialization (PCH module specific)
77 – PCH DXE Initialization (PCH module specific)
78 – ACPI module initialization
79 – CSM initialization
7A – Reserved for future AMI DXE codes
7B – Reserved for future AMI DXE codes
7C – Reserved for future AMI DXE codes
7D – Reserved for future AMI DXE codes
7E – Reserved for future AMI DXE codes
7F – Reserved for future AMI DXE codes80 – OEM DXE initialization codes
81 – OEM DXE initialization codes
82 – OEM DXE initialization codes
83 – OEM DXE initialization codes
84 – OEM DXE initialization codes
85 – OEM DXE initialization codes
86 – OEM DXE initialization codes
87 – OEM DXE initialization codes
88 – OEM DXE initialization codes
89 – OEM DXE initialization codes
8A – OEM DXE initialization codes
8B – OEM DXE initialization codes
8C – OEM DXE initialization codes
8D – OEM DXE initialization codes
8E – OEM DXE initialization codes
8F – OEM DXE initialization codes90 – Boot Device Selection (BDS) phase is started
91 – Driver connecting is started
92 – PCI Bus initialization is started
93 – PCI Bus Hot Plug Controller Initialization
94 – PCI Bus Enumeration 32
95 – PCI Bus Request Resources
96 – PCI Bus Assign Resources
97 – Console Output devices connect
98 – Console input devices connect
99 – Super IO Initialization
9A – USB initialization is started
9B – USB Reset
9C – USB Detect
9D – USB Enable
9E – Reserved for future AMI codes
9F – Reserved for future AMI codesA0 – IDE initialization is started
A1 – IDE Reset
A2 – IDE Detect
A3 – IDE Enable
A4 – SCSI initialization is started
A5 – SCSI Reset
A6 – SCSI Detect
A7 – SCSI Enable
A8 – Setup Verifying Password
A9 – Start of Setup
AA – Reserved for ASL (see ASL Status Codes section below)
AB – Setup Input Wait
AC – Reserved for ASL (see ASL Status Codes section below)
AD – Ready To Boot event
AE – Legacy Boot event
AF – Exit Boot Services eventB0 – Runtime Set Virtual Address MAP Begin
B1 – Runtime Set Virtual Address MAP End
B2 – Legacy Option ROM Initialization
B3 – System Reset
B4 – USB hot plug
B5 – PCI bus hot plug
B6 – Clean-up of NVRAM
B7 – Confi guration Reset (reset of NVRAM settings)
B8 – Reserved for future AMI codes
B9 – Reserved for future AMI codes
BA – Reserved for future AMI codes
BB – Reserved for future AMI codes
BC – Reserved for future AMI codes
BD – Reserved for future AMI codes
BE – Reserved for future AMI codes
BF – Reserved for future AMI codesC0 – OEM BDS initialization codes
C1 – OEM BDS initialization codes
C2 – OEM BDS initialization codes
C3 – OEM BDS initialization codes
C4 – OEM BDS initialization codes
C5 – OEM BDS initialization codes
C6 – OEM BDS initialization codes
C7 – OEM BDS initialization codes
C8 – OEM BDS initialization codes
C9 – OEM BDS initialization codes
CA – OEM BDS initialization codes
CB – OEM BDS initialization codes
CC – OEM BDS initialization codes
CD – OEM BDS initialization codes
CE – OEM BDS initialization codes
CF – OEM BDS initialization codesDXE Error Codes
D0 – CPU initialization error
D1 – System Agent initialization error
D2 – PCH initialization error
D3 – Some of the Architectural Protocols are not available
D4 – PCI resource allocation error. Out of Resources
D5 – No Space for Legacy Option ROM
D6 – No Console Output Devices are found
D7 – No Console Input Devices are found
D8 – Invalid password
D9 – Error loading Boot Option (LoadImage returned error)
DA – Boot Option is failed (StartImage returned error)
DB – Flash update is failed
DC – Reset protocol is not availableS3 Resume Progress Codes
E0 – S3 Resume is stared (S3 Resume PPI is called by the DXE IPL)
E1 – S3 Boot Script execution
E2 – Video repost
E3 – OS S3 wake vector call
E4 – Reserved for future AMI progress codes
E5 – Reserved for future AMI progress codes
E6 – Reserved for future AMI progress codes
E7 – Reserved for future AMI progress codesS3 Resume Error Codes
E8 – S3 Resume Failed
E9 – S3 Resume PPI not Found
EA – S3 Resume Boot Script Error
EB – S3 OS Wake Error
EC – Reserved for future AMI error codes 31
ED – Reserved for future AMI error codes 31
EE – Reserved for future AMI error codes 31
EF – Reserved for future AMI error codes 31Recovery Progress Codes
F0 – Recovery condition triggered by firmware (Auto recovery)
F1 – Recovery condition triggered by user (Forced recovery)
F2 – Recovery process started
F3 – Recovery fi rmware image is found
F4 – Recovery fi rmware image is loaded
F5 – Reserved for future AMI progress codes
F6 – Reserved for future AMI progress codes
F7 – Reserved for future AMI progress codesRecovery Error Codes
F8 – Recovery PPI is not available
F9 – Recovery capsule is not found
FA – Invalid recovery capsule
FB – Reserved for future AMI error codes
FC – Reserved for future AMI error codes
FD – Reserved for future AMI error codes
FE – Reserved for future AMI error codes
FF – Reserved for future AMI error codesACPI/ASL Checkpoints
01 – System is entering S1 sleep state
02 – System is entering S2 sleep state
03 – System is entering S3 sleep state
04 – System is entering S4 sleep state
05 – System is entering S5 sleep state
10 – System is waking up from the S1 sleep state
20 – System is waking up from the S2 sleep state
30 – System is waking up from the S3 sleep state
40 – System is waking up from the S4 sleep state
AC – System has transitioned into ACPI mode. Interrupt controller is in APIC mode
AA – System has transitioned into ACPI mode. Interrupt controller is in APIC mode
2.
이제 그림이 그려집니다. DXE상태에서 문제가 발생하였습니다. DXE는 driver execution environment (DXE)의 약자입니다. 말 그대로 드라이버가 동작할 수 있는 환경을 만드는 단계입니다. PI Boot Flow에 올라온 자세한 설명입니다. 공부가 필요합니다.
Boot Device Selection (BDS) Phase
The Boot Manager in DXE executes after all the DXE drivers whose dependencies have been satisfied have been dispatched by the DXE Dispatcher. At that time, control is handed to the Boot Device Selection (BDS) phase of execution. The BDS phase is responsible for implementing the platform boot policy. System firmware that is compliant with this specification must implement the boot policy specified in the Boot Manager chapter of the UEFI 2.0 specification. This boot policy provides flexibility that allows system vendors to customize the user experience during this phase of execution.
The Boot Manager must also support booting from a short-form device path that starts with the first node being a firmware volume device path. The boot manager must use the GUID in the firmware volume device node to match it to a firmware volume in the system. The GUID in the firmware volume device path is compared with the firmware volume name GUID. If a match is made, then the firmware volume device path can be appended to the device path of the matching firmware volume and normal boot behavior can then be used.
The BDS phase is implemented as part of the BDS Architectural Protocol. The DXE Foundation will hand control to the BDS Architectural Protocol after all of the DXE drivers whose dependencies have been satisfied have been loaded and executed by the DXE Dispatcher.
The BDS phase is responsible for the following:
- Initializing console devices
- Loading device drivers
- Attempting to load and execute boot selections
이제 원인을 알았으니까 수리를 하여야 합니다. 수리를 할 능력이 없어서 괜찮은 업체를 검색하였습니다. 예전에 거래했던 곳이 있는데 지금은 문을 닫은 듯 합니다. 다시금 검색. 제가 참고하여 방문한 글입니다.
용산 사설수리의 화타를 만나다 “신우전자” 메인보드 수리 후기
전화를 드리고 방문했습니다. 가장 먼저 눈에 들어온 부분이 깨끗한 작업환경입니다. 용산상가에서 여러번 수리를 해본 경험과 달랐습니다. 대부분 어지러운데 너무 깨끗하였습니다 물어보니까 오래전 보드설계 및 생산을 하실 때부터의 습관이라고 합니다. 이런 저런 수다와 함께 수리를 마치고 검사를 하니까 제자리로 돌아왔습니다. 사실 수리하는 동안 잔소리를 엄청 들었습니다.
“왜 중국산을 사용하느냐?”
“중국산의 문제가 무엇인지 아느냐”
그래서 나중에 사장님에게 중고서버보드를 사겠다고 했습니다. 수리과정이 너무 단순합니다. 메모리이상이라고 생각했는데 메모리 슬롯이 아니고 CPU이상이었습니다. 정확히 청소 불량입니다. 다시 사무실로 돌아와서 조립후 동작을 하니 잘 돌아갑니다.
항상 느끼지만 재야의 고수는 많습니다. 그런데 이 분들이 하나씩 둘씩 사라집니다. 안타까운 현실입니다.