Outage #696
Information
Begins at: | 2020-10-22 01:05:00 CEST |
Duration: | 120 minutes |
Type: | maintenance |
State: | resolved |
Impact: | unavailability |
Affected systems: | Node node1.prg
Node node7.prg Node node12.prg Node node13.prg |
Summary: | |
English: CPU upgrade | |
Česky: výměna CPU | |
Description: | |
English: CPU upgrade to E5-2670V2 | |
Česky: výměna CPU za výkonnější E5-2670V2 | |
Handled by: | Martin Myška |
Updates
Date | Summary | Reported by |
---|---|---|
2020-10-20 17:38:24 CEST | Martin Myška | |
State: announced | ||
2020-10-22 05:05:51 CEST | English: node13: hw problem encountered, upgrade postponed
Česky: node13: problem s hw, upgrade odlozen |
Pavel Šnajdr |
State: resolved
English: node13 encountered a problem with the motherboard, 4 DIMM slots are not working properly. The node is up with -64G RAM, migrations are being planned to empty the node. The node appears to run stable, this should be just a precaution and a preparation step for further HW service. The upgrade had to be postponed, node13 took too long. Sorry for the unnecessary reboots on the other nodes, we didn't expect so long troubleshooting. Česky: node13 ma problem s deskou, 4 DIMM sloty nefunguji poradne. Node nabehl s -64G RAM, naplanovavame migrace pryc. Node vypada bezet stabilne, migrace by mela byt tedy hlavne preventivni + pripravou na dalsi HW servis. Zbytek upgrade jsme museli odlozit, node13 nam trval moc dlouho. Omlouvame se za zbytecne rebooty na ostatnich nodach, necekali jsme tak dlouhe reseni problemu. |