Outage #833

Information

Begins at: 2021-09-10 21:23:00 CEST
Duration: 40 minutes
Type: outage
State: resolved
Impact: system_reset
Affected systems: Node node3.brq
Summary:
English: ... another vz node down: looks like bad ssd firmware
Česky: ... dalsi vz noda dole: vypada to na spatny firmware v ssd
Description:
English: the current downtime streak has one common denominator: new Intel SSDs in all the nodes...

still investigating :(

what a funny day...
Česky: soucasna downtime-parada ma jediny spolecny jmenovatel: nova Inteli SSDcka ve vsech dotcenych nodech...

stale zjistujeme, co presne se to deje :(

docela zabavny den...
Handled by: Pavel Šnajdr, Jakub Skokan

Updates

Date Summary Reported by
2021-09-10 21:30:50 CEST Pavel Šnajdr
State: announced
2021-09-10 22:42:58 CEST English: wasnt firmware after all
Česky: tak firmwarem to nebylo nakonec
Pavel Šnajdr
State: resolved

English: ... NAS got into a weirdly half-stuck state with a huge slowdown for IO, which took the nodes with worst memory management out (~most openvz nodes)

Česky: ... NAS se polo-zaseknul do divneho stavu, kdy strasne zpomaloval IO, coz s sebou vzalo spravu pameti na tech nejstarsich nodech (tj. vetsinu openvz nodu)

Help

Where to report bugs and suggestions?

Support vpsFree.cz

Support mail: podpora@vpsfree.cz

Links

Status
https://status.vpsf.cz

IRC
irc.libera.chat #vpsfree

Matrix
#vpsfree:matrix.org

Discourse
https://discourse.vpsfree.cz

Knowledge base
Česky: https://kb.vpsfree.cz/
English: https://kb.vpsfree.org/

Sysadmins contacts

Jakub Skokan
IRC: aither at #vpsfree
Phone: +420 775 386 453

Pavel Snajdr (main admin)
IRC: snajpa at #vpsfree
Phone: +420 720 107 791