Opened 8 years ago

Closed 7 years ago

#3883 closed Bug/Something is broken (fixed)

theyesmen.org is down

Reported by: vamosi@… Owned by: Jamie McClelland
Priority: Urgent Component: Membership/Dues
Keywords: website down Cc: fluxdepot@…
Sensitive: no

Description

theyesmen.org website is down, and email is not working. traceroute works, and ping, but not ssh or http.... help! thanks-

Change History (4)

comment:1 Changed 8 years ago by Jamie McClelland

Hi Vamosi,

I logged in via the console and saw messages along the lines of:

[10649588.307430] INFO: task apache2:830 blocked for more than 120 seconds.
[10649588.315382] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

I then attempted a magic sysreq restart (skinny elephants recovery).

The console reported the normal responses to each command, but then failed to reboot on the last command, instead reporting:

[10650048.084333] mptscsih: ioc0: host reset: SUCCESS (sc=ffff81011c55a6c0)
[10650048.092331] sd 0:1:0:0: Device offlined - not ready after error recovery
[10650048.100298] sd 0:1:0:0: Device offlined - not ready after error recovery
[10650048.108285] sd 0:1:0:0: Device offlined - not ready after error recovery
[10650048.115010] sd 0:1:0:0: rejecting I/O to offline device
[10650048.121961] Buffer I/O error on device dm-0, logical block 560058
[10650048.121961] lost page write due to I/O error on dm-0
[10650048.137311] sd 0:1:0:0: rejecting I/O to offline device
[10650048.142781] Buffer I/O error on device dm-0, logical block 843633
[10650048.145309] lost page write due to I/O error on dm-0
[10650048.155826] sd 0:1:0:0: rejecting I/O to offline device
[10650048.161299] Buffer I/O error on device dm-0, logical block 843632

Which looks like possible disk failure.

Then, it stopped responding to magic sysreq commands. I'm still working on it...

jamie

comment:2 Changed 8 years ago by Jamie McClelland

I'm going to try a hard reset.

jamie

comment:3 Changed 8 years ago by Jamie McClelland

The server has restarted... since goofball/kesey is not a MFPL admin'ed server - I would suggest sending a link to this ticket to your sys admin - we can help trouble shoot. I don't see any disk errors in the syslog, but it might be a good idea to do a more comprehensive check

jamie

comment:4 Changed 7 years ago by Jamie McClelland

Resolution: fixed
Status: newclosed

This is an ancient ticket... closing since the problem is resolved...

jamie

Please login to add comments to this ticket.

Note: See TracTickets for help on using tickets.