Blogs/200611116 raidcontroller ext3 filesystem error

Aus Cryptronic
Wechseln zu: Navigation, Suche

Die guten Raidcontroller

Sind das nicht tolle Geräte? Nehmen dem Kernel Arbeit ab in Bezug auf ein Softwareraid aber gehen dann doch auch klamm heimlich und ohne Vorankündigung kaputt.

Tja was macht man als Admin, wenn sich apache verabschiedet, du ne load von 40 hast, verzweifelt den Fehler suchst und dich dann wunderst was zur Hölle macht dein Rootsystem denn im read-only Modus? Das wär doch dann mal wieder ein Festplatten defekt?!? Eigentlich ja nich, es kann ja nich so viele kaputte Festplatten auf der Welt geben. Also dich dann doch getraut dmesg einzugeben und siehe da?

ext3_free_blocks: aborting transaction: Journal has aborted in __ext3_journal_get_undo_access<2>EXT3-fs error (device sda2) in ext3_free_blocks: Journal has aborted 
ext3_free_blocks: aborting transaction: Journal has aborted in __ext3_journal_get_undo_access<2>EXT3-fs error (device sda2) in ext3_free_blocks: Journal has aborted 
ext3_free_blocks: aborting transaction: Journal has aborted in __ext3_journal_get_undo_access<2>EXT3-fs error (device sda2) in ext3_free_blocks: Journal has aborted 
ext3_free_blocks: aborting transaction: Journal has aborted in __ext3_journal_get_undo_access<2>EXT3-fs error (device sda2) in ext3_free_blocks: Journal has aborted 
ext3_free_blocks: aborting transaction: Journal has aborted in __ext3_journal_get_undo_access<2>EXT3-fs error (device sda2) in ext3_free_blocks: Journal has aborted 
ext3_free_blocks: aborting transaction: Journal has aborted in __ext3_journal_get_undo_access<2>EXT3-fs error (device sda2) in ext3_free_blocks: Journal has aborted 
ext3_free_blocks: aborting transaction: Journal has aborted in __ext3_journal_get_undo_access<2>EXT3-fs error (device sda2) in ext3_free_blocks: Journal has aborted 
ext3_free_blocks: aborting transaction: Journal has aborted in __ext3_journal_get_undo_access<2>EXT3-fs error (device sda2) in ext3_free_blocks: Journal has aborted 
ext3_free_blocks: aborting transaction: Journal has aborted in __ext3_journal_get_undo_access<2>EXT3-fs error (device sda2) in ext3_free_blocks: Journal has aborted 
ext3_free_blocks: aborting transaction: Journal has aborted in __ext3_journal_get_undo_access<2>EXT3-fs error (device sda2) in ext3_free_blocks: Journal has aborted 
ext3_free_blocks: aborting transaction: Journal has aborted in __ext3_journal_get_undo_access<2>EXT3-fs error (device sda2) in ext3_free_blocks: Journal has aborted 
ext3_free_blocks: aborting transaction: Journal has aborted in __ext3_journal_get_undo_access<2>EXT3-fs error (device sda2) in ext3_free_blocks: Journal has aborted 
ext3_free_blocks: aborting transaction: Journal has aborted in __ext3_journal_get_undo_access<2>EXT3-fs error (device sda2) in ext3_free_blocks: Journal has aborted 
ext3_free_blocks: aborting transaction: Journal has aborted in __ext3_journal_get_undo_access<2>EXT3-fs error (device sda2) in ext3_free_blocks: Journal has aborted 
ext3_reserve_inode_write: aborting transaction: Journal has aborted in __ext3_journal_get_write_access<2>EXT3-fs error (device sda2) in ext3_reserve_inode_write: Journal has aborted 
ext3_reserve_inode_write: aborting transaction: Journal has aborted in __ext3_journal_get_write_access<2>EXT3-fs error (device sda2) in ext3_reserve_inode_write: Journal has aborted 
EXT3-fs error (device sda2) in ext3_orphan_del: Journal has aborted 
EXT3-fs error (device sda2) in ext3_truncate: Journal has aborted 
EXT3-fs error (device sda2) in start_transaction: Journal has aborted 
EXT3-fs error (device sda2) in start_transaction: Journal has aborted 
EXT3-fs error (device sda2) in start_transaction: Journal has aborted 
EXT3-fs error (device sda2) in start_transaction: Journal has aborted 
EXT3-fs error (device sda2) in start_transaction: Journal has aborted 
__journal_remove_journal_head: freeing b_committed_data 
__journal_remove_journal_head: freeing b_committed_data 
__journal_remove_journal_head: freeing b_committed_data 
__journal_remove_journal_head: freeing b_committed_data 
__journal_remove_journal_head: freeing b_committed_data 
EXT3-fs error (device sda2) in start_transaction: Journal has aborted 
EXT3-fs error (device sda2) in start_transaction: Journal has aborted 
EXT3-fs error (device sda2) in start_transaction: Journal has aborted 
EXT3-fs error (device sda2) in start_transaction: Journal has aborted 
EXT3-fs error (device sda2) in start_transaction: Journal has aborted 
EXT3-fs error (device sda2) in start_transaction: Journal has aborted 
EXT3-fs error (device sda2) in start_transaction: Journal has aborted

Sieht irgendwie böse aus :(

Aber glücklicherweise gibt es ja so ne tolle Erfindung wie ein Rettungssystem - also einen schönen soft reboot durchgeführt.

Nach einem fsck.ext3 /dev/sda2, der auch nich gerade kurz ging, war meine Konsole schon wieder überfüllt:

fsck.ext3 /dev/sda2
e2fsck 1.37 (21-Mar-2005)
/dev/sda2: recovering journal
/dev/sda2 contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Inodes that were part of a corrupted orphan linked list found.  Fix<y>? yes

Inode 3260418 was part of the orphaned inode list.  FIXED.
Inode 6242324 was part of the orphaned inode list.  FIXED.
Inode 6242327 was part of the orphaned inode list.  FIXED.
Inode 6242328 was part of the orphaned inode list.  FIXED.
Inode 6242329 was part of the orphaned inode list.  FIXED.
Inode 6242330 was part of the orphaned inode list.  FIXED.
Inode 6242332 was part of the orphaned inode list.  FIXED.
Inode 6242333 was part of the orphaned inode list.  FIXED.
Inode 6242334 was part of the orphaned inode list.  FIXED.
Inode 6242335 was part of the orphaned inode list.  FIXED.
Inode 6242337 was part of the orphaned inode list.  FIXED.
Inode 6242340 was part of the orphaned inode list.  FIXED.
Inode 6242341 was part of the orphaned inode list.  FIXED.
Inode 6242342 was part of the orphaned inode list.  FIXED.
Inode 6242345 was part of the orphaned inode list.  FIXED.
Inode 6243422 was part of the orphaned inode list.  FIXED.
Inode 6243429 was part of the orphaned inode list.  FIXED.
Inode 6243430 was part of the orphaned inode list.  FIXED.
Inode 6243431 was part of the orphaned inode list.  FIXED.
Inode 6243432 was part of the orphaned inode list.  FIXED.
Inode 6243434 was part of the orphaned inode list.  FIXED.
Inode 6243435 was part of the orphaned inode list.  FIXED.
Inode 6243437 was part of the orphaned inode list.  FIXED.
Inode 6243439 was part of the orphaned inode list.  FIXED.
Inode 6243442 was part of the orphaned inode list.  FIXED.
Inode 6243443 was part of the orphaned inode list.  FIXED.
Inode 8896613 was part of the orphaned inode list.  FIXED.
Inode 8896614 was part of the orphaned inode list.  FIXED.
Inode 8896615 was part of the orphaned inode list.  FIXED.
Inode 8900770 was part of the orphaned inode list.  FIXED.
Inode 8962589 was part of the orphaned inode list.  FIXED.
Inode 8962715 was part of the orphaned inode list.  FIXED.
Inode 8962726 was part of the orphaned inode list.  FIXED.
Inode 8962879 was part of the orphaned inode list.  FIXED.
Inode 8962980 was part of the orphaned inode list.  FIXED.
Inode 8963012 was part of the orphaned inode list.  FIXED.
Inode 8963015 was part of the orphaned inode list.  FIXED.
Inode 8963234 was part of the orphaned inode list.  FIXED.
Deleted inode 8963334 has zero dtime.  Fix<y>? yes

Inode 8963355 was part of the orphaned inode list.  FIXED.
Inode 8963431 was part of the orphaned inode list.  FIXED.
Inode 8963820 was part of the orphaned inode list.  FIXED.
Inode 11256884, i_blocks is 1024, should be 992.  Fix<y>? yes

Inode 11256776, i_size is 7923692, should be 7929856.  Fix<y>? yes

Inode 11256776, i_blocks is 15504, should be 15512.  Fix<y>? yes

Inode 11503941 was part of the orphaned inode list.  FIXED.
Inode 11503941, i_size is 0, should be 8192.  Fix<y>? yes

Inode 12470782, i_blocks is 1488, should be 1480.  Fix<y>? yes


Inode 14893355, i_blocks is 280, should be 272.  Fix<y>? yes

Inode 15304363 was part of the orphaned inode list.  FIXED.
Inode 15304577 was part of the orphaned inode list.  FIXED.
Inode 15304701 was part of the orphaned inode list.  FIXED.
Inode 15304710 was part of the orphaned inode list.  FIXED.
Inode 15304713 was part of the orphaned inode list.  FIXED.
Inode 15304774 was part of the orphaned inode list.  FIXED.
Inode 15304797 was part of the orphaned inode list.  FIXED.
Inode 15304958 was part of the orphaned inode list.  FIXED.
Inode 15305018 was part of the orphaned inode list.  FIXED.
Deleted inode 16285882 has zero dtime.  Fix<y>? yes

Inode 16285906 was part of the orphaned inode list.  FIXED.
Inode 16285914 was part of the orphaned inode list.  FIXED.
Deleted inode 16285930 has zero dtime.  Fix<y>? yes

Inode 16285933 was part of the orphaned inode list.  FIXED.
Inode 16285934 was part of the orphaned inode list.  FIXED.
Inode 16285951 was part of the orphaned inode list.  FIXED.
Inode 16285993 was part of the orphaned inode list.  FIXED.
Inode 16285994 was part of the orphaned inode list.  FIXED.
Inode 16582595 was part of the orphaned inode list.  FIXED.
Inode 17532030, i_blocks is 2152, should be 2128.  Fix<y>? yes

Duplicate blocks found... invoking duplicate block passes.
Pass 1B: Rescan for duplicate/bad blocks
Duplicate/bad block(s) in inode 12485339: 25090021
Duplicate/bad block(s) in inode 12486262: 25090021
Pass 1C: Scan directories for inodes with dup blocks.
Pass 1D: Reconciling duplicate blocks
(There are 2 inodes containing duplicate/bad blocks.)

Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Unattached inode 1638515
Connect to /lost+found<y>? yes

Inode 1638515 ref count is 2, should be 1.  Fix<y>? yes

Pass 5: Checking group summary information
Block bitmap differences:  -(133803--133805) -133807 -145638 -(3194544--3194546) -3215761 +4385112 -(6521347--6521606) -(12485197--12485220) -(12485227--12485689) -(12485694--12485738) -(12485744--12485753) -(12485768--12485817) -(12485831--12485833) -(12494858--12494881) -(12494888--12495350) -(12495355--12495384) -(12495389--12495399) -(12495405--12495414) -(12495429--12495471) -(17800032--17800035) -(17926266--17926267) -(17927105--17927108) -(17927124--17927154) -(17927308--17927313) -(17929925--17929933) -(17929983--17929984) -(17929986--17929990) -(17930882--17931183) -(17932844--17932915) -(17934166--17934230) -(17939679--17939681) -(17948679--17948822) -(18014436--18014708) +(23005665--23005671) -(23005697--23005701) -23005704 -(23005706--23005708) +23044918 +23044991 +23044999 +23045108 -23099581 -23101138 -25089913
Fix<y>? yes

Free blocks count wrong for group #4 (18261, counted=18267).
Fix<y>? yes

Free blocks count wrong for group #97 (17964, counted=17967).
Fix<y>? yes

Free blocks count wrong for group #98 (672, counted=673).
Fix<y>? yes

Free blocks count wrong for group #199 (30661, counted=30921).
Fix<y>? yes

Free blocks count wrong for group #381 (22335, counted=23511).
Fix<y>? yes

Free blocks count wrong for group #543 (16218, counted=16222).
Fix<y>? yes

Free blocks count wrong for group #547 (14466, counted=15111).
Fix<y>? yes

Free blocks count wrong for group #549 (28107, counted=28380).
Fix<y>? yes

Free blocks count wrong for group #702 (731, counted=738).
Fix<y>? yes

Free blocks count wrong for group #703 (57, counted=53).
Fix<y>? yes

Free blocks count wrong for group #765 (575, counted=576).
Fix<y>? yes

Free blocks count wrong for group #788 (9887, counted=13447).
Fix<y>? yes

Free blocks count wrong for group #910 (27450, counted=27451).
Fix<y>? yes

Free blocks count wrong (27445041, counted=27450974).
Fix<y>? yes

Inode bitmap differences:  -3260418 -6242324 -(6242327--6242330) -(6242332--6242335) -6242337 -(6242340--6242342) -6242345 -6243422 -(6243429--6243432) -(6243434--6243435) -6243437 -6243439 -(6243442--6243443) -(8896613--8896615) -8900770 -8962589 -8962715 -8962726 -8962879 -8962980 -8963012 -8963015 -8963234 -8963334 -8963355 -8963431 -8963820 -15304363 -15304577 -15304701 -15304710 -15304713 -15304774 -15304797 -15304958 -15305018 -16285882 -16285906 -16285914 -16285930 -(16285933--16285934) -16285951 -(16285993--16285994)
Fix<y>? yes

Free inodes count wrong for group #199 (16377, counted=16378).
Fix<y>? yes

Free inodes count wrong for group #381 (15236, counted=15261).
Fix<y>? yes

Free inodes count wrong for group #543 (11880, counted=11884).
Fix<y>? yes

Free inodes count wrong for group #547 (12170, counted=12182).
Fix<y>? yes

Free inodes count wrong for group #934 (13762, counted=13771).
Fix<y>? yes

Free inodes count wrong for group #994 (12551, counted=12560).
Fix<y>? yes

Free inodes count wrong (18323865, counted=18323925).
Fix<y>? yes


/dev/sda2: ***** FILE SYSTEM WAS MODIFIED *****
/dev/sda2: 960043/19283968 files (5.8% non-contiguous), 11096993/38547967 blocks

Nein mir ist es wahrlich nich langweilig geworden beim y drücken.

Also sah ja nach dem check eigentlich wieder gut aus - dachte ich und lag dabei falsch.

Also ab ins normale System gebootet, das ganze lief echt super - ganze 5 Minuten. Die Sache wurde mir echt zu bunt. Das kann doch wohl absolut nich sein. Da die Fehler in dmesg genau gleich aussahen, hatte ich keine Lust mehr und machte mir Gedanken über einen möglichen Hardwaredefekt.

Aber wie die Leser hier ja wissen, kenn ich mich zur Genüge mit Festplattenfehlern aus - aber das hier sah nich nach einem Festplattenfehler aus! Also nochmals ins Rettungssystem und das liebe e2fsck drüber laufen lassen und siehe da?

e2fsck 1.37 (21-Mar-2005)
/dev/sda2 contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Inodes that were part of a corrupted orphan linked list found.  Fix<y>? yes

Inode 11503553 was part of the orphaned inode list.  FIXED.

Inode 15304926 was part of the orphaned inode list.  FIXED.
Deleted inode 16285914 has zero dtime.  Fix<y>? yes

Inode 16285919 was part of the orphaned inode list.  FIXED.
Inode 16285929 was part of the orphaned inode list.  FIXED.
Inode 16285930 was part of the orphaned inode list.  FIXED.
Inode 16285933 was part of the orphaned inode list.  FIXED.
Inode 16285934 was part of the orphaned inode list.  FIXED.
Inode 16285950 was part of the orphaned inode list.  FIXED.
Inode 16285951 was part of the orphaned inode list.  FIXED.
Duplicate blocks found... invoking duplicate block passes.
Pass 1B: Rescan for duplicate/bad blocks
Duplicate/bad block(s) in inode 11454293: 23044918 23044991 23044999 23045108
Duplicate/bad block(s) in inode 11503780: 23044918 23044991 23044999 23045108
Pass 1C: Scan directories for inodes with dup blocks.
Pass 1D: Reconciling duplicate blocks
(There are 2 inodes containing duplicate/bad blocks.)

Pass 2: Checking directory structure
Entry 'BC537E9898A' in /var/spool/postfix/incoming/B (16007359) has deleted/unused inode 15305098.  Clear<y>? yes

Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Unattached inode 9502745
Connect to /lost+found<y>? yes

Inode 9502745 ref count is 2, should be 1.  Fix<y>? yes

Inode 11454278 ref count is 2, should be 3.  Fix<y>? yes

Pass 5: Checking group summary information

Block bitmap differences:  +(23005662--23005663) +(23005665--23005673) +23005675 +23005687 +(23005689--23005690) +23005693 +(23005696--23005701) +23005704 +(23005706--23005710) +23005712 +(23005714--23005716) +23005722 +(23005724--23005730) +23005732 +23005734 +23005747 -(23005830--23005832) -23005835 -(23006017--23006018) -23006618 -23006708 -23006710 +23010335 +23010346 -(23010384--23010385) -(23010387--23010388) -(23010390--23010392) -23010394 -23010428 +(23010621--23010623) +(23010630--23010631) +(23010633--23010635) +(23010637--23010640) +23010644 +(23010646--23010651) +23010653 +(23010655--23010661) +(23010663--23010666) +(23010668--23010669) +23010681 +(23010683--23010684) +(23010689--23010690) +23010692 +(23010698--23010699) +(23010701--23010706) +23010714 +23010719 +23010722 +23010726 +(23010732--23010733) +23010739 +(23010745--23010746) +(23010749--23010750) +23010753 +(23010758--23010760) +(23010834--23010843) +(23010853--23010854) +(23010879--23010881) +23010883 +(23012018--23012022) +23012133 +23012136 +23012292 +23012358 +23012360 +23012392 +(23012407--23012408) +23012410 +23012413 +(23012598--23012606) +23012753 +23012810 +23012822 +23012826 +23012830 +23012855 +(23013398--23013401) +(23013430--23013433) +23013435 +(23013456--23013457) +23014012 +23014014 +23015408 +23015465 +23015468 +23015470 +23015715 +23015724 +23015727 +23015733 +23015739 +(23015745--23015751) +23015754 +(23015758--23015760) +(23015762--23015763) +(23015775--23015779) +23015781 +(23015784--23015786) +(23015818--23015825) +(23015827--23015828) +(23015830--23015831) +(23015834--23015835) +23015841 +(23015843--23015847) +23015858 +23015860 +(23015877--23015882) +(23015906--23015909) +(23015919--23015922) +(23015937--23015939) +23015941 +(23015944--23015946) +23015948 +(23015950--23015953) +23015961 +23015978 +23015985 +23016067 +(23016071--23016072) +23016083 +23016103 +23016105 +23016115 +(23016121--23016125) +23017808 +23017818 +(23017827--23017828) +23018297 +23018308 +(23018788--23018793) +23018843 +23018848 +23018871 +(23018883--23018887) +23018889 +(23018897--23018898) +(23018903--23018904) +(23018944--23018947) +(23018949--23018950) +(23018962--23018967) +(23019004--23019006) +(23019011--23019012) +23044179 +23044918 +23044991 +23044999 +23045108 +23075015 +23075024 +23075026 -23099623
Fix<y>? yes

Free blocks count wrong for group #4 (18218, counted=18214).
Fix<y>? yes

Free blocks count wrong for group #702 (506, counted=498).
Fix<y>? yes

Free blocks count wrong for group #703 (0, counted=4).
Fix<y>? yes

Free blocks count wrong for group #704 (931, counted=939).
Fix<y>? yes

Free blocks count wrong for group #784 (28785, counted=28784).
Fix<y>? yes

Free blocks count wrong for group #789 (30068, counted=32254).
Fix<y>? yes

Free blocks count wrong (27582883, counted=27585068).
Fix<y>? yes

yyyyyyyyInode bitmap differences:  -15304926 -16285914 -16285919 -(16285929--16285930) -(16285933--16285934)y -(16285950--16285951)
Fix<y>? yes

yyyyyyyyyFree inodes count wrong for group #784 (13993, counted=13992).
Fix<y>? yes

Directories count wrong for group #784 (1064, counted=1065).
Fix<y>? yes

Free inodes count wrong for group #934 (13805, counted=13806).
Fix<y>? yes

Free inodes count wrong for group #994 (12554, counted=12562).
Fix<y>? yes

Free inodes count wrong (18323908, counted=18323916).
Fix<y>? yes


/dev/sda2: ***** FILE SYSTEM WAS MODIFIED *****
/dev/sda2: 960052/19283968 files (5.8% non-contiguous), 10962899/38547967 blocks

das gleiche Problem wieder...

Aber was fällt auf?: erstes Mal:

/dev/sda2: ***** FILE SYSTEM WAS MODIFIED *****
/dev/sda2: 960043/19283968 files (5.8% non-contiguous), 11096993/38547967 blocks

zweites Mal:

/dev/sda2: ***** FILE SYSTEM WAS MODIFIED *****
/dev/sda2: 960052/19283968 files (5.8% non-contiguous), 10962899/38547967 blocks

Das sieht schon gefährlich ähnlich aus. un war mir echt die Lust vergangen... ab zum Hörer RZ angerufen. Die haben dann auch gleich den Controller ausgewechselt und alles ging wieder...

HAPPY END!!!

Das komische an der ganzen Sache is auf jeden Fall, dass der Controller heute Morgen um 0 Uhr noch der ansicht war er funktioniere blendend...

Also warum teure Controller einsetzen, wenn man auch ein Softwareraid umsonst haben kann? Ich würds nich mehr einsehen dafür Geld auszugeben!!

--crypt 23:27, 16. Nov 2006 (CET)

Kommentare

Das habe ich mich jetzt auch gefragt ;-)

Naja, so kanns gehen :-( Hauptsache es ist wieder in Ordnung.

Kommentar hinzufügen

Meine Werkzeuge
Namensräume
Varianten
Aktionen
Navigation
Tipps
Werkzeuge