Blogs/200611116 raidcontroller ext3 filesystem error
Die guten Raidcontroller
Sind das nicht tolle Geräte? Nehmen dem Kernel Arbeit ab in Bezug auf ein Softwareraid aber gehen dann doch auch klamm heimlich und ohne Vorankündigung kaputt.
Tja was macht man als Admin, wenn sich apache verabschiedet, du ne load von 40 hast, verzweifelt den Fehler suchst und dich dann wunderst was zur Hölle macht dein Rootsystem denn im read-only Modus? Das wär doch dann mal wieder ein Festplatten defekt?!? Eigentlich ja nich, es kann ja nich so viele kaputte Festplatten auf der Welt geben. Also dich dann doch getraut dmesg einzugeben und siehe da?
ext3_free_blocks: aborting transaction: Journal has aborted in __ext3_journal_get_undo_access<2>EXT3-fs error (device sda2) in ext3_free_blocks: Journal has aborted ext3_free_blocks: aborting transaction: Journal has aborted in __ext3_journal_get_undo_access<2>EXT3-fs error (device sda2) in ext3_free_blocks: Journal has aborted ext3_free_blocks: aborting transaction: Journal has aborted in __ext3_journal_get_undo_access<2>EXT3-fs error (device sda2) in ext3_free_blocks: Journal has aborted ext3_free_blocks: aborting transaction: Journal has aborted in __ext3_journal_get_undo_access<2>EXT3-fs error (device sda2) in ext3_free_blocks: Journal has aborted ext3_free_blocks: aborting transaction: Journal has aborted in __ext3_journal_get_undo_access<2>EXT3-fs error (device sda2) in ext3_free_blocks: Journal has aborted ext3_free_blocks: aborting transaction: Journal has aborted in __ext3_journal_get_undo_access<2>EXT3-fs error (device sda2) in ext3_free_blocks: Journal has aborted ext3_free_blocks: aborting transaction: Journal has aborted in __ext3_journal_get_undo_access<2>EXT3-fs error (device sda2) in ext3_free_blocks: Journal has aborted ext3_free_blocks: aborting transaction: Journal has aborted in __ext3_journal_get_undo_access<2>EXT3-fs error (device sda2) in ext3_free_blocks: Journal has aborted ext3_free_blocks: aborting transaction: Journal has aborted in __ext3_journal_get_undo_access<2>EXT3-fs error (device sda2) in ext3_free_blocks: Journal has aborted ext3_free_blocks: aborting transaction: Journal has aborted in __ext3_journal_get_undo_access<2>EXT3-fs error (device sda2) in ext3_free_blocks: Journal has aborted ext3_free_blocks: aborting transaction: Journal has aborted in __ext3_journal_get_undo_access<2>EXT3-fs error (device sda2) in ext3_free_blocks: Journal has aborted ext3_free_blocks: aborting transaction: Journal has aborted in __ext3_journal_get_undo_access<2>EXT3-fs error (device sda2) in ext3_free_blocks: Journal has aborted ext3_free_blocks: aborting transaction: Journal has aborted in __ext3_journal_get_undo_access<2>EXT3-fs error (device sda2) in ext3_free_blocks: Journal has aborted ext3_free_blocks: aborting transaction: Journal has aborted in __ext3_journal_get_undo_access<2>EXT3-fs error (device sda2) in ext3_free_blocks: Journal has aborted ext3_reserve_inode_write: aborting transaction: Journal has aborted in __ext3_journal_get_write_access<2>EXT3-fs error (device sda2) in ext3_reserve_inode_write: Journal has aborted ext3_reserve_inode_write: aborting transaction: Journal has aborted in __ext3_journal_get_write_access<2>EXT3-fs error (device sda2) in ext3_reserve_inode_write: Journal has aborted EXT3-fs error (device sda2) in ext3_orphan_del: Journal has aborted EXT3-fs error (device sda2) in ext3_truncate: Journal has aborted EXT3-fs error (device sda2) in start_transaction: Journal has aborted EXT3-fs error (device sda2) in start_transaction: Journal has aborted EXT3-fs error (device sda2) in start_transaction: Journal has aborted EXT3-fs error (device sda2) in start_transaction: Journal has aborted EXT3-fs error (device sda2) in start_transaction: Journal has aborted __journal_remove_journal_head: freeing b_committed_data __journal_remove_journal_head: freeing b_committed_data __journal_remove_journal_head: freeing b_committed_data __journal_remove_journal_head: freeing b_committed_data __journal_remove_journal_head: freeing b_committed_data EXT3-fs error (device sda2) in start_transaction: Journal has aborted EXT3-fs error (device sda2) in start_transaction: Journal has aborted EXT3-fs error (device sda2) in start_transaction: Journal has aborted EXT3-fs error (device sda2) in start_transaction: Journal has aborted EXT3-fs error (device sda2) in start_transaction: Journal has aborted EXT3-fs error (device sda2) in start_transaction: Journal has aborted EXT3-fs error (device sda2) in start_transaction: Journal has aborted
Sieht irgendwie böse aus :(
Aber glücklicherweise gibt es ja so ne tolle Erfindung wie ein Rettungssystem - also einen schönen soft reboot durchgeführt.
Nach einem fsck.ext3 /dev/sda2, der auch nich gerade kurz ging, war meine Konsole schon wieder überfüllt:
fsck.ext3 /dev/sda2 e2fsck 1.37 (21-Mar-2005) /dev/sda2: recovering journal /dev/sda2 contains a file system with errors, check forced. Pass 1: Checking inodes, blocks, and sizes Inodes that were part of a corrupted orphan linked list found. Fix<y>? yes Inode 3260418 was part of the orphaned inode list. FIXED. Inode 6242324 was part of the orphaned inode list. FIXED. Inode 6242327 was part of the orphaned inode list. FIXED. Inode 6242328 was part of the orphaned inode list. FIXED. Inode 6242329 was part of the orphaned inode list. FIXED. Inode 6242330 was part of the orphaned inode list. FIXED. Inode 6242332 was part of the orphaned inode list. FIXED. Inode 6242333 was part of the orphaned inode list. FIXED. Inode 6242334 was part of the orphaned inode list. FIXED. Inode 6242335 was part of the orphaned inode list. FIXED. Inode 6242337 was part of the orphaned inode list. FIXED. Inode 6242340 was part of the orphaned inode list. FIXED. Inode 6242341 was part of the orphaned inode list. FIXED. Inode 6242342 was part of the orphaned inode list. FIXED. Inode 6242345 was part of the orphaned inode list. FIXED. Inode 6243422 was part of the orphaned inode list. FIXED. Inode 6243429 was part of the orphaned inode list. FIXED. Inode 6243430 was part of the orphaned inode list. FIXED. Inode 6243431 was part of the orphaned inode list. FIXED. Inode 6243432 was part of the orphaned inode list. FIXED. Inode 6243434 was part of the orphaned inode list. FIXED. Inode 6243435 was part of the orphaned inode list. FIXED. Inode 6243437 was part of the orphaned inode list. FIXED. Inode 6243439 was part of the orphaned inode list. FIXED. Inode 6243442 was part of the orphaned inode list. FIXED. Inode 6243443 was part of the orphaned inode list. FIXED. Inode 8896613 was part of the orphaned inode list. FIXED. Inode 8896614 was part of the orphaned inode list. FIXED. Inode 8896615 was part of the orphaned inode list. FIXED. Inode 8900770 was part of the orphaned inode list. FIXED. Inode 8962589 was part of the orphaned inode list. FIXED. Inode 8962715 was part of the orphaned inode list. FIXED. Inode 8962726 was part of the orphaned inode list. FIXED. Inode 8962879 was part of the orphaned inode list. FIXED. Inode 8962980 was part of the orphaned inode list. FIXED. Inode 8963012 was part of the orphaned inode list. FIXED. Inode 8963015 was part of the orphaned inode list. FIXED. Inode 8963234 was part of the orphaned inode list. FIXED. Deleted inode 8963334 has zero dtime. Fix<y>? yes Inode 8963355 was part of the orphaned inode list. FIXED. Inode 8963431 was part of the orphaned inode list. FIXED. Inode 8963820 was part of the orphaned inode list. FIXED. Inode 11256884, i_blocks is 1024, should be 992. Fix<y>? yes Inode 11256776, i_size is 7923692, should be 7929856. Fix<y>? yes Inode 11256776, i_blocks is 15504, should be 15512. Fix<y>? yes Inode 11503941 was part of the orphaned inode list. FIXED. Inode 11503941, i_size is 0, should be 8192. Fix<y>? yes Inode 12470782, i_blocks is 1488, should be 1480. Fix<y>? yes Inode 14893355, i_blocks is 280, should be 272. Fix<y>? yes Inode 15304363 was part of the orphaned inode list. FIXED. Inode 15304577 was part of the orphaned inode list. FIXED. Inode 15304701 was part of the orphaned inode list. FIXED. Inode 15304710 was part of the orphaned inode list. FIXED. Inode 15304713 was part of the orphaned inode list. FIXED. Inode 15304774 was part of the orphaned inode list. FIXED. Inode 15304797 was part of the orphaned inode list. FIXED. Inode 15304958 was part of the orphaned inode list. FIXED. Inode 15305018 was part of the orphaned inode list. FIXED. Deleted inode 16285882 has zero dtime. Fix<y>? yes Inode 16285906 was part of the orphaned inode list. FIXED. Inode 16285914 was part of the orphaned inode list. FIXED. Deleted inode 16285930 has zero dtime. Fix<y>? yes Inode 16285933 was part of the orphaned inode list. FIXED. Inode 16285934 was part of the orphaned inode list. FIXED. Inode 16285951 was part of the orphaned inode list. FIXED. Inode 16285993 was part of the orphaned inode list. FIXED. Inode 16285994 was part of the orphaned inode list. FIXED. Inode 16582595 was part of the orphaned inode list. FIXED. Inode 17532030, i_blocks is 2152, should be 2128. Fix<y>? yes Duplicate blocks found... invoking duplicate block passes. Pass 1B: Rescan for duplicate/bad blocks Duplicate/bad block(s) in inode 12485339: 25090021 Duplicate/bad block(s) in inode 12486262: 25090021 Pass 1C: Scan directories for inodes with dup blocks. Pass 1D: Reconciling duplicate blocks (There are 2 inodes containing duplicate/bad blocks.) Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Unattached inode 1638515 Connect to /lost+found<y>? yes Inode 1638515 ref count is 2, should be 1. Fix<y>? yes Pass 5: Checking group summary information Block bitmap differences: -(133803--133805) -133807 -145638 -(3194544--3194546) -3215761 +4385112 -(6521347--6521606) -(12485197--12485220) -(12485227--12485689) -(12485694--12485738) -(12485744--12485753) -(12485768--12485817) -(12485831--12485833) -(12494858--12494881) -(12494888--12495350) -(12495355--12495384) -(12495389--12495399) -(12495405--12495414) -(12495429--12495471) -(17800032--17800035) -(17926266--17926267) -(17927105--17927108) -(17927124--17927154) -(17927308--17927313) -(17929925--17929933) -(17929983--17929984) -(17929986--17929990) -(17930882--17931183) -(17932844--17932915) -(17934166--17934230) -(17939679--17939681) -(17948679--17948822) -(18014436--18014708) +(23005665--23005671) -(23005697--23005701) -23005704 -(23005706--23005708) +23044918 +23044991 +23044999 +23045108 -23099581 -23101138 -25089913 Fix<y>? yes Free blocks count wrong for group #4 (18261, counted=18267). Fix<y>? yes Free blocks count wrong for group #97 (17964, counted=17967). Fix<y>? yes Free blocks count wrong for group #98 (672, counted=673). Fix<y>? yes Free blocks count wrong for group #199 (30661, counted=30921). Fix<y>? yes Free blocks count wrong for group #381 (22335, counted=23511). Fix<y>? yes Free blocks count wrong for group #543 (16218, counted=16222). Fix<y>? yes Free blocks count wrong for group #547 (14466, counted=15111). Fix<y>? yes Free blocks count wrong for group #549 (28107, counted=28380). Fix<y>? yes Free blocks count wrong for group #702 (731, counted=738). Fix<y>? yes Free blocks count wrong for group #703 (57, counted=53). Fix<y>? yes Free blocks count wrong for group #765 (575, counted=576). Fix<y>? yes Free blocks count wrong for group #788 (9887, counted=13447). Fix<y>? yes Free blocks count wrong for group #910 (27450, counted=27451). Fix<y>? yes Free blocks count wrong (27445041, counted=27450974). Fix<y>? yes Inode bitmap differences: -3260418 -6242324 -(6242327--6242330) -(6242332--6242335) -6242337 -(6242340--6242342) -6242345 -6243422 -(6243429--6243432) -(6243434--6243435) -6243437 -6243439 -(6243442--6243443) -(8896613--8896615) -8900770 -8962589 -8962715 -8962726 -8962879 -8962980 -8963012 -8963015 -8963234 -8963334 -8963355 -8963431 -8963820 -15304363 -15304577 -15304701 -15304710 -15304713 -15304774 -15304797 -15304958 -15305018 -16285882 -16285906 -16285914 -16285930 -(16285933--16285934) -16285951 -(16285993--16285994) Fix<y>? yes Free inodes count wrong for group #199 (16377, counted=16378). Fix<y>? yes Free inodes count wrong for group #381 (15236, counted=15261). Fix<y>? yes Free inodes count wrong for group #543 (11880, counted=11884). Fix<y>? yes Free inodes count wrong for group #547 (12170, counted=12182). Fix<y>? yes Free inodes count wrong for group #934 (13762, counted=13771). Fix<y>? yes Free inodes count wrong for group #994 (12551, counted=12560). Fix<y>? yes Free inodes count wrong (18323865, counted=18323925). Fix<y>? yes /dev/sda2: ***** FILE SYSTEM WAS MODIFIED ***** /dev/sda2: 960043/19283968 files (5.8% non-contiguous), 11096993/38547967 blocks
Nein mir ist es wahrlich nich langweilig geworden beim y drücken.
Also sah ja nach dem check eigentlich wieder gut aus - dachte ich und lag dabei falsch.
Also ab ins normale System gebootet, das ganze lief echt super - ganze 5 Minuten. Die Sache wurde mir echt zu bunt. Das kann doch wohl absolut nich sein. Da die Fehler in dmesg genau gleich aussahen, hatte ich keine Lust mehr und machte mir Gedanken über einen möglichen Hardwaredefekt.
Aber wie die Leser hier ja wissen, kenn ich mich zur Genüge mit Festplattenfehlern aus - aber das hier sah nich nach einem Festplattenfehler aus! Also nochmals ins Rettungssystem und das liebe e2fsck drüber laufen lassen und siehe da?
e2fsck 1.37 (21-Mar-2005) /dev/sda2 contains a file system with errors, check forced. Pass 1: Checking inodes, blocks, and sizes Inodes that were part of a corrupted orphan linked list found. Fix<y>? yes Inode 11503553 was part of the orphaned inode list. FIXED. Inode 15304926 was part of the orphaned inode list. FIXED. Deleted inode 16285914 has zero dtime. Fix<y>? yes Inode 16285919 was part of the orphaned inode list. FIXED. Inode 16285929 was part of the orphaned inode list. FIXED. Inode 16285930 was part of the orphaned inode list. FIXED. Inode 16285933 was part of the orphaned inode list. FIXED. Inode 16285934 was part of the orphaned inode list. FIXED. Inode 16285950 was part of the orphaned inode list. FIXED. Inode 16285951 was part of the orphaned inode list. FIXED. Duplicate blocks found... invoking duplicate block passes. Pass 1B: Rescan for duplicate/bad blocks Duplicate/bad block(s) in inode 11454293: 23044918 23044991 23044999 23045108 Duplicate/bad block(s) in inode 11503780: 23044918 23044991 23044999 23045108 Pass 1C: Scan directories for inodes with dup blocks. Pass 1D: Reconciling duplicate blocks (There are 2 inodes containing duplicate/bad blocks.) Pass 2: Checking directory structure Entry 'BC537E9898A' in /var/spool/postfix/incoming/B (16007359) has deleted/unused inode 15305098. Clear<y>? yes Pass 3: Checking directory connectivity Pass 4: Checking reference counts Unattached inode 9502745 Connect to /lost+found<y>? yes Inode 9502745 ref count is 2, should be 1. Fix<y>? yes Inode 11454278 ref count is 2, should be 3. Fix<y>? yes Pass 5: Checking group summary information Block bitmap differences: +(23005662--23005663) +(23005665--23005673) +23005675 +23005687 +(23005689--23005690) +23005693 +(23005696--23005701) +23005704 +(23005706--23005710) +23005712 +(23005714--23005716) +23005722 +(23005724--23005730) +23005732 +23005734 +23005747 -(23005830--23005832) -23005835 -(23006017--23006018) -23006618 -23006708 -23006710 +23010335 +23010346 -(23010384--23010385) -(23010387--23010388) -(23010390--23010392) -23010394 -23010428 +(23010621--23010623) +(23010630--23010631) +(23010633--23010635) +(23010637--23010640) +23010644 +(23010646--23010651) +23010653 +(23010655--23010661) +(23010663--23010666) +(23010668--23010669) +23010681 +(23010683--23010684) +(23010689--23010690) +23010692 +(23010698--23010699) +(23010701--23010706) +23010714 +23010719 +23010722 +23010726 +(23010732--23010733) +23010739 +(23010745--23010746) +(23010749--23010750) +23010753 +(23010758--23010760) +(23010834--23010843) +(23010853--23010854) +(23010879--23010881) +23010883 +(23012018--23012022) +23012133 +23012136 +23012292 +23012358 +23012360 +23012392 +(23012407--23012408) +23012410 +23012413 +(23012598--23012606) +23012753 +23012810 +23012822 +23012826 +23012830 +23012855 +(23013398--23013401) +(23013430--23013433) +23013435 +(23013456--23013457) +23014012 +23014014 +23015408 +23015465 +23015468 +23015470 +23015715 +23015724 +23015727 +23015733 +23015739 +(23015745--23015751) +23015754 +(23015758--23015760) +(23015762--23015763) +(23015775--23015779) +23015781 +(23015784--23015786) +(23015818--23015825) +(23015827--23015828) +(23015830--23015831) +(23015834--23015835) +23015841 +(23015843--23015847) +23015858 +23015860 +(23015877--23015882) +(23015906--23015909) +(23015919--23015922) +(23015937--23015939) +23015941 +(23015944--23015946) +23015948 +(23015950--23015953) +23015961 +23015978 +23015985 +23016067 +(23016071--23016072) +23016083 +23016103 +23016105 +23016115 +(23016121--23016125) +23017808 +23017818 +(23017827--23017828) +23018297 +23018308 +(23018788--23018793) +23018843 +23018848 +23018871 +(23018883--23018887) +23018889 +(23018897--23018898) +(23018903--23018904) +(23018944--23018947) +(23018949--23018950) +(23018962--23018967) +(23019004--23019006) +(23019011--23019012) +23044179 +23044918 +23044991 +23044999 +23045108 +23075015 +23075024 +23075026 -23099623 Fix<y>? yes Free blocks count wrong for group #4 (18218, counted=18214). Fix<y>? yes Free blocks count wrong for group #702 (506, counted=498). Fix<y>? yes Free blocks count wrong for group #703 (0, counted=4). Fix<y>? yes Free blocks count wrong for group #704 (931, counted=939). Fix<y>? yes Free blocks count wrong for group #784 (28785, counted=28784). Fix<y>? yes Free blocks count wrong for group #789 (30068, counted=32254). Fix<y>? yes Free blocks count wrong (27582883, counted=27585068). Fix<y>? yes yyyyyyyyInode bitmap differences: -15304926 -16285914 -16285919 -(16285929--16285930) -(16285933--16285934)y -(16285950--16285951) Fix<y>? yes yyyyyyyyyFree inodes count wrong for group #784 (13993, counted=13992). Fix<y>? yes Directories count wrong for group #784 (1064, counted=1065). Fix<y>? yes Free inodes count wrong for group #934 (13805, counted=13806). Fix<y>? yes Free inodes count wrong for group #994 (12554, counted=12562). Fix<y>? yes Free inodes count wrong (18323908, counted=18323916). Fix<y>? yes /dev/sda2: ***** FILE SYSTEM WAS MODIFIED ***** /dev/sda2: 960052/19283968 files (5.8% non-contiguous), 10962899/38547967 blocks
das gleiche Problem wieder...
Aber was fällt auf?: erstes Mal:
/dev/sda2: ***** FILE SYSTEM WAS MODIFIED ***** /dev/sda2: 960043/19283968 files (5.8% non-contiguous), 11096993/38547967 blocks
zweites Mal:
/dev/sda2: ***** FILE SYSTEM WAS MODIFIED ***** /dev/sda2: 960052/19283968 files (5.8% non-contiguous), 10962899/38547967 blocks
Das sieht schon gefährlich ähnlich aus. un war mir echt die Lust vergangen... ab zum Hörer RZ angerufen. Die haben dann auch gleich den Controller ausgewechselt und alles ging wieder...
HAPPY END!!!
Das komische an der ganzen Sache is auf jeden Fall, dass der Controller heute Morgen um 0 Uhr noch der ansicht war er funktioniere blendend...
Also warum teure Controller einsetzen, wenn man auch ein Softwareraid umsonst haben kann? Ich würds nich mehr einsehen dafür Geld auszugeben!!
--crypt 23:27, 16. Nov 2006 (CET)
Kommentare
Das habe ich mich jetzt auch gefragt ;-)
Naja, so kanns gehen :-( Hauptsache es ist wieder in Ordnung.