Friday, May 11, 2012

Scratch Pad for Understanding how to enforce fsck.

There are two variables Mount Count and Max Mount Count:
If Mount Count  > Max Mount Count, next boot up fsck runs!
To change Maximum mount count, use the utility tune2fs
$tune2fs -c COUNT  /dev/sdaX
Different devices can have different maximum mount count value.

This happens when in the /etc/fstab you have a value 1 (against rootfs entry) at the last entry. like

/dev/hda2/ext2defaults1 1
for all other devices this last entry should be 2 if you want to force fsck! else 0. fsck will be still wait for mount count to get above max mount count for all these devices.

Problem :

Damage the file system just enough that:
On next bootup - fsck runs.
And fixes the problem.

The first problem is to damage the filesystem. There is a superblock to each filesystem partition. This superblock contains metadata about the filesystem like block size, inode table size and location, empty blocks location and size of block groups. File system is smart enough to have multiple copies of this superblock so that  if one gets corrupted it still can recover.

To find out the location of superblock we use dumpe2fs utility.
$dumpe2fs  /dev/sdaX |grep superblock


  Primary superblock at 1, Group descriptors at 2-9
  Backup superblock at 8193, Group descriptors at 8194-8201
  Backup superblock at 24577, Group descriptors at 24578-24585
  Backup superblock at 40961, Group descriptors at 40962-40969
  Backup superblock at 57345, Group descriptors at 57346-57353
  Backup superblock at 73729, Group descriptors at 73730-73737
  Backup superblock at 204801, Group descriptors at 204802-204809
  Backup superblock at 221185, Group descriptors at 221186-221193
  Backup superblock at 401409, Group descriptors at 401410-401417
  Backup superblock at 663553, Group descriptors at 663554-663561
  Backup superblock at 1024001, Group descriptors at 1024002-1024009
  Backup superblock at 1990657, Group descriptors at 1990658-1990665

Now, its easy to corrupt block 1 in filesystem /dev/sdaX
run :
dd if=/dev/zero of=/dev/sdaX bs=4k count=1 skip=1
essentially, we are deleting  a block of size 4k (check your filesystem block size before you do),  by skipping to  the location of block 1. 

Now, if we tried to reboot the system and then tried to mount  /dev/sdaX, we would fail, if fsck did not care to correct it.
If we had to manually correct the issue: we can replace superblock with a backup copy. fsck will do it for us as :

fsck -b 8193 /dev/sdaX and now we could mount it. 

if we instead just ran fsck -f /dev/sdaX, forcing fsck to correct /dev/sdaX, it would also do the same thing. though spill a lot of questions. To answer all questions as yes. 
do fsck -f -y /dev/sdaX and you should have recovered the filesystem. 

The problem we still face is :
what exactly should be done to force this check during bootup. The check should only happen if the mount did fail. fsck should not try to fix  a problem that had not occurred by running at each bootup. 
.....



No comments: