...making Linux just a little more fun!

next -->

Re-compress your gzipp'ed files to bzip2 using a Bash script (HOWTO)

By Dave Bechtel

If you were incredibly lucky (like me), perhaps you received an external USB hard drive for Christmas. Or perhaps you have one lying around already, with plenty of free space. And perhaps you also read the recent Slashdot article about compression software and have lots of fairly sizable gzipped files laying about.

After reading the comments in that article, I was dismayed to learn that my favorite compression tool of choice (gzip) has no error-correction capabilities. While I deem it to be the best all-around for quick backups with a decent compression ratio, gzip will choke if it gets a data error on restore - and there's something to be said for data integrity.

So, having this nice shiny new USB external drive and some time on my hands, I wrote a Bash utility script to re-compress gzip files to bzip2, using the external drive. It takes an order of magnitude longer to compress, but at least I'll save some space and have a hope of recovering the compressed data if things go wrong... Right??

My particular external drive is a 120-gig that came factory-formatted as a single FAT32 partition. Now, any Linux guru worth their salt knows that this thing practically begs to be customized, since Fat32 has a 2GB(Linux) or 4GB(Windows) filesize limit - depending on who's writing to it.

So, I fired up my Knoppix HD install and repartitioned it. Nothing fancy, just good old fdisk.

Here's how it looks now:

$ fdisk -l /dev/sdb

Disk /dev/sdb: 255 heads, 63 sectors, 14593 cylinders
Units = cylinders of 16065 * 512 bytes

   Device Boot    Start       End    Blocks   Id  System
/dev/sdb1   *         1         1      8032   83  Linux
/dev/sdb2             2     14593 117210240    f  Win95 Ext'd (LBA)
/dev/sdb5             2        18    136552   82  Linux swap
/dev/sdb6            19      4999  40009882    c  Win95 FAT32 (LBA)
/dev/sdb7          5000      5622   5004247   83  Linux
/dev/sdb8          5623     14593  72059557   83  Linux

(I did make a note of the fact that the factory-default was one big type "c", in case I needed to go back to that.)

Notice the 40GB Fat32 partition. In my other life (sssshhh!) I run Windows 2000 Professional - and was forcibly reminded that everything after Windows ME has a 32GB partition size limit for formatting Fat32. Note that the limitation is on formatting - not accessing - this is by design, and Microsoft has publically admitted it.

After going through several free Windows tools for formatting and repartitioning (and running into a brick wall), I eventually gave up on Windows 2000 formatting the thing. The vendor has a utility on their website to restore the drive to factory-default partitioning, but that doesn't really help my intended use of the drive. I could have formatted it in Windows 98, but that's no fun - and it would need a separate driver for the OS to recognize the drive.

So, rather than give up a perfectly usable 8GB, good old Linux to the rescue again:

$ mkdosfs -F 32 -v -n wdfat40 /dev/sdb6
and reboot.

Presto! Windows 2000 recognizes the drive just fine now, and it passes all the chkdsk tests. And for all you dual-booters out there, a wonderful utility exists called Ext2IFS ( http://www.fs-driver.org/ ). This allows NT-based systems like Windows 2000 to access ext2/ext3 partitions just like a regular drive - read/write, so no need for NTFS!

The Linux partitions were formatted like so:

mke2fs -j -c -m1 -v /dev/sdbX

Here are the /etc/fstab entries I created for the drive, BTW:

/dev/sdb6  /mnt/wdfat40  auto
defaults,noauto,noatime,user,suid,noexec,uid=dave 0 0
/dev/sdb7  /mnt/wdlinux  ext3 defaults,noauto,noatime,rw 0 0
/dev/sdb8  /mnt/wdvast  ext3 defaults,noauto,noatime,rw 0 0

Note the "uid=dave" in that first line. That's so my non-root user account will have write access to the drive by default.

Now onto the good part - the "rezip" Bash script.

At first, I started out by writing a fairly basic script with a simple function call and manually-entered filenames. Then I sat down and took another look at it - and practically rewrote it from scratch, with some features that occurred to me after several test runs.

rezip Currently Features:

-- KNOWN BUG(s):

Tux