Tux

...making Linux just a little more fun!

compressing sparse file(s) while still maintaining their holes

Mulyadi Santosa [mulyadi.santosa at gmail.com]


Mon, 29 Mar 2010 21:25:23 +0700

Due to certain reasons, we might have sparse file and want to compress it. However, we want to maintain its sparseness. Can we do it the usual way? Let's say we have this file: $ dd if=/dev/zero of=./sparse.img bs=1K seek=400 count=0 0+0 records in 0+0 records out 0 bytes (0 B) copied, 2.3956e-05 s, 0.0 kB/s

$ ls -lsh sparse.img 4.0K -rw-rw-r-- 1 mulyadi mulyadi 400K 2010-03-29 21:14 sparse.img

$ gzip sparse.img $ ls -lsh sparse.img.gz 8.0K -rw-rw-r-- 1 mulyadi mulyadi 443 2010-03-29 21:14 sparse.img.gz

$ gunzip sparse.img.gz $ ls -lsh sparse.img 408K -rw-rw-r-- 1 mulyadi mulyadi 400K 2010-03-29 21:14 sparse.img

Bad. After decompression, total blocks occupied by the file "grows" from 4KiB to 408 KiB.

The trick is by using tar with -S option: $ tar -Sczvf sparse.img.tgz sparse.img

$ ls -lsh sparse.img.tgz 8.0K -rw-rw-r-- 1 mulyadi mulyadi 136 2010-03-29 21:18 sparse.img.tgz

$ tar -xzvf sparse.img.tgz $ ls -lsh sparse.img 4.0K -rw-rw-r-- 1 mulyadi mulyadi 400K 2010-03-29 21:17 sparse.img

As you can see, the total block size of "sparse.img" are correctly restored after decompression.

-- 
regards,
Freelance Linux trainer and consultant

blog: the-hydra.blogspot.com training: mulyaditraining.blogspot.com


Top    Back


René Pfeiffer [lynx at luchs.at]


Mon, 29 Mar 2010 16:54:47 +0200

On Mar 29, 2010 at 2125 +0700, Mulyadi Santosa appeared and said:

> Due to certain reasons, we might have sparse file and want to compress
> it. However, we want to maintain its sparseness. Can we do it the
> usual way? [...]
> 
> The trick is by using tar with -S option:
> $ tar -Sczvf sparse.img.tgz sparse.img

The same is true for rsync. It has a -S (or --sparse) flag, too. The man page says "handle sparse files efficiently", so it might even handle sparse file without the option (albeit less efficiently).

Best, Ren?.


Top    Back


Ben Okopnik [ben at linuxgazette.net]


Mon, 29 Mar 2010 15:19:12 -0400

On Mon, Mar 29, 2010 at 09:25:23PM +0700, Mulyadi Santosa wrote:

> Due to certain reasons, we might have sparse file and want to compress
> it. However, we want to maintain its sparseness. Can we do it the
> usual way?

[snip]

> The trick is by using tar with -S option:
> $ tar -Sczvf sparse.img.tgz sparse.img

'tar cvzSpf' is a standard sysadmin mantra for preserving all the file metadata, etc. It's nicely effective when chanted in combination with 'ssh', as in

# Replicate local files or dir structure on remote host
cd /source/dir/here
tar cvzSpf - *|ssh user at remote_host '(cd /target/dir/there; tar xzSpf -)'

Obviously, if you have a really fast network or your send method is already set up to use in-flight compression (i.e., 'Compression yes' in ~/.ssh/config or "ssh -c" on the command line), then the 'z' option will just waste cycles and should be omitted. Also, note that you want to use 'v' only on one side of the pipe; otherwise, things will get ridiculously noisy.

-- * Ben Okopnik * Editor-in-Chief, Linux Gazette * https://LinuxGazette.NET *


Top    Back