Cromfs: Compressed ROM filesystem for Linux (user-space)

0. Contents

This is the documentation of cromfs-1.2.0.
   1. Purpose
   2. News
   3. Overview
   4. Limitations
   5. Comparing to other filesystems
      5.1. Compression tests
      5.2. Speed tests
   6. Getting started
   7. Tips
   8. Copying
   9. Requirements
   10. Downloading

1. Purpose

cromfs Cromfs is a compressed read-only filesystem for Linux. Cromfs is intended for permanently archiving gigabytes of big files that have lots of redundancy.

In terms of compression it is much similar to 7-zip files, except that fast random access is provided for the whole archive contents; the user does not need to launch a program to decompress a single file, nor does he need to wait while the system decompresses 500 files from a 1000-file archive to get him the 1 file he wanted to open.

Note: The primary design goal of cromfs is compression power. It is much slower than its peers, and uses more RAM. If all you care about is "powerful compression" and "random file access", then you will be happy with cromfs.

The creation of cromfs was inspired from Squashfs and Cramfs.

2. News

See the ChangeLog.

3. Overview

See the documentation of the cromfs format for technical details (also included in the source package as doc/FORMAT).

4. Limitations

Development status: Beta. The Cromfs project has been created very recently and it hasn't been yet tested extensively. There is no warranty against data loss or anything else, so use at your own risk.
That being said, there are no known bugs.

5. Comparing to other filesystems

This is all very biased, hypothetical, and by no means a scientific study, but here goes:
Feature Cromfs Cramfs Squashfs (3.0) Cloop
Compression unit adjustable arbitrarily (2 MB default) adjustable, must be power of 2 (4 kB default) adjustable, must be power of 2 (64 kB max) adjustable in 512-byte units (1 MB max)
Files are compressed (up to block size limit) Together Individually Individually, except for fragments Together
Maximum file size 16 EB (2^44 MB) (theoretical; actual limit depends on settings) 16 MB (2^4 MB) 16 EB (2^44 MB)
(4 GB before v3.0)
Depends on slave filesystem
Maximum filesystem size 16 EB (2^44 MB) 272 MB 16 EB (2^44 MB)
(4 GB before v3.0)
Unknown
Duplicate whole file detection Yes No Yes No
Hardlinks detected and saved Yes Yes Yes, since v3.0 depends on slave filesystem
Near-identical file detection Yes (identical blocks) No No No
Compression method LZMA gzip (patches exist to use LZMA) gzip (patches exist to use LZMA) gzip or LZMA
Ownerships uid,gid (since version 1.1.2) uid,gid (but gid truncated to 8 bits) uid,gid Depends on slave filesystem
Timestamps mtime only None mtime only Depends on slave filesystem
Endianess-safety Works on little-endian only Safe, but not exchangeable Safe, but not exchangeable Depends on slave filesystem
Kernelspace/userspace User (fuse) Kernel Kernel Kernel
Appending to a previously created filesystem No No Yes No (the slave filesystem can be decompressed, modified, and compressed again, but in a sense, so can every other of these.)
Supported inode types all all all Depends on slave filesystem
Fragmentation
(good for compression, bad for access speed)
Commonplace None File tails only Depends on slave filesystem
Holes (aka. sparse files); storage optimization of blocks which consist entirely of nul bytes Optimized, not limited to nul-byte blocks. Supported Not supported Depends on slave filesystem
Waste space (partially filled sectors) No Unknown Mostly not Depends on slave filesystem, generally yes
Extended attributes No Unknown Unknown Unknown, may depend on slave filesystem
Note: cromfs now saves the uid and gid in the filesystem. However, when the uid is 0 (root), the cromfs-driver returns the uid of the user who mounted the filesystem, instead of root. Similarly for gid. This is both for backward compatibility and for security.
If you mount as root, this behavior has no effect.

5.1. Compression tests

Note: I use the -e and -r options in all of these mkcromfs tests to avoid unnecessary decompression+recompression steps, in order to speed up the filesystem generation. This has no effect in compression ratio.

In this table, k equals 1024 bytes (210) and M equals 1048576 bytes (220).
Item 10783 NES ROMs (2523 MB) Mozilla source code from CVS (279 MB) Damn small Linux liveCD (113 MB)
(size taken from "du -c" output in the uncompressed filesystem)
Cromfs mkcromfs -s16384 -a… -b… -f…
With 16M fblocks, 2k blocks: 202,811,971 bytes
With 16M fblocks, 1k blocks, 198,410,407 bytes
With 16M fblocks, ¼k blocks: 194,386,381 bytes
mkcromfs
29,525,376 bytes
mkcromfs -f1048576
With 64k blocks (-b65536), 39,778,030 bytes
With 16k blocks (-b16384), 39,718,882 bytes
With 1k blocks (-b1024), 40,141,729 bytes
Cramfs mkcramfs -b65536
dies prematurely, "filesystem too big"
mkcramfs
with 2M blocks (-b2097152), 58,720,256 bytes
with 64k blocks (-b65536), 57,344,000 bytes
with 4k blocks (-b4096), 68,435,968 bytes
mkcramfs -b65536
51,445,760 bytes
Squashfs mksquashfs -b65536
(using an optimized sort file) 1,185,546,240 bytes
mksquashfs
43,335,680 bytes
mksquashfs -b65536
50,028,544 bytes
Cloop create_compressed_fs
(using an iso9660 image created with mkisofs -R)
using 7zip, 1M blocks (-B1048576 -t2 -L-1): 1,136,789,006 bytes
create_compressed_fs
(using an iso9660 image created with mkisofs -RJ)
using 7zip, 1M blocks (-B1048576 -L-1): 41,201,014 bytes
(1 MB is maximum block size in cloop)
create_compressed_fs
(using an iso9660 image)
using 7zip, 1M blocks (-B1048576 -L-1): 48,328,580 bytes
using zlib, 64k blocks (-B65536 -L9): 50,641,093 bytes
7-zip (p7zip)
(an archive, not a filesystem)
7za -mx9 -ma=2 a
with 32M blocks (-md=32m): 235,037,017 bytes
with 128M blocks (-md=128m): 222,523,590 bytes
with 256M blocks (-md=256m): 212,533,778 bytes
untested 7za -mx9 -ma2 a
37,205,238 bytes
An explanation why mkcromfs beats 7-zip in the NES ROM packing test:
7-zip packs all the files together as one stream. The maximum dictionary size is 256 MB. (Note: The default for "maximum compression" is 32 MB.) When 256 MB of data has been packed and more data comes in, similarities between the first megabytes of data and the latest data are not utilized. For example, Mega Man and Rockman are two almost identical versions of the same image, but because there's more than 400 MB of files in between of those when they are processed in alphabetical order, 7-zip does not see that they are similar, and will compress each one separately.
7-zip's chances could be improved by sorting the files so that it will process similar images sequentially. It already attempts to accomplish this by sorting the files by filename extension and filename, but it is not always the optimal way, as shown here.

mkcromfs however keeps track of all blocks it has encoded, and will remember similarities no matter how long ago they were added to the archive. (Click here to read how it does that.) This is why it outperforms 7-zip in this case, even when it only used 16 MB fblocks.

In the liveCD compressing test, mkcromfs does not beat 7-zip because this advantage is too minor to overcome the overhead needed to provide random access to the filesystem. It still beats cloop, squashfs and cramfs though.

5.2. Speed tests

Speed testing hasn't been done yet. It is difficult to test the speed, because it depends on factors such as cache (with compressed filesystems, decompression consumes CPU power but usually only needs to be done once) and block size (bigger blocks need more time to decompress).

However, in the general case, it is quite safe to assume that mkcromfs is the slowest of all. The same goes for resource testing (RAM).

cromfs-driver requires an amount of RAM proportional to a few factors. It can be approximated with this formula:

Max_RAM_usage = FBLOCK_CACHE_MAX_SIZE × fblock_size + READDIR_CACHE_MAX_SIZE × 60k + 8 × num_blocks

Where

For example, for a 500 MB archive with 16 kB blocks and 1 MB fblocks, the memory usage would be around 10.2 MB.

6. Getting started

  1. Install the development requirements: make, gcc-c++ and fuse
    • Remember that for fuse to work, the kernel must also contain the fuse support. Do "modprobe fuse", and check if you have "/dev/fuse" and check if it works.
      • If an attempt to read from "/dev/fuse" (as root) gives "no such device", it does not work. If it gives "operation not permitted", it might work.
  2. Build "cromfs-driver", "util/mkcromfs", "util/cvcromfs" and "util/unmkcromfs", i.e. command "make":
    $ make
    If you get compilation problems related to hash_map or hash, edit cromfs-defs.hh and remove the line that says #define USE_HASHMAP.
  3. Create a sample filesystem:
    $ util/mkcromfs . sample.cromfs
  4. Mount the sample filesystem:
    $ mkdir sample
    $ ./cromfs-driver sample.cromfs sample &
  5. Observe the sample filesystem:
    $ cd sample
    $ ls -al
  6. Unmounting the filesystem:
    $ cd ..
    $ fusermount -u sample
    or, type "fg" and press ctrl-c to terminate the driver.

7. Tips

To improve the compression, try these tips: To improve the filesystem generation speed, try these tips: To control the memory usage, use these tips: To control the filesystem speed, use these tips:

8. Copying

cromfs has been written by Joel Yliluoma, a.k.a. Bisqwit,
and is distributed under the terms of the General Public License (GPL).
The LZMA code embedded within is licensed under LGPL.

Patches and other related material can be submitted to the author by e-mail at:zd2kxho.@pJoelhzap Yli/eluomvrxjepkn//a <biV@ksqwiJ27lZOqh0zt@ikS2li.fi>

The author also wishes to hear if you use cromfs, and for what you use it and what you think of it.

9. Requirements

10. Downloading

The official home page of cromfs is at http://iki.fi/bisqwit/source/cromfs.html.
Check there for new versions.

Generated from progdesc.php (last updated: Mon, 05 Jun 2006 17:45:08 +0300)
with docmaker.php (last updated: Sun, 12 Jun 2005 06:08:02 +0300)
at Mon, 05 Jun 2006 17:45:10 +0300