Linux Fu: Mapping Files

If you use C or C++, you have probably learned how to open a file and read data from it. Usually, we read a character or a line at a time. At least, it seems that way. The reality is there are usually quite a number of buffers between you and the hard drive, so your request for a character might trigger a read for 2,048 characters and then your subsequent calls return from the buffer. There may even be layers of buffers feeding buffers.
A modern computer can do so much better than reading using things using old calls like fgetc. Given that your program has a huge virtual address space and that your computer has a perfectly good memory management unit within it, you can ask the operating system to simply map the file into your memory space. Then you can treat it like any other array of characters and let the OS do the rest.
The operating system doesn’t necessarily read the entire file in at one time, it just reserves space for you. Any time you hit a page that isn’t in memory, the operating system grabs it for you invisibly. Pages that you don’t use very often may be discarded and reloaded later. Behind the scenes, the OS does a lot so you can work on very large files with no real effort. The call that does it all is mmap.

Of course, there is always a catch. If you have a truly large file, you might have to do some work to map it partially and then map it again. Also creating or extending files is a bit more work using mapping. Still, memory mapping is easy to do in most common cases and well worth learning about.
Decisions
The first thing you have to decide is if you want to read the file or both read and write to the file. If you only need read access, you can ask for a private mapping. That means you’ll get the file as it exists and any changes you make will simply copy the pages to your own private copy. Typically, you won’t change files that you open like this anyway unless you create a new file to write changes to yourself.
However, if you want to write to the file just like you write to memory you’ll need a shared mapping. This can be used to share data with other processes, but it also makes sure the file gets updates as you make them — well, sort of. We’ll talk about msync a bit later.
Reading is Fundamental
You can find example code online for a simple word count program similar to wc (mmwc). Instead of using standard I/O calls (stdwd), it uses open to open the file for reading and then maps it into the program’s address space. We need to know the file size: a job for stat. Here’s part of the code:

int fd = open(filename, O_RDONLY); // open file
struct stat finfo;
char *b;
if ( fd == -1 )
{
perror(filename);
return 2;
}
if ( fstat( fd, &finfo ) == -1 ) // learn size of file
{
perror(filename);
return 3;
}
b=mmap( NULL, finfo.st_size, PROT_READ, MAP_PRIVATE, fd, 0 ); // map to memory
if ( b == MAP_FAILED )
{
perror(“mmap”);
return 4;
}

The arguments to mmap are simple. The first is an address. You almost never need to specify the address unless you are doing something exotic. If you do, there are many rules about how to set the address that vary based on platform. By specifying NULL, mmap will pick a spot for you. The next argument is a length followed by flags. In this case, we tell the system we only care to read the file. We also specify a private mapping and then the filehandle and an offset from the start of the file.
Once this call succeeds, the b variable has a pointer to the entire file in memory. Printing it out would be as easy as:

while (–len) putchar *b++;

The following code implements a simple word counting engine. Note the function do_work never uses any call that would relate to the input file. It simply processes data in memory:

while (len–) // process each character
{
if ( *b == ‘n’ ) l++;
if ( isspace(*b) )
{
if (state==1)
{
w++;
state=0;
}
}
else state=1;
b++;
}

Writing and the msync Consideration
Writing is a little trickier because the file is shared and also needs to have contents already. You might, for example, use lseek to set a file position and just write something at the end so the file was preloaded.
Before Linux 2.6.19, you had to call msync to make sure the file wrote to disk, but now you don’t. However, if you think your code may run on older kernels, it might be wise to use msync when writing to a mapped file.
In the example program mmup.c, we sidestep the size problems by working on the file in place. I also put a call into msync for good measure. The mmap line is similar to the previous version:

b = mmap( NULL, finfo.st_size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0 );

Look at how simple the conversion code is:

int do_work( char *b, unsigned int len )
{
while (len–)
{
*b=toupper(*b);
b++;
}
return 0;
}

Isn’t that easy?
Compilers and libraries are pretty good these days, so I’ll leave performance measurements as an exercise for the interested. A lot will depend on how good your disk I/O system is, too. In theory, the memory mapped I/O should be faster than a program that is really doing disk I/O. However, libraries may be doing buffering and even mapping behind your back anyway, so the performance difference could be very slight. But the simplification in the code is a big plus regardless of the performance.
Perfect?
This seems great, but you should be aware of some potential issues. We are used to thinking that if we read some data from disk and nothing goes wrong, we can forget about it. But mapping a file isn’t the same as reading it. Suppose you map a file on a network drive and maybe read some pages out of it. Then the network goes down. A new page read will cause a fault because the underlying file is now gone. You can catch the SIGBUS signal that indicates this, but then what will you do?
Of course, if you still support 32-bit operating systems, you may find you quickly run out of address space if you process large files. True, you can make a smaller window with an offset in the file, but it is more work.
On the other hand, mmap makes many file handling programs simpler and easier to write. It is worth having in your arsenal of Linux programming tricks.