Incremental Access

As we described earlier in this chapter, an incremental or a fast save is a greatly demanded feature, which applies just as well to incremental loads. I mentioned before that access to elements within Structured Storage is inherently incremental in the same way that access to individual directories and files in a typical file system is incremental. In a file system, saving information in a file requires only a change to that file, leaving everything else in the file system untouched. In the same manner, changing the name of a directory or shuffling files around within it affects only that directory and its contents. Because Structured Storage is a file system within a file, the same ideas apply. Changes to an individual stream don't affect any other element in the entire storage hierarchy. Changing the name of a storage element or fudging with its contents leaves the rest of the hierarchy untouched.

The real impact of incremental access is the time it takes to perform a read or write operation to the final disk file. Without incremental access, loading a file means loading all of it into memory; saving a file means writing all of it back to disk. These operations can take an enormous amount of time. With incremental access, however, the time it takes to do all of this is not only spread over a longer period but also minimized a great deal—it takes zero time to read and write information that you aren't going to use and don't need to modify in any way. The idea of "incremental" really means "as needed."

If the changes you're making to a document, for example, mean changing a few characters in a block of text on a particular page, you need only to change the contents of the stream that holds that block of text. All other data on that page—and all other pages in the document—remain unaffected. As a result, saving these changes is very quick: rewrite one stream and you've finished. This chapter's Patron sample uses this idea—only the page being viewed is opened.

The real trick to doing all this in a storage hierarchy is navigating the hierarchy to get to the stream that contains the information you want. This means a sequence of IStorage::OpenStorage (or CreateStorage) calls to navigate to the stream and an IStorage::OpenStream (or CreateStream) call to open the stream that you need to read or write. Of course, this does take time, and Structured Storage doesn't provide any sort of shortcut. Once you get there, however, you can read and write that stream in isolation without disturbing any other parts of the hierarchy, even if you write new information past the end of that stream. Structured Storage simply finds new space for the new information, requiring no modification to the rest of the file. If you delete information or even whole elements, the storage implementation simply marks that space as free and uses that space to write new information later. In other words, it performs garbage collection as necessary.

Of course, things can become fragmented in this manner, and to combat this you can create a storage and copy the existing file contents into it with IStorage::CopyTo. OLE will eliminate all unused space in the process. This is faster than rewriting the entire file, but it doesn't necessarily defragment stream contents within the file. If you repeat the CopyTo operation or manually rewrite the entire file, you'll defragment the contents as well. We'll see an example of how this works in "Compound File Defragmentation" at the end of this chapter.