I have moved!

I've moved my blog
CLICK HERE

Friday 13 February 2009

General Theory of Resources

Having blogged the other day about a different way of doing automatic cleanup, I’ve been mulling it over and also answering a question on StackOverflow, and decided that I need to assemble a taxonomy of resources; what kinds there are, how to recognise them by their distinctive markings, guidelines for upkeep, training, feeding, breeding, etc. And because I’m obsessive-compulsive, I also wanted to develop a complete and rigorous theory of resources from the ground up.

I don’t like it when I have to say “In that sort of situation I normally try something like this and it usually seems to work.”

So without further ado, I present the General Theory of Resources.

Resources Modeled by Conceptual Bits

Imagine an array of bits, ones and zeros. Each represents something that has to be rationed between the parts of your program that want to use it. Only one “user” at a time can take ownership of the thing represented by the bit. When the thing is in use, its corresponding bit is set to 1. Otherwise, it is set to 0. If all the bits are 1 when you show up with your begging bowl, you’re out of luck. Note that these are just conceptual bits; real implementations might not actually maintain an array of bits or Booleans specifically for this purpose. But the two-state nature is always there nonetheless. We can imagine each available resource being modeled by a bit indicating whether it is currently reserved for exclusive use.

Are we talking about an infinite number of bits? Of course not. If we were, then we wouldn’t need to go any further in this discussion. We are frequently talking about a single bit. Often we are talking about a number of bits just large enough to fool us into thinking that our program is correct, only to find out later that it isn’t because it eventually runs out of bits.

Enter and Exit

Atop this set of bits, there is a pair of operations, which I’ll call Enter and Exit. When Enter executes, it sets a bit, or possibly a bunch of bits to 1. When Exit executes, it sets the bits back to 0 again. The two operations abstract over the underlying set of bits.

We sometimes say that a resource “exists” whenever we have executed Enter but have not yet executed Exit in the right way to reverse the state changes caused by Enter. A correct program must execute Exit once for every time it executes Enter.

Fungibility

You don’t complain to the bank if they give you “someone else’s cash”; it can be any old cash, as long as it adds up to the required value.

Some resources are fungible, meaning that the caller doesn’t care which particular bunch of bits is set to 1. An incorrect program would run out of such fungible bits because it would forget to Exit and so the number of 1-bits would build up over time, until eventually there would be no 0-bits remaining.

If you came home after leaving your kids with the babysitter, and found that they were different kids, you wouldn’t be satisfied if the baby sitter said, “They look pretty similar to your kids, don’t they?”

Other resources are non-fungible, meaning that the caller requests that a specific bit must be set, so each bit has a meaningful identity. An incorrect program would try to set a non-fungible bit to 1 that it had already set to 1. Sometimes a resource has only one bit to play with, in which case it is surely non-fungible, because we cannot use it without implicitly specifying which bit we want to use.

Orthogonality

Some resource types are orthogonal, meaning that the states of the bits are effectively mutually independent (they may be bunched together by the Enter and Exit operations, but if those bunches are mutually orthogonal then we still say that the resource is orthogonal).

Some are non-orthogonal, meaning that the states of the bits are entangled in some way, even though they are manipulated by separate Enter and Exit operations. Sometimes a resource has only one bit to play with, in which case we may as well call it orthogonal (because there is nothing for the single bit to be entangled with).

The only non-orthogonal kind of resource I consider here is stack-like. There may be other useful kinds, but I can’t think of any off hand.

Some Examples

Here’s a nice table of some contrived examples and their various attributes:

Resource Enter Exit Bits Fungible Orthogonal
Unmanaged memory in C malloc free One per word of memory Yes Yes
A stack of integers A Push method A Pop method One bit per stack depth No No
A static Boolean flag, _entered, protecting a method from reentrance // start of method
if (_entered ==
    true) return;

_entered = true;
// end of method
_entered = false;
One No Yes
A file The CreateFile API The CloseHandle API One per file No Yes
An XML output stream A method such as BeginElement A method such as EndElement One per open element No No
A thread-local integer, _entered, that monitors recursion into a method // start of method
_entered++;
// end of method
_entered--;
One per possible level of recursion No No

Something to consider: I give no example that is both fungible and non-orthogonal. If a resource is non-orthogonal, it means that the user has to be careful about the order in which it takes ownership of or releases ownership instances of that resource. It cannot do this if it is required to treat the resource as fungible, to not care which instance it is holding onto. So non-orthogonal implies non-fungible.

Another thing that should certainly be clear by now is that a "resource" is not something the operating system provides you with, implemented in native code. It can be, in the case of resources encapsulated in kernel handles that can be closed with CloseHandle, but there are other things, which you can write all by yourself, in pure C# or VB.NET or F#, which are also best thought of as resources.

Now let’s pick a few of them apart to make sure it’s all clear.

Unmanaged Memory in C

This may seem like a strange example to pick, because we have managed memory these days. But managed memory is fundamentally the same as unmanaged memory except that the runtime performs lazy cleanup automatically on another thread, and the question we must ask is: what makes this an appropriate solution for managing a resource? With what kinds of resource may we use such a clever solution?

Firstly, consider the C unmanaged memory heap, also known as the “free store”. We’ll assume memory has a resolution of one “word” (which would typically be either 4 or 8 bytes, but that doesn’t matter right now). Each word is either part of the free memory or it is part of some allocated memory. So apart from its actual contents (which we don’t presently care about at all), each word has one of two states associated with it: 0 for free, or 1 for allocated. These state bits are all completely orthogonal - independent from each other.

On top of this simple model, the heap provides a convenient abstraction: you can use Enter to find N contiguous free words for you and it will mark them as allocated (set the bit corresponding to each word). Later on, you can pass to Exit the address of the first word you were allocated and the heap will set those words back to the unallocated state (unset the bits). Each bunch of bits is orthogonal with respect to all other bunches, because the bunches can never overlap.

To reiterate, we are not talking about manipulating the stuff stored in the words (that’s the application’s job). We’re just imagining that the heap keeps track of which words are in use and which are free by recording those facts in a separate array of bits, one bit for each word of memory in the process. Of course in practice the heap can optimize its data structures so it doesn’t actually need to keep an array of bits to track the allocation status of every word separately.

If we rewrite the heap’s operations in C#, using my names for them, then they look like this:

IntPtr Enter(int wordsToAllocate);
void Exit(IntPtr blockToFree);

This is the standard pattern for an orthogonal resource. The parameter(s) to Enter do not give the identity of the bits that need to be set. Instead, they describe what is needed in a more general sense, and the Enter operation finds a group of bits that fit the bill: any group of bits representing a contiguous region of words of the length requested.

We then have this token or receipt being returned by Enter and we have to give it back to Exit when we no longer need the memory. The token can be used to access the block of memory, and it also is the key to freeing it.

A Stack of Integers

Now consider a stack of integers, something like a Stack<int>. The Enter operation takes the thing to put on the stack. This puts the system (the stack) into a state of being taller than it was before. However, the Exit operation does not require any information. It is the nature of a stack to only let you remove the top-most item, so all you need to tell it is that you would like it to remove something. You don’t have to tell it which item to remove (if you could do that, it wouldn’t be a stack). If this isn’t what your program needs, then you shouldn’t be using a stack.

As with any other resource manager, a stack has a conceptual array of bits. It is usually very large, but of course not infinite. Most of the bits are set to 0, most of the time. All the bits set to 1 are at the front of the array, and each 1-bit represents an item on the stack.

1 1 1 1 0 0 0 0 0

When we push something on the stack, we set the next 0-bit to 1:

1 1 1 1 1 0 0 0 0

When we pop something off the stack, we set the last 1-bit to 0:

1 1 1 1 0 0 0 0 0

Obviously there is nothing like orthogonality here. The bits are very much entangled such that no state change can occur that would stop the bits from being properly sorted into two neat groups. You certainly can’t just flip the state of any bit. Nor is there fungibility, because the identity of a specific bit to modify is implied by the operation we execute (non-orthogonality implies non-fungibility, remember?)

If we wanted to we could make our stack resource fit the same pattern as the unmanaged heap functions, with a little creativity and shoehorning. Firstly, the Enter operation might return the new height of the stack, which would serve as a unique resource handle. Exit would then accept that same value. Now we’d have an interface much like the above memory interface. But Exit would have to throw an exception if we passed it a value that was not the current height of the stack. Apart from performing this check, there is no other purpose in passing a value to it. So it is thoroughly misleading to expose the above interface on a stack-like resource. Instead, stack-like resources should have this kind of interface:

void Enter(T item);
T Exit();

This makes it obvious that there is no choice about which state the stack will be restored to next. Bear in mind that the item parameter is quite optional. As with the memory example, we’re not interested here in what the job stack is doing, but merely it how it behaves as a resource.

A Boolean Flag as a Recursion Guard

Sometimes you need to protect a method against re-entrance. For example, suppose you have an event handler that in turn fires another event. There is a danger that someone else will wire up the other event to fire the event your handler is wired up to, and there will be an endless recursion. To protect against this, you can create a static Boolean variable and make your event handler do nothing at all if it is true. Otherwise your event handler is able to run normally. First it sets the flag to true, then it goes about its business (firing the other event), and then sets the flag back to false again. This stops the infinite recursion from happening.

But it also means that your event handler (or whatever method you apply this approach to) is a resource, albeit a very limited one. It can only be used by one client at a time in the whole process. Consequently it is modeled by a single logical state-bit (indeed, it is implemented by one, in the form of the Boolean variable).

This definitely makes it non-fungible (because we cannot help but specify which bit we want to take over) but also orthogonal (because there are no other bits for this single bit to be entangled with).

File Handles

Now for the really complicated example. With file handles, the Enter operation produces a unique identifier that you have to hold on to. You then have to pass the same identifier back to Exit to close the file handle. So far, it looks a lot like memory. But there’s an added wrinkle, which is that there are two different kinds of resource here: files and handles.

The OS maintains a set of handles, and that constitutes a system which does indeed have the same properties as the C memory heap. Each possible handle value has an orthogonal Boolean state (allocated, unallocated) associated with it, just as every word of memory does. Applications are not interested in which handle value they receive, making them also fungible.

But the OS also maintains persistent files with unique names on your hard drive, and these each have an orthogonal state-bit associated with each of them, if we consider write-locking, which is the most problematic case. But these are definitely non-fungible, because they are identified by their names. (Note that if you build an abstraction on top if this system to make up random temporary filenames, then they become fungible again, but that’s another story.)

In reality, the Enter operation presented to applications is actually two operations that together form an atomic transaction. First the specified file is locked, meaning that its associated bit is set to 1. Then an unallocated handle is found (which is entirely the choice of the OS, because like money and memory, handle values are fungible) and its bit is also set to 1.

Consequently, although the “handle system” allows multiple simultaneous handles to exist, the file system forces us to treat files very carefully as resources. It is vital to understand this, because the Enter/Leave API looks treacherously similar to the memory API:

IntPtr Enter(string fileName);
void Exit(IntPtr handleToClose);

It has to look like that because handles work that way. But files do not – they are closer to the example of using a Boolean flag to guard against reentrance.

This frequently tricks people into thinking that they can treat file handles in the same way as memory – that is, let a lazy garbage collector take care of calling Exit automatically some time, hopefully not too long after there are no remaining references to the file handle. But the difference is in the information being passed into Enter. It is not a general description of what might fit the bill, but the specific identity of the one and only thing that is needed.

So the truth is that a program that fails to close its file handles in a well-understood way is probably not going to work very well. On Windows there is in fact no way of opening a file without locking it against at least one other operation (deletion), so even non-exclusive file handles have some problems.

As a digression, consider directories or folders in the file system. Enter creates a directory and Leave destroys it. There is one root directory which cannot be destroyed, and all other directories must have a parent directory. The result is a tree-like structure. But we can conceptually simplify it by thinking of it as a set of interleaved stacks. The chain from any leaf directory to the root is a stack, so we know for sure that tree-like resources are at least as restrictive as stack-like resources.

The other two examples are not appreciably different from what we’ve already examined. A stream of XML being written sequentially – however it is implemented – is effectively a stack. Each 1-bit represents an open element. The elements must be closed in the right order to ensure well-formed XML is produced. And the integer being used to monitor the level of recursion into a method, that too is a stack: one that stores nothing except a record of its current depth.

Internal Resources vs External Resources

One crucial point that should be clear by now is that there is not really a distinction between states “inside” your program and states “outside” of it. There is a tendency among C# programmers to think of the stuff they write themselves as “managed” (because it’s in managed code) and the concepts exposed by the operating system as native or “unmanaged”.

In fact, if your program internally maintains state, and the state needs to obey rules like the ones described here, then it inevitably defines resources of its own. And unless you take steps to provide automatic management of the lifetimes of your custom resources, then they are unmanaged resources, no matter what language they are written in or what runtime technology they are based on.

While an object exists in the CLR, it may have internal mutable state, and there may be rules for the correct use of that state, such that we have to conclude unavoidably that it is a type of resource, and it may or may not be be fungible or orthogonal. So the fact that the whole object will eventually be garbage collected is no comfort. What matters is that the state of it is modified correctly during its lifetime. We need some kind of automatic resource management to deal with the internal state of our own objects, as well as the objects exposed to us the operating system.

So what does all this tell us? It tells us that managed memory is one kind of resource among many, that the fundamental nature of resources means that they require management of some kind, and that the technique (lazy garbage collection) applicable to management of memory is only applicable because of specific properties of that kind of resource (fungibility, orthogonality). Other resources do not have those properties, and yet still need to be managed. So we need other techniques for them.

Resource Management Techniques

How can we manage resources easily in C#? By taking advantage of facilities in the language wherever possible. We have the interface to the resource in the form of the Enter and Exit functions, but we would like to avoid the kind of problems that occur when we forget to call Exit once for each call to Enter. So we want something easier than merely putting in the calls manually.

There are three solutions:

  • Finalizers
  • IDisposable/using
  • Gateways (methods that Enter, forward to a delegate, then Exit)

The first two are often confused and conflated.  The last one is, to my mind, massively superior to the other two, for reasons that should become clear, and yet is rarely described in this context as far as I know. I’ve coined the term “gateway” because I’m not aware of another term for them, so please let me know if there’s a better name.

Finalizers

Finalizers are essentially horrid nasty things. It should rarely be necessary to write one. It is notoriously difficult to write one correctly. I have never bothered.

The basic idea is to hitch a ride with the GC’s ability to lazily dispose of memory resources. The information needed to Exit a resource is stored in an object, and when that object is reclaimed by the GC, the object’s finalizer executes and calls Exit for the resource.

The GC schedules the finalizer to run lazily (that is, some time soon but not immediately or at a predictable time), which means that it is only really useful for resources that are fungible. Consider what this means for file handles: finalization will ensure that your process does not run out of handles, because they are fungible and in plentiful supply, but it does not help at all in avoiding bugs caused by holding onto file locks for longer than necessary, because files are non-fungible.

And don’t even consider trying to manage stack-like resources with finalizers. Imagine the mess, with things being popped when they shouldn’t be.

The information needed to Exit the resource must be stored in value-type fields only; it is not generally safe to access other reference objects (classes) from a finalizer. This means that finalizers are only really useful for resources defined by the operating system as kernel objects (and in fact there is a better way of encapsulating those, called the SafeHandle.)

So the less said about finalizers the better.

IDisposable/using

The other official way of managing resources is to encapsulate the resource in an object (as with the finalizer approach) and to give it a Dispose method, as defined in the interface IDisposable. The construction of the object corresponds to Enter, and Dispose serves as an Exit.

Then the language provides the using statement as a convenient way of ensuring that Dispose will be called when the flow of control leaves a specified scope:

 using ( resource-acquisition ) embedded-statement 

This approach is fully deterministic and non-lazy. It has none of the limitations of finalizers. Because the using statement weaves itself into the thread’s method-call stack, it works very well with stack resources - each object’s Dispose is guaranteed by the C# specification to be called in the reverse order that the objects were constructed, assuming they were constructed in the resource-acquisition.

There are several intrinsic technical drawbacks, however:

  • The fact that Dispose is available as a public method gives the impression of an orthogonal resource. You don’t have to employ the using statement – you can just call Dispose manually whenever you want. This would play havoc with a stack-like resource just as surely as using a finalizer.
  • It is surprisingly easy to forget to put in the using statement. There are ways to mitigate this, but the problem is that you do have to know that the problem exists before you can take the necessary steps. Implementing a resource as an object does not make it difficult to abuse. It makes it easy to abuse.
  • The using statement ensures that Dispose is called even if an exception propagates from the embedded-statement. However, little is said about what happens if Dispose then throws another exception. This is all too likely in many common cases. What actually happens is that the CLR discards the first exception, so all information about the original cause is lost.

So the IDisposable/using technique doesn’t exactly call out to the user to employ it correctly. There are various ways to use it wrongly, and the exception problem is especially irritating. Nevertheless it is possible to use it correctly with a wide range of resources. Because it follows the program stack as it grows and shrinks, it obeys the rules of stack-like resources. Because it is deterministic, it is able to provide the predictable immediate Exit necessary when handling non-fungible resources.

The Cultural Problem with IDisposable

But these mere technical problems are minor compared with the cultural problem that exists around this technique. These are caused by copious amounts of official or semi-official documentation which confuse it with finalizers, in apparent denial of their different areas of applicability. For example, in this article we are shown the “basic IDisposable pattern” and sure enough, it has a finalizer as well. In such articles IDisposable is described as a safety feature to be added to any class that has a finalizer, when in fact it is a general mechanism for hooking into the program stack unwinding process via objects, built on top of the more fundamental try/finally mechanism.

The logical fallacy that results is like Woody Allen’s conclusion that “All men are Socrates”. The advice, if read carefully, is merely that all finalizable classes should also implement IDisposable. But the idea people frequently come away with is that any class that implements IDisposable should also have a finalizer. They get it completely the wrong way around.

I’ve found it almost impossible to shift this two-way link from many people’s minds, so much so that it is often controversial to suggest managing a stack-like resource with IDisposable, because some people will insist that the resource must also have a finalizer “just in case the user forgets to call Dispose”. Of course, for a stack-like resource the finalizer will be no help at all. In fact it will be no use for any resource defined in terms of reference objects, because it is not safe to manipulate them from a finalizer.

This cultural problem doesn’t exist in C++/CLI, because there the existence of IDisposable is hidden behind the C++ language facility called a destructor. A typical C++/CLI program would have lots of classes that implement IDisposable without the programmer giving it a second thought, and none of them would have finalizers. In C++, destructors are quite often used to manage stack-like resources, which in C++/CLI automatically translates into IDisposable being employed for that same purpose. This is due mostly to Herb Sutter, whose annotations here are particularly illuminating.

Gateways

The cultural problems around IDisposable prompted me to look for an alternative, and happily I’ve found one that is applicable in a lot of cases. I already described it in a previous blog post, although there I merely extolled its virtue of being more expressive than the using statement.

But I mostly glossed over its other technical advantages. First and foremost is the way it is extremely difficult to abuse. Providing access to a resource only through the gateway of a function is practically impossible to get wrong. You can’t forget to put in the using statement, because you don’t need one. Nor can you call the Dispose method of nested stack-like resources in the wrong order, because no such method is exposed to you.

For some kinds of resource, it is not necessary or desirable to force them to exit when an exception occurs. The reason for this is that you would expect an exception to abort the whole operation in which the resource is relevant. An example is the TraceWriter class that I included in the source code of this post. It includes this gateway method:

public void Indented(Action whileIndented)
{
    WriteLine("{");
    _indent++;
    whileIndented();
    _indent--;
    WriteLine("}");
}

This allows me to write very readable code:

writer.WriteLine("string comparison:");
writer.Indented(() =>
    {
        writer.WriteLine(leftValue);
        writer.WriteLine(rightValue);
        writer.WriteLine(new String(' ', index)
                         + "^ difference at index " + index);
    });

It is absolutely impossible for me to forget to un-indent after I’ve indented, thanks to the use of a gateway function to control the use of the indenting “resource”. However, I’ve made the Indented method completely ignore exceptions, because in this program, if an exception occurs during the logging process, I expect the whole log file to be abandoned anyway. Had this not been the case, I would have use try/finally within Indented.

Only by writing our own gateway functions can we make the right choice for each case.

This works for all kinds of resources, of course. It is not only free from the cultural/educational baggage surrounding IDisposable, but it also does a better job technically, giving us control over the handling of exceptions and protecting us from accidental abuses.

So in summary: where resource-like patterns of state occur in your own programs, I recommend where possible making little gateway functions to provide access to those states, instead encapsulating the resource in an IDisposable class.

No comments: