I have moved!

I've moved my blog
CLICK HERE

Thursday 26 February 2009

Making Lambdas and Iterators Play Nicely Together

The major problem with this idea is in my favourite feature of C#. In an iterator (a function containing the yield keyword), the code is transformed into a state machine represented by a class. It has to be chopped up into sections, so the function can be “paused” after each yield return. Those pieces must all be in the same function, so sadly lambdas don’t play well with iterators.

There are a couple of possible future language features which would completely solve this, however.

Firstly, how about yield foreach. This would take an IEnumerable<T> and yield all the items of it. It would remove the need to write an explicit foreach loop to return a nested sequence. This amazing paper shows how this feature would be implemented (they even picked the same keyword), although the use-case there is to support recursion with good performance.

Secondly, I’d like to be able to write an anonymous iterator. This would be a matter of using yield return in the middle of a lambda, and hey presto, the compiler would build me an iterator instead of a normal anonymous method.

How would these two features help here? Suppose I wanted to use my TraceWriter (see the source download) class in an iterator. Here’s how it would look:

IEnumerable<int> MyIterator(TraceWriter w)
{
    yield return 1;
    w.WriteLine("Not indented yet");

    yield foreach w.Indented(() =>
    {
        yield return 2;
        w.WriteLine("Indented now");
        yield return 3;
    });

    w.WriteLine("Un-indented again");
    yield return 4;
}

So we have the same readability as before. This would require a special implementation of the Indented method, specifically for iterators:

public IEnumerable<T> Indented<T>(IEnumerable<T> sequence)
{
    WriteLine("{");
    _indent++;

    yield foreach sequence;

    _indent--;
    WriteLine("}");
}

It’s handy, that yield foreach thing, isn’t it?

Fatal Exceptions, and Why VB.NET Has a Purpose!

There’s a feature in the CLR that is exposed in VB.NET and not in C#. Apparently this is a deliberate decision made by the C# team, which so far they’ve stuck to. Unfortunately, it’s a pretty important feature.

I was prompted by Andrew Pardoe of the CLR team to look more closely at this when considering try/finally and the problem of “fatal” exceptions. By fatal, I mean an exception that you have no idea how to handle. A good example is NullReferenceException, which almost certainly indicates a bug. If there’s a bug, your program is in an unknown state, and you need to capture that state exactly as it is. Ideally the program would terminate at that point. In the interests of flexibility, the CLR does allow you to catch such exceptions. (Then the CLR team tells you that you shouldn’t – it’s a “fruit of the tree of knowledge” type of situation.)

Very often a method will throw a variety of exceptions to indicate that it was unable to finish what it was supposed to do, but it ain’t the end of the world. These are recoverable exceptions. A simple example is FileInfo.Delete:

string errorMessage = null;

try
{
    new FileInfo(@"c:\myfile.txt").Delete();
}
catch (SecurityException x)
{
    errorMessage = x.Message;
}
catch (IOException x)
{
    errorMessage = x.Message;
}
catch (UnauthorizedAccessException x)
{
    errorMessage = x.Message;
}

if (errorMessage != null)
    MessageBox.Show("You idiot - " + errorMessage);

Now, there’s a slight problem, which we’ll gloss over – how do you know what exceptions a method will throw? Ultimately, you need accurate, reliable, up-to-date documentation for every method. But instead of waiting for hell to freeze over, let’s move on.

When you have an operation that performs many such calls to various APIs, 3rd party libraries and so on, the ideal place to put the handler is at the outermost layer where you can recover. For example, the user presses the button and you run a lot of code to do what the user has asked. The best place to put the try/catch is around that whole sequence of operations, so that the catch can display the error to the user, and so they can figure out why they’re such an idiot. Inside that operation, you need to use try/finally (and related constructs) to ensure that any half-finished stuff is always properly undone. Nice and simple.

Except for one slight irritation: now the set of exceptions you need to catch is the union of all the recoverable exceptions that might be thrown by all the operations performed within that combined operation. This means that your try/catch block potentially has a ridiculously long list of types. It’s also very easy to miss an exception type here or there, which would mean that your program would crash when it didn’t need to. Whenever someone performs maintenance on the big operation (or any of the APIs or 3rd party libraries it calls on to), the huge list of catch blocks needs to be updated to ensure nothing is missed.

Also this all needs to be done separately for every try/catch statement. So you might think of reducing this hideous boilerplate repetition by writing it once:

public static bool Recover(Action action, Action<Exception> recover)
{
    try
    {
        action();
        return true;
    }
    catch (SecurityException x)
    {
        recover(x);
    }
    catch (IOException x)
    {
        recover(x);
    }
    catch (UnauthorizedAccessException x)
    {
        recover(x);
    }
    //others...

    return false;
}

You can now write nice neat lambdas to provide the code for the action to try and the single recovery handler. But you still have the same major problem: this long list of all the recoverable exceptions has to include every single one that might be thrown anywhere in any operation.

What if your product is extensible, such that 3rd parties can write plug-ins for it? How do they add their own recoverable exceptions to the Recover function? They simply can’t. And even if they could, what if they don’t remember to? How does your application protect itself?

So you have explosive complexity and apparently no way to control it.

At this point, the following quite reasonable thought occurs to you: surely the number of fatal exceptions is smaller than the number of non-fatal ones. Indeed, the number of non-fatal ones is constantly growing as we create new exception classes of our own. Also, if I make any mistakes in constructing my software, I’d prefer to not crash rather than crash.

The natural conclusion, which therefore naturally occurs to everyone eventually, is to do this:

try
{
    new FileInfo(@"c:\myfile.txt").Delete();
}
catch (Exception x)
{
    MessageBox.Show("Couldn't delete file: " + x.Message);
}

But then you remember that there’s such a thing as a fatal exception. Although you’d generally prefer your software not to crash, in the event of a definite bug, it is better for it to crash – that way, you get an accurate snapshot of the program’s state. So you need to examine the exception object before you try to handle it:

try
{
    new FileInfo(@"c:\myfile.txt").Delete();
}
catch (Exception x)
{
    if (!ExceptionPolicy.IsExceptionRecoverable(x))
        throw;

    MessageBox.Show("Couldn't delete file: " + x.Message);
}
    

Of course, that IsExceptionRecoverable method has to be written by you, but only once, and it probably has a relatively short list of exception types to look out for, and that list doesn’t grow so fast as the list of recoverable exceptions.

The above solution represents the state of the art according to the Microsoft Enterprise Library. Unfortunately, it’s still not right. The problem is to do with the thing I mentioned above about how to correctly write the code within each long and complex operation:

Inside that operation, you need to use try/finally (and related constructs) to ensure that any half-finished stuff is always properly undone.

The problem is, you only want the finally blocks to run for recoverable exceptions, not fatal ones. To repeat what I said right at the start:

If there’s a bug, your program is in an unknown state, and you need to capture that state exactly as it is.

So you definitely don’t want to run any finally blocks before that state is captured. They might modify it (in fact, that’s the whole point of them). They might make it worse, cause further corruption, or even throw another exception and so hide the original bug.

My first thought when this occurred to me was that try/finally presents a serious problem. How can we make it execute the finally block for a recoverable exception but not for a fatal exception?

Well, we can, sort of. If you don’t ever catch an exception, the finally blocks never run. When an exception is thrown, the CLR does not immediately run the finally blocks. First, it checks to see if there is a suitable catch in the stack for the current exception. If there is, and only if there is, it runs the enclosed finally blocks, and then runs the catch block.

This is why the correct advice is to never catch fatal exceptions, because merely catching them will trigger the silent execution of finally blocks when the program is already in an unrecoverable situation.

But this is sadly no good to us, because we’ve already established that explicitly catching only the recoverable exception types will lead us to a situation of hideously and unmanageable complexity, and programs that crash when they don’t need to. It’s just not practical to work that way.

The problem all stems from the fact that catch blocks can only decide to run based on the type of the exception, and this has to be an “opt in”. It can’t be coded the other way round, as an “opt out”. Yes, exceptions can be arranged in hierarchies and so that one catch block can catch all exception types derived from some base class, but this turns out to be useless because there’s no base class that all recoverable exceptions have in common, except for the class which is also the base class of all fatal exceptions as well.

So we cry: if only the CLR allowed us to run some code of our own to decide whether to catch an exception, before the finally blocks run.

The surprising answer (to a C# programmer) is that the CLR does allow us to do that. It just isn’t made available in C#. Only in VB.NET. Devastating, huh?

So to do exception handling in C#, in a way that is actually manageable, but also doesn’t execute code after a fatal exception, you first have to crank up VB.NET and create a class library containing this class:

Imports System

Public Class Exceptions
    Public Shared Sub TryFilterCatch(ByVal tryAction As Action, _
            ByVal filter As Func(Of Exception, Boolean), _
            ByVal handler As Action(Of Exception))
        Try
            tryAction()
        Catch ex As Exception When filter(ex)
            handler(ex)
        End Try
    End Sub
End Class

This effectively turns the magic feature of VB.NET – the When clause that appears next to the Catch – into something we can reuse in any CLR-based language. Having built that, you can now go back to C#… phew!

Now, supposing you still have that IsExceptionRecoverable method handy, you can do this:

Exceptions.TryFilterCatch(() =>
{
    new FileInfo(@"c:\myfile.txt").Delete();
},
ExceptionPolicy.IsExceptionRecoverable, x =>
{
    MessageBox.Show("Couldn't delete file: " + x.Message);
});

Given that we are effectively forced to this conclusion by the above reasoning process, it’s very strange to look at where we’ve ended up. Essentially, if you want a manageable, fault-tolerant, practical approach to exception handling, the above example is what you need to use, instead of using C# try/catch statements.

How does this situation persist? Why isn’t general exception filtering exposed in C#, as it is in VB.NET? These are very good questions. It seems that the C# team doesn’t accept the line of reasoning I gave above. According to them, we should all write long and unmanageable lists of catch handlers, explicitly listing the recoverable exceptions, based on the apparently trustworthy documentation of every software component we ever call into.

The CLR team, on the other hand, is well aware of this problem, and is aware that people are forced to try an alternative, and so inevitably make their way down the chain of reasoning I’ve described here, and the Microsoft Enterprise Library is probably representative of where most people stop in that chain.

The earliest reference I’ve found for all this stuff is here. It covers all the technical details, and mentions the idea of exposing VB.NET’s filtering capability.

This is the blog post that Andrew Pardoe pointed me to. The more I think about it, the more I think it’s a message to the C# team, rather than to users. The chances of many users hearing the message is low, and the chances of them understanding the implications is probably a little lower still. They must be hoping that the C# team will hear it and understand it. I guess C# is a customer of CLR, so the C# team is “always right” in that relationship. Consequently the message will need to come from customers of the C# team!

In the meantime, the CLR team is doing something about this problem:

In version 4 of the CLR, the product team is making exceptions that indicate a corrupted process state distinct from all other exceptions.

This is only part of the solution, of course, because in some situations our own exception types can indicate that the program is in an invalid state. The “corruption” is not of the low-level, random byte trashing kind, but it is corruption by a bug, nonetheless. So we need the recoverability criterion to be specified in a customizable way for each try/catch situation.

This means that, even after version 4 of the CLR, we will still need that snippet of VB code. Until the C# team catch-es up with the VB.NET team.

Wednesday 25 February 2009

What's Wrong With C# 4.0's dynamic Keyword, and How I Think It Should Be Fixed

Update: Exactly what I want, already implemented, complete with caching of the call site. Also, Microsoft's Mads Torgersen responds to this suggestion here.

C# 4.0 proposes to add the dynamic keyword to the language. There are a few problems with it:

  1. Beginners have no clue what the difference is between object, var and dynamic. This is a relatively minor problem compared to the others, but it's worth considering.
  2. The dependency of a program on specific features of a dynamic object are not captured in a central definition, defeating the type system. I'll explain this a bit better below.
  3. There's no intellisense support (this is really the same thing as point 2, but it's so important that it's worth highlighting on its own).
  4. New language features are an unnecessary pain if you can meet the same need with a library feature - and I believe you can do a better job with a library feature in this case.

In a nutshell, I sincerely hope that dynamic is removed from C# 4.0. In its place, there should be a static class in a namespace System.Dynamic. It would have one method, an extension on object:

public static TInterface AsDynamic<TInterface>(this object obj)
      where TInterface : class

This would take any object at all and return a dynamic wrapper that implements the specified interface by forwarding calls onto the wrapped object. The wrinkle here is that TInterface really should be an interface but there's no way of constraining to that (now there's a language feature I'd accept quite happily).

Now in order to make use of a dynamic object (that is, an object whose type is unknown to me) I simply declare the features I require of that object, in an ordinary interface. Whenever I receive such an object, I wrap it with that interface as soon as possible, and use that interface to refer to it thereafter.

What's so great about this? Suppose my dynamic object is a frog. So first I declare an interface IFrog:

public interface IFrog
{
   void Ribbit();
}

This states in a single location what features a dynamic frog must have.

Now I can make the majority of my code use this interface to refer to frogs. At the "dynamic boundary" where I receive weird objects from the outside world, I immediately convert them to IFrog so the rest of my code can use them:

IFrog f = something.AsDynamic<IFrog>();

This doesn't change the dynamic nature of what is happening. Exactly the same kind of method resolution now has to happen at runtime, and no capability has been lost compared to the present dynamic keyword design.

But there are big advantages. Firstly, users don't need to understand a new keyword that clashes conceptually with two other things. Secondly, what if the dynamic system that passes me frogs changes the way it works, so now it expects an argument to be passed to Ribbit? No problem for me. I just update my interface:

public interface IFrog
{
   void Ribbit(bool loudly);
}

Now when I recompile, the C# compiler will find all the places in my code where I need to provide the new argument. The lesson here is that all languages (whether statically or dynamically typed) end up having these kinds of contracts or dependencies on the features of objects. The advantage of static languages is that it pays to declare those contracts centrally, so you can update them more easily.

Then there's the intellisense - obviously this works as usual with IFrog. It doesn't work at all with the current C# 4.0 design.

Finally, this would be a library feature. It would be available in a consistent way from C#, VB, F#, Eiffel, and so on. It's also very optional - if you don't want to see the AsDynamic extension appearing on every class, then don't add using System.Dynamic to your source file.

Also I may as well point out that it would be easy to instantly find all uses of dynamic type conversion within a solution, by simply finding all references to the AsDynamic method. So no need for a new IDE feature there.

This solution contains the dynamic stuff rather than spreading it around. It strikes me as being far more in the spirit of C# as a statically typed language designed for large-scale development. Yes, the dynamic keyword makes short examples look cool on the page. But it doesn't make them fun to write in the first place (no intellisense), or to maintain (no central type definition).

How about it, Microsoft?

Friday 13 February 2009

General Theory of Resources

Having blogged the other day about a different way of doing automatic cleanup, I’ve been mulling it over and also answering a question on StackOverflow, and decided that I need to assemble a taxonomy of resources; what kinds there are, how to recognise them by their distinctive markings, guidelines for upkeep, training, feeding, breeding, etc. And because I’m obsessive-compulsive, I also wanted to develop a complete and rigorous theory of resources from the ground up.

I don’t like it when I have to say “In that sort of situation I normally try something like this and it usually seems to work.”

So without further ado, I present the General Theory of Resources.

Resources Modeled by Conceptual Bits

Imagine an array of bits, ones and zeros. Each represents something that has to be rationed between the parts of your program that want to use it. Only one “user” at a time can take ownership of the thing represented by the bit. When the thing is in use, its corresponding bit is set to 1. Otherwise, it is set to 0. If all the bits are 1 when you show up with your begging bowl, you’re out of luck. Note that these are just conceptual bits; real implementations might not actually maintain an array of bits or Booleans specifically for this purpose. But the two-state nature is always there nonetheless. We can imagine each available resource being modeled by a bit indicating whether it is currently reserved for exclusive use.

Are we talking about an infinite number of bits? Of course not. If we were, then we wouldn’t need to go any further in this discussion. We are frequently talking about a single bit. Often we are talking about a number of bits just large enough to fool us into thinking that our program is correct, only to find out later that it isn’t because it eventually runs out of bits.

Enter and Exit

Atop this set of bits, there is a pair of operations, which I’ll call Enter and Exit. When Enter executes, it sets a bit, or possibly a bunch of bits to 1. When Exit executes, it sets the bits back to 0 again. The two operations abstract over the underlying set of bits.

We sometimes say that a resource “exists” whenever we have executed Enter but have not yet executed Exit in the right way to reverse the state changes caused by Enter. A correct program must execute Exit once for every time it executes Enter.

Fungibility

You don’t complain to the bank if they give you “someone else’s cash”; it can be any old cash, as long as it adds up to the required value.

Some resources are fungible, meaning that the caller doesn’t care which particular bunch of bits is set to 1. An incorrect program would run out of such fungible bits because it would forget to Exit and so the number of 1-bits would build up over time, until eventually there would be no 0-bits remaining.

If you came home after leaving your kids with the babysitter, and found that they were different kids, you wouldn’t be satisfied if the baby sitter said, “They look pretty similar to your kids, don’t they?”

Other resources are non-fungible, meaning that the caller requests that a specific bit must be set, so each bit has a meaningful identity. An incorrect program would try to set a non-fungible bit to 1 that it had already set to 1. Sometimes a resource has only one bit to play with, in which case it is surely non-fungible, because we cannot use it without implicitly specifying which bit we want to use.

Orthogonality

Some resource types are orthogonal, meaning that the states of the bits are effectively mutually independent (they may be bunched together by the Enter and Exit operations, but if those bunches are mutually orthogonal then we still say that the resource is orthogonal).

Some are non-orthogonal, meaning that the states of the bits are entangled in some way, even though they are manipulated by separate Enter and Exit operations. Sometimes a resource has only one bit to play with, in which case we may as well call it orthogonal (because there is nothing for the single bit to be entangled with).

The only non-orthogonal kind of resource I consider here is stack-like. There may be other useful kinds, but I can’t think of any off hand.

Some Examples

Here’s a nice table of some contrived examples and their various attributes:

Resource Enter Exit Bits Fungible Orthogonal
Unmanaged memory in C malloc free One per word of memory Yes Yes
A stack of integers A Push method A Pop method One bit per stack depth No No
A static Boolean flag, _entered, protecting a method from reentrance // start of method
if (_entered ==
    true) return;

_entered = true;
// end of method
_entered = false;
One No Yes
A file The CreateFile API The CloseHandle API One per file No Yes
An XML output stream A method such as BeginElement A method such as EndElement One per open element No No
A thread-local integer, _entered, that monitors recursion into a method // start of method
_entered++;
// end of method
_entered--;
One per possible level of recursion No No

Something to consider: I give no example that is both fungible and non-orthogonal. If a resource is non-orthogonal, it means that the user has to be careful about the order in which it takes ownership of or releases ownership instances of that resource. It cannot do this if it is required to treat the resource as fungible, to not care which instance it is holding onto. So non-orthogonal implies non-fungible.

Another thing that should certainly be clear by now is that a "resource" is not something the operating system provides you with, implemented in native code. It can be, in the case of resources encapsulated in kernel handles that can be closed with CloseHandle, but there are other things, which you can write all by yourself, in pure C# or VB.NET or F#, which are also best thought of as resources.

Now let’s pick a few of them apart to make sure it’s all clear.

Unmanaged Memory in C

This may seem like a strange example to pick, because we have managed memory these days. But managed memory is fundamentally the same as unmanaged memory except that the runtime performs lazy cleanup automatically on another thread, and the question we must ask is: what makes this an appropriate solution for managing a resource? With what kinds of resource may we use such a clever solution?

Firstly, consider the C unmanaged memory heap, also known as the “free store”. We’ll assume memory has a resolution of one “word” (which would typically be either 4 or 8 bytes, but that doesn’t matter right now). Each word is either part of the free memory or it is part of some allocated memory. So apart from its actual contents (which we don’t presently care about at all), each word has one of two states associated with it: 0 for free, or 1 for allocated. These state bits are all completely orthogonal - independent from each other.

On top of this simple model, the heap provides a convenient abstraction: you can use Enter to find N contiguous free words for you and it will mark them as allocated (set the bit corresponding to each word). Later on, you can pass to Exit the address of the first word you were allocated and the heap will set those words back to the unallocated state (unset the bits). Each bunch of bits is orthogonal with respect to all other bunches, because the bunches can never overlap.

To reiterate, we are not talking about manipulating the stuff stored in the words (that’s the application’s job). We’re just imagining that the heap keeps track of which words are in use and which are free by recording those facts in a separate array of bits, one bit for each word of memory in the process. Of course in practice the heap can optimize its data structures so it doesn’t actually need to keep an array of bits to track the allocation status of every word separately.

If we rewrite the heap’s operations in C#, using my names for them, then they look like this:

IntPtr Enter(int wordsToAllocate);
void Exit(IntPtr blockToFree);

This is the standard pattern for an orthogonal resource. The parameter(s) to Enter do not give the identity of the bits that need to be set. Instead, they describe what is needed in a more general sense, and the Enter operation finds a group of bits that fit the bill: any group of bits representing a contiguous region of words of the length requested.

We then have this token or receipt being returned by Enter and we have to give it back to Exit when we no longer need the memory. The token can be used to access the block of memory, and it also is the key to freeing it.

A Stack of Integers

Now consider a stack of integers, something like a Stack<int>. The Enter operation takes the thing to put on the stack. This puts the system (the stack) into a state of being taller than it was before. However, the Exit operation does not require any information. It is the nature of a stack to only let you remove the top-most item, so all you need to tell it is that you would like it to remove something. You don’t have to tell it which item to remove (if you could do that, it wouldn’t be a stack). If this isn’t what your program needs, then you shouldn’t be using a stack.

As with any other resource manager, a stack has a conceptual array of bits. It is usually very large, but of course not infinite. Most of the bits are set to 0, most of the time. All the bits set to 1 are at the front of the array, and each 1-bit represents an item on the stack.

1 1 1 1 0 0 0 0 0

When we push something on the stack, we set the next 0-bit to 1:

1 1 1 1 1 0 0 0 0

When we pop something off the stack, we set the last 1-bit to 0:

1 1 1 1 0 0 0 0 0

Obviously there is nothing like orthogonality here. The bits are very much entangled such that no state change can occur that would stop the bits from being properly sorted into two neat groups. You certainly can’t just flip the state of any bit. Nor is there fungibility, because the identity of a specific bit to modify is implied by the operation we execute (non-orthogonality implies non-fungibility, remember?)

If we wanted to we could make our stack resource fit the same pattern as the unmanaged heap functions, with a little creativity and shoehorning. Firstly, the Enter operation might return the new height of the stack, which would serve as a unique resource handle. Exit would then accept that same value. Now we’d have an interface much like the above memory interface. But Exit would have to throw an exception if we passed it a value that was not the current height of the stack. Apart from performing this check, there is no other purpose in passing a value to it. So it is thoroughly misleading to expose the above interface on a stack-like resource. Instead, stack-like resources should have this kind of interface:

void Enter(T item);
T Exit();

This makes it obvious that there is no choice about which state the stack will be restored to next. Bear in mind that the item parameter is quite optional. As with the memory example, we’re not interested here in what the job stack is doing, but merely it how it behaves as a resource.

A Boolean Flag as a Recursion Guard

Sometimes you need to protect a method against re-entrance. For example, suppose you have an event handler that in turn fires another event. There is a danger that someone else will wire up the other event to fire the event your handler is wired up to, and there will be an endless recursion. To protect against this, you can create a static Boolean variable and make your event handler do nothing at all if it is true. Otherwise your event handler is able to run normally. First it sets the flag to true, then it goes about its business (firing the other event), and then sets the flag back to false again. This stops the infinite recursion from happening.

But it also means that your event handler (or whatever method you apply this approach to) is a resource, albeit a very limited one. It can only be used by one client at a time in the whole process. Consequently it is modeled by a single logical state-bit (indeed, it is implemented by one, in the form of the Boolean variable).

This definitely makes it non-fungible (because we cannot help but specify which bit we want to take over) but also orthogonal (because there are no other bits for this single bit to be entangled with).

File Handles

Now for the really complicated example. With file handles, the Enter operation produces a unique identifier that you have to hold on to. You then have to pass the same identifier back to Exit to close the file handle. So far, it looks a lot like memory. But there’s an added wrinkle, which is that there are two different kinds of resource here: files and handles.

The OS maintains a set of handles, and that constitutes a system which does indeed have the same properties as the C memory heap. Each possible handle value has an orthogonal Boolean state (allocated, unallocated) associated with it, just as every word of memory does. Applications are not interested in which handle value they receive, making them also fungible.

But the OS also maintains persistent files with unique names on your hard drive, and these each have an orthogonal state-bit associated with each of them, if we consider write-locking, which is the most problematic case. But these are definitely non-fungible, because they are identified by their names. (Note that if you build an abstraction on top if this system to make up random temporary filenames, then they become fungible again, but that’s another story.)

In reality, the Enter operation presented to applications is actually two operations that together form an atomic transaction. First the specified file is locked, meaning that its associated bit is set to 1. Then an unallocated handle is found (which is entirely the choice of the OS, because like money and memory, handle values are fungible) and its bit is also set to 1.

Consequently, although the “handle system” allows multiple simultaneous handles to exist, the file system forces us to treat files very carefully as resources. It is vital to understand this, because the Enter/Leave API looks treacherously similar to the memory API:

IntPtr Enter(string fileName);
void Exit(IntPtr handleToClose);

It has to look like that because handles work that way. But files do not – they are closer to the example of using a Boolean flag to guard against reentrance.

This frequently tricks people into thinking that they can treat file handles in the same way as memory – that is, let a lazy garbage collector take care of calling Exit automatically some time, hopefully not too long after there are no remaining references to the file handle. But the difference is in the information being passed into Enter. It is not a general description of what might fit the bill, but the specific identity of the one and only thing that is needed.

So the truth is that a program that fails to close its file handles in a well-understood way is probably not going to work very well. On Windows there is in fact no way of opening a file without locking it against at least one other operation (deletion), so even non-exclusive file handles have some problems.

As a digression, consider directories or folders in the file system. Enter creates a directory and Leave destroys it. There is one root directory which cannot be destroyed, and all other directories must have a parent directory. The result is a tree-like structure. But we can conceptually simplify it by thinking of it as a set of interleaved stacks. The chain from any leaf directory to the root is a stack, so we know for sure that tree-like resources are at least as restrictive as stack-like resources.

The other two examples are not appreciably different from what we’ve already examined. A stream of XML being written sequentially – however it is implemented – is effectively a stack. Each 1-bit represents an open element. The elements must be closed in the right order to ensure well-formed XML is produced. And the integer being used to monitor the level of recursion into a method, that too is a stack: one that stores nothing except a record of its current depth.

Internal Resources vs External Resources

One crucial point that should be clear by now is that there is not really a distinction between states “inside” your program and states “outside” of it. There is a tendency among C# programmers to think of the stuff they write themselves as “managed” (because it’s in managed code) and the concepts exposed by the operating system as native or “unmanaged”.

In fact, if your program internally maintains state, and the state needs to obey rules like the ones described here, then it inevitably defines resources of its own. And unless you take steps to provide automatic management of the lifetimes of your custom resources, then they are unmanaged resources, no matter what language they are written in or what runtime technology they are based on.

While an object exists in the CLR, it may have internal mutable state, and there may be rules for the correct use of that state, such that we have to conclude unavoidably that it is a type of resource, and it may or may not be be fungible or orthogonal. So the fact that the whole object will eventually be garbage collected is no comfort. What matters is that the state of it is modified correctly during its lifetime. We need some kind of automatic resource management to deal with the internal state of our own objects, as well as the objects exposed to us the operating system.

So what does all this tell us? It tells us that managed memory is one kind of resource among many, that the fundamental nature of resources means that they require management of some kind, and that the technique (lazy garbage collection) applicable to management of memory is only applicable because of specific properties of that kind of resource (fungibility, orthogonality). Other resources do not have those properties, and yet still need to be managed. So we need other techniques for them.

Resource Management Techniques

How can we manage resources easily in C#? By taking advantage of facilities in the language wherever possible. We have the interface to the resource in the form of the Enter and Exit functions, but we would like to avoid the kind of problems that occur when we forget to call Exit once for each call to Enter. So we want something easier than merely putting in the calls manually.

There are three solutions:

  • Finalizers
  • IDisposable/using
  • Gateways (methods that Enter, forward to a delegate, then Exit)

The first two are often confused and conflated.  The last one is, to my mind, massively superior to the other two, for reasons that should become clear, and yet is rarely described in this context as far as I know. I’ve coined the term “gateway” because I’m not aware of another term for them, so please let me know if there’s a better name.

Finalizers

Finalizers are essentially horrid nasty things. It should rarely be necessary to write one. It is notoriously difficult to write one correctly. I have never bothered.

The basic idea is to hitch a ride with the GC’s ability to lazily dispose of memory resources. The information needed to Exit a resource is stored in an object, and when that object is reclaimed by the GC, the object’s finalizer executes and calls Exit for the resource.

The GC schedules the finalizer to run lazily (that is, some time soon but not immediately or at a predictable time), which means that it is only really useful for resources that are fungible. Consider what this means for file handles: finalization will ensure that your process does not run out of handles, because they are fungible and in plentiful supply, but it does not help at all in avoiding bugs caused by holding onto file locks for longer than necessary, because files are non-fungible.

And don’t even consider trying to manage stack-like resources with finalizers. Imagine the mess, with things being popped when they shouldn’t be.

The information needed to Exit the resource must be stored in value-type fields only; it is not generally safe to access other reference objects (classes) from a finalizer. This means that finalizers are only really useful for resources defined by the operating system as kernel objects (and in fact there is a better way of encapsulating those, called the SafeHandle.)

So the less said about finalizers the better.

IDisposable/using

The other official way of managing resources is to encapsulate the resource in an object (as with the finalizer approach) and to give it a Dispose method, as defined in the interface IDisposable. The construction of the object corresponds to Enter, and Dispose serves as an Exit.

Then the language provides the using statement as a convenient way of ensuring that Dispose will be called when the flow of control leaves a specified scope:

 using ( resource-acquisition ) embedded-statement 

This approach is fully deterministic and non-lazy. It has none of the limitations of finalizers. Because the using statement weaves itself into the thread’s method-call stack, it works very well with stack resources - each object’s Dispose is guaranteed by the C# specification to be called in the reverse order that the objects were constructed, assuming they were constructed in the resource-acquisition.

There are several intrinsic technical drawbacks, however:

  • The fact that Dispose is available as a public method gives the impression of an orthogonal resource. You don’t have to employ the using statement – you can just call Dispose manually whenever you want. This would play havoc with a stack-like resource just as surely as using a finalizer.
  • It is surprisingly easy to forget to put in the using statement. There are ways to mitigate this, but the problem is that you do have to know that the problem exists before you can take the necessary steps. Implementing a resource as an object does not make it difficult to abuse. It makes it easy to abuse.
  • The using statement ensures that Dispose is called even if an exception propagates from the embedded-statement. However, little is said about what happens if Dispose then throws another exception. This is all too likely in many common cases. What actually happens is that the CLR discards the first exception, so all information about the original cause is lost.

So the IDisposable/using technique doesn’t exactly call out to the user to employ it correctly. There are various ways to use it wrongly, and the exception problem is especially irritating. Nevertheless it is possible to use it correctly with a wide range of resources. Because it follows the program stack as it grows and shrinks, it obeys the rules of stack-like resources. Because it is deterministic, it is able to provide the predictable immediate Exit necessary when handling non-fungible resources.

The Cultural Problem with IDisposable

But these mere technical problems are minor compared with the cultural problem that exists around this technique. These are caused by copious amounts of official or semi-official documentation which confuse it with finalizers, in apparent denial of their different areas of applicability. For example, in this article we are shown the “basic IDisposable pattern” and sure enough, it has a finalizer as well. In such articles IDisposable is described as a safety feature to be added to any class that has a finalizer, when in fact it is a general mechanism for hooking into the program stack unwinding process via objects, built on top of the more fundamental try/finally mechanism.

The logical fallacy that results is like Woody Allen’s conclusion that “All men are Socrates”. The advice, if read carefully, is merely that all finalizable classes should also implement IDisposable. But the idea people frequently come away with is that any class that implements IDisposable should also have a finalizer. They get it completely the wrong way around.

I’ve found it almost impossible to shift this two-way link from many people’s minds, so much so that it is often controversial to suggest managing a stack-like resource with IDisposable, because some people will insist that the resource must also have a finalizer “just in case the user forgets to call Dispose”. Of course, for a stack-like resource the finalizer will be no help at all. In fact it will be no use for any resource defined in terms of reference objects, because it is not safe to manipulate them from a finalizer.

This cultural problem doesn’t exist in C++/CLI, because there the existence of IDisposable is hidden behind the C++ language facility called a destructor. A typical C++/CLI program would have lots of classes that implement IDisposable without the programmer giving it a second thought, and none of them would have finalizers. In C++, destructors are quite often used to manage stack-like resources, which in C++/CLI automatically translates into IDisposable being employed for that same purpose. This is due mostly to Herb Sutter, whose annotations here are particularly illuminating.

Gateways

The cultural problems around IDisposable prompted me to look for an alternative, and happily I’ve found one that is applicable in a lot of cases. I already described it in a previous blog post, although there I merely extolled its virtue of being more expressive than the using statement.

But I mostly glossed over its other technical advantages. First and foremost is the way it is extremely difficult to abuse. Providing access to a resource only through the gateway of a function is practically impossible to get wrong. You can’t forget to put in the using statement, because you don’t need one. Nor can you call the Dispose method of nested stack-like resources in the wrong order, because no such method is exposed to you.

For some kinds of resource, it is not necessary or desirable to force them to exit when an exception occurs. The reason for this is that you would expect an exception to abort the whole operation in which the resource is relevant. An example is the TraceWriter class that I included in the source code of this post. It includes this gateway method:

public void Indented(Action whileIndented)
{
    WriteLine("{");
    _indent++;
    whileIndented();
    _indent--;
    WriteLine("}");
}

This allows me to write very readable code:

writer.WriteLine("string comparison:");
writer.Indented(() =>
    {
        writer.WriteLine(leftValue);
        writer.WriteLine(rightValue);
        writer.WriteLine(new String(' ', index)
                         + "^ difference at index " + index);
    });

It is absolutely impossible for me to forget to un-indent after I’ve indented, thanks to the use of a gateway function to control the use of the indenting “resource”. However, I’ve made the Indented method completely ignore exceptions, because in this program, if an exception occurs during the logging process, I expect the whole log file to be abandoned anyway. Had this not been the case, I would have use try/finally within Indented.

Only by writing our own gateway functions can we make the right choice for each case.

This works for all kinds of resources, of course. It is not only free from the cultural/educational baggage surrounding IDisposable, but it also does a better job technically, giving us control over the handling of exceptions and protecting us from accidental abuses.

So in summary: where resource-like patterns of state occur in your own programs, I recommend where possible making little gateway functions to provide access to those states, instead encapsulating the resource in an IDisposable class.

Displaying a nested evaluation tree from Expression<Func<bool>>

Updated: The downloadable source (see link below) is now tidied up a bit, and also displays any embedded string comparisons in a similar way to NUnit.

This might be useful as a way to write Assert statements in tests. Instead of requiring many different forms of Assert to capture values and intents, we could just have one Assert that accepts a lambda:

public static void Assert(Expression<Func<bool>> expr)
{
    if (!(bool)Trace(expr.Body, 0))
    {
        // throw an exception...
    }
}

Now all we need is a Trace function. The challenge is making it include all the relevant information about the test, including the values being used. But this isn't actually that hard, because you just recursively search through the tree of nested expressions, evaluating them all:

public static object Trace(Expression part, int indent)
{
    string indentString = new String(' ', indent*2);

    Console.Write(indentString);
    Console.Write(GetExpressionText(part));
    Console.Write(" == ");
    LambdaExpression lambda = Expression.Lambda(part);
    Delegate callable = lambda.Compile();
    object result = callable.DynamicInvoke(null);

    if (result is string)
        result = "\"" + result + "\"";

    Console.Write(result);

    var nestedExpressions = part
        .GetType()
        .GetProperties()
        .Where(p => typeof(Expression).IsAssignableFrom(p.PropertyType))
        .Select(p => (Expression)p.GetValue(part, null))
        .Where(x => (x != null) && !(x is ConstantExpression));

    if (nestedExpressions.Any())
    {
        Console.WriteLine(" where");
        Console.Write(indentString);
        Console.WriteLine("{");

        foreach (Expression nested in nestedExpressions)
            Trace(nested, indent + 1);

        Console.Write(indentString);
        Console.WriteLine("}");
    }
    else
        Console.WriteLine();

    return result;
}

I just use reflection and a bit of LINQ to hunt for nested expressions, so I don't need to laboriously handle the various kinds of expression.

I don't bother presenting the value of constants, because it would just explain to the user that 5 == 5, which isn't very enlightening.

The GetExpressionText function is a helper to tidy up the output. Variables captured in the lambda have some nasty looking prefix to identify which compiler-generated class they belong to, but that's unlikely to be useful here, so I strip it out:

public static string GetExpressionText(Expression expr)
{
    string text = expr.ToString();

    const string prefix = "value(";
    const string suffix = ").";

    StringBuilder builder = new StringBuilder();

    int start = 0;
    int pos = text.IndexOf(prefix);
    while (pos != -1)
    {
        builder.Append(text.Substring(start, pos - start));
        start = text.IndexOf(suffix, pos) + suffix.Length;
        pos = text.IndexOf(prefix, start);
    }

    if (start < text.Length)
        builder.Append(text.Substring(start, text.Length - start));

    return builder.ToString();
}

And now for a little demo:

int x = 3;
string t = "hi";
Assert(() => 5*x + (2 / t.Length) < 99);

Which produces the remarkably readable output:

(((5 * x) + (2 / t.Length)) < 99) == True where
{
  ((5 * x) + (2 / t.Length)) == 16 where
  {
    (5 * x) == 15 where
    {
      x == 3
    }
    (2 / t.Length) == 1 where
    {
      t.Length == 2 where
      {
        t == "hi"
      }
    }
  }
}

Obviously not a hugely strenuous test, but I suspect any other edge cases could easily be dealt with.

The way I've written Assert is just for demo purposes, because I wanted to see the trace regardless of whether the test passes or fails. In a real framework you'd only print out the trace if the test failed. Apart from making the successful output as short as possible (silence is golden) it would also help your tests run faster.

Download full source here: http://www.earwicker.com/blogfiles/ExpressionTracer.zip

Sunday 8 February 2009

A functional replacement for the using statement

The using-statement is just that: a statement. Why does this bug me?

The FileStream object has a Length property. Assuming I have a function Open that returns a FileStream ready for us, it is awfully tempting to do this:

Console.WriteLine(Open().Length);

But that is wrong, wrong, wrong, because it postpones the closing of the file handle until the finalization thread gets around to it. No good.

To obtain a value from a disposable object after that object is disposed, you have to store it in a variable declared outside the using-statement, like this:

long length;
 
using (FileStream file = Open())
    length = file.Length;
 
Console.WriteLine(length);

Not very nice. But how about this?

Console.WriteLine(Open().Use(file => file.Length));

Only a little uglier than the wrong version, but a whole lot righter. The Use extension method executes the function passed to it, and returns whatever that function returns, but also disposes of the object it was called on:

public static class DisposableExtensions
{
    public static TResult Use<TArg, TResult>(
        this TArg arg, Func<TArg, TResult> usage)
        where TArg : IDisposable
    {
        try
        {
            return usage(arg);
        }
        finally
        {
            arg.Dispose();
        }
    }
}

Of course, this technique allows us to do something very similar even when we aren’t dealing with objects that implement IDisposable. What we are really doing here is putting the system into some temporary state, then computing something, then undoing that temporary state change. That’s all a  “resource” really is: some state the system gets into that you’ll need to back out of soon. The definition of “soon” varies; for memory allocation, we can be pretty lax, because we have enough memory to support a very large number of small allocations simultaneously, but we can only support a single exclusive file handle on a given file, so we have to be extremely careful.

So a resource is two bits of code: get-into-state and get-out-of-state. Suppose I have a really simple kind of “lock”. I actually support any number of simultaneous locks, but I want to know how many are held at any given time. So my “lock” is just an integer field in my class. To take out a lock, increment it, and to release the lock, decrement it:

public class Lockable
{
    private int _lockCount;
 
    public int OpenLocks
    {
        get { return _lockCount; }
    }
 
    public void Lock()
    {
        _lockCount++;
    }
 
    public void Unlock()
    {
        _lockCount--;
    }
}

How can I help users of my class to ensure they don’t forget to release the lock in a timely fashion?

If you understood the definition of a “resource” I gave above, you’ll have realised that the resource in this case is not the class itself, but a state in which there has been a call to Lock without a corresponding Unlock call. So if the value of _lockCount happens to be 63, then there are 63 resources alive in our program, and we want them to be cleaned up at some point.

So the object-oriented (which doesn’t necessarily mean “good”) way to expose this is to reify our resource, by giving it a class of its own:

public class Lock : IDisposable
{
    private readonly Lockable _res;
 
    public Lock(Lockable res)
    {
        _res = res;
        _res.Lock();
    }
 
    public void Dispose()
    {
        _res.Unlock();
    }
}

Now the users of MyResource can employ the using-statement to call Dispose for them, which takes care of calling Unlock.

Lockable mr = new Lockable();
 
int locks;
using (new Lock(mr))
    locks = mr.OpenLocks;
 
Console.WriteLine(locks);

Fair enough, but as we saw above, that makes it ugly when we just want to write an expression that computes a value while the lock is held. Yes, we could use my Use extension method:

Console.WriteLine(new Lock(mr).Use(l => mr.OpenLocks));

But why not cut out the middleman altogether?

public class Lockable
{
    private int _lockCount;
 
    public int OpenLocks
    {
        get { return _lockCount; }
    }
 
    public T WithinLock<T>(Func<T> f)
    {
        _lockCount++;
 
        try
        {
            return f();
        }
        finally
        {
            _lockCount--;
        }
    }
}

We get rid of the public Lock/Unlock so there’s no way to break the rules, and instead provide a way to conveniently get a parameterless lambda executed inside a lock, guaranteeing that the lock will be released when the computation finishes:

Console.WriteLine(mr.WithinLock(() => mr.OpenLocks));

This addresses a very common complaint about IDisposable, which is that the user has to remember to call Dispose, or else they have to remember to use a using-statement, and it is very easy to forget. With the above alternative technique, the user is not given the ability to forget.

I call such methods "Gateways", although in Java this is called the Execute Around idiom, and it is also exemplified by Common Lisp macros that start with the prefix with-, such as with-open-file.

Of course, there is a minor downside. C# has the error: “Cannot use void as a type argument”. So you have to write a second version of WithinLock that returns void and accepts an Action as its parameter, of which more in my next post.