I have moved!

I've moved my blog
CLICK HERE

Sunday 8 February 2009

A functional replacement for the using statement

The using-statement is just that: a statement. Why does this bug me?

The FileStream object has a Length property. Assuming I have a function Open that returns a FileStream ready for us, it is awfully tempting to do this:

Console.WriteLine(Open().Length);

But that is wrong, wrong, wrong, because it postpones the closing of the file handle until the finalization thread gets around to it. No good.

To obtain a value from a disposable object after that object is disposed, you have to store it in a variable declared outside the using-statement, like this:

long length;
 
using (FileStream file = Open())
    length = file.Length;
 
Console.WriteLine(length);

Not very nice. But how about this?

Console.WriteLine(Open().Use(file => file.Length));

Only a little uglier than the wrong version, but a whole lot righter. The Use extension method executes the function passed to it, and returns whatever that function returns, but also disposes of the object it was called on:

public static class DisposableExtensions
{
    public static TResult Use<TArg, TResult>(
        this TArg arg, Func<TArg, TResult> usage)
        where TArg : IDisposable
    {
        try
        {
            return usage(arg);
        }
        finally
        {
            arg.Dispose();
        }
    }
}

Of course, this technique allows us to do something very similar even when we aren’t dealing with objects that implement IDisposable. What we are really doing here is putting the system into some temporary state, then computing something, then undoing that temporary state change. That’s all a  “resource” really is: some state the system gets into that you’ll need to back out of soon. The definition of “soon” varies; for memory allocation, we can be pretty lax, because we have enough memory to support a very large number of small allocations simultaneously, but we can only support a single exclusive file handle on a given file, so we have to be extremely careful.

So a resource is two bits of code: get-into-state and get-out-of-state. Suppose I have a really simple kind of “lock”. I actually support any number of simultaneous locks, but I want to know how many are held at any given time. So my “lock” is just an integer field in my class. To take out a lock, increment it, and to release the lock, decrement it:

public class Lockable
{
    private int _lockCount;
 
    public int OpenLocks
    {
        get { return _lockCount; }
    }
 
    public void Lock()
    {
        _lockCount++;
    }
 
    public void Unlock()
    {
        _lockCount--;
    }
}

How can I help users of my class to ensure they don’t forget to release the lock in a timely fashion?

If you understood the definition of a “resource” I gave above, you’ll have realised that the resource in this case is not the class itself, but a state in which there has been a call to Lock without a corresponding Unlock call. So if the value of _lockCount happens to be 63, then there are 63 resources alive in our program, and we want them to be cleaned up at some point.

So the object-oriented (which doesn’t necessarily mean “good”) way to expose this is to reify our resource, by giving it a class of its own:

public class Lock : IDisposable
{
    private readonly Lockable _res;
 
    public Lock(Lockable res)
    {
        _res = res;
        _res.Lock();
    }
 
    public void Dispose()
    {
        _res.Unlock();
    }
}

Now the users of MyResource can employ the using-statement to call Dispose for them, which takes care of calling Unlock.

Lockable mr = new Lockable();
 
int locks;
using (new Lock(mr))
    locks = mr.OpenLocks;
 
Console.WriteLine(locks);

Fair enough, but as we saw above, that makes it ugly when we just want to write an expression that computes a value while the lock is held. Yes, we could use my Use extension method:

Console.WriteLine(new Lock(mr).Use(l => mr.OpenLocks));

But why not cut out the middleman altogether?

public class Lockable
{
    private int _lockCount;
 
    public int OpenLocks
    {
        get { return _lockCount; }
    }
 
    public T WithinLock<T>(Func<T> f)
    {
        _lockCount++;
 
        try
        {
            return f();
        }
        finally
        {
            _lockCount--;
        }
    }
}

We get rid of the public Lock/Unlock so there’s no way to break the rules, and instead provide a way to conveniently get a parameterless lambda executed inside a lock, guaranteeing that the lock will be released when the computation finishes:

Console.WriteLine(mr.WithinLock(() => mr.OpenLocks));

This addresses a very common complaint about IDisposable, which is that the user has to remember to call Dispose, or else they have to remember to use a using-statement, and it is very easy to forget. With the above alternative technique, the user is not given the ability to forget.

I call such methods "Gateways", although in Java this is called the Execute Around idiom, and it is also exemplified by Common Lisp macros that start with the prefix with-, such as with-open-file.

Of course, there is a minor downside. C# has the error: “Cannot use void as a type argument”. So you have to write a second version of WithinLock that returns void and accepts an Action as its parameter, of which more in my next post.

No comments: