I have moved!

I've moved my blog
CLICK HERE

Tuesday, 9 June 2009

Generic Value Object With Yield Return

Inspired by this and this, here's my version.

Jan Van Ryswyck is right to use static reflection, and to do that he "registers" the properties in a list. On the other hand, assuming you'd have a lot of these objects building up in the GC, it might prove important to reduce the number of ancillary allocations going on, so it may be better to avoid creating a list of the properties in every single instance.

It's not really necessary to have the full information about the properties via reflection; we only need the values. Also it's a pain to have to distinguish between fields and properties. Finally, we are really doing operations on a sequence of values, which can be expressed very neatly with Linq operators, and specified in the derived class with an iterator method.

So here's what I came up with:

public abstract class ValueObject<T> : IEquatable<T>
   where T : ValueObject<T>
{
   protected abstract IEnumerable<object> Reflect();

   public override bool Equals(Object obj)
   {
       if (ReferenceEquals(null, obj)) return false;
       if (obj.GetType() != GetType()) return false;
       return Equals(obj as T);
   }

   public bool Equals(T other)
   {
       if (ReferenceEquals(null, other)) return false;
       if (ReferenceEquals(this, other)) return true;
       return Reflect().SequenceEqual(other.Reflect());
   }

   public override int GetHashCode()
   {
       return Reflect().Aggregate(36, (hashCode, value) => value == null ?
                               hashCode : hashCode ^ value.GetHashCode());
   }

   public override string ToString()
   {
       return "{ " + (string) Reflect().Aggregate((l, r) => l + ", " + r) + " }";
   }
}

First off, it's a lot shorter! Aside from the standard ReferenceEquals checks, every method is a one-liner, a return of a single expression. Check out how SequenceEqual does so much work for us. And Aggregate is designed for turning a sequence into one value, which is exactly what GetHashCode and ToString are all about.

This is all possible because we're treating the properties or fields as just a sequence of values, obtained from the Reflect method, which the derived class has to supply.

(You could easily add operator== and operator!= of course. Also in case you're wondering about the way ToString appears not to check for null, actually it does, because one odd thing about .NET is that string concatenation can cope with null strings.)

Secondly, the way you use it is pretty neat as well:

public class Person : ValueObject<Person>
{
   public int Age { get; set; }
   public string Name { get; set; }

   protected override IEnumerable<object> Reflect()
   {
       yield return Age;
       yield return Name;
   }
}

It wouldn't matter if I yielded the values of fields or properties, and there's no need for expression-lambda trickery to get a PropertyInfo.

If you're unfamiliar with how iterator methods work, you may be wondering, what if I have twenty fields and the first fields of two instances are unequal, isn't this going to waste time comparing the other nineteen fields? No, because SequenceEqual will stop iterating as soon as it finds an unequal element pair, and iterator methods are interruptible.

(Note that if you need this to work on .NET 2.0, you can grab Jared Parsons' BCL Extras library to get the necessary sequence operations. If you're using the C# 2.0 compiler you just need to rejig the method call syntax to avoid using them as extension methods. Iterator methods were already available in 2.0, so nothing here is dependent on 3.0.)

Thursday, 4 June 2009

Lazy Sequence Generation without using Yield Return

C# has the yield return feature, which makes it easy to write state machines as if they were plain old methods:

class Program
{
    static IEnumerable<int> SomeNumbers()
    {
        Console.WriteLine("Started");

        yield return 1;
        
        Console.WriteLine("Yielded 1");

        yield return 2;
            
        Console.WriteLine("Yielded 2");

        yield return 3;

        Console.WriteLine("Finished");
    }

    static void Main(string[] args)
    {
        foreach (int n in SomeNumbers())
        {
            Console.WriteLine(n);
        }
    }
}

The output shows that the sequence of numbers is generated "lazily", with each chunk of code in the method being executed only on demand as the foreach pulls values from sequence.

How close can we get to this using only lamdbas? Pretty close:

class Program
{
    static Lazy.Yielding<int> SomeNumbers()
    {
        Console.WriteLine("Started");

        return Lazy.Yield(1, () => {

        Console.WriteLine("Yielded 1");

        return Lazy.Yield(2, () => {
        
        Console.WriteLine("Yielded 2");

        return Lazy.Yield(3, () => {

        Console.WriteLine("Finished");

        return null; }); }); });
    }

    static void Main(string[] args)
    {
        foreach (int n in Lazy.Enumerate<int>(SomeNumbers))
        {
            Console.WriteLine(n);
        }
    }
}

By messing with the nesting of my curly braces, I've made it look like the original, but really it's made of three nested lambdas. So this version of SomeNumbers is, deep breath... a function that returns a function that returns a function that returns a function.

Each returned function supplies the code to execute for the next step.

The main remaining ingredient is a helper function Lazy.Enumerate that turns our strange contraption into a plain IEnumerable, so we can loop through it conveniently.

public static class Lazy
{
    public class Yielding<T>
    {
        public readonly T Result;
        public readonly Func<Yielding<T>> Next;

        public Yielding(T result)
        {
            Result = result;
            Next = null;
        }

        public Yielding(T result, Func<Yielding<T>> next)
        {
            Result = result;
            Next = next;
        }
    }

    public static Yielding<T> Yield<T>(T value, Func<Yielding<T>> next)
    {
        return new Yielding<T>(value, next);
    }

    public static Yielding<T> Yield<T>(T value)
    {
        return new Yielding<T>(value);
    }

    public class Seq<T> : IEnumerable<T>
    {
        private readonly Func<Yielding<T>> _generator;

        public Seq(Func<Yielding<T>> generator)
        {
            _generator = generator;
        }

        private class Iter : IEnumerator<T>
        {
            private Func<Yielding<T>> _generator;

            public Iter(Func<Yielding<T>> generator)
            {
                _generator = generator;
            }

            public T Current { get; set; }

            public void Dispose() { }

            object System.Collections.IEnumerator.Current
            {
                get { return Current; }
            }

            public bool MoveNext()
            {
                if (_generator == null)
                    return false;

                Yielding<T> yielding = _generator();
                if (yielding == null)
                {
                    _generator = null;
                    return false;
                }

                Current = yielding.Result;
                _generator = yielding.Next;
                return true;
            }

            public void Reset()
            {
                throw new NotImplementedException();
            }
        }

        public IEnumerator<T> GetEnumerator()
        {
            return new Iter(_generator);
        }

        System.Collections.IEnumerator 
            System.Collections.IEnumerable.GetEnumerator()
        {
            return GetEnumerator();
        }
    }

    public static IEnumerable<T> Enumerate<T>(
                      Func<Yielding<T>> generator)
    {
        return new Seq<T>(generator);
    }
}

Most of the code is "boilerplate" implementation of IEnumerable/IEnumerator. The important bit is the MoveNext method, which calls the generator method to get the next value and the next generator method.

So what's missing? The major thing (aside from the misery of getting the syntax right, closing all the right brackets, etc.) is the lack of try/finally support, which turns out to be extremely useful. We could add support for that, however. Firstly, we'd add this member to Yielding.

public readonly Action Finally;

The generator code would initialize that field to whatever action it liked, to represent the finally block, before returning the Yielding instance. And Lazy.Seq.Iter would store it so it could execute it before retrieving the next value, and it would also execute it from the Dispose method, so that the Finally action would run even if the loop was abandoned.

Friday, 22 May 2009

VC++16 has decltype in Beta 1 (so Linq to C++0x looks a little nicer)

This post has moved to my new(er) blog and can be found here: http://smellegantcode.wordpress.com/2009/05/22/vc16-has-decltype-in-beta-1-so-linq-to-c0x-looks-a-little-nicer/

A while ago I wrote up a little demo of using lambdas to implement convenient lazy list comprehension in C++0x, but at the time the VC++ compiler I was using lacked decltype, and I found a need for it.

I just tried out the Beta 1 that was released this week, and it has decltype, so here's the updated sample:

#include <iostream>
#include <vector>
#include <sstream>

// Some helpers for working with decltype 
// (maybe these are in std:: somewhere?)
template <class T>
T *make_ptr() { return (T *)0; }

template <class T>
const T &make_ref() { return *(make_ptr<T>()); }

// so you don't need to obtain boost::lexical_cast!
template <class T>
std::string to_string(const T &v)
{
    std::ostringstream s;
    s << v;
    return s.str();
}

// Unary function that always returns true
template <class T>
struct always_true
{
    bool operator() (const T &) const { return true; }
};

// Unary function that returns its argument
template <class T>
struct pass_thru
{
    T operator() (const T &e) const { return e; }
};

template <class TElemFrom, 
          class TElemTo, 
          class TIterFrom, 
          class TSelector,
          class TPredicate>
class filter
{
    TIterFrom from_;
    TIterFrom end_;
    const TSelector &selector_;
    const TPredicate &predicate_;

public:
    filter(TIterFrom from, TIterFrom end, 
             const TSelector &selector, 
             const TPredicate &predicate)
        : from_(from), end_(end), 
          selector_(selector), 
          predicate_(predicate) {}

    class const_iterator
    {
        TIterFrom from_;
        TIterFrom end_;
        const TSelector &selector_;
        const TPredicate &predicate_;

        void locate()
        {
            while (!done())
            {
                if (predicate_(*from_))
                    return;

                ++from_;
            }
        }

        bool done() const { return (from_ == end_); }

    public:
        const_iterator(TIterFrom from, TIterFrom end, 
                       const TSelector &selector,
                       const TPredicate &predicate)
            : from_(from), end_(end), 
              selector_(selector),
              predicate_(predicate) { locate(); }

        TElemTo operator*() const { return selector_(*from_); }

        const_iterator operator++()
        {
            ++from_;
            locate();
            return *this;
        }

        bool operator==(const const_iterator &other) const
            { return done() && other.done(); }

        bool operator!=(const const_iterator &other) const
            { return !done() || !other.done(); }
    };

    typedef TElemFrom value_type;

    const_iterator begin() const 
        { return const_iterator(from_, end_, selector_, predicate_); }

    const_iterator end() const 
        { return const_iterator(end_, end_, selector_, predicate_); }

    template <class TSelector>
    filter<TElemTo, 
             decltype(make_ref<TSelector>()(make_ref<TElemTo>())),
             const_iterator, 
             TSelector,
             always_true<TElemTo> >
        select(const TSelector &selector)
    {
        return filter<TElemTo, 
            decltype(make_ref<TSelector>()(make_ref<TElemTo>())), 
                      const_iterator, 
                      TSelector, 
                      always_true<TElemTo> >
                    (begin(), 
                     end(), 
                     selector, 
                     always_true<TElemTo>());
    }

    template <class TPredicate>
    filter<TElemTo, 
              TElemTo,
             const_iterator, 
             pass_thru<TElemTo>, 
             TPredicate> 
        where(const TPredicate &predicate)
    {
        return filter<TElemTo, 
                        TElemTo,
                        const_iterator, 
                        pass_thru<TElemTo>, 
                        TPredicate>
                    (begin(),
                     end(),
                     pass_thru<TElemTo>(), 
                     predicate);
    }
};

template <class TCollFrom>
filter<typename TCollFrom::value_type, 
         typename TCollFrom::value_type,
         typename TCollFrom::const_iterator, 
         pass_thru<typename TCollFrom::value_type>,
         always_true<typename TCollFrom::value_type> >
    from(const TCollFrom &from)
{
    return filter<typename TCollFrom::value_type, 
                    typename TCollFrom::value_type, 
                    typename TCollFrom::const_iterator, 
                    pass_thru<typename TCollFrom::value_type>, 
                    always_true<typename TCollFrom::value_type> >
                (from.begin(), 
                 from.end(), 
                 pass_thru<typename TCollFrom::value_type>(), 
                 always_true<typename TCollFrom::value_type>());
}

int main(int argc, char* argv[])
{
    std::vector<int> vecInts;
    vecInts.push_back(5);
    vecInts.push_back(2);
    vecInts.push_back(9);

    auto filtered = from(vecInts)
        .where([] (int n) { return n > 3; })
        .select([] (int n) { return "\"" + to_string(n) + "\""; });

    for (auto i = filtered.begin(); i != filtered.end(); ++i)
        std::cout << *i << std::endl;

    return 0;
}

The main function is the pay-off, of course. I have a vector of ints, and I wrap it something that ends up stored in a variable called filtered, which I can then iterate through. The filtering (discarding anything not greater than 3) and transforming (int to quoted string) of the items happens on-the-fly during that iteration.

(And to clarify, this is different from the last time I posted it because select no longer requires any type parameters to be explicitly given - thanks to decltype).

Thursday, 14 May 2009

Recovery

If you want your application to be able to recover from a crash, it would NOT be ideal for it to restore the exact state it was in immediately before the crash, because then it would just crash again.

(Same goes for economies, presumably).

Monday, 11 May 2009

When to use Stored Procedures?

No coding blog would be complete without an overly-simplistic salvo fired into this bloody online battlefield, so here's mine.

What if you're writing an application that will make heavy use of an RDBMS?

If you're writing an packaged app for sale to customers who want to run it on the RDBMS of their choice, don't use stored procedures.

But what if you're writing something bespoke?

Maybe you've never in practice had to change RDBMS in mid-project. But maybe that's because you automatically ruled it out as an impossibility - how do you know what advantages you might have had, what client license bundle offers you could have benefitted from, if you maintained the ability to switch RDBMS vendors on a whim? Competition pressure is what keeps vendors competitive, and competition pressure disappears if customers lose the ability to switch. The vendors know this, which is why they want to lock you in. So if you want to be able to get the best deal out of RDBMS vendors, don't use stored procedures.

But what if you want to have a layered architecture?

Adding more languages doesn't enable more layers. It just adds more development cost. You can write a data access abstraction layer in any language - and in Java and .NET, open source toolkits like [n]Hibernate already make this child's play, largely eliminating the need to generate your own SQL strings or DDL.

If you want to reduce development costs by keeping the number of language skills required to maintain an app lower, by writing all business logic and data layer logic in one language, don't use stored procedures.

But what if you're prepared to take on extra cost because you want to optimise by moving execution into the RDBMS?

If you haven't yet done any comparative profiling to establish where such optimisation is really needed or makes any measurable difference, don't use stored procedures.

... otherwise, knock yourself out!

Sunday, 29 March 2009

General form of C# Object Initializers

C# has a language feature that allows several properties of an object to be assigned to as a suffix to a new expression, namely the object initializer:

var myObj = new MyClass
            {
                SomeProperty = 5,
                Another = true,
                Complain = str => MessageBox.Show(str),
            };

As properties can have hand-coded setters, this is an opportunity to call several methods on the newly constructed object, without having to make each method return the same object.

The limitations on property setters are:

  • They can only accept one argument
  • They cannot be generic

I would like it if we could call methods and enlist in events, as well as assign to properties, inside an object initializer block.

var myObj = new MyClass
            {
                SomeProperty = 5,
                Another = true,
                Complain = str => MessageBox.Show(str),
                DoSomething(),
                Click += (se, ev) => MessageBox.Show("Clicked!"),
            };

And why should such a block of modifications only be applicable immediately after construction? We could have:

myObj with
{
    SomeProperty = 5,
    Another = true,
    Complain = str => MessageBox.Show(str),
    DoSomething(),
    Click += (se, ev) => MessageBox.Show("Clicked!"),
}

The with would be a new keyword that operates on an object of some type and produces the same object and type - note that this would be an expression, not a statement.

So you could use initializer-style syntax regardless of whether you'd got the object from a new expression or from an IOC or factory method, etc.

In fact you could use with after a complete new and it would be equivalent to the current style of object initializer:

var myObj = new MyClass() with
            {
                SomeProperty = 5,
                Another = true,
                Complain = str => MessageBox.Show(str),
                DoSomething(),
                Click += (se, ev) => MessageBox.Show("Clicked!")
            };

I mused about this in a Stack Overflow answer, and Charlie Flowers pointed out something I should have realised immediately - we can implement with as an extension method.

public static T With(this T with, Action<T> action)
{
    if (with != null)
        action(with);
    return with;
}

Equivalent of normal object initializer, but with event enlisting:

var myObj = new MyClass().With(w =>
            {
                w.SomeProperty = 5;
                w.Another = true;
                w.Click += (se, ev) => MessageBox.Show("Clicked!");
            };

And on a factory method instead of a new:

var myObj = Factory.Alloc().With(w =>
            {
                w.SomeProperty = 5;
                w.Another = true;
                w.Click += (se, ev) => MessageBox.Show("Clicked!");
            };

I couldn't resist giving it the "maybe monad"-style check for null as well, so if you have something that might return null, you can still apply With to it and then check it for null-ness.

Friday, 6 March 2009

Readonly Fields and Thread Safety

In C# you can mark the fields of a type as readonly (indeed you generally should if it's a struct). By doing so, you make it illegal to change the fields except in a constructor of the type.

The advantage of this is that readonly data can be shared freely between threads without any thread causing updates that may be seen "in progress" by other threads, simply because there never will be any updates.

But is this strictly speaking true? Nope. It would be true if it were impossible for another thread to gain access to a partially constructed object. This seems true at first because the constructor of an object effectively "returns" a reference to itself via the new operator, which then becomes available to the caller of new, at which point the object is available to be passed around and is also fully constructed.

But a constructor can call methods, and it can pass this to methods. Those methods could pass that information to other threads. So you have to be careful what you do in a constructor. Maybe the compiler will help to check this somehow in a future version of the language.

This ties in nicely with some general advice about constructors that Brad Abrams gives in the Annotated C# Programming Language: do as little as possible in the constructor, and do everything else "lazily" - that is, the first time you need it. If you find yourself requiring non-readonly fields to achieve that, then you potentially have a problem with your design, because your class as a whole is not going to be readonly and so will not be trivially shareable in the way you originally assumed.

Wednesday, 4 March 2009

JavaScript Invaders!

Click in the small text field in the top left corner to give allow the game to receive keyboard events.

  • Z = left
  • X = right
  • M = fire

Compatible with IE, Firefox and Safari, pure JavaScript and HTML

I wrote this during a train journey, just to see if it was practically possible. Only then did I google for other examples, and found quite a few going back years. This version of Pacman has excellent attention to detail.

Sunday, 1 March 2009

The Amazing Speed of the .NET Garbage Collector

Coming from a C++ background, I always balk at designs that involve lots of small heap allocations. It’s something I’m gradually training myself to stop doing, because in managed code, it’s a completely irrational concern.

Given the constant usefulness of lambdas for things like this, it is worth reassuring myself (and others) that the overhead of such techniques is negligible. I already knew it probably was, but I had no idea just how negligible. It really is ridiculously unimportant.

Here’s my simple test program – most of the operations it does are meaningless busywork (like a typical school day). The important thing is the difference between Foo1 and Foo2:

public class Program
{
    private static readonly Queue<string> _q = new Queue<string>();

    public static int _c;

    public static void Push() { _q.Enqueue(_c.ToString()); }
    public static void Pop() { _q.Dequeue(); }
    public static void Count() { _c++; }

    public static int Foo1()
    {
        int x = 0;
        Push();
        Count();
        x++;
        Pop();
        return x;
    }

    public static void Gateway(Action a)
    {
        Push();
        a();
        Pop();
    }

    public static int Foo2()
    {
        int x = 0;
        Gateway(() =>
                       {
                           Count();
                           x++;
                       });
        return x;
    }

    public static double Time(Action a)
    {
        Stopwatch s = new Stopwatch();
        s.Start();
        a();
        s.Stop();
        return s.ElapsedMilliseconds;
    }

    public static void Main(string[] args)
    {
        _q.Enqueue("x");
        _q.Enqueue("y");
        _q.Enqueue("z");

        int reps = 100000000;

        double direct = Time(() =>
        {
            for (int n = 0; n < reps; n++)
                Foo1();
        });
        
        Console.WriteLine("Direct, " + reps + " calls: " + direct / 1000 + " s");

        double lambda = Time(() =>
        {
            for (int n = 0; n < reps; n++)
                Foo2();
        });

        Console.WriteLine("Lambda, " + reps + " calls: " + lambda / 1000 + " s");

        double overhead = (lambda - direct);
        double percall = overhead / reps;

        Console.WriteLine("Overhead per call: " + percall * 1000000 + " ns");
    }
}

Foo1 calls Push, Count and Pop and also fools around with a local variable, x. Note that Push and Pop manipulate a queue of strings. The queue stays a constant length during the test, with newly allocated strings being inserted and old strings being removed, so the GC has some work to do. The reason I used a queue was so that the lifetimes of GC objects would not simply coincide with the stack of function calls, forcing the CLR (I vaguely assume) to truly allocate objects in Generation 0 instead of “cheating” and using the stack to store everything.

Foo2 does the same, but it does so via Gateway. Also note that it captures the local variable x in a closure, so it can be read and modified in Foo2 and in the nested lambda.

This means that for every call to Foo2, it is necessary to allocate two GC objects. Firstly, the closure is implemented by a compiler generated class, which would look like this:

[CompilerGenerated]
private sealed class <>c__DisplayClass1
{
    public int x;

    public void <Foo2>b__0()
    {
        Program.Count();
        x++;
    }
}

The local variable x has become a field of that class. So every time Foo2 is called, an instance of <>c__DisplayClass1 is allocated to store the variable x. Then there’s the lambda, which has become the method <Foo2>b__0 of the class. In order to pass it to Gateway, an instance of the Action delegate must be created that has an Invoke method that forwards to <Foo2>b__0.

To confirm this, look at the IL for Foo2:

.method public hidebysig static int32 Foo2() cil managed
{
    .maxstack 3
    .locals init (
        [0] class Program/<>c__DisplayClass1 CS$<>8__locals2)
    L_0000: newobj instance void Program/<>c__DisplayClass1::.ctor()
    L_0005: stloc.0 
    L_0006: ldloc.0 
    L_0007: ldc.i4.0 
    L_0008: stfld int32 Program/<>c__DisplayClass1::x
    L_000d: ldloc.0 
    L_000e: ldftn instance void Program/<>c__DisplayClass1::<Foo2>b__0()
    L_0014: newobj instance void [System.Core]System.Action::.ctor(object, native int)
    L_0019: call void Program::Gateway(class [System.Core]System.Action)
    L_001e: ldloc.0 
    L_001f: ldfld int32 Program/<>c__DisplayClass1::x
    L_0024: ret 
}

Sure enough, there are two calls to newobj – the first one allocates the storage for the closure, the second allocates the Action delegate. Compare this with the much cleaner-looking code for Foo1:

.method public hidebysig static int32 Foo1() cil managed
{
    .maxstack 2
    .locals init (
        [0] int32 x)
    L_0000: ldc.i4.0 
    L_0001: stloc.0 
    L_0002: call void Program::Push()
    L_0007: call void Program::Count()
    L_000c: ldloc.0 
    L_000d: ldc.i4.1 
    L_000e: add 
    L_000f: stloc.0 
    L_0010: call void Program::Pop()
    L_0015: ldloc.0 
    L_0016: ret 
}

No allocations, just some method calls. So the overhead of Foo2 must be pretty big, right? Especially when those heap-allocated objects will need to be garbage-collected all the time. What if a lot of them build up? Won’t that put “pressure” on the GC and make everything grind to a halt?

Well, try running the test program. It does a hundred million calls to Foo1, and then the same number of calls to Foo2. It then figures out the difference in the timings for these runs. On my VM of Windows XP running on a MacBook Pro, the output is:

Direct, 100000000 calls: 27.209 s
Lambda, 100000000 calls: 33.515 s
Overhead per call: 63.06 ns

Now, it’s customary with this kind of test to observe that 33 seconds is longer than 27 seconds and try to conclude something from that, like “lambdas make your code about 25% slower”. However, that would be completely stupid. The point is that the code in the test is doing very little, apart from keeping the GC a little busy, so that makes the overhead appear more significant than it would be in a real program. The important point is that it only took 6 seconds longer even though we are doing a hundred million calls, i.e. the overhead per call is about 60 nanoseconds. Which is very, very small. Code that actually did something useful would make such an overhead utterly invisible.

Note that if the GC was not collecting at regular intervals during the test, we would have a build up of 400 million objects (100 million strings in Foo1, another 100 million in Foo2, which also needs 100 million closures and 100 million delegates). The overhead on an object is 8 bytes (two pointers used by the CLR), so that’s approximately 3 GB of memory requirement even without considering the actual data stored in the objects. So the GC clearly is cleaning up old objects during the test, or else it wouldn’t finish successfully.

So the GC heap is so fast that in a real program, even in tight loops, you can use closures and delegates without even giving it a second’s thought (or even a few nanosecond’s thought). As always, work on a clean, safe design, then profile to find out where the overhead is.

Thursday, 26 February 2009

Making Lambdas and Iterators Play Nicely Together

The major problem with this idea is in my favourite feature of C#. In an iterator (a function containing the yield keyword), the code is transformed into a state machine represented by a class. It has to be chopped up into sections, so the function can be “paused” after each yield return. Those pieces must all be in the same function, so sadly lambdas don’t play well with iterators.

There are a couple of possible future language features which would completely solve this, however.

Firstly, how about yield foreach. This would take an IEnumerable<T> and yield all the items of it. It would remove the need to write an explicit foreach loop to return a nested sequence. This amazing paper shows how this feature would be implemented (they even picked the same keyword), although the use-case there is to support recursion with good performance.

Secondly, I’d like to be able to write an anonymous iterator. This would be a matter of using yield return in the middle of a lambda, and hey presto, the compiler would build me an iterator instead of a normal anonymous method.

How would these two features help here? Suppose I wanted to use my TraceWriter (see the source download) class in an iterator. Here’s how it would look:

IEnumerable<int> MyIterator(TraceWriter w)
{
    yield return 1;
    w.WriteLine("Not indented yet");

    yield foreach w.Indented(() =>
    {
        yield return 2;
        w.WriteLine("Indented now");
        yield return 3;
    });

    w.WriteLine("Un-indented again");
    yield return 4;
}

So we have the same readability as before. This would require a special implementation of the Indented method, specifically for iterators:

public IEnumerable<T> Indented<T>(IEnumerable<T> sequence)
{
    WriteLine("{");
    _indent++;

    yield foreach sequence;

    _indent--;
    WriteLine("}");
}

It’s handy, that yield foreach thing, isn’t it?

Fatal Exceptions, and Why VB.NET Has a Purpose!

There’s a feature in the CLR that is exposed in VB.NET and not in C#. Apparently this is a deliberate decision made by the C# team, which so far they’ve stuck to. Unfortunately, it’s a pretty important feature.

I was prompted by Andrew Pardoe of the CLR team to look more closely at this when considering try/finally and the problem of “fatal” exceptions. By fatal, I mean an exception that you have no idea how to handle. A good example is NullReferenceException, which almost certainly indicates a bug. If there’s a bug, your program is in an unknown state, and you need to capture that state exactly as it is. Ideally the program would terminate at that point. In the interests of flexibility, the CLR does allow you to catch such exceptions. (Then the CLR team tells you that you shouldn’t – it’s a “fruit of the tree of knowledge” type of situation.)

Very often a method will throw a variety of exceptions to indicate that it was unable to finish what it was supposed to do, but it ain’t the end of the world. These are recoverable exceptions. A simple example is FileInfo.Delete:

string errorMessage = null;

try
{
    new FileInfo(@"c:\myfile.txt").Delete();
}
catch (SecurityException x)
{
    errorMessage = x.Message;
}
catch (IOException x)
{
    errorMessage = x.Message;
}
catch (UnauthorizedAccessException x)
{
    errorMessage = x.Message;
}

if (errorMessage != null)
    MessageBox.Show("You idiot - " + errorMessage);

Now, there’s a slight problem, which we’ll gloss over – how do you know what exceptions a method will throw? Ultimately, you need accurate, reliable, up-to-date documentation for every method. But instead of waiting for hell to freeze over, let’s move on.

When you have an operation that performs many such calls to various APIs, 3rd party libraries and so on, the ideal place to put the handler is at the outermost layer where you can recover. For example, the user presses the button and you run a lot of code to do what the user has asked. The best place to put the try/catch is around that whole sequence of operations, so that the catch can display the error to the user, and so they can figure out why they’re such an idiot. Inside that operation, you need to use try/finally (and related constructs) to ensure that any half-finished stuff is always properly undone. Nice and simple.

Except for one slight irritation: now the set of exceptions you need to catch is the union of all the recoverable exceptions that might be thrown by all the operations performed within that combined operation. This means that your try/catch block potentially has a ridiculously long list of types. It’s also very easy to miss an exception type here or there, which would mean that your program would crash when it didn’t need to. Whenever someone performs maintenance on the big operation (or any of the APIs or 3rd party libraries it calls on to), the huge list of catch blocks needs to be updated to ensure nothing is missed.

Also this all needs to be done separately for every try/catch statement. So you might think of reducing this hideous boilerplate repetition by writing it once:

public static bool Recover(Action action, Action<Exception> recover)
{
    try
    {
        action();
        return true;
    }
    catch (SecurityException x)
    {
        recover(x);
    }
    catch (IOException x)
    {
        recover(x);
    }
    catch (UnauthorizedAccessException x)
    {
        recover(x);
    }
    //others...

    return false;
}

You can now write nice neat lambdas to provide the code for the action to try and the single recovery handler. But you still have the same major problem: this long list of all the recoverable exceptions has to include every single one that might be thrown anywhere in any operation.

What if your product is extensible, such that 3rd parties can write plug-ins for it? How do they add their own recoverable exceptions to the Recover function? They simply can’t. And even if they could, what if they don’t remember to? How does your application protect itself?

So you have explosive complexity and apparently no way to control it.

At this point, the following quite reasonable thought occurs to you: surely the number of fatal exceptions is smaller than the number of non-fatal ones. Indeed, the number of non-fatal ones is constantly growing as we create new exception classes of our own. Also, if I make any mistakes in constructing my software, I’d prefer to not crash rather than crash.

The natural conclusion, which therefore naturally occurs to everyone eventually, is to do this:

try
{
    new FileInfo(@"c:\myfile.txt").Delete();
}
catch (Exception x)
{
    MessageBox.Show("Couldn't delete file: " + x.Message);
}

But then you remember that there’s such a thing as a fatal exception. Although you’d generally prefer your software not to crash, in the event of a definite bug, it is better for it to crash – that way, you get an accurate snapshot of the program’s state. So you need to examine the exception object before you try to handle it:

try
{
    new FileInfo(@"c:\myfile.txt").Delete();
}
catch (Exception x)
{
    if (!ExceptionPolicy.IsExceptionRecoverable(x))
        throw;

    MessageBox.Show("Couldn't delete file: " + x.Message);
}
    

Of course, that IsExceptionRecoverable method has to be written by you, but only once, and it probably has a relatively short list of exception types to look out for, and that list doesn’t grow so fast as the list of recoverable exceptions.

The above solution represents the state of the art according to the Microsoft Enterprise Library. Unfortunately, it’s still not right. The problem is to do with the thing I mentioned above about how to correctly write the code within each long and complex operation:

Inside that operation, you need to use try/finally (and related constructs) to ensure that any half-finished stuff is always properly undone.

The problem is, you only want the finally blocks to run for recoverable exceptions, not fatal ones. To repeat what I said right at the start:

If there’s a bug, your program is in an unknown state, and you need to capture that state exactly as it is.

So you definitely don’t want to run any finally blocks before that state is captured. They might modify it (in fact, that’s the whole point of them). They might make it worse, cause further corruption, or even throw another exception and so hide the original bug.

My first thought when this occurred to me was that try/finally presents a serious problem. How can we make it execute the finally block for a recoverable exception but not for a fatal exception?

Well, we can, sort of. If you don’t ever catch an exception, the finally blocks never run. When an exception is thrown, the CLR does not immediately run the finally blocks. First, it checks to see if there is a suitable catch in the stack for the current exception. If there is, and only if there is, it runs the enclosed finally blocks, and then runs the catch block.

This is why the correct advice is to never catch fatal exceptions, because merely catching them will trigger the silent execution of finally blocks when the program is already in an unrecoverable situation.

But this is sadly no good to us, because we’ve already established that explicitly catching only the recoverable exception types will lead us to a situation of hideously and unmanageable complexity, and programs that crash when they don’t need to. It’s just not practical to work that way.

The problem all stems from the fact that catch blocks can only decide to run based on the type of the exception, and this has to be an “opt in”. It can’t be coded the other way round, as an “opt out”. Yes, exceptions can be arranged in hierarchies and so that one catch block can catch all exception types derived from some base class, but this turns out to be useless because there’s no base class that all recoverable exceptions have in common, except for the class which is also the base class of all fatal exceptions as well.

So we cry: if only the CLR allowed us to run some code of our own to decide whether to catch an exception, before the finally blocks run.

The surprising answer (to a C# programmer) is that the CLR does allow us to do that. It just isn’t made available in C#. Only in VB.NET. Devastating, huh?

So to do exception handling in C#, in a way that is actually manageable, but also doesn’t execute code after a fatal exception, you first have to crank up VB.NET and create a class library containing this class:

Imports System

Public Class Exceptions
    Public Shared Sub TryFilterCatch(ByVal tryAction As Action, _
            ByVal filter As Func(Of Exception, Boolean), _
            ByVal handler As Action(Of Exception))
        Try
            tryAction()
        Catch ex As Exception When filter(ex)
            handler(ex)
        End Try
    End Sub
End Class

This effectively turns the magic feature of VB.NET – the When clause that appears next to the Catch – into something we can reuse in any CLR-based language. Having built that, you can now go back to C#… phew!

Now, supposing you still have that IsExceptionRecoverable method handy, you can do this:

Exceptions.TryFilterCatch(() =>
{
    new FileInfo(@"c:\myfile.txt").Delete();
},
ExceptionPolicy.IsExceptionRecoverable, x =>
{
    MessageBox.Show("Couldn't delete file: " + x.Message);
});

Given that we are effectively forced to this conclusion by the above reasoning process, it’s very strange to look at where we’ve ended up. Essentially, if you want a manageable, fault-tolerant, practical approach to exception handling, the above example is what you need to use, instead of using C# try/catch statements.

How does this situation persist? Why isn’t general exception filtering exposed in C#, as it is in VB.NET? These are very good questions. It seems that the C# team doesn’t accept the line of reasoning I gave above. According to them, we should all write long and unmanageable lists of catch handlers, explicitly listing the recoverable exceptions, based on the apparently trustworthy documentation of every software component we ever call into.

The CLR team, on the other hand, is well aware of this problem, and is aware that people are forced to try an alternative, and so inevitably make their way down the chain of reasoning I’ve described here, and the Microsoft Enterprise Library is probably representative of where most people stop in that chain.

The earliest reference I’ve found for all this stuff is here. It covers all the technical details, and mentions the idea of exposing VB.NET’s filtering capability.

This is the blog post that Andrew Pardoe pointed me to. The more I think about it, the more I think it’s a message to the C# team, rather than to users. The chances of many users hearing the message is low, and the chances of them understanding the implications is probably a little lower still. They must be hoping that the C# team will hear it and understand it. I guess C# is a customer of CLR, so the C# team is “always right” in that relationship. Consequently the message will need to come from customers of the C# team!

In the meantime, the CLR team is doing something about this problem:

In version 4 of the CLR, the product team is making exceptions that indicate a corrupted process state distinct from all other exceptions.

This is only part of the solution, of course, because in some situations our own exception types can indicate that the program is in an invalid state. The “corruption” is not of the low-level, random byte trashing kind, but it is corruption by a bug, nonetheless. So we need the recoverability criterion to be specified in a customizable way for each try/catch situation.

This means that, even after version 4 of the CLR, we will still need that snippet of VB code. Until the C# team catch-es up with the VB.NET team.

Wednesday, 25 February 2009

What's Wrong With C# 4.0's dynamic Keyword, and How I Think It Should Be Fixed

Update: Exactly what I want, already implemented, complete with caching of the call site. Also, Microsoft's Mads Torgersen responds to this suggestion here.

C# 4.0 proposes to add the dynamic keyword to the language. There are a few problems with it:

  1. Beginners have no clue what the difference is between object, var and dynamic. This is a relatively minor problem compared to the others, but it's worth considering.
  2. The dependency of a program on specific features of a dynamic object are not captured in a central definition, defeating the type system. I'll explain this a bit better below.
  3. There's no intellisense support (this is really the same thing as point 2, but it's so important that it's worth highlighting on its own).
  4. New language features are an unnecessary pain if you can meet the same need with a library feature - and I believe you can do a better job with a library feature in this case.

In a nutshell, I sincerely hope that dynamic is removed from C# 4.0. In its place, there should be a static class in a namespace System.Dynamic. It would have one method, an extension on object:

public static TInterface AsDynamic<TInterface>(this object obj)
      where TInterface : class

This would take any object at all and return a dynamic wrapper that implements the specified interface by forwarding calls onto the wrapped object. The wrinkle here is that TInterface really should be an interface but there's no way of constraining to that (now there's a language feature I'd accept quite happily).

Now in order to make use of a dynamic object (that is, an object whose type is unknown to me) I simply declare the features I require of that object, in an ordinary interface. Whenever I receive such an object, I wrap it with that interface as soon as possible, and use that interface to refer to it thereafter.

What's so great about this? Suppose my dynamic object is a frog. So first I declare an interface IFrog:

public interface IFrog
{
   void Ribbit();
}

This states in a single location what features a dynamic frog must have.

Now I can make the majority of my code use this interface to refer to frogs. At the "dynamic boundary" where I receive weird objects from the outside world, I immediately convert them to IFrog so the rest of my code can use them:

IFrog f = something.AsDynamic<IFrog>();

This doesn't change the dynamic nature of what is happening. Exactly the same kind of method resolution now has to happen at runtime, and no capability has been lost compared to the present dynamic keyword design.

But there are big advantages. Firstly, users don't need to understand a new keyword that clashes conceptually with two other things. Secondly, what if the dynamic system that passes me frogs changes the way it works, so now it expects an argument to be passed to Ribbit? No problem for me. I just update my interface:

public interface IFrog
{
   void Ribbit(bool loudly);
}

Now when I recompile, the C# compiler will find all the places in my code where I need to provide the new argument. The lesson here is that all languages (whether statically or dynamically typed) end up having these kinds of contracts or dependencies on the features of objects. The advantage of static languages is that it pays to declare those contracts centrally, so you can update them more easily.

Then there's the intellisense - obviously this works as usual with IFrog. It doesn't work at all with the current C# 4.0 design.

Finally, this would be a library feature. It would be available in a consistent way from C#, VB, F#, Eiffel, and so on. It's also very optional - if you don't want to see the AsDynamic extension appearing on every class, then don't add using System.Dynamic to your source file.

Also I may as well point out that it would be easy to instantly find all uses of dynamic type conversion within a solution, by simply finding all references to the AsDynamic method. So no need for a new IDE feature there.

This solution contains the dynamic stuff rather than spreading it around. It strikes me as being far more in the spirit of C# as a statically typed language designed for large-scale development. Yes, the dynamic keyword makes short examples look cool on the page. But it doesn't make them fun to write in the first place (no intellisense), or to maintain (no central type definition).

How about it, Microsoft?

Friday, 13 February 2009

General Theory of Resources

Having blogged the other day about a different way of doing automatic cleanup, I’ve been mulling it over and also answering a question on StackOverflow, and decided that I need to assemble a taxonomy of resources; what kinds there are, how to recognise them by their distinctive markings, guidelines for upkeep, training, feeding, breeding, etc. And because I’m obsessive-compulsive, I also wanted to develop a complete and rigorous theory of resources from the ground up.

I don’t like it when I have to say “In that sort of situation I normally try something like this and it usually seems to work.”

So without further ado, I present the General Theory of Resources.

Resources Modeled by Conceptual Bits

Imagine an array of bits, ones and zeros. Each represents something that has to be rationed between the parts of your program that want to use it. Only one “user” at a time can take ownership of the thing represented by the bit. When the thing is in use, its corresponding bit is set to 1. Otherwise, it is set to 0. If all the bits are 1 when you show up with your begging bowl, you’re out of luck. Note that these are just conceptual bits; real implementations might not actually maintain an array of bits or Booleans specifically for this purpose. But the two-state nature is always there nonetheless. We can imagine each available resource being modeled by a bit indicating whether it is currently reserved for exclusive use.

Are we talking about an infinite number of bits? Of course not. If we were, then we wouldn’t need to go any further in this discussion. We are frequently talking about a single bit. Often we are talking about a number of bits just large enough to fool us into thinking that our program is correct, only to find out later that it isn’t because it eventually runs out of bits.

Enter and Exit

Atop this set of bits, there is a pair of operations, which I’ll call Enter and Exit. When Enter executes, it sets a bit, or possibly a bunch of bits to 1. When Exit executes, it sets the bits back to 0 again. The two operations abstract over the underlying set of bits.

We sometimes say that a resource “exists” whenever we have executed Enter but have not yet executed Exit in the right way to reverse the state changes caused by Enter. A correct program must execute Exit once for every time it executes Enter.

Fungibility

You don’t complain to the bank if they give you “someone else’s cash”; it can be any old cash, as long as it adds up to the required value.

Some resources are fungible, meaning that the caller doesn’t care which particular bunch of bits is set to 1. An incorrect program would run out of such fungible bits because it would forget to Exit and so the number of 1-bits would build up over time, until eventually there would be no 0-bits remaining.

If you came home after leaving your kids with the babysitter, and found that they were different kids, you wouldn’t be satisfied if the baby sitter said, “They look pretty similar to your kids, don’t they?”

Other resources are non-fungible, meaning that the caller requests that a specific bit must be set, so each bit has a meaningful identity. An incorrect program would try to set a non-fungible bit to 1 that it had already set to 1. Sometimes a resource has only one bit to play with, in which case it is surely non-fungible, because we cannot use it without implicitly specifying which bit we want to use.

Orthogonality

Some resource types are orthogonal, meaning that the states of the bits are effectively mutually independent (they may be bunched together by the Enter and Exit operations, but if those bunches are mutually orthogonal then we still say that the resource is orthogonal).

Some are non-orthogonal, meaning that the states of the bits are entangled in some way, even though they are manipulated by separate Enter and Exit operations. Sometimes a resource has only one bit to play with, in which case we may as well call it orthogonal (because there is nothing for the single bit to be entangled with).

The only non-orthogonal kind of resource I consider here is stack-like. There may be other useful kinds, but I can’t think of any off hand.

Some Examples

Here’s a nice table of some contrived examples and their various attributes:

Resource Enter Exit Bits Fungible Orthogonal
Unmanaged memory in C malloc free One per word of memory Yes Yes
A stack of integers A Push method A Pop method One bit per stack depth No No
A static Boolean flag, _entered, protecting a method from reentrance // start of method
if (_entered ==
    true) return;

_entered = true;
// end of method
_entered = false;
One No Yes
A file The CreateFile API The CloseHandle API One per file No Yes
An XML output stream A method such as BeginElement A method such as EndElement One per open element No No
A thread-local integer, _entered, that monitors recursion into a method // start of method
_entered++;
// end of method
_entered--;
One per possible level of recursion No No

Something to consider: I give no example that is both fungible and non-orthogonal. If a resource is non-orthogonal, it means that the user has to be careful about the order in which it takes ownership of or releases ownership instances of that resource. It cannot do this if it is required to treat the resource as fungible, to not care which instance it is holding onto. So non-orthogonal implies non-fungible.

Another thing that should certainly be clear by now is that a "resource" is not something the operating system provides you with, implemented in native code. It can be, in the case of resources encapsulated in kernel handles that can be closed with CloseHandle, but there are other things, which you can write all by yourself, in pure C# or VB.NET or F#, which are also best thought of as resources.

Now let’s pick a few of them apart to make sure it’s all clear.

Unmanaged Memory in C

This may seem like a strange example to pick, because we have managed memory these days. But managed memory is fundamentally the same as unmanaged memory except that the runtime performs lazy cleanup automatically on another thread, and the question we must ask is: what makes this an appropriate solution for managing a resource? With what kinds of resource may we use such a clever solution?

Firstly, consider the C unmanaged memory heap, also known as the “free store”. We’ll assume memory has a resolution of one “word” (which would typically be either 4 or 8 bytes, but that doesn’t matter right now). Each word is either part of the free memory or it is part of some allocated memory. So apart from its actual contents (which we don’t presently care about at all), each word has one of two states associated with it: 0 for free, or 1 for allocated. These state bits are all completely orthogonal - independent from each other.

On top of this simple model, the heap provides a convenient abstraction: you can use Enter to find N contiguous free words for you and it will mark them as allocated (set the bit corresponding to each word). Later on, you can pass to Exit the address of the first word you were allocated and the heap will set those words back to the unallocated state (unset the bits). Each bunch of bits is orthogonal with respect to all other bunches, because the bunches can never overlap.

To reiterate, we are not talking about manipulating the stuff stored in the words (that’s the application’s job). We’re just imagining that the heap keeps track of which words are in use and which are free by recording those facts in a separate array of bits, one bit for each word of memory in the process. Of course in practice the heap can optimize its data structures so it doesn’t actually need to keep an array of bits to track the allocation status of every word separately.

If we rewrite the heap’s operations in C#, using my names for them, then they look like this:

IntPtr Enter(int wordsToAllocate);
void Exit(IntPtr blockToFree);

This is the standard pattern for an orthogonal resource. The parameter(s) to Enter do not give the identity of the bits that need to be set. Instead, they describe what is needed in a more general sense, and the Enter operation finds a group of bits that fit the bill: any group of bits representing a contiguous region of words of the length requested.

We then have this token or receipt being returned by Enter and we have to give it back to Exit when we no longer need the memory. The token can be used to access the block of memory, and it also is the key to freeing it.

A Stack of Integers

Now consider a stack of integers, something like a Stack<int>. The Enter operation takes the thing to put on the stack. This puts the system (the stack) into a state of being taller than it was before. However, the Exit operation does not require any information. It is the nature of a stack to only let you remove the top-most item, so all you need to tell it is that you would like it to remove something. You don’t have to tell it which item to remove (if you could do that, it wouldn’t be a stack). If this isn’t what your program needs, then you shouldn’t be using a stack.

As with any other resource manager, a stack has a conceptual array of bits. It is usually very large, but of course not infinite. Most of the bits are set to 0, most of the time. All the bits set to 1 are at the front of the array, and each 1-bit represents an item on the stack.

1 1 1 1 0 0 0 0 0

When we push something on the stack, we set the next 0-bit to 1:

1 1 1 1 1 0 0 0 0

When we pop something off the stack, we set the last 1-bit to 0:

1 1 1 1 0 0 0 0 0

Obviously there is nothing like orthogonality here. The bits are very much entangled such that no state change can occur that would stop the bits from being properly sorted into two neat groups. You certainly can’t just flip the state of any bit. Nor is there fungibility, because the identity of a specific bit to modify is implied by the operation we execute (non-orthogonality implies non-fungibility, remember?)

If we wanted to we could make our stack resource fit the same pattern as the unmanaged heap functions, with a little creativity and shoehorning. Firstly, the Enter operation might return the new height of the stack, which would serve as a unique resource handle. Exit would then accept that same value. Now we’d have an interface much like the above memory interface. But Exit would have to throw an exception if we passed it a value that was not the current height of the stack. Apart from performing this check, there is no other purpose in passing a value to it. So it is thoroughly misleading to expose the above interface on a stack-like resource. Instead, stack-like resources should have this kind of interface:

void Enter(T item);
T Exit();

This makes it obvious that there is no choice about which state the stack will be restored to next. Bear in mind that the item parameter is quite optional. As with the memory example, we’re not interested here in what the job stack is doing, but merely it how it behaves as a resource.

A Boolean Flag as a Recursion Guard

Sometimes you need to protect a method against re-entrance. For example, suppose you have an event handler that in turn fires another event. There is a danger that someone else will wire up the other event to fire the event your handler is wired up to, and there will be an endless recursion. To protect against this, you can create a static Boolean variable and make your event handler do nothing at all if it is true. Otherwise your event handler is able to run normally. First it sets the flag to true, then it goes about its business (firing the other event), and then sets the flag back to false again. This stops the infinite recursion from happening.

But it also means that your event handler (or whatever method you apply this approach to) is a resource, albeit a very limited one. It can only be used by one client at a time in the whole process. Consequently it is modeled by a single logical state-bit (indeed, it is implemented by one, in the form of the Boolean variable).

This definitely makes it non-fungible (because we cannot help but specify which bit we want to take over) but also orthogonal (because there are no other bits for this single bit to be entangled with).

File Handles

Now for the really complicated example. With file handles, the Enter operation produces a unique identifier that you have to hold on to. You then have to pass the same identifier back to Exit to close the file handle. So far, it looks a lot like memory. But there’s an added wrinkle, which is that there are two different kinds of resource here: files and handles.

The OS maintains a set of handles, and that constitutes a system which does indeed have the same properties as the C memory heap. Each possible handle value has an orthogonal Boolean state (allocated, unallocated) associated with it, just as every word of memory does. Applications are not interested in which handle value they receive, making them also fungible.

But the OS also maintains persistent files with unique names on your hard drive, and these each have an orthogonal state-bit associated with each of them, if we consider write-locking, which is the most problematic case. But these are definitely non-fungible, because they are identified by their names. (Note that if you build an abstraction on top if this system to make up random temporary filenames, then they become fungible again, but that’s another story.)

In reality, the Enter operation presented to applications is actually two operations that together form an atomic transaction. First the specified file is locked, meaning that its associated bit is set to 1. Then an unallocated handle is found (which is entirely the choice of the OS, because like money and memory, handle values are fungible) and its bit is also set to 1.

Consequently, although the “handle system” allows multiple simultaneous handles to exist, the file system forces us to treat files very carefully as resources. It is vital to understand this, because the Enter/Leave API looks treacherously similar to the memory API:

IntPtr Enter(string fileName);
void Exit(IntPtr handleToClose);

It has to look like that because handles work that way. But files do not – they are closer to the example of using a Boolean flag to guard against reentrance.

This frequently tricks people into thinking that they can treat file handles in the same way as memory – that is, let a lazy garbage collector take care of calling Exit automatically some time, hopefully not too long after there are no remaining references to the file handle. But the difference is in the information being passed into Enter. It is not a general description of what might fit the bill, but the specific identity of the one and only thing that is needed.

So the truth is that a program that fails to close its file handles in a well-understood way is probably not going to work very well. On Windows there is in fact no way of opening a file without locking it against at least one other operation (deletion), so even non-exclusive file handles have some problems.

As a digression, consider directories or folders in the file system. Enter creates a directory and Leave destroys it. There is one root directory which cannot be destroyed, and all other directories must have a parent directory. The result is a tree-like structure. But we can conceptually simplify it by thinking of it as a set of interleaved stacks. The chain from any leaf directory to the root is a stack, so we know for sure that tree-like resources are at least as restrictive as stack-like resources.

The other two examples are not appreciably different from what we’ve already examined. A stream of XML being written sequentially – however it is implemented – is effectively a stack. Each 1-bit represents an open element. The elements must be closed in the right order to ensure well-formed XML is produced. And the integer being used to monitor the level of recursion into a method, that too is a stack: one that stores nothing except a record of its current depth.

Internal Resources vs External Resources

One crucial point that should be clear by now is that there is not really a distinction between states “inside” your program and states “outside” of it. There is a tendency among C# programmers to think of the stuff they write themselves as “managed” (because it’s in managed code) and the concepts exposed by the operating system as native or “unmanaged”.

In fact, if your program internally maintains state, and the state needs to obey rules like the ones described here, then it inevitably defines resources of its own. And unless you take steps to provide automatic management of the lifetimes of your custom resources, then they are unmanaged resources, no matter what language they are written in or what runtime technology they are based on.

While an object exists in the CLR, it may have internal mutable state, and there may be rules for the correct use of that state, such that we have to conclude unavoidably that it is a type of resource, and it may or may not be be fungible or orthogonal. So the fact that the whole object will eventually be garbage collected is no comfort. What matters is that the state of it is modified correctly during its lifetime. We need some kind of automatic resource management to deal with the internal state of our own objects, as well as the objects exposed to us the operating system.

So what does all this tell us? It tells us that managed memory is one kind of resource among many, that the fundamental nature of resources means that they require management of some kind, and that the technique (lazy garbage collection) applicable to management of memory is only applicable because of specific properties of that kind of resource (fungibility, orthogonality). Other resources do not have those properties, and yet still need to be managed. So we need other techniques for them.

Resource Management Techniques

How can we manage resources easily in C#? By taking advantage of facilities in the language wherever possible. We have the interface to the resource in the form of the Enter and Exit functions, but we would like to avoid the kind of problems that occur when we forget to call Exit once for each call to Enter. So we want something easier than merely putting in the calls manually.

There are three solutions:

  • Finalizers
  • IDisposable/using
  • Gateways (methods that Enter, forward to a delegate, then Exit)

The first two are often confused and conflated.  The last one is, to my mind, massively superior to the other two, for reasons that should become clear, and yet is rarely described in this context as far as I know. I’ve coined the term “gateway” because I’m not aware of another term for them, so please let me know if there’s a better name.

Finalizers

Finalizers are essentially horrid nasty things. It should rarely be necessary to write one. It is notoriously difficult to write one correctly. I have never bothered.

The basic idea is to hitch a ride with the GC’s ability to lazily dispose of memory resources. The information needed to Exit a resource is stored in an object, and when that object is reclaimed by the GC, the object’s finalizer executes and calls Exit for the resource.

The GC schedules the finalizer to run lazily (that is, some time soon but not immediately or at a predictable time), which means that it is only really useful for resources that are fungible. Consider what this means for file handles: finalization will ensure that your process does not run out of handles, because they are fungible and in plentiful supply, but it does not help at all in avoiding bugs caused by holding onto file locks for longer than necessary, because files are non-fungible.

And don’t even consider trying to manage stack-like resources with finalizers. Imagine the mess, with things being popped when they shouldn’t be.

The information needed to Exit the resource must be stored in value-type fields only; it is not generally safe to access other reference objects (classes) from a finalizer. This means that finalizers are only really useful for resources defined by the operating system as kernel objects (and in fact there is a better way of encapsulating those, called the SafeHandle.)

So the less said about finalizers the better.

IDisposable/using

The other official way of managing resources is to encapsulate the resource in an object (as with the finalizer approach) and to give it a Dispose method, as defined in the interface IDisposable. The construction of the object corresponds to Enter, and Dispose serves as an Exit.

Then the language provides the using statement as a convenient way of ensuring that Dispose will be called when the flow of control leaves a specified scope:

 using ( resource-acquisition ) embedded-statement 

This approach is fully deterministic and non-lazy. It has none of the limitations of finalizers. Because the using statement weaves itself into the thread’s method-call stack, it works very well with stack resources - each object’s Dispose is guaranteed by the C# specification to be called in the reverse order that the objects were constructed, assuming they were constructed in the resource-acquisition.

There are several intrinsic technical drawbacks, however:

  • The fact that Dispose is available as a public method gives the impression of an orthogonal resource. You don’t have to employ the using statement – you can just call Dispose manually whenever you want. This would play havoc with a stack-like resource just as surely as using a finalizer.
  • It is surprisingly easy to forget to put in the using statement. There are ways to mitigate this, but the problem is that you do have to know that the problem exists before you can take the necessary steps. Implementing a resource as an object does not make it difficult to abuse. It makes it easy to abuse.
  • The using statement ensures that Dispose is called even if an exception propagates from the embedded-statement. However, little is said about what happens if Dispose then throws another exception. This is all too likely in many common cases. What actually happens is that the CLR discards the first exception, so all information about the original cause is lost.

So the IDisposable/using technique doesn’t exactly call out to the user to employ it correctly. There are various ways to use it wrongly, and the exception problem is especially irritating. Nevertheless it is possible to use it correctly with a wide range of resources. Because it follows the program stack as it grows and shrinks, it obeys the rules of stack-like resources. Because it is deterministic, it is able to provide the predictable immediate Exit necessary when handling non-fungible resources.

The Cultural Problem with IDisposable

But these mere technical problems are minor compared with the cultural problem that exists around this technique. These are caused by copious amounts of official or semi-official documentation which confuse it with finalizers, in apparent denial of their different areas of applicability. For example, in this article we are shown the “basic IDisposable pattern” and sure enough, it has a finalizer as well. In such articles IDisposable is described as a safety feature to be added to any class that has a finalizer, when in fact it is a general mechanism for hooking into the program stack unwinding process via objects, built on top of the more fundamental try/finally mechanism.

The logical fallacy that results is like Woody Allen’s conclusion that “All men are Socrates”. The advice, if read carefully, is merely that all finalizable classes should also implement IDisposable. But the idea people frequently come away with is that any class that implements IDisposable should also have a finalizer. They get it completely the wrong way around.

I’ve found it almost impossible to shift this two-way link from many people’s minds, so much so that it is often controversial to suggest managing a stack-like resource with IDisposable, because some people will insist that the resource must also have a finalizer “just in case the user forgets to call Dispose”. Of course, for a stack-like resource the finalizer will be no help at all. In fact it will be no use for any resource defined in terms of reference objects, because it is not safe to manipulate them from a finalizer.

This cultural problem doesn’t exist in C++/CLI, because there the existence of IDisposable is hidden behind the C++ language facility called a destructor. A typical C++/CLI program would have lots of classes that implement IDisposable without the programmer giving it a second thought, and none of them would have finalizers. In C++, destructors are quite often used to manage stack-like resources, which in C++/CLI automatically translates into IDisposable being employed for that same purpose. This is due mostly to Herb Sutter, whose annotations here are particularly illuminating.

Gateways

The cultural problems around IDisposable prompted me to look for an alternative, and happily I’ve found one that is applicable in a lot of cases. I already described it in a previous blog post, although there I merely extolled its virtue of being more expressive than the using statement.

But I mostly glossed over its other technical advantages. First and foremost is the way it is extremely difficult to abuse. Providing access to a resource only through the gateway of a function is practically impossible to get wrong. You can’t forget to put in the using statement, because you don’t need one. Nor can you call the Dispose method of nested stack-like resources in the wrong order, because no such method is exposed to you.

For some kinds of resource, it is not necessary or desirable to force them to exit when an exception occurs. The reason for this is that you would expect an exception to abort the whole operation in which the resource is relevant. An example is the TraceWriter class that I included in the source code of this post. It includes this gateway method:

public void Indented(Action whileIndented)
{
    WriteLine("{");
    _indent++;
    whileIndented();
    _indent--;
    WriteLine("}");
}

This allows me to write very readable code:

writer.WriteLine("string comparison:");
writer.Indented(() =>
    {
        writer.WriteLine(leftValue);
        writer.WriteLine(rightValue);
        writer.WriteLine(new String(' ', index)
                         + "^ difference at index " + index);
    });

It is absolutely impossible for me to forget to un-indent after I’ve indented, thanks to the use of a gateway function to control the use of the indenting “resource”. However, I’ve made the Indented method completely ignore exceptions, because in this program, if an exception occurs during the logging process, I expect the whole log file to be abandoned anyway. Had this not been the case, I would have use try/finally within Indented.

Only by writing our own gateway functions can we make the right choice for each case.

This works for all kinds of resources, of course. It is not only free from the cultural/educational baggage surrounding IDisposable, but it also does a better job technically, giving us control over the handling of exceptions and protecting us from accidental abuses.

So in summary: where resource-like patterns of state occur in your own programs, I recommend where possible making little gateway functions to provide access to those states, instead encapsulating the resource in an IDisposable class.