I have moved!

I've moved my blog
CLICK HERE

Sunday, 29 March 2009

General form of C# Object Initializers

C# has a language feature that allows several properties of an object to be assigned to as a suffix to a new expression, namely the object initializer:

var myObj = new MyClass
            {
                SomeProperty = 5,
                Another = true,
                Complain = str => MessageBox.Show(str),
            };

As properties can have hand-coded setters, this is an opportunity to call several methods on the newly constructed object, without having to make each method return the same object.

The limitations on property setters are:

  • They can only accept one argument
  • They cannot be generic

I would like it if we could call methods and enlist in events, as well as assign to properties, inside an object initializer block.

var myObj = new MyClass
            {
                SomeProperty = 5,
                Another = true,
                Complain = str => MessageBox.Show(str),
                DoSomething(),
                Click += (se, ev) => MessageBox.Show("Clicked!"),
            };

And why should such a block of modifications only be applicable immediately after construction? We could have:

myObj with
{
    SomeProperty = 5,
    Another = true,
    Complain = str => MessageBox.Show(str),
    DoSomething(),
    Click += (se, ev) => MessageBox.Show("Clicked!"),
}

The with would be a new keyword that operates on an object of some type and produces the same object and type - note that this would be an expression, not a statement.

So you could use initializer-style syntax regardless of whether you'd got the object from a new expression or from an IOC or factory method, etc.

In fact you could use with after a complete new and it would be equivalent to the current style of object initializer:

var myObj = new MyClass() with
            {
                SomeProperty = 5,
                Another = true,
                Complain = str => MessageBox.Show(str),
                DoSomething(),
                Click += (se, ev) => MessageBox.Show("Clicked!")
            };

I mused about this in a Stack Overflow answer, and Charlie Flowers pointed out something I should have realised immediately - we can implement with as an extension method.

public static T With(this T with, Action<T> action)
{
    if (with != null)
        action(with);
    return with;
}

Equivalent of normal object initializer, but with event enlisting:

var myObj = new MyClass().With(w =>
            {
                w.SomeProperty = 5;
                w.Another = true;
                w.Click += (se, ev) => MessageBox.Show("Clicked!");
            };

And on a factory method instead of a new:

var myObj = Factory.Alloc().With(w =>
            {
                w.SomeProperty = 5;
                w.Another = true;
                w.Click += (se, ev) => MessageBox.Show("Clicked!");
            };

I couldn't resist giving it the "maybe monad"-style check for null as well, so if you have something that might return null, you can still apply With to it and then check it for null-ness.

Friday, 6 March 2009

Readonly Fields and Thread Safety

In C# you can mark the fields of a type as readonly (indeed you generally should if it's a struct). By doing so, you make it illegal to change the fields except in a constructor of the type.

The advantage of this is that readonly data can be shared freely between threads without any thread causing updates that may be seen "in progress" by other threads, simply because there never will be any updates.

But is this strictly speaking true? Nope. It would be true if it were impossible for another thread to gain access to a partially constructed object. This seems true at first because the constructor of an object effectively "returns" a reference to itself via the new operator, which then becomes available to the caller of new, at which point the object is available to be passed around and is also fully constructed.

But a constructor can call methods, and it can pass this to methods. Those methods could pass that information to other threads. So you have to be careful what you do in a constructor. Maybe the compiler will help to check this somehow in a future version of the language.

This ties in nicely with some general advice about constructors that Brad Abrams gives in the Annotated C# Programming Language: do as little as possible in the constructor, and do everything else "lazily" - that is, the first time you need it. If you find yourself requiring non-readonly fields to achieve that, then you potentially have a problem with your design, because your class as a whole is not going to be readonly and so will not be trivially shareable in the way you originally assumed.

Wednesday, 4 March 2009

JavaScript Invaders!

Click in the small text field in the top left corner to give allow the game to receive keyboard events.

  • Z = left
  • X = right
  • M = fire

Compatible with IE, Firefox and Safari, pure JavaScript and HTML

I wrote this during a train journey, just to see if it was practically possible. Only then did I google for other examples, and found quite a few going back years. This version of Pacman has excellent attention to detail.

Sunday, 1 March 2009

The Amazing Speed of the .NET Garbage Collector

Coming from a C++ background, I always balk at designs that involve lots of small heap allocations. It’s something I’m gradually training myself to stop doing, because in managed code, it’s a completely irrational concern.

Given the constant usefulness of lambdas for things like this, it is worth reassuring myself (and others) that the overhead of such techniques is negligible. I already knew it probably was, but I had no idea just how negligible. It really is ridiculously unimportant.

Here’s my simple test program – most of the operations it does are meaningless busywork (like a typical school day). The important thing is the difference between Foo1 and Foo2:

public class Program
{
    private static readonly Queue<string> _q = new Queue<string>();

    public static int _c;

    public static void Push() { _q.Enqueue(_c.ToString()); }
    public static void Pop() { _q.Dequeue(); }
    public static void Count() { _c++; }

    public static int Foo1()
    {
        int x = 0;
        Push();
        Count();
        x++;
        Pop();
        return x;
    }

    public static void Gateway(Action a)
    {
        Push();
        a();
        Pop();
    }

    public static int Foo2()
    {
        int x = 0;
        Gateway(() =>
                       {
                           Count();
                           x++;
                       });
        return x;
    }

    public static double Time(Action a)
    {
        Stopwatch s = new Stopwatch();
        s.Start();
        a();
        s.Stop();
        return s.ElapsedMilliseconds;
    }

    public static void Main(string[] args)
    {
        _q.Enqueue("x");
        _q.Enqueue("y");
        _q.Enqueue("z");

        int reps = 100000000;

        double direct = Time(() =>
        {
            for (int n = 0; n < reps; n++)
                Foo1();
        });
        
        Console.WriteLine("Direct, " + reps + " calls: " + direct / 1000 + " s");

        double lambda = Time(() =>
        {
            for (int n = 0; n < reps; n++)
                Foo2();
        });

        Console.WriteLine("Lambda, " + reps + " calls: " + lambda / 1000 + " s");

        double overhead = (lambda - direct);
        double percall = overhead / reps;

        Console.WriteLine("Overhead per call: " + percall * 1000000 + " ns");
    }
}

Foo1 calls Push, Count and Pop and also fools around with a local variable, x. Note that Push and Pop manipulate a queue of strings. The queue stays a constant length during the test, with newly allocated strings being inserted and old strings being removed, so the GC has some work to do. The reason I used a queue was so that the lifetimes of GC objects would not simply coincide with the stack of function calls, forcing the CLR (I vaguely assume) to truly allocate objects in Generation 0 instead of “cheating” and using the stack to store everything.

Foo2 does the same, but it does so via Gateway. Also note that it captures the local variable x in a closure, so it can be read and modified in Foo2 and in the nested lambda.

This means that for every call to Foo2, it is necessary to allocate two GC objects. Firstly, the closure is implemented by a compiler generated class, which would look like this:

[CompilerGenerated]
private sealed class <>c__DisplayClass1
{
    public int x;

    public void <Foo2>b__0()
    {
        Program.Count();
        x++;
    }
}

The local variable x has become a field of that class. So every time Foo2 is called, an instance of <>c__DisplayClass1 is allocated to store the variable x. Then there’s the lambda, which has become the method <Foo2>b__0 of the class. In order to pass it to Gateway, an instance of the Action delegate must be created that has an Invoke method that forwards to <Foo2>b__0.

To confirm this, look at the IL for Foo2:

.method public hidebysig static int32 Foo2() cil managed
{
    .maxstack 3
    .locals init (
        [0] class Program/<>c__DisplayClass1 CS$<>8__locals2)
    L_0000: newobj instance void Program/<>c__DisplayClass1::.ctor()
    L_0005: stloc.0 
    L_0006: ldloc.0 
    L_0007: ldc.i4.0 
    L_0008: stfld int32 Program/<>c__DisplayClass1::x
    L_000d: ldloc.0 
    L_000e: ldftn instance void Program/<>c__DisplayClass1::<Foo2>b__0()
    L_0014: newobj instance void [System.Core]System.Action::.ctor(object, native int)
    L_0019: call void Program::Gateway(class [System.Core]System.Action)
    L_001e: ldloc.0 
    L_001f: ldfld int32 Program/<>c__DisplayClass1::x
    L_0024: ret 
}

Sure enough, there are two calls to newobj – the first one allocates the storage for the closure, the second allocates the Action delegate. Compare this with the much cleaner-looking code for Foo1:

.method public hidebysig static int32 Foo1() cil managed
{
    .maxstack 2
    .locals init (
        [0] int32 x)
    L_0000: ldc.i4.0 
    L_0001: stloc.0 
    L_0002: call void Program::Push()
    L_0007: call void Program::Count()
    L_000c: ldloc.0 
    L_000d: ldc.i4.1 
    L_000e: add 
    L_000f: stloc.0 
    L_0010: call void Program::Pop()
    L_0015: ldloc.0 
    L_0016: ret 
}

No allocations, just some method calls. So the overhead of Foo2 must be pretty big, right? Especially when those heap-allocated objects will need to be garbage-collected all the time. What if a lot of them build up? Won’t that put “pressure” on the GC and make everything grind to a halt?

Well, try running the test program. It does a hundred million calls to Foo1, and then the same number of calls to Foo2. It then figures out the difference in the timings for these runs. On my VM of Windows XP running on a MacBook Pro, the output is:

Direct, 100000000 calls: 27.209 s
Lambda, 100000000 calls: 33.515 s
Overhead per call: 63.06 ns

Now, it’s customary with this kind of test to observe that 33 seconds is longer than 27 seconds and try to conclude something from that, like “lambdas make your code about 25% slower”. However, that would be completely stupid. The point is that the code in the test is doing very little, apart from keeping the GC a little busy, so that makes the overhead appear more significant than it would be in a real program. The important point is that it only took 6 seconds longer even though we are doing a hundred million calls, i.e. the overhead per call is about 60 nanoseconds. Which is very, very small. Code that actually did something useful would make such an overhead utterly invisible.

Note that if the GC was not collecting at regular intervals during the test, we would have a build up of 400 million objects (100 million strings in Foo1, another 100 million in Foo2, which also needs 100 million closures and 100 million delegates). The overhead on an object is 8 bytes (two pointers used by the CLR), so that’s approximately 3 GB of memory requirement even without considering the actual data stored in the objects. So the GC clearly is cleaning up old objects during the test, or else it wouldn’t finish successfully.

So the GC heap is so fast that in a real program, even in tight loops, you can use closures and delegates without even giving it a second’s thought (or even a few nanosecond’s thought). As always, work on a clean, safe design, then profile to find out where the overhead is.