I have moved!

I've moved my blog
CLICK HERE

Thursday, 4 June 2009

Lazy Sequence Generation without using Yield Return

C# has the yield return feature, which makes it easy to write state machines as if they were plain old methods:

class Program
{
    static IEnumerable<int> SomeNumbers()
    {
        Console.WriteLine("Started");

        yield return 1;
        
        Console.WriteLine("Yielded 1");

        yield return 2;
            
        Console.WriteLine("Yielded 2");

        yield return 3;

        Console.WriteLine("Finished");
    }

    static void Main(string[] args)
    {
        foreach (int n in SomeNumbers())
        {
            Console.WriteLine(n);
        }
    }
}

The output shows that the sequence of numbers is generated "lazily", with each chunk of code in the method being executed only on demand as the foreach pulls values from sequence.

How close can we get to this using only lamdbas? Pretty close:

class Program
{
    static Lazy.Yielding<int> SomeNumbers()
    {
        Console.WriteLine("Started");

        return Lazy.Yield(1, () => {

        Console.WriteLine("Yielded 1");

        return Lazy.Yield(2, () => {
        
        Console.WriteLine("Yielded 2");

        return Lazy.Yield(3, () => {

        Console.WriteLine("Finished");

        return null; }); }); });
    }

    static void Main(string[] args)
    {
        foreach (int n in Lazy.Enumerate<int>(SomeNumbers))
        {
            Console.WriteLine(n);
        }
    }
}

By messing with the nesting of my curly braces, I've made it look like the original, but really it's made of three nested lambdas. So this version of SomeNumbers is, deep breath... a function that returns a function that returns a function that returns a function.

Each returned function supplies the code to execute for the next step.

The main remaining ingredient is a helper function Lazy.Enumerate that turns our strange contraption into a plain IEnumerable, so we can loop through it conveniently.

public static class Lazy
{
    public class Yielding<T>
    {
        public readonly T Result;
        public readonly Func<Yielding<T>> Next;

        public Yielding(T result)
        {
            Result = result;
            Next = null;
        }

        public Yielding(T result, Func<Yielding<T>> next)
        {
            Result = result;
            Next = next;
        }
    }

    public static Yielding<T> Yield<T>(T value, Func<Yielding<T>> next)
    {
        return new Yielding<T>(value, next);
    }

    public static Yielding<T> Yield<T>(T value)
    {
        return new Yielding<T>(value);
    }

    public class Seq<T> : IEnumerable<T>
    {
        private readonly Func<Yielding<T>> _generator;

        public Seq(Func<Yielding<T>> generator)
        {
            _generator = generator;
        }

        private class Iter : IEnumerator<T>
        {
            private Func<Yielding<T>> _generator;

            public Iter(Func<Yielding<T>> generator)
            {
                _generator = generator;
            }

            public T Current { get; set; }

            public void Dispose() { }

            object System.Collections.IEnumerator.Current
            {
                get { return Current; }
            }

            public bool MoveNext()
            {
                if (_generator == null)
                    return false;

                Yielding<T> yielding = _generator();
                if (yielding == null)
                {
                    _generator = null;
                    return false;
                }

                Current = yielding.Result;
                _generator = yielding.Next;
                return true;
            }

            public void Reset()
            {
                throw new NotImplementedException();
            }
        }

        public IEnumerator<T> GetEnumerator()
        {
            return new Iter(_generator);
        }

        System.Collections.IEnumerator 
            System.Collections.IEnumerable.GetEnumerator()
        {
            return GetEnumerator();
        }
    }

    public static IEnumerable<T> Enumerate<T>(
                      Func<Yielding<T>> generator)
    {
        return new Seq<T>(generator);
    }
}

Most of the code is "boilerplate" implementation of IEnumerable/IEnumerator. The important bit is the MoveNext method, which calls the generator method to get the next value and the next generator method.

So what's missing? The major thing (aside from the misery of getting the syntax right, closing all the right brackets, etc.) is the lack of try/finally support, which turns out to be extremely useful. We could add support for that, however. Firstly, we'd add this member to Yielding.

public readonly Action Finally;

The generator code would initialize that field to whatever action it liked, to represent the finally block, before returning the Yielding instance. And Lazy.Seq.Iter would store it so it could execute it before retrieving the next value, and it would also execute it from the Dispose method, so that the Finally action would run even if the loop was abandoned.

2 comments:

Marcel said...

Er... do you have a reason for this? :)

Daniel Earwicker said...

I'm a bit of a physicist, so I like finding ways of unifying things that appear to be separate!