I have moved!

I've moved my blog
CLICK HERE

Friday 7 November 2008

Fun with Internal DSLs in C++

About three years ago I experimented with internal DSLs in C++. I didn't know to call it that at the time (I'm not sure when the term was coined). It really means twisting the features of an existing language to make what feels like a new language.

The purpose of my DSL was to allow C++ programmers to naturally express database queries that would be executed against an RDBMS as standard SQL queries. In other words, it had exactly the same aim as LINQ, although again I wasn't to know that at the time.

The starting point was a couple of template classes called column and table, which serve the purpose of making the names of tables and columns visible within the C++ type system.

These were wrapped in some convenient macros, so you could declare the structure of your database tables like this:

SQL_BEGIN_NAMED_TABLE(users, "USERS")
    SQL_DECLARE_NAMED_COLUMN(id, "USERID", int)
    SQL_DECLARE_NAMED_COLUMN(username, "USERNAME", std::wstring)
    SQL_DECLARE_NAMED_COLUMN(password, "PASSWORD", std::wstring)
    SQL_DECLARE_NAMED_COLUMN(accesslevel, "ACCESSLEVEL", int)
    SQL_DECLARE_NAMED_COLUMN(usertype, "WINUSERFLAG", int)
    SQL_DECLARE_NAMED_COLUMN(longname, "LONGNAME", std::wstring)
    SQL_DECLARE_NAMED_COLUMN(email, "EMAIL", std::wstring)
    SQL_DECLARE_NAMED_COLUMN(dynamic, "DYNAMIC", int)
    SQL_DECLARE_NAMED_COLUMN(userflags, "USERFLAGS", int)
    SQL_DECLARE_NAMED_COLUMN(pwdexpirytime, "PWDEXPIRYTIME", sql::datetime_type)
SQL_END_TABLE(users)

So there we have a table called USERS with a bunch of columns. The names and data types of the columns are part of the information captured in the resulting type structure.

The macros actually declare a type and also an instance of that type. The above example declares a type called users_t to represent the table, and also a nested type called users_t::password_t to represent that column. Along with these, it declares instances of those types called users and users.password. And the same for the other columns.

We can then write things like this:

record_set<users_::username_, users_::email_> admins;

db.select(
    into = admins,
    from = users,
    where = users.accesslevel == 2
);

The Boost Parameter library provides the named parameter syntax (I wrote my own equivalent first before realising that Boost Parameter existed, and then retro-fitted it).

The predicate expression, as seen in the where clause, can get quite complicated. It can compare columns with values, or with each other, and it can use the standard && and || operators, amongst others. This all gets captured and turned into SQL, just like in Linq, but it's done with operator overloading. This trick is called expression templates in the C++ world.

The problem with this kind of thing is that although it results in a very neat and simple-to-use programming interface, there aren't many people who feel up to the job of maintaining such a library. If you don't like templates, you wouldn't like looking at this code. I'd guess that 30% of the characters in the source are angle brackets (only partly a joke).

The record_set type is a std::vector of another type called record, which is a little like a custom struct that is declared on-the-fly at the point of use. It's a whole little world of pain all by itself!

Just to give a flavour of the excitement involved in this kind of work, here's some of the record source:

struct none
{
    struct value_type {};
};

// Forward declaration of record
template <
        class T0 = none, class T1 = none,
        class T2 = none, class T3 = none,
        class T4 = none, class T5 = none,
        class T6 = none, class T7 = none,
        class T8 = none, class T9 = none,
        class TA = none, class TB = none,
        class TC = none, class TD = none,
        class TE = none, class TF = none
        >
struct record;

// Specialization for all fields none
template <>
struct record<
            none, none, none, none,
            none, none, none, none,
            none, none, none, none,
            none, none, none, none
            >
{
    // continues...
};

template <
        class T0, class T1, class T2, class T3,
        class T4, class T5, class T6, class T7,
        class T8, class T9, class TA, class TB,
        class TC, class TD, class TE, class TF
        >
struct record : 
    public record<T1, T2, T3, T4, T5, T6, T7, T8, 
                T9, TA, TB, TC, TD, TE, TF, none>
{
   // continues...

Yes folks, it's a template that derives from a specialisation of itself.

No comments: