Friday 18 November 2011

Obtaining the size of a C++ array using templates and other techniques (from C++11) - Part 1

Recently I was helping somebody debug an issue around the use of swprintf_s.  The issue turned out to an Obi-Wan (off by one) error.  I don't tend use the likes of printf() very much instead preferring to use a std::stringstream if I need to format into a string.

I'd assumed that the Microsoft's secure versions of these methods, i.e. those with _s suffix took a buffer size so when looking at the help for swprintf_s I was momentarily taken aback by the lack of a buffer parameter.  However I then noticed that swprintf_s is not just a regular function but is in fact a function template:

template <size t size>
int swprintf_s(
    wchar_t (&buffer)[size],
    const wchar_t *format [, argument] ...); // C++ only

One of the most useful properties of a function template is its ability to deduce its argument types.  In this case the argument is not a type parameter but a fundamental type (though using the size_t typedef) that specifies the size of the target string buffer (in characters not bytes). When used as:

wchar_t buf[10];
swprintf_s(buf, "%d", 10);

It deduces that the size of the buffer (buf) is 10.  This works because the template parameter is used to specify the size of the expected wchar_t array that swprintf_f expects.  It could have been specified as in swprintf_s<10>(buf, "%d, 10) but this is where the beauty lies in that the compiler is able to deduce it.  This is what function templates do and how they're often used so there's nothing novel here accept the application of finding an array size.  This is a really neat trick and I don't know why I've missed it for so long!

An important point here is that the signature is a reference to an array (note the & before buffer) as opposed to the array syntax of just wchar_t (buffer)[size]. If this were used the function template would be unable to deduce the parameter (size). This is because the syntax:

template<size_t size> foo(wchar_t (buffer)[size])

decays to become:

template<size_t size> foo(wchar_t* buffer);

When compiled, i.e. foo() can accept a pointer to a wchar_t array of any size. In fact a pointer to wchar_t* is fine. There is nothing special about this and it's just the standard decay that C (and C++) has always supported.

Anyway, after that slight diversion into decay let's return to deducing the size of an array. So why is this is this useful? In order to iterate over each of any arrays elements, e.g.

int a[] = { 0, 1, 2, 3, 4 };
for (int i = 0; i < sizeof(a)/sizeof(int); ++i)
SomeFn(i);

or slightly better:

int a[] = { 0, 1, 2, 3, 4 };
std::for_each(&a[0], &a[sizeof(a) / sizeof(int)], &SomeFn);

The number of elements is required in order to terminate the iteration.

The concept can be generalized to obtain the size of any type of array, i.e.

template<typename T, size_t sizeOfArray> int GetNmberOfElements(T (&)[sizeOfArray])
{
  return sizeOfArray;
}

Which can be used to rewrite the previous examples as:

int a[] = { 0, 1, 2, 3, 4 };
std::for_each(&a[0], &a[GetNumberOfElements(a)], &SomeFn);

Returning to the discussions about decay it should be noted that this mechanism only works for actual arrays.  The signature used to prevent decay, i.e. using the '&' means that a pointer cannot be passed, e.g.

char *pa = new char[100];
GetNumberOfElements<char, 100>(pa);

Won't compile with VC++ 2010 giving:

Error 3 error C2664: 'GetNumberOfElements' : cannot convert parameter 1 from 'char *' to 'char (&)[100]'

This makes perfect sense as it's explicitly requires an array.  Even if for some reason it could accept the pointer then it wouldn't be able to deduce the size because this information isn't syntactically available (though will be most likely embedded within memory block that pa points too; quite possibly a few bytes further back so that when delete [] is invoked the C++ runtime will know how much memory to free).  Looking at the MSDN help for swprintf_s() it is clear why additional definitions (the non-template overloads) are provided as these deal with passing pointers.

Now that this cool feature can be used to easily obtain array sizes the next thing you tend to want to do is then define other arrays using this information, i.e.

char a[100];
char b[GetNumberOfElements(a)];

However this won't compile as despite the fact that GetNumberOfElements() performs (well the compiler does) the size deduction at compile time the result is only available at runtime.  To define an array the size must be known at compile time.

There is a clever hack to make this available at compile time but it requires the use a macro which is unpleasant.  However, at this point it's C++11 to the rescue but that'll have to wait until part 2 which is available here.