🝰

Abusing the C preprocessor for better variadic function arguments.

I am not a fan of C’s stance on variadic function arguments. They are unsafe and their usage is fairly unorthodox.

I, however, will not claim to solve all of the issues with it — I will provide, at best, some techniques I have used that I think will make variadic functions more usable.

As such, be wary that there is no silver bullet, and each proposition has some drawbacks.

Well then, without further ado, let us first quicky review how variadic arguments work:

Standard variadic arguments

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
#include <stdio.h>
#include <stdarg.h>

void func(int n, ...) {
    va_list args;
    va_start(args, n);

    for (int i = 0; i < n; ++i)
        printf("%s\n", va_arg(args, char *));

    va_end(args);
}

int main(void) {
    func(3, "1", "2", "3");
    return 0;
}

Here, func is a variadic function that prints its parameters. What this simple program does is printing on the standard output the strings “1”, “2” and “3” on separate lines.

The inner workings are straight forward : since variadic parameters can only be last in a parameter list, they are pushed first on the stack when the function is called.

However, there is no way, as-is, to be able to access each variadic parameter without a point of reference, a marker to tell where to look, so we need an additional parameter, a sentinel.

This is what va_start just does, it initializes the variadic parameter list in a va_list, with n as a point of reference. Each parameter can then be pulled in sequence with va_arg, and when we are done, we just call va_end. Simple indeed, but incredibly unsafe.

A sword of Damocles

The first thing that you might have noticed is that the sentinel n indicates the number of variadic parameters: indeed, there is no way as-is to know how many of them there are, so we need a hint. Some functions like printf are blessed with parameters that convey both a meaning and the number of expected variadic arguments (i.e. the format string), but in the general case, that is not always possible.

So, would you guess what calling func(4, "1", "2", "3") does ? That’s right, Undefined Behavior. At best, your process will read whatever has been pushed before “3”, try do dereference it, and will crash, but worse, it could continue to run as if nothing happened – no need to tell how bad this is.

Let’s take another example:

1
2
3
4
5
6
7
8
9
10
void func2(int n, ...) {
    va_list args;
    va_start(args, n);

    if (n > 0) printf("%d\n", va_arg(args, int));
    if (n > 1) printf("%f\n", va_arg(args, double));
    if (n > 2) printf("%s\n", va_arg(args, char *));

    va_end(args);
}

What would happen if I called func2(3, "1", 2, 3) ? Again, Undefined Behavior. va_arg is not typesafe, meaning that you have to care about the order and size of your parameters (this can cause horrible results if, for example, you pass 2 instead of 2.; subtle, but deadly).

Overall, I would say that the worst part is that your compiler will never complain about all that, the only special case being the printf function family, where the parameters are type checked against the format string (and this only happens because said functions have the format function annotation).

Improving the usability with sentinels

Let us ignore for the moment all the type safety concerns, and focus on the first horrible problem variadic parameters have : the length hint.

We are not, in 99% of the time, designing functions à la printf, where one of the mandatory parameters gives us the length hint, so usually we end up with functions like:

void func(param0, param1, int n, ...) {
    // function body
}

If we happen to have all our variadic parameters of the same type, we can use a null sentinel at the end of the parameter list:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
__attribute__ ((sentinel))
void func(param0, param1, ...) {
    va_list args;
    va_start(args, param1);

    for (char *s; s = va_arg(args, char *);)
        printf("%s\n", s);

    va_end(args);
}

int main(void) {
    func(param0, param1, "1", "2", "3", NULL);
    return 0;
}

The attribute here is of course GCC/Clang only, but can help to enforce the length safety by generating a compiler warning when no NULL is provided at the end of the parameter list.

This is better, but it only works when the parameters are of the same integer type and we still have to remember to insert NULL, plus we programmers hate writing redundant things like this. It is time to ask some help from the C preprocessor.

Using the preprocessor to generate the length hints

With sentinels

Wrapping the function to insert a trailing NULL is pretty straight forward:

// ANSI C
#define func(Param0, Param1, ...) func(Param0, Param1, __VA_ARGS__, NULL)

// GNU C
#define func(Param0, Param1, args...) func(Param0, Param1, ## args, NULL)

The only difference between the ANSI and GNU version is that func(a, b) will expand to func(a, b, NULL) in GNU C instead of func(a, b, , NULL) in ANSI C (making one of the variadic parameters mandatory).

With the length directly

Generating a similar macro when the length hint is explicit is a bit trickier. We first have to define a macro to count the variadic parameters passed to it:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#if __STRICT_ANSI__
# define ARG_LENGTH(...) __builtin_choose_expr(sizeof (#__VA_ARGS__) == 1,  \
        0,                                                                  \
        ARG_LENGTH__(__VA_ARGS__))
#else /* !__STRICT_ANSI__ */
# define ARG_LENGTH(...) ARG_LENGTH__(__VA_ARGS__)
#endif /* !__STRICT_ANSI__ */

# define ARG_LENGTH__(...) ARG_LENGTH_(,##__VA_ARGS__,                         \
    63, 62, 61, 60, 59, 58, 57, 56, 55, 54, 53, 52, 51, 50, 49, 48, 47, 46, 45,\
    44, 43, 42, 41, 40, 39, 38, 37, 36, 35, 34, 33, 32, 31, 30, 29, 28, 27, 26,\
    25, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6,\
    5, 4, 3, 2, 1, 0)
# define ARG_LENGTH_(_, _63, _62, _61, _60, _59, _58, _57, _56, _55, _54, _53, \
    _52, _51, _50, _49, _48, _47, _46, _45, _44, _43, _42, _41, _40, _39, _38, \
    _37, _36, _35, _34, _33, _32, _31, _30, _29, _28, _27, _26, _25, _24, _23, \
    _22, _21, _20, _19, _18, _17, _16, _15, _14, _13, _12, _11, _10, _9, _8,   \
    _7, _6, _5, _4, _3, _2, _1, Count, ...) Count

(Snippet taken from libcsptr)

Here the snippet is GNU C only, but can be adapted to work with other compiler – the only difference being that ARG_LENGTH will have to take at least one parameter.

The #if __STRICT_ANSI__ block is here to have consistent results with different gcc/clang compiler options when ARG_LENGTH() is expanded: without this, the macro expands to 1 instead of 0 when compiled with -ansi, -std=c99, or -std=c89.

The macro itself relies on the fact that the expanded varargs will shift all the trailing numbers to the right in ARG_LENGTH_, making Count expand to the number of times the shift happens.

With this macro defined, it becomes trivial to wrap the variadic function:

#define func(Param0, Param1, args...) \
    func(Param0, Param1, ARG_LENGTH(args), ## args)

Tackling the type safety problem

This is a technique I discovered while experimenting with the interfaces of libcsptr and criterion: it all goes down to the fact that designated initializers and compound literals both use commas as value separator, just like parameter lists – the one major difference being that types in those are checked.

Putting the concept into practice:

1
2
3
4
5
6
7
8
9
10
11
12
13
struct func_params {
    int sentinel_;
    int a;
    double b;
    char *c;
};

void func(struct func_params *args) {
    printf("%d, %f, %s\n", args->a, args->b, args->c);
}

#define func(...) \
    func(&(struct func_params) { .sentinel_ = 0, __VA_ARGS__ })

sentinel_ is here to ensure that func called without parameters still produce a valid compound literal.

Edit: professorlava fairly pointed out that for those that use GNU extensions you can, in fact, ignore the sentinel. It will still produce a warning if -Wmissing-field-initializer is provided, so be sure to balance that out (is 4 bytes worth all this?).

What this brings on the table is a type safe interface for optional parameters (that can be extended to variadic parameters when using arrays instead of structs), that is 100% standard C99 (to make this work with C89, you would have to replace the compound literal with a local variable and a designated initializer).

Hence, this compiles:

func();
func(1);
func(1, 2.0);
func(1, 2.0, "3");

… and this does not:

func(1, "3"); // error: a pointer is not a double
func(1, 2.0, 3); // error: an integer is not a pointer

The neat side-effect that this technique spawns is that func has a python-like interface for keyword arguments:

// All of these expressions are equivalent
func(1, .b = 2.0, .c = "3");
func(1, .c = "3", .b = 2.0);
func(.a = 1, .c = "3", .b = 2.0);

Of course, one must also consider that this technique provides default values to unspecified fields: 0 (zero). This effect is sometimes desirable, but on other cases it is not.


TL;DR

Variadic functions are unsafe and weird to use, the following techniques try to provide a better interface:

Technique Usage Pros Cons
Sentinel f(param, "1", "2", NULL) Sentinel presence can be enforced Trailing NULL (easily forgettable)
    No length parameter Unique type parameters
      Not type safe
Sentinel + Macro f(param, "1", "2") Benefits of sentinel without trailing NULL Unique type parameters
      Not type safe
Length parameter + Macro f(1, 2.0, "3") Inferred length parameter Not type safe
    Parameters of different types  
Struct + Macro f(1, 2.0, "3") Type safe optional parameters Default values are always 0
  f(1, .c = "3", .b = 2.0) Python-like keyword arguments  
       

Conclusion

These techniques provide some sugar on top of the C language to provide safer interfaces to variadic parameters – I hope you find inspiration in those techniques.