Abusing the C preprocessor for better variadic function arguments.
I am not a fan of C’s stance on variadic function arguments. They are unsafe and their usage is fairly unorthodox.
I, however, will not claim to solve all of the issues with it — I will provide, at best, some techniques I have used that I think will make variadic functions more usable.
As such, be wary that there is no silver bullet, and each proposition has some drawbacks.
Well then, without further ado, let us first quicky review how variadic arguments work:
Standard variadic arguments
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
#include <stdio.h>
#include <stdarg.h>
void func(int n, ...) {
va_list args;
va_start(args, n);
for (int i = 0; i < n; ++i)
printf("%s\n", va_arg(args, char *));
va_end(args);
}
int main(void) {
func(3, "1", "2", "3");
return 0;
}
Here, func
is a variadic function that prints its parameters.
What this simple program does is printing on the standard output the strings
“1”, “2” and “3” on separate lines.
The inner workings are straight forward : since variadic parameters can only be last in a parameter list, they are pushed first on the stack when the function is called.
However, there is no way, as-is, to be able to access each variadic parameter without a point of reference, a marker to tell where to look, so we need an additional parameter, a sentinel.
This is what va_start
just does, it initializes the variadic parameter
list in a va_list
, with n
as a point of reference. Each parameter
can then be pulled in sequence with va_arg
, and when we are done,
we just call va_end
. Simple indeed, but incredibly unsafe.
A sword of Damocles
The first thing that you might have noticed is that the sentinel n
indicates the number of variadic parameters: indeed, there is no way as-is
to know how many of them there are, so we need a hint. Some functions like
printf
are blessed with parameters that convey both a meaning and
the number of expected variadic arguments (i.e. the format string), but
in the general case, that is not always possible.
So, would you guess what calling func(4, "1", "2", "3")
does ? That’s
right, Undefined Behavior. At best, your process will read whatever has
been pushed before “3”, try do dereference it, and will crash, but worse,
it could continue to run as if nothing happened – no need to tell how bad
this is.
Let’s take another example:
1
2
3
4
5
6
7
8
9
10
void func2(int n, ...) {
va_list args;
va_start(args, n);
if (n > 0) printf("%d\n", va_arg(args, int));
if (n > 1) printf("%f\n", va_arg(args, double));
if (n > 2) printf("%s\n", va_arg(args, char *));
va_end(args);
}
What would happen if I called func2(3, "1", 2, 3)
? Again, Undefined
Behavior. va_arg
is not typesafe, meaning that you have to care about
the order and size of your parameters (this can cause horrible results if,
for example, you pass 2
instead of 2.
; subtle, but deadly).
Overall, I would say that the worst part is that your compiler will
never complain about all that, the only special case being the
printf
function family, where the parameters are type checked against
the format string (and this only happens because said functions have
the format
function annotation).
Improving the usability with sentinels
Let us ignore for the moment all the type safety concerns, and focus on the first horrible problem variadic parameters have : the length hint.
We are not, in 99% of the time, designing functions à la printf
, where
one of the mandatory parameters gives us the length hint, so usually we
end up with functions like:
If we happen to have all our variadic parameters of the same type, we can use a null sentinel at the end of the parameter list:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
__attribute__ ((sentinel))
void func(param0, param1, ...) {
va_list args;
va_start(args, param1);
for (char *s; s = va_arg(args, char *);)
printf("%s\n", s);
va_end(args);
}
int main(void) {
func(param0, param1, "1", "2", "3", NULL);
return 0;
}
The attribute here is of course GCC/Clang only, but can help to enforce the length safety by generating a compiler warning when no NULL is provided at the end of the parameter list.
This is better, but it only works when the parameters are of the same integer type and we still have to remember to insert NULL, plus we programmers hate writing redundant things like this. It is time to ask some help from the C preprocessor.
Using the preprocessor to generate the length hints
With sentinels
Wrapping the function to insert a trailing NULL is pretty straight forward:
The only difference between the ANSI and GNU version is that
func(a, b)
will expand to func(a, b, NULL)
in GNU C instead of
func(a, b, , NULL)
in ANSI C (making one of the variadic parameters
mandatory).
With the length directly
Generating a similar macro when the length hint is explicit is a bit trickier. We first have to define a macro to count the variadic parameters passed to it:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#if __STRICT_ANSI__
# define ARG_LENGTH(...) __builtin_choose_expr(sizeof (#__VA_ARGS__) == 1, \
0, \
ARG_LENGTH__(__VA_ARGS__))
#else /* !__STRICT_ANSI__ */
# define ARG_LENGTH(...) ARG_LENGTH__(__VA_ARGS__)
#endif /* !__STRICT_ANSI__ */
# define ARG_LENGTH__(...) ARG_LENGTH_(,##__VA_ARGS__, \
63, 62, 61, 60, 59, 58, 57, 56, 55, 54, 53, 52, 51, 50, 49, 48, 47, 46, 45,\
44, 43, 42, 41, 40, 39, 38, 37, 36, 35, 34, 33, 32, 31, 30, 29, 28, 27, 26,\
25, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6,\
5, 4, 3, 2, 1, 0)
# define ARG_LENGTH_(_, _63, _62, _61, _60, _59, _58, _57, _56, _55, _54, _53, \
_52, _51, _50, _49, _48, _47, _46, _45, _44, _43, _42, _41, _40, _39, _38, \
_37, _36, _35, _34, _33, _32, _31, _30, _29, _28, _27, _26, _25, _24, _23, \
_22, _21, _20, _19, _18, _17, _16, _15, _14, _13, _12, _11, _10, _9, _8, \
_7, _6, _5, _4, _3, _2, _1, Count, ...) Count
(Snippet taken from libcsptr)
Here the snippet is GNU C only, but can be adapted to work with other
compiler – the only difference being that ARG_LENGTH
will have to take at
least one parameter.
The #if __STRICT_ANSI__
block is here to have consistent results with
different gcc/clang compiler options when ARG_LENGTH()
is expanded: without this,
the macro expands to 1
instead of 0
when compiled with -ansi
,
-std=c99
, or -std=c89
.
The macro itself relies on the fact that the expanded varargs will shift
all the trailing numbers to the right in ARG_LENGTH_
, making Count
expand to the number of times the shift happens.
With this macro defined, it becomes trivial to wrap the variadic function:
Tackling the type safety problem
This is a technique I discovered while experimenting with the interfaces of libcsptr and criterion: it all goes down to the fact that designated initializers and compound literals both use commas as value separator, just like parameter lists – the one major difference being that types in those are checked.
Putting the concept into practice:
1
2
3
4
5
6
7
8
9
10
11
12
13
struct func_params {
int sentinel_;
int a;
double b;
char *c;
};
void func(struct func_params *args) {
printf("%d, %f, %s\n", args->a, args->b, args->c);
}
#define func(...) \
func(&(struct func_params) { .sentinel_ = 0, __VA_ARGS__ })
sentinel_
is here to ensure that func called without parameters still
produce a valid compound literal.
Edit: professorlava
fairly pointed out that for those that use GNU extensions you can, in fact, ignore
the sentinel. It will still produce a warning if -Wmissing-field-initializer
is
provided, so be sure to balance that out (is 4 bytes worth all this?).
What this brings on the table is a type safe interface for optional parameters (that can be extended to variadic parameters when using arrays instead of structs), that is 100% standard C99 (to make this work with C89, you would have to replace the compound literal with a local variable and a designated initializer).
Hence, this compiles:
… and this does not:
The neat side-effect that this technique spawns is that func
has
a python-like interface for keyword arguments:
Of course, one must also consider that this technique provides default values to unspecified fields: 0 (zero). This effect is sometimes desirable, but on other cases it is not.
TL;DR
Variadic functions are unsafe and weird to use, the following techniques try to provide a better interface:
Technique | Usage | Pros | Cons |
---|---|---|---|
Sentinel | f(param, "1", "2", NULL) |
Sentinel presence can be enforced | Trailing NULL (easily forgettable) |
No length parameter | Unique type parameters | ||
Not type safe | |||
Sentinel + Macro | f(param, "1", "2") |
Benefits of sentinel without trailing NULL | Unique type parameters |
Not type safe | |||
Length parameter + Macro | f(1, 2.0, "3") |
Inferred length parameter | Not type safe |
Parameters of different types | |||
Struct + Macro | f(1, 2.0, "3") |
Type safe optional parameters | Default values are always 0 |
f(1, .c = "3", .b = 2.0) |
Python-like keyword arguments | ||
Conclusion
These techniques provide some sugar on top of the C language to provide safer interfaces to variadic parameters – I hope you find inspiration in those techniques.