Featuring few C features
Maybe you are a seasoned C programmer, maybe you are not. Anyway, maybe there are few features of modern C (and sometimes also of good old C) that you are not aware about, or that you knew but have forgot because they are just curiosities.
Let's see some of them, as a July reminder.
What about notation to get an element of an array?
You know this:
char e_of_hello = "hello"[1];
But maybe you don't know that you can get the very same result by the following:
char e_of_hello = 1["hello"];
Exactly. You can think of a[b]
as a shorthand notation for *(a+b)
. Then, you must remember that a + b = b + a
… Wait a moment! — you could say — one is a string and the other is a number! You can't sum apples and oranges!
Indeed, C hasn't string! C has arrays of characters, and "abc"
is a special notation to write an array made of the three characters a
, b
, and c
.
But still — you may argue — it's an array, which isn't an integer like 1!
Indeed, an expression like "abc"
is a pointer (to an array of char). A pointer is nothing but a number, or, at least C handles it like if it were a number (from a CPU point of view, you can always say that an address is an integer). Since C is such a “low level” language when it comes to pointer, it allows pointer arithmetic: you can add or subtract integers from a pointer, obtaining another pointer to a different memory location. This can be a little bit dangerous, and it breaks strong typing a bit… but C isn't such a strongly typed programming language, if you think of it.
You can write this:
printf("%c\n", *("hello" + 1));
which is exactly like
printf("%c\n", "hello"[1]);
Enough of this: you've got the idea. But please, do not ever write things like 1["hello"]
in your code.
Designators
Modern C (C99 and later) has designators. That is, you can initialize an array by “designating” its elements. Like this:
int a[10] = {
[5] = 1,
[9] = 1
};
The undesignated elements are initialized with 0, while a[5]
and a[9]
gets 1.
Do not forget also that
int a[10];
declares an array of ten integers (indexed from 0 to 9), but it doesn't initialize it. You can do it by writing:
int a[10] = {};
You need to know that this works “always”; also with structs, and that all “undesignated” elements are initialized to “0” (0 don't need to be actually 0 for all type — but this is another story).
Structs also has designators:
struct {
const char *n;
int l;
} s = {
.l = 5
};
printf("%s : %d\n", s.n ? s.n : "NULL", s.l);
This will print NULL : 5
. We initialized l
with a designator, but we left n
out. Hence, it was initialized with the “0” value for a pointer (which could be an actual binary 0 on many machines).
Struct copy
We tend to forget that structs are treated like values. Let us suppose we had:
struct h {
const char *n;
int l;
} s = {
.l = 5
};
as before. We can overwrite this s
with s1
like this:
struct h s1 = { .n = "hello", .l = sizeof "hello" };
s = s1;
Of course you must be aware of the fact that the const char *
, which is a pointer, points to the same memory — created by a literal string, hence it needs to be const (on many machines, trying to write into the "hello"
memory will bring serious trouble: your program would crash).
If you want memory which can be modified, you need more something like this:
struct { char s[100]; int l; } s2 = { .s = "hello" };
Do not forget, anyway, that we are initializing things; that is, the value used must be known at compile time, “directly”, so to speak. That is, the following is an error:
const char known[] = "hello";
struct { char s[100]; int l; } s2 = { .s = known };
Unfortunately, even if known
is an array (whose size is deduced), once you write known
, this is “degraded” (it decays in)to a pointer, and it loses its “array nature”; without this, there's no clue of how many array elements to be copied. Briefly, it can't be done.
A common idiom which can be useful, is to have an “null” struct to initialize other structs; e.g.,
struct whatever z0 = {};
struct whatever running;
// ...
// maybe inside a loop
running = z0; // clean the struct
running.v = 5;
if (something) {
running.x = other;
}
do_something(running);
First, we assure that running
, which is a struct whatever
used and used again, is in an initial “state”, that is, it contains certain values by default (in this example, it is “zero”-ed); then we can assign only the members that are different, according to the program logic.
Overloading!
Often overloading is considered a “OO” features, but it is wrong. Non-OO languages may have function overloading. Modern C has it too, à-la Fortran, using macro _Generic
.
À-la Fortran means this: in Fortran, you specify a single symbol-name (the overloaded function) and then which variant must be used for a certain type of argument(s). Users will use the single symbol-name, which is just a front end, and behind the curtains the function actually called will be another one, according to the type of its arguments.
By the way, this approach avoids part of the name mangling chaos: you decide the name of the specific functions. No surprises, no linking problems ahead, at least related to functions' overloads. To understand the problem better, see e.g. on wikipedia.
In modern C you can overload a symbol name using type generic functions — this is actually used by some math functions, for instance sin
: users don't need to bother about using the right sin, because the language will figure it out according to the type of the argument.
if you use sin(doubleVar)
, the sin
will be called. If you use sin(floatVar)
, the function sinf
will be called; if you use sin(longDoubleVar)
, it will be sinl
.
This magic is achieved with _Generic
. Which looks like this:
#define func(X) _Generic((X), int: funci, \
long: funci, \
double: funcd, \
float: funcf, \
default: funcv)(X)
void funcv() {
puts("i don't know...");
}
void funci(long a) {
puts("long/int version");
}
void funcd(double a) {
puts("double version");
}
void funcf(float a) {
puts("float version");
}
And we can try with this fragment:
func((int)5);
func((long)6);
func((float)1.1);
func((double)5.5);
func("hello");
which outputs:
long/int version
long/int version
float version
double version
i don't know...
Notice the void funcv()
function. This is a rather controversial feature of C from the ancient time: this is not a function that takes no argument, but a function that takes any argument. In standard modern C, a function that takes no argument must be written as void funcv(void)
— see the void
in the ()
? But in our case we can't do it, because we need a function which is able to digest a parameter (of any type).
As you can imagine, you can play a lot of tricks with _Generic
, because well, it is “just a macro”.
define gunc(X, Y) _Generic((X), int: gunci, \
double: guncd)((X), \
(_Generic((Y), int: 1, \
default: -1)))
void gunci(int a, int b) {
printf("%d (%d)\n", a + b, b);
}
void guncd(double a, int b) {
printf("%lf (%d)\n", a*(double)(b), b);
}
Now gunc
looks like a very odd two-arguments function. The actual functions gunci
and guncd
are selected according to the first argument; then, these actual functions take a second argument which is 1 if the second argument type is integer, and -1 otherwise. The example code of gunci
and guncd
is meaningless, as the choice of the behaviour on the secondo argument… But you can see that you can play with types to select not just functions, but values. _Generic
behaves like a macro which select a “string” according to the type of its argument(s).
C macro can make text which can't compile, but _Generic
is a little bit more robust — just a little bit more.
You can write code which fails if the type isn't the one you want to succeed:
#define compil(X) _Generic((X), int: 1, default: abort())
This would leave the expression 1 of X is int, otherwise it will insert a call to abort. Now, you can have
1;
and the compiler will still be happy, but
int c = abort();
is this ok? abort()
is void abort(void)
, and the compiler will complain:
error: void value not ignored as it ought to be
So,
int c = compil(argc);
is ok, because argc
is int, and it will be like
int c = 1;
But
int c = compil(argv);
won't compile. Contrast with
compil(argc);
compil(argv);
which compiles, but it will abort when executed because argv
isn't int.
Conclusion
Despite this many years, C is still a language we couldn't live without — think about just this: the linux kernel is written in C.
It's a powerful language — maybe too powerful for the web and java generations — yet it's “simple” but has dark, complex corners…