[clug-progsig] Stupid C tricks

Chris Berry chrisb at chrisbtoo.net
Thu Sep 30 20:44:10 PDT 2004


If you were at the meeting tonight, you may have heard William and me 
bickering about an obscure and largely (completely?) pointless feature 
of C. Here's a rather lengthy description of what we were on about. 
Don't let it put you off C programming - to my mind it's one of the many 
things that makes C the most beautiful programming language :-)

Take a look at the following program:

1:  #include <stdio.h>
2:
3:  int main (void)
4:  {
5:    char buf[7] = {'c','h','r','i','s','b','\0'};
6:    char c = 3[buf];
7:    printf("%c\n", c);
8:    return 0;
9:  }

FWIW, the output of this program is the letter 'i' on its own.

If you've done any programming at all recently, you're likely to have 
come across the construct array[index] - most languages have that sort 
of syntax. Not so many will let you say index[array], though, as line 6 
above does.

The reason this works in C is down to 2 features - the syntactic 
definition of array indexing, and the semantic meaning of it. Let's look 
at the semantic meaning first.

In C, an array is basically just a block of memory that can be allocated 
either on the stack or the heap (if you don't know what those are, don't 
worry, they're just bits of memory). Many of the standard library 
functions (e.g. strcpy, strcmp) take as parameters a const char * 
(pointer to an unchangable character) or a char * (pointer to a 
changeable character). What's more, though, C lets you treat those 
pointers as a pointer to 1 or more characters. But that's pretty much 
what an array is, so it'll also let you pass the name of an array to 
something that's expecting a pointer, and it'll silently cast from one 
to the other.

When you index into an array, what the compiler does is to take the name 
of your array and cast it to a pointer, and add the index value to that 
pointer to generate a pointer to the indexed array element. For example:

1:  char array[10] = {1,2,3,4,5,6,7,8,9,10};
2:  char i = array[0];
3:  char j = array[4];
4:  char k = *(array + 4);

In line 2 above, the compiler effectively takes the address of the 0th 
element of the array, and adds 0 to it, then reads the value from the 
calculated address - so i = 1.

In line 3, we take the address of the 0th element again, add 4 to it so 
we're pointing at the address of the 4th element, and read the value - 
so j = 5.

In line 4, we're explicitly doing what the compiler is doing implicitly 
in the previous lines. We're taking the address of the 0th element of 
array, adding 4, then reading the contents of the memory - so k = 5 also.

So what we've learned here is that an index into an array is the same as 
reading the value from the memory pointed at by the sum of a pointer and 
an integer offset.

Now, the syntactic part. The grammar rules for C state that 
(effectively), the following is a valid expression:

     expression := expression [ expression ]

That is, some expression followed by a '[', followed by another 
expression, followed by a ']' is also an expression. As it happens, it's 
the expression for an array index lookup. What it *doesn't* say (and 
this is where William and I disagreed) is that the first of the 
expressions has to be a pointer and the 2nd has to be an integer. This 
is why our earlier (remember earlier?) c = 3[buf]; is a syntactically 
valid statement.

We know from our semantic descussion above that an array indexing 
operation is the same as the dereference of the pointer generated by 
adding a pointer and an integer, and in expression above we have a 
pointer and an integer, just not in the order one normally expects to 
see them. The compiler doesn't mind, though, because it fits its rules.


Chris.



More information about the clug-progsig mailing list