الجمعة، 9 مارس 2012

All about pointers

Everyone, including myself, thought at some time that pointers are some complex, difficult-to-understand feature of C. It's much simpler than you think. You just need to know how the hardware & the C compiler deal with variables.

In C, all data are basically bytes: a char is 1 byte, an int is four bytes, a pointer is four or eight bytes...etc. The difference is what those bytes mean.

Also, during compilation variable names are replaced by numbers. So when your program is being compiled the variable 'x' is replace by (say) 1000. And this number refers to a location in memory.

Let's say your C program is this:

main()
{
int x, y;
x = 5;
y = x;
y++;
}

The compiler will choose an address for each variable (say address of x is 1000, address of y is 1100) and replace all your variable names with these addresses. The machine code generated from the compiler would be like this, in the form of instructions to the CPU:

put 5 into address 1000
read content of address 1000 & put a copy of this content into address 1100
increment the content of address 1100

But those addresses are invisible to us the programmer; we cannot see the 1000, we see only x, right? Well, being a powerful language, C allows us to peek behind the scenes and play with the addresses themselves. This is done via the 'addressOf' operator, also known as &

if I add to the above program
z = &x
it will mean the following (assuming Z is in address 1200):
put the address 1000 in the content of the address 1200

Do you notice a difference? We did not say "read the content of address ....", we directly said "put the address....". This is basically what pointers are: addresses.

The type of z would be a "pointer to int", which means "you know those int variables? I want to store their addresses, not their content". Notice that for int, float, char....etc all pointers are the same size; since all variables are in the same memory space, which has the same type of address.

ok, another example:

main()
{
int x;
int *z;
x = 8;
z = &x;
}

What can we now do with z? We want to read and write to the address it stores. So we can do this:

main()
{
int x;
int *z;
x = 8;
z = &x;
*z = 12;
}

This will generate machine code like this (assume again x is address 1000, z is address 1200; actually you can run this code in debug mode, put a breakpoint, and look at x, z...etc in the watch window; you'll see their actual addresses)

put the value 8 in the memory location 1000
put the value 1000 in the memory location 1200
read the content of the memory location 1200, use that value as a memory location, and go there and write 12

Oh! Did you see that? We did not put 12 in the address 1200, no! We took a peek in 1200, found another address, and went there and put 12.

What was the location written in 1200? It was 1000. And we know that 1000 is actually another name for x. If we did a cout << x we would have found it to be 12 and not 8.

In summary:
  • &a means "the address assigned to the variable a"
  • *b means "read the contents of b, use it as an address for reading and writing".
  • * has another meaning when declaring variables: type *x means x is a pointer to a variable of the type.
All this looks more confusing that it actually is. If you find it confusing, read it again with a pencil and papers, and try to draw memory and its locations/contents.

What is the benefit of all this? It's a long story about abstraction over variables and dynamic memory and more...I discuss it in the second part of this article. Just learn pointers first :)

ليست هناك تعليقات: