Friday, September 18, 2009

C++ : Understanding pointers

1. Introduction

This article is intended to all programing enthusiasts on all levels who do wish to understand pointers in C++ language.  All code presented here is not a compiler specific and all examples will be written in plain ANSI C++. Debate about pointers can stretch for miles, and you would need to go really far to master it all. If you really want to run that far, this article gives you a clear understanding of fundamental concepts about pointers and prepares you for that journey. However, those who are new to C++ programming make sure that  you are able to write and run your own C++ “hello world” program, and also it is recommended that you have a basic understanding of C++ functions and classes. If you need to refresh your knowledge about how to compile and run C++ program, use functions and classes, please read an appendix at the end of this document before you continue reading this article.

2. What is a Pointer?

Pointer is a variable that stores a memory address. OK, that is simple ! But, what is a memory address then? Every variable is located under unique location within a computer's memory and this unique location has its own unique address, the memory address. Normally, variables hold values such as 5 or “hello” and these values are stored under specific location within computer memory. However, pointer is a different beast, because it holds the memory address as its value and has an ability to “point” ( hence pointer ) to certain value within a memory, by use of its associated memory address.

3. Retrieving a Variable's Memory Address

OK, enough talking and let's get down to the pointer business. To retrieve a variable's memory address, we need to use address-of operator &.
#include 
int main()
{
  using namespace std;
// Declare an integer variable and initialize it with 99
  unsigned short int myInt = 99;
// Print out value of myInt
  cout << myInt << endl;
// Use address-of operator & to print out a memory address of myInt
  cout << &myInt << endl;

return 0;
}
OUTPUT:
99
0xbff26312
The first line of the output contains an integer value 99 and on the second line, there is a memory address of myInt printed out. Please note that your output will be different.

4. Assigning a Variable's Memory Address to a Pointer

Before we can assign a memory address to a pointer, we need to declare one. Declaring a pointer in C++ is as simple as to declare any other variable with one single difference. Asterix symbol " * " needs to be add and located after variable type and before a variable name. One rule has to be followed when assigning memory address to a pointer: pointer type has to match with variable type it will point to. One exception is a pointer to void, which can handle different types of variables it will point to. To declare a pointer pMark of type unsigned short int a following syntax is to be used:
#include 

int main()
{
  using namespace std;

// Declare and initialize a pointer.
  unsigned short int * pPointer = 0;
// Declare an integer variable and initialize it with 35698
  unsigned short int twoInt = 35698;
// Declare an integer variable and initialize it with 77
  unsigned short int oneInt = 77;
// Use address-of operator & to assign a memory address of twoInt to a pointer
  pPointer = &twoInt;
// Pointer pPointer now holds a memory address of twoInt

// Print out associated memory addresses and its values
  cout << "pPointer's memory address:\t\t" << &pPointer << endl;
  cout << "Integer's oneInt memory address:\t" << &oneInt << "\tInteger value:\t" << oneInt << endl;
  cout << "Integer's twoInt memory address:\t" << &twoInt << "\tInteger value:\t" << twoInt << endl;
  cout << "pPointer is pointing to memory address:\t" << pPointer << "\tInteger value:\t" << *pPointer << endl;

return 0;
}
OUTPUT:
pPointer's memory address:              0xbff43314
Integer's oneInt memory address:        0xbff43318      Integer value:  77
Integer's twoInt memory address:        0xbff4331a      Integer value:  35698
pPointer is pointing to memory address: 0xbff4331a      Integer value:  35698
C++ pointer example diagram
The diagram above is a high level abstraction of how are variables stored within a computer memory. Pointer pPointer starts at memory address 0xbff43314 and takes 4 bytes. Pointer pPointer holds as a value a memory address of a short int twoInt ( 2 bytes ) which is 0xbff4331a. This address is stored as a binary data within a pointer's memory space allocation. Therefore, dereferencing a pointer with a memory address 0xbff4331a will indirectly access a value of twoInt which is in this case a positive integer 36698.

5. Accessing the Value at the Memory Address held by a Pointer

As you could see in the previous example a pointer pMark truly holds a value memory address of an oneInt. Process accessing a variable's value by a pointer is called indirection, since the value of variable is accessed indirectly. Value of oneInt can be now indirectly accessed with a use of pPointer pointer. To do that we need to dereference a pointer with dereference operator “ * “ which needs to be placed before a pointer variable name:
#include 

int main()

{
  using namespace std;
// Declare an integer variable and initialize it with 99
  unsigned short int myInt = 99;
// Declare and initialize a pointer
  unsigned short int * pMark = 0;
// Print out a value of myInt
  cout << myInt << endl;
// Use address-of operator & to assign a memory address of myInt to a pointer
  pMark = &myInt;
// Dereference a pMark pointer with dereference operator * to access a value of myInt
  cout << *pMark << endl;

return 0;
}
OUTPUT:
99
99

6. Manipulating Data with Pointers

Same as accessing the value at the memory address held by pointer by indirection, the indirection can also be used to manipulate variable's value. Assigning a value to a dereferenced pointer will indirectly change a value of a variable the pointer is pointing to. The following example illustrates simple manipulation of data with pointers:
#include 

int main()

{
  using namespace std;
// Declare an integer variable and initialize it with 99
  unsigned short int myInt = 99;
// Declare and initialize a pointer
  unsigned short int * pMark = 0;
// Print out a value of myInt
  cout << myInt << endl;
// Use address-of operator & to assign memory address of myInt to a pointer
  pMark = &myInt;
// Dereference a pMark pointer with dereference operator * and set new value
  *pMark = 11;
// show indirectly a value of pMark and directly the value of myInt
  cout << "*pMark:\t" << *pMark << "\nmyInt:\t" << myInt << endl;

return 0;
}
OUTPUT:
99
*pMark: 11
myInt:  11
Dereferencing a pMark pointer and assignment of new value to 11 will indirectly alter a value of myInt also to 11. It is important to bring up again that pointer holds an only variable's memory address not its value. To access a value of the variable to which a pointer is pointing to the pointer must be derefereced by dereference operator “ * “. Note “ * “ is not a multiplication, by the context of your C++ code your compiler will differentiate if your intention is to use multiplication or dereference operator.

7. Pointers and Arrays in C++ Language

In C++ , an array name is a constant pointer to its first element. The relationship between c++ pointers and arrays are closely related and its use is almost interchangeable. Consider the following example:
#include 

int main()
{
  using namespace std;
// Declare an array with 10 elements
  int Marks [10]= {1,2,3,4,5,6,7,8,9,0};
// Print out the memory address of an array name 
  cout << Marks << endl;
// Print out the memory address of a first element of an array
  cout << &Marks[0] << endl;
// Print out value of the first element by dereferencing a array name
  cout << *Marks << endl;
return 0;
}
OUTPUT:
0xbf83d3fc
0xbf83d3fc
1
As you can see an array name is indeed a pointer to its first element and therefore, is perfectly legal to access the array elements by a const pointer. Hence, dereferencing an array name will access a value of the first element of a given array. The next example will demonstrate how to access other elements of an array by conts pointers.
#include 

int main()
{
  using namespace std;

// Declare an array with 10 elements
  int Marks [10]= {1,2,3,4,5,6,7,8,9,0};
// Create a constant pointer to Marks array
  const int *pMarks = Marks;
// Access a 6th element of an array by pMarks pointer
  cout << *(pMarks + 5) << endl;
// Access a 6th element by dereferencing array name
  cout << *(Marks + 5) << endl;
// Access a 6th element of an array
  cout << Marks[5] << endl;

return 0;
}
OUTPUT:
6
6
6
The 6th element of an array can be referenced with a pointer expression *(Marks + 5). In addition it is possible to declare another constant pointer and assign it a value of a memory address of the first element of an array . This pointer will then behave the same way as the originally declared Marks array. The “ + “ sign tells the compiler to move 5 objects that are integers from the start of the array. If the object, in an integer, as it is in this our example, ( integer is usually 4 bytes ) this will cause the pointer to point to a memory address 20 bytes behind the address reserved by the first array element and thus pointing to 6th. The following example demonstrates this idea in detail:
#include 
int main()
{
  using namespace std;
// Declare an array with 10 elements
int Marks [10]= {1,2,3,4,5,6,7,8,9,0};
// Create a constant pointer to Marks array
const int *pMarks = Marks;

  for (int i=0, bytes=0; i < 10; ++i, bytes+=4)
  {
    cout <<  "Element " << i << ": " << pMarks << " + ";
    cout <<  bytes << " bytes" << " = " << (pMarks + i) << endl;
  }
return 0;
}
OUTPUT:
Element 0: 0xbfa5ce0c + 0 bytes = 0xbfa5ce0c
Element 1: 0xbfa5ce0c + 4 bytes = 0xbfa5ce10
Element 2: 0xbfa5ce0c + 8 bytes = 0xbfa5ce14
Element 3: 0xbfa5ce0c + 12 bytes = 0xbfa5ce18
Element 4: 0xbfa5ce0c + 16 bytes = 0xbfa5ce1c
Element 5: 0xbfa5ce0c + 20 bytes = 0xbfa5ce20
Element 6: 0xbfa5ce0c + 24 bytes = 0xbfa5ce24
Element 7: 0xbfa5ce0c + 28 bytes = 0xbfa5ce28
Element 8: 0xbfa5ce0c + 32 bytes = 0xbfa5ce2c
Element 9: 0xbfa5ce0c + 36 bytes = 0xbfa5ce30
Think over the following pointer arithmetic operation:
Since we know that this particular array contains integers and integers are 4 bytes long, memory reserved for the entire array is 40 bytes. Memory address 0xbfa5ce0c is where the first element of the array resides. Adding 20 bytes ( 0x16 ) to an address of the first element will return address of the 6th element: 0xbfa5ce0c + 0x16 ( 20 bytes ) = 0xbfa5ce20

7.1. Array of pointers

In C++, it is also possible to declare an array of pointers. Such a data structure could be used to create an array of strings ( string array ). In C++ a string is actually a pointer to its first character, thus a string array would be essentially an array of pointers to a first character of a string in each array's element.

#include 
int main()
{
  using namespace std;

const char *linuxDistro[6] = 
   { "Debian", "Ubuntu", "OpenSuse", "Fedora", "Linux Mint", "Mandriva"};

for ( int i=0; i < 6; i++) 
cout << linuxDistro[i] << endl;

return 0;
}
OUTPUT:
Debian
Ubuntu
OpenSuse
Fedora
Linux Mint
Mandriva

8. Why do we need pointers?

So far, you have been exposed to some theory and syntax behind C++ pointers. You learned how to assign a memory address of a variable to a pointer. Assigning an address to a pointer is useful for an explanation on how pointers work. However, in the real life you would under no circumstances do that. The correct question at this point would be: Why do we need pointers in C++ language if we can access and manipulate variables by just using their declaration name? The ability of C++ to access memory directly by pointers makes C++ language favorable over some other languages such as Visual Basic, C# or Java. Accessing variables directly by pointers rather than through their memory location results in increased efficiency and flexibility of written code. However, as it can be expected, increased efficiency takes its cost, because using any low-level tool such as a pointer means intensified difficulty in their implementation. The most common use of pointers includes:
  • Data management on the free store
  • Accessing class member data and functions
  • Passing variables by reference to functions

9. Where do we use pointers?

In this section, we will explore a couple different ways of using pointers in C++ language.

9.1. Allocating a memory from the "Free Store"

Variables declared and used locally inside a function definition are destroyed, once a return value is passed back to a calling statement. This approach of passing by value to a function has an advantage that programmer does not have to a free a memory allocated by function's local variables and variables passed to a function cannot be changes. Disadvantage comes when declared variables within a function scope are needed to be used by other functions without creating an overhead be copying these variables via function's return value.

One solution to this problem is to create global variables. However, this always leads to decreased code readability, efficiency and bugs, therefore, declaring global variables should be avoided whenever possible. Declared global variables within a global namespace are holding an allocated memory during the entire program runtime and very often are needed only for a short time.

Imagine that a programmer will be able to dynamically allocate a memory during a program runtime . This memory allocation can also be used anywhere within a code and freed any time is no longer needed. This is where the so called “Free Store” comes in. Free store is a memory available for a programmer to be dynamically allocated and de-allocated during the program execution.

Allocating a memory with new operator

new operator is to be used when allocating a memory in the free store is needed. Return value of a new operator is a pointer. Therefore, it should be assigned to only a pointer. Here is an example:
#include 
int main()
{
using namespace std;
// Delcare a pointer
unsigned short  * pPointer;
// allocate a memory with a new operator on the free store 
// and assign a memory address to a pointer named pPointer
pPointer = new unsigned short;
// assign an integer value to a pointer
*pPointer = 31;
// Print out the value of pPointer
cout << *pPointer << endl;

return 0;
}
OUTPUT:
31
The free store memory allocated during the program runtime was released only when the actual program ended. For this short program it is ok, however not de-allocating a free memory back to the system is not a good programming practice and more importantly your code will normally consist of more than 11 lines.

De-allocating a memory with delete operator

To return an allocated memory back to the free store a delete operator is to be used:
#include 
int main()
{
using namespace std;
// Delcare a pointer
unsigned short * pPointer;
// allocate a memory with a new operator on the free store 
// and assign a memory address to a pointer named pPointer
pPointer = new unsigned short;
// assign an integer value to a pointer
*pPointer = 31;
// de-allocating memory back to the free store
delete pPointer;
// Print out a value of pPointer
cout << *pPointer << endl;

return 0;
}
OUTPUT:
0
delete operator frees a memory allocated by a new operator and returns it back to the free store. The declaration of the pointer is still valid so pPointer can be reassigned. The remove operator just removed associated memory address with the pointer.
NOTE:It is important to mention that pointers should be always initialized even after delete operator. Not to do so can create so called a stray pointer which can lead to unpredictable results because a stray pointer may still point to an old position in the memory and compiler can use that memory address to store other data. Program can function normally. However, a time bomb to crash the program is set and the time is running.

9.2. Data type class pointers

Declaring a type class data pointers are not different from declaring a pointers to any other data type. Consider o following example where simple class Heater will be constructed to demonstrate this concept.
#include 

using namespace std;
class Heater
{
// Declare public functions for class heater
public: 
// constrator to initialise temperature
    Heater( int itsTemperature);
// destructor, takes no action 
    ~Heater();
// accessor function to retrieve temperature
    int getTemperature() const;
// declare private data members
private:
    int temperature;
};
// definition of constractor
Heater::Heater(int itsTemperature)
{
    temperature = itsTemperature;
}
// definition of destructor
Heater::~Heater()
{
// no action taken here
}
// Definition of an accesor function which returns a value of a private Heater class member temperature
int Heater::getTemperature() const
{
return temperature;
}

int main()
{
// define a Heater class pointer object and initialize temperature to 8 by use of constructor
    Heater * modelXYZ = new Heater(8);
// Access an accessor function getTemperature() to retrieve temperature via an object modelXVZ
    cout << "Temperature of heater modelXYZ is :" << modelXYZ->getTemperature() << endl;
// Free an allocated memory back to a free store
    delete modelXYZ;

return 0;
}
OUTPUT:
Temperature of heater modelXYZ is :8
The largest part of the code above is devoted to a class definition. The interesting segment for us from pointers perspective is in the main function. Memory from a free store is allocated to a new pointer object modelXYZ of class Heater. On the next line of this code, an accessor function getTemperature() is called by referencing a pointer object with an arrow operator. Lastly, a memory set aside from a free store is freed. It is important to mention that (.) operator which is generally used to access class member functions cannot be used when an object is declared as a pointer, when a class object is declared as a pointer an arrow ( -> ) operator must be used instead.

9.3. Passing to a function by reference using pointers

In addition to passing by value and reference, passing arguments to a function can be also done by using pointers. Consider a following example:
#include 

using namespace std;

// FUNCTION PROTOTYPE
// passing to a function by reference using pointers
void addone( int * a, int * b);

int main()
{
// define integers a and b
int a = 1;
int b = 4;

  cout << "a = " << a << "\tb = " << b << endl;
// addone() function call
  cout << "addone() function call\n";
  addone(&a,&b);
// after addone() function call value a=2 and b=5 
  cout << "a = " << a << "\tb = " << b << endl;

return 0;
}

// addone() fuction header
void addone( int * a, int * b)
{
// addone() function definition
// this function adds 1 to each value
  ++*a;
  ++*b;
}
OUTPUT:
a = 1   b = 4
addone() function call
a = 2   b = 5
Function prototype of addone function includes pointers as accepted variable types. addone() function adds 1 to each variable which a memory address had been passed during the function call, from the main function. It is imperative to know that using a reference to pointers as passing arguments allow the involved values to be changed also outside of the addone() function scope. This is a complete opposite to a situation where a programmer is passing to a function by value. Please see an appendix at the end of this article, for listed examples on how to pass to a function by value and by reference.

10. Memory leaks associated with pointers

As it was already mentioned above, using pointers and new operator gives to a programmer great power. Great power very often comes with a great responsibility. False manipulation with a free store memory can result in memory leaks. Next we will illustrate some instances of memory leaks related to C++ pointers.

10.1. Memory leak caused by pointer reassignment

Memory leak takes place when a memory allocated from the free store is no longer needed and is not released by delete operator. One way this can occur is by reassigning a pointer before a delete operator had a opportunity to do its job:
#include 
int main()
{
using namespace std;
// Delcare a pointer
unsigned short * pPointer;
// allocate a memory with a new operator on the free store 
// and assign a memory address to a pointer named pPointer
pPointer = new unsigned short;
// assign an integer value to a pointer
*pPointer = 31;
// print out the value of pPointer and its associated memory address
cout << "*pPointer Value: " << *pPointer << "\tMemory Address: " << pPointer << endl;
// reassign a pointer to a new memory address from a free store
pPointer = new unsigned short;
// assign an integer value to a pointer
*pPointer = 15;
// print out a value of pPointer and its corresponding memory address
cout << "*pPointer Value: " << *pPointer << "\tMemory Address: " << pPointer << endl;
// de-allocating memory back to the free store
delete pPointer;

return 0;
}
OUTPUT:
*pPointer Value: 31     Memory Address: 0x90e9008
*pPointer Value: 15     Memory Address: 0x90e9018
Analysing an output from a memory leak example:
  1. pPointer holds a memory address ( 0x90e9008 ) which points to an integer value 15.
  2. pPointer was reassigned with a new memory address by allocating a new memory from a free store.
  3. The original memory address which pointed to a value 31 is lost, and therefore, it cannot be released.
  4. pPointer now holds a memory address ( 0x90e9018 ) which points to an integer of value 15.
  5. Allocated memory with an address 0x90e9018 is de-allocated.
  6. Allocated memory with an address 0x90e9008 is released due to the program termination.

10.2. Memory leak caused by misuse of local variables

When a function returns a value or executes its last statement all variables declared within a function definition are destroyed and no longer accessible from the stack segment. This can cause a problem what memory from a free store is allocated to a pointer declared locally within a function definition, and it is not freed with delete operator before it goes out of the scope. The next example will sketch how it may occur:
#include 
// function prototype
void localPointer();

int main()
{
// function call from the main function
localPointer();

return 0;
}

// function header
void localPointer()
{
// function definition
// variable declaration pPointer of type pointer 
unsigned short int * pPointer; 
// allocate a new memory from the free store and assign its memory address to a pointer pPointer 
pPointer = new unsigned short int;
// initialise pPointer
*pPointer = 38;
// print out a value to which pPointer is pointing to
std::cout << "Value of *pPointer : " << *pPointer << std::endl;
// de-allocation statement "delete pPointer;" is missing here

// function end
}
OUTPUT:
Value of *pPointer : 38
When the function localPointer() exits, all variables decalred within a function scope are removed from the existence. pPointer is also a variable declared locally and therefore, a memory address associated with assigned memory from the free store will be lost and therefore, impossible de-allocate. The program will run normally in this simple example. However, every time the function localPointer() is called, a new memory will be set aside and not freed, which will diminish program's efficiency and cause program crash. More examples of memory leaks could be found on codersource.net.

11. Conclusion

As it was already mentioned earlier, pointers give a great power to any software engineer who masters them. This article tried to introduce a pointers in simple way by using examples, so reader can be on a good way to achieve that goal. Accessing a memory directly by pointers is one of the best features of C++ language and for that reason is often preferable language also for embedded devices programming. Great caution needs to be taken when working with pointers as great power can swiftly turn into a disaster.

12. Useful links:

Here are some great links to enhance understanding of C++ pointers.


13. Appendix

13.1. Write, compile and execute C++ “hello world”

#include 

int main()
{
  using namespace std;
  cout << "Hello world\n";

return 0;
}
COMPILE:
g++ hello-world.cc -o hello-world
EXECUTE:
./hello-world
OUTPUT:
Hello World

13.2. C++ functions basics

Please note that the following program is just for revision purposes only. It should not serve as a full guide to C++ functions. Please visit following links for better understanding on how to use C++ functions:
#include 

using namespace std;

// FUNCTION PROTOTYPES

// passing to a function by value
int add( int a, int b);
// passing to a function by reference using pointers
void addone( int * a, int * b);
// passing to a function by reference using reference
void swap( int &a, int &b);

int main()

{
// define integers a and b
int a = 1;
int b = 4;

// add() function call return value will be printed out
  cout << "add() function\n";
  cout << add(a,b) << endl;
  cout << "a = " << a << "\tb = " << b << endl;
// addone() function call
  cout << "addone() function\n";
  addone(&a,&b);
// after addone() function call value a=2 and b=5 
  cout << "a = " << a << "\tb = " << b << endl;
// swap() function call
  cout << "swap() function\n";
  swap(a,b);
// after swap() function call value a=5 and b=2
  cout << "a = " << a << "\tb = " << b << endl;

return 0;
}
// add() fuction header
int add( int a, int b) 
{
// add() function definition
// this function returns sum of two integers
  return a + b;
}

// addone() fuction header
void addone( int * a, int * b)
{
// addone() function definition
// this function adds 1 to each value
  ++*a;
  ++*b;
}

// swap() function header
void swap( int &a, int &b)
{
// swap() function definition
// this function swaps values
int tmp;

  tmp = a;
  a = b;
  b = tmp;
}
OUTPUT:
add() function
5
a = 1   b = 4
addone() function
a = 2   b = 5
swap() function
a = 5   b = 2

13.3. C++ classes basics

Please note that the following program is just for revision purposes only. It should not serve as a full guide to C++ classes. Please visit following links for better understanding on how to use C++ classes:
#include 

using namespace std;
class Heater
{
// declare public functions for class heater
public: 
// constrator to initialize a private data member temperature
    Heater( int itsTemperature);
// destructor, takes no action 
    ~Heater();
// accessor function to retrieve a private date member temperature
    int getTemperature() const;
// declare private data members
private:
    int temperature;
};
// definition of constractor
Heater::Heater(int itsTemperature)
{
    temperature = itsTemperature;
}
// definition of destructor
Heater::~Heater()
{
// no action taken
}
// definition of accesor function which returns value of a private Heater class member temperature
int Heater::getTemperature() const
{
return temperature;
}

int main()
{
// define a Heater class object and initialise temperature to 8 by calling a constructor
    Heater modelXYZ(8);
// access accessor function getTemperature() to retrieve temperature via object modelXVZ
    cout << "Temperature of heater modelXYZ is :" << modelXYZ.getTemperature() << endl;
return 0;

}
OUTPUT:
Temperature of heater modelXYZ is :8

No comments:

Post a Comment