CtoC++

CtoC++: Upgrading to Object Oriented C

=Introduction=

This tutorial carries on where StartingC left off.

To get the material, cut and paste the contents of the box below onto your command line.

svn co https://svn.ggy.bris.ac.uk/subversion-open/CtoC++/trunk ./CtoC++

In this tutorial we will assume basic linux skills as outlined in Linux1.

=Cutting to the Chase: Classes and Encapsulation=

So, he we are contemplating C++. We've got to grips with most of the C language in StartingC and it looked alright. Definitely serviceable. What's all the fuss about C++? Well, I believe that most of the fuss is about encapsulation. We saw the benefit of collecting together related variables into structures in C, true? Well, C++ goes further and allows us to collect together not only related variables, but also functions which use those variables too. An instance of a class is called an object and it comes preloaded with all the variables and functions (aka methods) that you'll need when considering said object.

What may have seemed like the relatively small enhancement of adding methods to the encapsulation has, in fact, resulted in a sea-change. No longer are we thinking about a program in terms of the variables and the functions, but instead we're thinking about objects (planets, radios, payrolls and the like) and how they interact with other objects. Hence the term object oriented programming (OOP).

Things are typically invented for a reason and C++ is no different. The problem with the traditional functional programming model, such as standard C, is that as our programs grow we end up with more and more variables which are used by more and more functions. These are functions and variables are typically all mixed up in the scope, or namespace, of the top-level function, called 'main'. Modification and maintenance of the program becomes harder and harder since it becomes more difficult to keep track of which variables are used by which functions. Overall are program begins to resemble spaghetti--not a renowned building material! Instead, we would like to work with something more amenable to our aims. We would like components which are easily combined, modified or even replaced completely. A more modular paradigm suggests itself. We want the programming equivalent of Lego!



We'll see in the following examples that the OOP approach, and in particular the mindset of encapsulation, provides us with the modular building blocks that we are after. Repeat after me, "encapsulation is the best thing since sliced-bread!" :-)

OK, enough of the spiel, let's get our hands dirty with an actual example:

cd CtoC++/examples/example1 make

The first chunk of code to greet you inside class.cc (we'll use .cc to denote C++ source code files) is:

What's new? Well, first up, we see that the comment syntax has changed and that we can use just a leading double forward slash (//) to signal a note from the author. #include is familiar, except that we've dropped the .hs from inside the angle brackets.

The next block is a namespace declaration. The concept of a namespace is common to a number of programming languages and here we're setting one up called scientific and using it to store some handy constants. We can enclose anything we like in a namespace. We access the contents of a namespace via the using directive. In this case we're accessing an intrinsic one called std (standard)--we'll be doing that a lot!--and also our scientific one. The idea behind namespaces is to reduce the risk of a clash of names when programs get large. They're handy.

Next up in the source code is the class declaration (and definition, as it happens) itself:

You can see that the class called satellite contains some variables and also some methods. The contents of the class is also separated into two sections by the keywords private and public. We've declared our variables to be private (cannot be seen from outside the class) and our methods to be public (are visible from outside). In doing so, we've set up an interface (i.e. the public methods) through which other parts of the program can interact with this class. In this case, the program at large can call set, providing information about the satellite's orbit as it does so, and also mass_of_attractor in order to discover the mass of whatever the satellite is orbiting.



The existence of an interface simplifies the ways in which the object interacts with the rest of the program and means that any alterations to the program are much easier to make. For example, any you can make changes to the internals of a class without fear that you will unwittingly break some aspect of the program outside of the interface. Indeed, we could entirely re-write the contents of a (perhaps complex) class and as long as the interface remains unchanged, the rest of the program need never know! This is quite a boon for scientific software, which has a more rapid schedule of alterations that other kinds of software.



Last up is our glue code, or main function:

in which we declare in instance of our satellite class, the moon object, call set and finally mass_of_attractor, noting the dot (.) operator for accessing members of the class.

The way in which we print to stdout is also different in C++. Here we have used the left shift operator (<<) together with the cout I/O stream and also the endline (endl) operator.

You can run the program--and weigh the Earth!--by typing:

./class.exe

(The eagle-eyed amongst you will note that we have a small error in our calculation of mass. The intrigued amongst the cohort of eagles may be relieved to see that Kepler's law gives the combined mass of the moon and Earth in this case, and that if we subtract off the mass of the moon, we get closer to the actual mass of the Earth--phew!)

Exercises
 * Try modifying the main program, so that you weigh the Sun, instead of the Earth. The following pages give you details of the orbit of the Earth and the mass of the Sun, to check.
 * Add a new method to the satellite class to compute the mean orbital speed of the satellite, and perhaps another to compute the satellites speed at various points along it's orbit?
 * Add a whole new class to the program. This is just for practice, so it could be a very simple one.  How about a class to represent a 2-d vector (i.e. on the x-y plane), which has a method to report the magnitude of that vector?



=More on Methods=

OK. We've bundled up some methods and variables into a class. This is all to the good. However, we haven't delved too deeply into all the features that C++ provides with regards to methods. Let's rectify that right now. We'll make a start by typing:

cd ../example2 make

In this directory, you'll see that we've split our program over the files;
 * 1) methods.h, containing the declarations (names and types of arguments) for our enhanced satellite class,
 * 2) methods.cc, containing the 'meat' of the methods and,
 * 3) main.cc, containing the main function inside which we put our class through it's paces.

Looking inside the header file, you'll see our scientific namespace again, as well as the class declaration:

This time around we have some extra members:
 * We have a character pointer called name, along with an integer to store the length of the character array, once some memory has been allocated.
 * We have a number of constructor methods, which we immediately see are special since their (shared) name matches the name of the class.
 * We have a destructor, where it's name also matches the class name, but with a leading twiddle (~).
 * We have a private method called copy,
 * a display method and also
 * an assignment operator (=).

Let's go through these in turn.

Constructors are invoked when a new object is created. The two relevant lines in 'main.cc are:

Here we've declared two instances of the satellite class and--imaginatively enough--called them moon1 and moon2. We created moon1 using the default constructor (no arguments follow the variable name). The internals of which we can find inside methods.cc:

As it's name suggests, this method sets up an object with default values (zero values, null strings etc.) in lieu of any specific information.

moon2 was created using a constructor which takes arguments:

This method accepts the name of the satellite instance, together with values for the period and the semi-major axis. Given these, it merely calls the set method, which is sensible since this method has all the functionality that we desire, and it's a bad idea to duplicate the code.

We can see that these two methods have exactly the same save and differ only in their associated argument lists. This is an example of what's called overloading, which can be highly desirable when designing clear and simple class interfaces. We can overload methods and operators.

You will see that we also have what we've labelled as a copy constructor, which takes another instance of the satellite class as it's argument, and creates a new object in it's image.

This method makes use of a member initializer and calls the private copy method (not available from outside the class, but callable from other members). Member initializers are carried out before the method itself is called and are always done in order. In this case, we've set name equal to NULL so as to avoid dynamic memory allocation manoeuvres in the copy method.

C++ will provide what's known as shallow copy constructor, assignment and destructor methods implicitly, which are fine for classes which do not make use of dynamic memory allocation. However, for more complex classes, we must write our own deep copying methods. For example, our copy method:

The copy needs to be deep, as if we were not careful, we would end up with two classes containing pointers to the same block of memory (holding the 'name' character string) and that would not be at all what we wanted! Instead we allocate some new memory and call a string copying method from the standard C library. Copying the values of the numerical variables is easy. We've made use of the new C++ memory allocation function new, which we can all agree is far simpler than 'malloc'. Correspondingly delete replaces 'free'.

None of the other methods warrant any comment, except for the assignment operator:

In this case, we've overloaded the = operator and given it particular instructions when faced with instances of the satellite class on either side of it, such as the statement:

Using this method, we've ensured that a deep copy takes place, where the name string is handled appropriately.

Good eh? Now we see the way to create full and convenient interfaces to our classes. To run the program, type:

./methods.exe

Exercises


 * Experiment with the copy constructor. For example, is it legal syntax to add the declaration satellite moon3(moon2); towards the end of the main function?
 * Method arguments can have defaults attached, e.g. satellite(const char *nm, const double prd=0.0, const double sma=0.0). Experiment with the constructor with arguments.  How much more flexibility can you introduce to the interface?  Note that your default values should only be added to the declaration of the class (i.e. inserted in the header file), and your default arguments must be all the rightmost arguments in the list.
 * Can you define other methods/operators for this class? How about 'less than' (<) or 'greater than' (>) operators.  If two satellites were to collide and coalesce, what could a plus (+) operator do?

Hints: My template for the 'less-than' operator is below. The argument '_stllt', will act as the RHS of the comparison. The class through which the method is invoked will be the LHS. (A similar template will hold for the plus operator, except that this method must return a copy of a new instance of the class.)

Is this good enough? Note that since the class of the argument and this are the same, the instance on the LHS of the comparison can access the private data members of that on the RHS.

=Templates and the Standard Template Library=

OK, so things are going swimmingly. We're using classes for encapsulation. We've considered the interface to a class in some detail and seen how we can improve the way that instances of a class interact with the rest of the program. This is all excellent, but... You knew there was a wrinkle on the horizon, eh?

Let's take a moment to think about data structures. The way we store data can make a huge difference to a program. Given the right data structures, solving an involved problem can be a pleasure, if not a cinch. Given the wrong data structures, the whole enterprise can be a chore!

So far, we've hardly stopped to think about data structures. We've seen single variables and arrays of said variables. As an improvement, we've also seen structures and even arrays of structures. There are a great many more possibilities, however. We can have stacks, queues, linked lists, binary trees, sets, strings, vectors, matrices and many, many more. All these data structures are designed to highlight certain properties of some stored data and so make certain operations as easy as possible.

For example, a tree structure is good for representing a search through a state-space. If you wanted to program a computer to play chess, you could represent the state of the board at a node. Different moves from a given state would be the branches. As you can see, by using a tree we can hold a number of different move sequences in memory at the same time. We can pick and advance any stored state by another move. We can also prune away a whole 'subtree' of moves, should it prove ill-advised, according to some criterion.

For our code example, let's consider one of the simpler structures--a stack. To create a stack of boxes we would take a box and set it down. We take another box and place it on top of the first, and so on. In order to get at the first box, we need to take all the other boxes off it. The image below shows such as stack.



Sometimes, this is exactly the way in which we want to store our data. If we we're modelling the deposition and erosion of sediments on the sea floor, for example, a stack would be just the ticket.

OK, ok, this is all well and good, but where's the wrinkle? Well, let's say we want a stack of real numbers at one point of a program, and a stack of integers at another. Does that mean that we would need to write two different classes, with all their associated interface gubbins, one for the doubles and one for the integers? That would be a pain!

Fear not! We can write a template class instead. Templates are neat, as we do not need to specify the type of thing that will be found in a stack until the point where we declare an instance of said stack. In order to illustrate this approach, we have a small example of what we will call a LIFO stack. LIFO stands for 'Last In, First Out'.

cd ../example3 make

Inside lifo.h, you'll see the declaration (and definition - many compilers seem to prefer this) of our template class:

Note the use of the wildcard name TYPE in the angle brackets (this name could be anything, but TYPE in capitals stands out nicely). The interface to the class contains methods for construction and destruction, as well as the basic modes of operation--pushing and poping items on to and off the stack. We have a method to report what's on the top of the stack and a couple more to report whether the stack is 'full' or 'empty'.

Feel free to browse the details of the implementation, but we'll skip over them here. They are relatively rudimentary and no doubt could tolerate a good deal of improvement. The short piece of glue code is contained in main.cc and you can run the example program by typing:

./lifo.exe

One of the reasons, why we haven't laboured too hard over our stack implementation is because C++ provides us with something called the Standard Template Library, or STL, for short. This contains tried and tested implementations of of many data structures and algorithms that we would like. All there, provided to us for free!

An example of using a stack from the STL is in:

cd ../example4 make

This time, all we need is in main.cc:

and you can run the program in the usual way:

./stack.exe

For good measure, you will also see an example of a list sourced from the STL, and an associated iterator for cycling through the members of said list. Iterators allow us to cycle over members of a data structure without having to know the details of how that particular data structure is implemented.

To learn more about the STL, you can take a look at, e.g. SGI's page or that on Wikipedia. O'Reilly, of course, have a few good books on the topic too.

Other libraries that augment the STL are listed on http://www.boost.org. This collection contains many more useful algorithms and datatypes. With the STL, Boost etc., the sky is the limit!

Exercises


 * Modify the program in example4 to make use of other members of the STL, such as a queue and perhaps a linked list.
 * Those who are really looking for a challenge can get to grips with hash tables (maps) and binary trees!
 * Why not go the whole hog an write your own binary tree and iterator (depth or breadth first search)? You'll learn a lot!

=Inheritance=

The last topic that we will look at is inheritance. This is a mechanism through which you can declare a new class--called the derived class--to be a specialisation of another class--called the base class. In line with the spirit of the pragmatic programming tutorials, we will not linger on this topic as we believe that while it is certainly neat, it may be of limited use for our scientific projects.

In this example, we will consider the simplest, but quite likely the most often used, form on inheritance--public inheritance from a single parent base class.

cd ../example5 make

In inheritance.h, we see a simple base class declared:

Followed by a derived class which builds on the concept of a celestial body and adds in space to store information about it's orbit and additional methods:

In inheritance.cc, you will see that we call the 'set' method in the base class from the constructor of the derived class. This highlights that what is private in the base class is hidden from the derived class and so an appropriate interface is required even within a chain of parents and children.

In main.cc, we see that through the process of inheritance, we can call the volume method (declared in the base class) from an instance of the derived class:

Nore that we have declared two instances of the class satellite. The constructor for 'moon2' is given all the relevant information, whereas that for 'moon1' relies on default values for the size settings. To run the program type:

./inheritance.exe



Exercises


 * There is a good deal more to discuss on the topic of inheritance, but I will leave researching those details as an exercise to the reader for the moment.

=A Good Read?=


 * References for further reading.