StartingC

startingC: Learning the C Programming Language

=Introduction=

svn co http://source.ggy.bris.ac.uk/subversion-open/startingC/trunk ./startingC

=A Quintessential First Program=

OK, now that we have the example code, let's get cracking and run our first C program. First of all, move into the example directory:

cd startingC/examples/example1

We'll use of a Makefile for each example, so as to make the build process painless (hopefully!). All we need do is run make (see the [make tutorial about make] if you're interested in this further):

make

Now, we can run the classic program:

./hello.exe

and you should get the friendly response:

hello, world!

Bingo! We've just surmounted--in some ways--our hardest step; running our first C program. Given this quantum leap, everything else will be plain sailing from here, honest!

=Types & Operations=

Buoyed with confidence from our first example, let's march fearlessly onwards into the realm of variable types and basic operations. To do this, move up and over to the directory example2 and type make to build the example programs:

cd ../example2 make

Take a look inside types.c and after the start of the main function, you'll see a block of variable declarations:

char  nucleotide;        /* A, C, G or T for our DNA */ int   numPlanets;        /* eight in our solar system - poor old Pluto! */ float  length;            /* e.g.  1.8288m, for a 6' snooker table */ double accum;            /* an accumulator */

C, like many languages (e.g. Fortran), requires that variables must be declared to be of a certain type before they can be used, and here we see examples of four intrinsic types provided by the language. It's a very good habit to comment all your variable declarations, and here the comments pretty much explain what the various types are. double is a double precision--twice the storage space of a float--floating point number. The extra space make a double a good choice for an accumulator where you want to minimise rounding errors and avoid under- and overflow as best as possible. (The Fortran programmers amongst us will note, with a whince, the absence of an implicit type for complex numbers. Those reeling from this revelation will be comforted by the knowledge that C++ contains a complex class.)

Various types can be given further qualifiers, such as short, long, signed and unsigned:

short int mini;          /* typically two bytes */ long int maxi;           /* typically eight bytes */ signed char cSigned;     /* one byte, values in the range [-128:127] */ unsigned char cUsigned;  /* values in the range [0:255] */

The const keyword is also very useful for, well, declaring constants. In invaluable intrinsic (aka built-in) function when pondering the amount of memory assigned to a variable is sizeof.

In addition to single entities of various types, we can also declare arrays of the self-same intrinsics. The syntax for this is along the lines of:

char cStr[20];           /* a character array/string of 20 chars */ int iMat2d[3][3];        /* a 2-dimensional matrix of integers - 3x3 */

You'll see a good deal more of accessing the various elements of an array in later examples, but for now be satisfied with the knowledge that array indices start at 0 in C (yes, that's right Fortraners, that's zero, not 1) and that the syntax for array access is, e.g.:

cStr[0] = 'h'; /* first elememt set to ascii char code for 'h' */

Enumerated types can be a useful way to map (a list of) symbolic names to integer values.

Now that you've read it through, run the program and satisfy yourself that it all works as you expect it to. To run the program, type:

./types.exe

Shifting our attention to operations.c, let's consider some basic operations that C supports. This is the start of the doing things part.

The first block of code here gives an arithmetic example--how to calculate the volume of an oblate spheroid that happens to be close to all our hearts, our shared home Earth:

val = (4.0/3.0) * pi * pow(equi_rad,2) * pol_rad;

I won't dwell on this as I'm confident that the syntax is self-explanatory, save to mention that the function pow comes from the built-in library of math functions.

Next up, you'll see the decrement and increment operators:

--numPlanets; ++numPlanets;

also self-explanatory.

C provides the logic operators, == (is equal), != (not equal), && (AND) and || (OR); as well as the relationals, > (greater than), < (less than), >= (greater than or equal) and <= (less than or equal).

An operation that you will become keenly aware of--especially working in scientific computing--is the ability to temporarily convert the a variable from one type to another on-the-fly. This is known as casting. Two examples of this are:

(short int) pi (float) 42

where, in the first, we convert pi into a (short) integer and convert 42 into a floating point number in the second. Note that the cast does not effect the original variable in any way. i.e. the value given to the variable called pi is not changed through using the cast.

One last class of operations for now are the bitwise operators. These give you very low-level control over the bytes associated with variables, should you need that. For example, we can perform a bitwise AND on the two bytes 01001000 and 10111000, yielding 00001000 when all the bit pairs are considered in turn according to the criteria:

To run the second program, type:

./operations.exe

Now, it's very important that you muck around with these example programs as much as possible! Ideally, so much so that you break them! We never learn as much as when we make a mess of things, and since these are just toy programs, you may as well go for it! If you get in a pickle, you can get the original programs back with a quick waft of the Subversion wand:

svn revert *

Exercises

types.c
 * declare a character array sufficient to record the state of a game of naughts and crosses, populate is and print it to the screen.
 * How many bytes is used to store a long double?
 * You can give an initial value to a character array when you declare it (e.g. char cStr[20] = "xxxxxxxxxxxxxxxxxxxx";). What happens if we leave '\0' out of character assignments in this case?

operations.c
 * 29.2% of the Earth's surface is land. How much is this in square kilometers?
 * Logic is perilous. Can you think of a time when we say "or", but really mean logical AND?
 * What happens if you cast the character '9' to an int?

=Conditionals & Loops=

OK, we have types and operators under our belts. This C malarky isn't too bad, eh? Let's take a look at some stalwarts of the procedural family of languages--conditionals and loops. As we will start all our sections, move up and over to the example3 directory and build the program(s) therein:

cd ../example3 make

Looking inside flow.c, our first block shows how we can make many way decisions using if tests and the else catch-all:

if ( temperature < 0.0 ) { printf("Water would normally freeze at:\t%f\tdegrees C\n"); } else if (temperature > 100.0 ) { printf("Water would normally boil at:\t%f\tdegrees C\n"); } else { printf("The temperature must be in the range [0.0,100.0]\n"); }

This is all very nice and self-explanatory. Typically you would use the above for a decision point that could follow one of 3 or less branches. If you have more than 3 branches, the switch statement is likely to be more concise and easier for you and your fellow developers to read:

switch (iCount) { case 0: printf("case 0: nada, zip, nowt.\n"); break; case 1: printf("case 1: uno, sole, unitary.\n"); break; ... default:  /* a default protects against 'fall through' bugs */ printf("default: mucho, many, lashings.\n"); break; }

The default case is much like our else catch-all in the box above and is important to include as otherwise you will be vulnerable to a 'fall-through' bug. This is when none of the cases trigger because we did not consider the actual value passed to switch. You will also notice the break statements in all the cases. Adding these is also a defensive maneouvre, since we could accidentally trigger two cases. Case 4 and the default, say.

Moving on. The for is an oft used tool on the work bench:

for(ii=0; ii<iMax; ii++) { if (ii == 3) { printf("Surprise!\n"); continue; /* jumps to the start of the next iteration */ }     printf("Yup, I'm in a for loop, whizz-oh.  Counter is:\t%d\n", ii); }

It's tidy, succinct and gets the job done. Note that we've nested an if statement inside our loop. The continue statement is a useful way to skip the rest of an iteration, if it's superfluous.

Sometimes, however, we don't know ahead of time how many iterations of a loop will be required. We can't use a for loop in this case and the while loop steps into the breach for us. For example:

while (ii > threshold) { printf("%d\t> threshold, continuing..\n",ii,threshold); ii = rand;                   /* get next random number */ printf("next random value:\t%d\n", ii); }

In this case we keep testing to see if ii is greater than the threshold. If it is, then we go around the loop one more time, acquiring a new value for ii along the way. We loop back to the top, re-test against the threshold and so on. The loop will only terminate when ii is less than the threshold, i.e. when the while test fails, so watch out for those infinite loops!

To run the example program, type:

./flow.exe

Exercises


 * Can you nest an if within another if and what would be the point? Indeed can you have an if within a loop, within an if..?
 * What happens if you remove break statements from the switch construct?
 * Can you write a for loop that counts down rather than up? What about in steps of 2, or 3?
 * Can you increment more than one variable in a for loop?
 * Can you have multiple tests conditions in a loop?
 * What's the simplest infinite loop you can write? Do you know how to abort a program?!
 * Can you sabotage the counting in a for loop? Is there a way to protect against such a bug?

=The C Preprocessor=

Up until now, we've been studiously ignoring the lines beginning with # at the start of our programs. The time has come, however, to look these statements square in the eyes!..

cd ../example4 make

So far we've glanced upon constructs such as:


 * 1) include 

Lines starting with a # form instructions to the C preprocessor. We can think of the preprocessor as a form of cut & paste. In our example, the preprocessor will replace our #include line with the contents of the system header file, stdio.h. Why are we doing this? Well, we wish use some of the standard input/output library functions, such as printf in our program and the header file contains the function prototypes. The compiler needs these prototypes to make sure that we are calling the functions correctly and thus to produce a working executable or compile-time error--whichever is appropriate.

We'll look at header files in more detail when we come to write our own functions.

We can do a good deal more than just including header files, however. For one, we can use a #define statement to set global constants. Take a look inside macros.c and notice how we have specified the size of our character array, called cStr. Outside of the main program we have:


 * 1) define MAXSTR 25

Inside the main program, we then make use of our new symbol in our variable declarations block:

char cStr[MAXSTR]; /* a character array with size set globally */

We can arrange to loop over the contents of that array using:

for(ii=0;ii<MAXSTR;ii++) { cStr[ii] = 'c'; }

To run the program, type:

./macros.exe

as per usual.

Arranging conditional compilation is perhaps the most useful aspect of the preprocessor. Further down in the main program we have the conditional code block:

printf("DEBUG is ON\n"); printf("I'm going to print out a lot more information\n"); printf("Boy-oh-boy am I going to have a lot to say!\n");
 * 1) ifdef DEBUG
 * 1) endif

which we can activate through the use of an appropriate compiler flag. In order to do this, uncomment the line:


 * 1) CFLAGS=-DDEBUG

in the Makefile, retype make, and re-run.

The preprocessor gives us yet more possibilities, with constructs such as:


 * 1) if SYSTEM == WIN32
 * 2) include 
 * 3) elif SYSTEM == LINUX
 * 4) include 
 * 5) else
 * 6) include 
 * 7) endif

However, a word of caution It is wise not to overuse the preprocessor. For example:
 * It may be better to use a const variable declaration, rather than a global #define.
 * Conditional compilation can be useful, but if you can use run-time switches in your code instead, you will not have to keep re-compiling your programs when you want to vary a parameter, say.

If you're keen, you can see a good use of the preprocessor for setting function names in mixed Fortran-C programming.

Note that we now have 3 distinct stages en route to producing an executable program:
 * 1) The preprocessor step: cut & paste.
 * 2) Compilation: taking source code and creating object code.
 * 3) Linkage: Linking object files and possibly libraries together to give an executable.

Exercises


 * Vary the size of the character array. Note that you'll have to re-compile your program each time.
 * Invent a new block of conditionally-compiled code and make the appropriate changes to the Makefile to bring it into effect.
 * Experiment with the additional #ifdef and #ifndef preprocessor statements.

=Functions & Header Files=

So, onto functions. What are these and why do we use them?

Well, we can think of a function, in some ways, as a black box--we feed in inputs and it returns outputs. An example would be a trigonometric function, such as the sine function. If we input $$\pi/2$$ radians, we'll get 1 as the output; input $$\pi$$ and we'll get 0 back. We're not limited to just mathematical functions in C, however. We can write pretty much any function we like! We'll see many examples cropping up from here onwards.

OK, so much for a function's general form. Motivation-wise, if you need to do something more than once in a program, you should write a function to do it. That way, you just call your function whenever you need to perform that task. That strategy will give us concise programs as we don't need to duplicate any lines of code. Another benefit is that duplicate lines of code is a bug waiting to happen! Why? Well, there is a good chance that if you modify one of those lines of code, you'll forget to change the other. We're humans, after all. We err. Now, those lines are no longer identical and so will no longer do the same thing--tada, your bug.

Now repeat after me, never duplicate any code---write a function.

Another reason for writing a function, even if you don't call it more than once, is that breaking down your program into functional units will make it much easier to read and understand. This should be your #1 design criterion for any piece of code that you write.

OK, with the preamble out the way, let's take a look at an example:

cd ../example5 make

Inside funcs1.c, you'll see that we compute the volumes for all the planets in the solar system, rather than just for Earth. Accordingly, we bundle the volume calculation into a function of it's own:

double volume(double equitorial_rad, double polar_rad) { /* local variables */ const double pi = 3.14159265; double      retval;

/* the calcs */ retval = (4.0/3.0) * pi * pow(equitorial_rad,2) * polar_rad; /* functions typically return a value */ return retval; }

and call it a number of times as we cycle through the planets:

for(ii=0; ii<NumPlanets; ii++) { val = volume(equi_rad[ii], pol_rad[ii]); printf("the volume of the planet is:\t%f\tkms cubed\n", val); }

Note also the presence of a function prototype near the top of the file:

double volume(double equitorial_rad, double polar_rad);

We'll need one of those for each function that we write. To run the program, type:

./funcs1.exe

The eagle-eyed amongst you will have noticed that the const variable pi is no longer needed in the main program unit. Also, the variable val is declared inside the main program unit and the function. Both of these things allude to something called scope, and in particular that variables declared inside a function are only known to that function. This rule also applies to the main function. Thus, if we want to pass values between functions, we must use arguments and return values. (Deliberately ignoring global values, as they are typically considered to be bad news.)

While we're on the topic of C function arguments; they are what's known as passed-by-value. Beware, this contrasts with passed-by-reference, as used in Fortran, for example. What's the significance of this? Well, in C a copy of the value of the agument is passed into a function. That means that you can do anything you like to the it's value inside the function, but it will all be forgotten upon exiting. Pass--by-reference means that the actual memory address of the argument is passed into the routine and so any changes to it's value will stick. We'll look at this topic more closely when we consider memory addresses and pointers later on.

Exercise


 * Modify funcs1.c so that the equitorial radius argument is zeroed inside the function. Write a second loop to investigate the consequences of that inside the main program.
 * Write an additional function to calculate the surface area of a planet and print the results of applying that function too.

=Arrays & Pointers=

address, dereference address arith 2d arrays binary trees and linked lists - just give examples

=Structures=

DAB again

watch out for padding

=The Command Line and I/O=

=Further Reading=

The bible is The C Programming Language by Kernighan & Ritchie.