Difference between revisions of "StartingC"

From SourceWiki
Jump to navigation Jump to search
Line 355: Line 355:
 
Well, we can think of a function, in some ways, as a black box--'''we feed in inputs and it returns outputs'''.  An example would be a trigonometric function, such as the '''sine''' function.  If we input <math>\pi/2</math> radians, we'll get 1 as the output; input <math>\pi</math> and we'll get 0 back.  We're not limited to just mathematical functions in C, however.  We can write pretty much any function we like!  We'll see many examples cropping up from here onwards.
 
Well, we can think of a function, in some ways, as a black box--'''we feed in inputs and it returns outputs'''.  An example would be a trigonometric function, such as the '''sine''' function.  If we input <math>\pi/2</math> radians, we'll get 1 as the output; input <math>\pi</math> and we'll get 0 back.  We're not limited to just mathematical functions in C, however.  We can write pretty much any function we like!  We'll see many examples cropping up from here onwards.
  
OK, so much for a function's general form.  Motivation-wise, if you need to do something more than once in a program, you should write a function to do it.  That way, you just call your function whenever you need to perform that task.  That strategy will give us concise programs as we don't need to duplicate any lines of code.  Another benefit is that '''duplicate lines of code is a bug waiting to happen!'''  Why?  Well, there is a good chance that if you modify one of those lines of code, you'll forget to change the other.  We're humans, after all.  We err.  Now, those lines are no longer identical and so will no longer do the same thing--'''tada''', your bug.
+
OK, so much for a function's general form.  Motivation-wise, if you need to do something more than once in a program, you should write a function to do it.  That way, you just call your function whenever you need to perform that task.  That strategy will give us concise programs as we don't need to duplicate any lines of code.  Another benefit is that '''duplicate lines of code is a bug waiting to happen!'''  Why?  Well, there is a good chance that if you modify one of those lines of code, you'll forget to change the other.  We're humans, after all.  We err.  Now, those lines are no longer identical and so will no longer do the same thing--tada, your bug.
  
 
Now repeat after me, '''never duplicate any code---write a function'''.
 
Now repeat after me, '''never duplicate any code---write a function'''.

Revision as of 14:48, 24 August 2009

startingC: Learning the C Programming Language

Introduction

svn co http://source.ggy.bris.ac.uk/subversion-open/startingC/trunk ./startingC

A Quintessential First Program

OK, now that we have the example code, let's get cracking and run our first C program. First of all, move into the example directory:

cd startingC/examples/example1

We'll use of a Makefile for each example, so as to make the build process painless (hopefully!). All we need do is run make (see the [make tutorial about make] if you're interested in this further):

make

Now, we can run the classic program:

./hello.exe

and you should get the friendly response:

hello, world!

Bingo! We've just surmounted--in some ways--our hardest step; running our first C program. Given this quantum leap, everything else will be plain sailing from here, honest!

Types & Operations

Buoyed with confidence from our first example, let's march fearlessly onwards into the realm of variable types and basic operations. To do this, move up and over to the directory example2 and type make to build the example programs:

cd ../example2
make

Take a look inside types.c and after the start of the main function, you'll see a block of variable declarations:

  char   nucleotide;        /* A, C, G or T for our DNA */
  int    numPlanets;        /* eight in our solar system - poor old Pluto! */
  float  length;            /* e.g.  1.8288m, for a 6' snooker table */
  double accum;             /* an accumulator */

C, like many languages (e.g. Fortran), requires that variables must be declared to be of a certain type before they can be used, and here we see examples of four intrinsic types provided by the language. It's a very good habit to comment all your variable declarations, and here the comments pretty much explain what the various types are. double is a double precision--twice the storage space of a float--floating point number. The extra space make a double a good choice for an accumulator where you want to minimise rounding errors and avoid under- and overflow as best as possible. (The Fortran programmers amongst us will note, with a whince, the absence of an implicit type for complex numbers. Those reeling from this revelation will be comforted by the knowledge that C++ contains a complex class.)

Various types can be given further qualifiers, such as short, long, signed and unsigned:

  short int mini;           /* typically two bytes */ 
  long  int maxi;           /* typically eight bytes */ 
  signed char cSigned;      /* one byte, values in the range [-128:127] */
  unsigned char cUsigned;   /* values in the range [0:255] */

The const keyword is also very useful for, well, declaring constants. In invaluable intrinsic (aka built-in) function when pondering the amount of memory assigned to a variable is sizeof().

In addition to single entities of various types, we can also declare arrays of the self-same intrinsics. The syntax for this is along the lines of:

  char cStr[20];            /* a character array/string of 20 chars */
  int  iMat2d[3][3];        /* a 2-dimensional matrix of integers - 3x3 */

You'll see a good deal more of accessing the various elements of an array in later examples, but for now be satisfied with the knowledge that array indices start at 0 in C (yes, that's right Fortraners, that's zero, not 1) and that the syntax for array access is, e.g.:

cStr[0] = 'h';  /* first elememt set to ascii char code for 'h' */

Enumerated types can be a useful way to map (a list of) symbolic names to integer values.

Now that you've read it through, run the program and satisfy yourself that it all works as you expect it to. To run the program, type:

./types.exe

Shifting our attention to operations.c, let's consider some basic operations that C supports. This is the start of the doing things part.

The first block of code here gives an arithmetic example--how to calculate the volume of an oblate spheroid that happens to be close to all our hearts, our shared home Earth:

val = (4.0/3.0) * pi * pow(equi_rad,2) * pol_rad;

I won't dwell on this as I'm confident that the syntax is self-explanatory, save to mention that the function pow comes from the built-in library of math functions.

Next up, you'll see the decrement and increment operators:

  --numPlanets;
  ++numPlanets;

also self-explanatory.

C provides the logic operators, == (is equal), != (not equal), && (AND) and || (OR); as well as the relationals, > (greater than), < (less than), >= (greater than or equal) and <= (less than or equal).

An operation that you will become keenly aware of--especially working in scientific computing--is the ability to temporarily convert the a variable from one type to another on-the-fly. This is known as casting. Two examples of this are:

  (short int) pi
  (float) 42

where, in the first, we convert pi into a (short) integer and convert 42 into a floating point number in the second. Note that the cast does not effect the original variable in any way. i.e. the value given to the variable called pi is not changed through using the cast.

One last class of operations for now are the bitwise operators. These give you very low-level control over the bytes associated with variables, should you need that. For example, we can perform a bitwise AND on the two bytes 01001000 and 10111000, yielding 00001000 when all the bit pairs are considered in turn according to the criteria:

INPUT OUTPUT
A B A AND B
0 0 0
0 1 0
1 0 0
1 1 1

To run the second program, type:

./operations.exe

Now, it's very important that you muck around with these example programs as much as possible! Ideally, so much so that you break them! We never learn as much as when we make a mess of things, and since these are just toy programs, you may as well go for it! If you get in a pickle, you can get the original programs back with a quick waft of the Subversion wand:

svn revert *

Exercises

types.c

  • declare a character array sufficient to record the state of a game of naughts and crosses, populate is and print it to the screen.
  • How many bytes is used to store a long double?
  • You can give an initial value to a character array when you declare it (e.g. char cStr[20] = "xxxxxxxxxxxxxxxxxxxx";). What happens if we leave '\0' out of character assignments in this case?

operations.c

  • 29.2% of the Earth's surface is land. How much is this in square kilometers?
  • Logic is perilous. Can you think of a time when we say "or", but really mean logical AND?
  • What happens if you cast the character '9' to an int?

Conditionals & Loops

OK, we have types and operators under our belts. This C malarky isn't too bad, eh? Let's take a look at some stalwarts of the procedural family of languages--conditionals and loops. As we will start all our sections, move up and over to the example3 directory and build the program(s) therein:

cd ../example3
make

Looking inside flow.c, our first block shows how we can make many way decisions using if tests and the else catch-all:

  if ( temperature < 0.0 ) {
    printf("Water would normally freeze at:\t%f\tdegrees C\n");
  }
  else if (temperature > 100.0 ) {
    printf("Water would normally boil at:\t%f\tdegrees C\n");
  }
  else {
    printf("The temperature must be in the range [0.0,100.0]\n");
  }

This is all very nice and self-explanatory. Typically you would use the above for a decision point that could follow one of 3 or less branches. If you have more than 3 branches, the switch statement is likely to be more concise and easier for you and your fellow developers to read:

  switch (iCount) {
  case 0:
    printf("case 0: nada, zip, nowt.\n");
    break;
  case 1:
    printf("case 1: uno, sole, unitary.\n");
    break;
...
  default:  /* a default protects against 'fall through' bugs */
    printf("default: mucho, many, lashings.\n");
    break;
  }

The default case is much like our else catch-all in the box above and is important to include as otherwise you will be vulnerable to a 'fall-through' bug. This is when none of the cases trigger because we did not consider the actual value passed to switch(). You will also notice the break statements in all the cases. Adding these is also a defensive maneouvre, since we could accidentally trigger two cases. Case 4 and the default, say.


Moving on. The for is an oft used tool on the work bench:

  for(ii=0; ii<iMax; ii++) {
    if (ii == 3) {
      printf("Surprise!\n");
      continue;  /* jumps to the start of the next iteration */
    }  
    printf("Yup, I'm in a for loop, whizz-oh.  Counter is:\t%d\n", ii);
  }

It's tidy, succinct and gets the job done. Note that we've nested an if statement inside our loop. The continue statement is a useful way to skip the rest of an iteration, if it's superfluous.

Sometimes, however, we don't know ahead of time how many iterations of a loop will be required. We can't use a for loop in this case and the while loop steps into the breach for us. For example:

  while (ii > threshold) {
    printf("%d\t> threshold, continuing..\n",ii,threshold);
    ii = rand();                    /* get next random number */
    printf("next random value:\t%d\n", ii);
  }

In this case we keep testing to see if ii is greater than the threshold. If it is, then we go around the loop one more time, acquiring a new value for ii along the way. We loop back to the top, re-test against the threshold and so on. The loop will only terminate when ii is less than the threshold, i.e. when the while test fails, so watch out for those infinite loops!

To run the example program, type:

./flow.exe

Exercises

  • Can you nest an if within another if and what would be the point? Indeed can you have an if within a loop, within an if..?
  • What happens if you remove break statements from the switch construct?
  • Can you write a for loop that counts down rather than up? What about in steps of 2, or 3?
  • Can you increment more than one variable in a for loop?
  • Can you have multiple tests conditions in a loop?
  • What's the simplest infinite loop you can write? Do you know how to abort a program?!
  • Can you sabotage the counting in a for loop? Is there a way to protect against such a bug?

The C Preprocessor

Up until now, we've been studiously ignoring the lines beginning with # at the start of our programs. The time has come, however, to look these statements square in the eyes!..

cd ../example4
make

So far we've glanced upon constructs such as:

#include <stdio.h>

Lines starting with a # form instructions to the C preprocessor. We can think of the preprocessor as a form of cut & paste. In our example, the preprocessor will replace our #include line with the contents of the system header file, stdio.h. Why are we doing this? Well, we wish use some of the standard input/output library functions, such as printf() in our program and the header file contains the function prototypes. The compiler needs these prototypes to make sure that we are calling the functions correctly and thus to produce a working executable or compile-time error--whichever is appropriate.

We'll look at header files in more detail when we come to write our own functions.

We can do a good deal more than just including header files, however. For one, we can use a #define statement to set global constants. Take a look inside macros.c and notice how we have specified the size of our character array, called cStr. Outside of the main program we have:

#define MAXSTR 25

Inside the main program, we then make use of our new symbol in our variable declarations block:

char cStr[MAXSTR];  /* a character array with size set globally */

We can arrange to loop over the contents of that array using:

  for(ii=0;ii<MAXSTR;ii++) {
    cStr[ii] = 'c';
  }

To run the program, type:

./macros.exe

as per usual.

Arranging conditional compilation is perhaps the most useful aspect of the preprocessor. Further down in the main program we have the conditional code block:

#ifdef DEBUG
  printf("DEBUG is ON\n");
  printf("I'm going to print out a lot more information\n");
  printf("Boy-oh-boy am I going to have a lot to say!\n");
#endif 

which we can activate through the use of an appropriate compiler flag. In order to do this, uncomment the line:

#CFLAGS=-DDEBUG

in the Makefile, retype make, and re-run.

The preprocessor gives us yet more possibilities, with constructs such as:

#if SYSTEM == WIN32
#include <win.h>
#elif SYSTEM == LINUX
#include <linux.h>
#else
#include <default.h>
#endif

However, a word of caution It is wise not to overuse the preprocessor. For example:

  • It may be better to use a const variable declaration, rather than a global #define.
  • Conditional compilation can be useful, but if you can use run-time switches in your code instead, you will not have to keep re-compiling your programs when you want to vary a parameter, say.

If you're keen, you can see a good use of the preprocessor for setting function names in mixed Fortran-C programming.


Note that we now have 3 distinct stages en route to producing an executable program:

  1. The preprocessor step: cut & paste.
  2. Compilation: taking source code and creating object code.
  3. Linkage: Linking object files and possibly libraries together to give an executable.

Exercises

  • Vary the size of the character array. Note that you'll have to re-compile your program each time.
  • Invent a new block of conditionally-compiled code and make the appropriate changes to the Makefile to bring it into effect.
  • Experiment with the additional #ifdef and #ifndef preprocessor statements.

Functions & Header Files

So, onto functions. What are these and why do we use them?

Well, we can think of a function, in some ways, as a black box--we feed in inputs and it returns outputs. An example would be a trigonometric function, such as the sine function. If we input [math]\displaystyle{ \pi/2 }[/math] radians, we'll get 1 as the output; input [math]\displaystyle{ \pi }[/math] and we'll get 0 back. We're not limited to just mathematical functions in C, however. We can write pretty much any function we like! We'll see many examples cropping up from here onwards.

OK, so much for a function's general form. Motivation-wise, if you need to do something more than once in a program, you should write a function to do it. That way, you just call your function whenever you need to perform that task. That strategy will give us concise programs as we don't need to duplicate any lines of code. Another benefit is that duplicate lines of code is a bug waiting to happen! Why? Well, there is a good chance that if you modify one of those lines of code, you'll forget to change the other. We're humans, after all. We err. Now, those lines are no longer identical and so will no longer do the same thing--tada, your bug.

Now repeat after me, never duplicate any code---write a function.

Another reason for writing a function, even if you don't call it more than once, is that breaking down your program into functional units will make it much easier to read and understand. This should be your #1 design criterion for any piece of code that you write.

Arrays & Pointers

address, dereference address arith 2d arrays binary trees and linked lists - just give examples

Structures

DAB again

watch out for padding

The Command Line and I/O

Further Reading

The bible is The C Programming Language by Kernighan & Ritchie.