Tensten's Journal

Sunday, April 16, 2006

Building robust software, part 1

A few years ago, I got a call from a former employer looking for game industry tips on software quality with applicability in the automotive industry. That original discussion never turned into the imagined series of speaking tours, preaching to the car makers how they should be crafting their software. Instead, I've had years to stew over the lessons I've learned on engineering practice, without an outlet. This is the first of a series of articles about the subject. The goal is to codify some ideas about how to build a robust piece of software.

A word on languages: I'm talking about C/C++ here. Since those are the languages I work in most often, they're where I feel the greatest comfort making comments. They are fairly low-level languages that expose more than hide the complexities of software development. Higher level languages like Java, C#, Python and the like are built to hide some of the difficulty. In my experience, though, if you do not understand the steps they are taking to hide complexity, you can get yourself into just as much trouble as in the low-level languages.

By far the most common source of errors/crashes I've seen in C/C++ programs is the unchecked use of invalid pointers. (Are there any non-programmers reading this? A pointer is just the address of a region of memory.) Pointers are the classic double-edged sword. A fundamental element of programming, they are simultaneously both powerful and dangerous. Pointer problems largely fall into one of three categories.


  • Not checking return values.This one is so common, they actually teach it in school. You ask for a pointer to a address, and then use it without making sure it points where you think it does. Even a read through an invalid pointer can crash you. Is it non-zero? Does it point somewhere inside your program's address space? Is the thing on the other side of it still what you think it is?

  • Uninitialized memory. Suppose you made a structure, I asked for the value of a pointer in your structure, and discover that you never initialized that field? What am I getting back? What is the value of uninitialized memory? Sadly, it can look like anything. Now obviously you should initialize your pointers to known values if anyone (even you) can get at them later. But we also want a way to recognize uninitialized values for what they are. That way when the operating system tells me my program has been shut down because it tried to access (read or write) memory location 0xABABABAB, I know exactly why.

  • Keeping a pointer to memory you no longer own. This is the number one source for bad data in the middle of an object that, I swear, was good just a second ago. Problems here are especially common in systems with freelist-style memory allocators that tend to hand out the address most recently returned to them. Consider this example:


#include <stdlib.h>
#include <stdio.h>

typedef struct _A
{
int val;
} A;

typedef struct _B
{
int* val_ptr;
} B;

static void kill_me( A* a );
static B* b = 0; // NULL is just a typedef for 0

int main( )
{
A* a = (A*) malloc( sizeof( A ) );
if( a )
{
kill_me( a );
a->val = 1; // a no longer points to valid memory
}

if( b && b->val_ptr )
{
printf( "B's val: %i\n", *b->val_ptr );
}
free( b );
return( 0 );
}

void kill_me( A* a )
{
free( a );
b = (B*) malloc( sizeof( B ) );
// because A and B are the same size, B's malloc may
// well have the same address as A
// even if there's something after A on the heap
b->val_ptr = 0;
}

I can't say what this program will do when it runs. If the address of b is the same as a, it will crash when it derefences the value 1 as if it were a valid pointer. Then again, the addresses could be different, and it will work, obfuscating the bug until 4 AM the night before you're ready to ship. This is an admittedly contrived example, but problems like this happen all the time in software development, often in seemingly innocuous cleanup code.

A dream system will provide an easy way of resolving all those problems. In part 2 of this series, we'll try to develop such a system.

1 Comments:

  • PC-Lint catches all of the above. It's a pain in the posterior, but it works...

    By Anonymous Anonymous, at 2:50 PM  

Post a Comment

<< Home