Heap Analysis - Making Memory Errors a Thing of the Past

Abstract
Dynamic Memory Management
Problems with Heap Corruption
Common Errors
Detecting and Reporting Errors
Manual Checking (Bounds Checking)
Memory Leaks
Compiler Support
Summary

Abstract

The dynamic management of memory enables complex programs to be developed that can cope with memory usage patterns that differ substantially depending on the nature of the task currently being performed. This comes with some additional complexity for the application programmer due to the additional responsibility of keeping track of the memory the program has allocated and releasing it when it's no longer required. In addition, it's possible for the programmer to lose track of memory resulting in "leaks" or to write code that manipulates a part of the dynamically allocated memory that logically belongs to another part of the code base resulting in an error that manifests itself in an unrelated part of the program, prevents memory from being correctly released to the system, or causes the memory allocator itself to fail.

Conventional debugging techniques rarely provide much in the way of assistance to the developer in locating the sources of such problems due in large part to their tendency to manifest themselves in locations other than the source of the error. This is true of both memory leaks and corruption problems. Both problems are compounded by the use of concurrency in the form of threads because of their ability to interleave execution that affects the dynamically manage memory region and alter the behavior of a sequence of code in unpredictable ways.

This document describes techniques for analyzing and detecting problems related to dynamic memory management using special versions of all functions related to memory management that perform additional runtime checks on the application during debugging and testing.

Dynamic Memory Management

Dynamic memory management allows a program to dynamically request memory buffers or blocks of a particular size from the runtime environment -- using the malloc(), realloc() or calloc() functions -- and release them back to the runtime environment when they're no longer required -- using the free() function. The memory allocator is responsible for satisfying these requests. The memory allocator manages a region of the program's memory area -- the heap -- to satisfy all such requests. The runtime environment grows the size of the heap when it no longer has enough memory available to satisfy allocation requests. It may also return memory from the heap to the system when the program releases memory.

The memory allocator must keep track of information about the heap buffers it has given to the program so that it can make the memory available for subsequent allocation requests. At a minimum, the allocator must know the size of the original block to do this. When releasing a block, the allocator normally places it in a list of available blocks, called a free list. The information that the allocator keeps about the block is normally kept in a header that precedes the block itself in memory.

Problems with Heap Corruption

Heap corruption occurs when an application damages the allocator's view of the heap. This may occur as a result of incorrect arguments to a memory allocation function, or as a result of writing beyond the bounds of an allocated block, writing through a stale pointer or writing using an uninitialized pointer. The outcome of heap corruption can be relatively benign, such as causing a memory leak where some memory isn't returned to the heap and is inaccessible to the program afterwards, or it may be fatal causing a fault, usually within the allocator itself.

A memory fault usually occurs within the allocator when it's manipulating its free list(s) after heap corruption has occurred. The problem with identifying the source of such errors is frequently one of dislocality. The source of heap corruption is often dislocated from the source of the fault, because the fault occurs when the program attempts to free memory -- or even worse -- on a subsequent allocation attempt after the memory has been freed. When the problem involves the corruption of the block of freed memory, problem determination is difficult enough because the heap corruption may have occurred well before the release of the block, and if the fault occurs on a subsequent block it's commensurately harder to determine which block was responsible for the fault.

The problem is exacerbated when we consider the problems of contiguous memory blocks and multi-threaded execution. Each of these issues can compound the difficulty associated with determining the actual source of the error. In the case of contiguous blocks, the problem that presents itself is that a program writing outside of the bounds can not only corrupt the allocator's information about the block of memory it's using, it may just as frequently corrupt the allocator's view of the part of the heap that's contiguous with that block, either before or after it, which may or may not be allocated. When this happens a fault occurs in the allocator in an unrelated allocation or release attempt. Likewise, multi-threaded execution may cause any fault to occur in a different thread from the thread that actually corrupted the heap, because threads interleave requests to allocate or release memory.

The dislocality property is what creates difficulty for the use of conventional debugging techniques to be applied to heap corruption problems. A conventional backtrace rarely indicates the source of the problem. In this case, debugging techniques usually focus on applying breakpoints to narrow down the offending section of code, often with conditional breakpoints used to halt execution at a particular invocation of the allocator. This can sometimes be applied successfully to narrow the problem down for single threaded programs but is often intractable for multi-threaded execution because the fault may occur at an unpredictable time and the act of debugging the program may influence the appearance of the fault by altering the way that thread execution is interleaved. Even when the source of the error has been narrowed down, there may be a substantial amount of manipulation performed on the block before it's released, particularly for long-lived heap buffers.

A final item to be considered is that a seemingly benign source of errors may prove to be fatal under even slightly different conditions. For example, a program that works under a particular memory allocation strategy may abort if the allocation strategy is changed in only minor ways. A good example of this is memory overrun conditions. The allocator is free to return blocks that are larger than requested to satisfy allocation requests. Under this circumstance, the program may behave normally in the presence of overrun conditions. A simple change, such as changing the size of the block requested, however, may result in the allocation of a block of the exact size requested, which results in a fatal error for the offending program. This may also occur if the allocator is configured slightly differently or the allocator policy is changed in a subsequent release of the runtime library. This makes it all the more important to detect errors early in the lifecycle of an application, even if it doesn't exhibit fatal errors in the testing phase.

Common Errors

Heap corruption problems frequently involve assignments to memory that corrupt the header of an allocated block. They may also involve the incorrect use of a function in the memory allocation routines. An efficient allocator may make use of assumptions to avoid keeping additional memory for validity information and avoid costly runtime checking in some circumstances. Providing invalid information to a request such as free() cause a fatal error for such an allocator. Even the most robust allocator can fall prey to such problems occasionally.

This section outlines some of the most frequent sources of heap corruption problems.

Overrun/Underrun

Overrun or underrun occur when the program writes outside of the bounds of the allocated block. They are frequently the most difficult types of heap corruption to track down, and the most fatal to program execution.

In the case of overrun, the program writes past the end of the allocated block. This may cause the corruption of the next contiguous block in the heap, whether it's allocated or not. The behavior observed in this case varies depending on whether that block is allocated or free, and whether it's associated with a part of the program related to the source of the error. Corruption of neighboring allocated blocks usually manifest themselves when that block is released elsewhere in the program. Corruption of unallocated blocks usually result in a fatal error on some subsequent allocation request. Although this may well be the next allocation request, it's actually dependent on a complex set of conditions that could result in a fault at a much later point in time, in a completely unrelated section of the program, especially when small blocks of memory are involved.

Underrun occurs when the program writes before the start of the allocated block. This often corrupts the header of the block itself, and may also corrupt the preceding block in memory. Underrun errors usually result in a fault occurring when the program attempts to release the corrupted block.

Requests to free()

Requests to release memory can frequently cause heap corruption because they involve the program keeping track of the pointer for the allocated block and passing that pointer to the free() function. If the pointer is stale, or it doesn't point to the exact start of the allocated block it can create problems.

A duplicate request to free() involves passing a stale pointer to the free() function. A pointer is stale when it refers to a block of memory that's already been released. As such, there's no way to know whether the pointer is referring to unallocated memory or memory that's been used to satisfy an allocation request to another part of the program. Passing a stale pointer to free() may result in a fault in the allocator. Perhaps even worse, passing a stale pointer could release a block that's been used to satisfy another allocation request. The code that made that allocation request could then compete with another section of code that subsequently allocated the same region of heap, resulting in corrupted data for one or both. The most effective way to avoid this error is to NULL out pointers when the block is released, but this is all too uncommon, and is difficult to do when pointers are aliased in any way.

The second common source of errors is to attempt to release an interior pointer (i.e. one that's somewhere inside the allocated block rather than at the beginning). This isn't a legal operation, but it may occur when the pointer has been used in conjunction with pointer arithmetic. The result of providing an interior pointer is highly dependent on the allocator and is largely unpredictable, but it frequently results in a fault in the free() call.

A more rare source of errors is to pass an uninitialized pointer to free(). If the uninitialized pointer is an automatic (stack) variable, it may point to a heap buffer, causing the types of coherency problems described for duplicate free() requests above. If the pointer contains some other non-NULL value, it may cause a fault in the allocator.

Using Uninitialized/Stale Pointers

The use of uninitialized pointers or stale pointers can result in the corruption of data in a heap buffer that's allocated to some other part of the program. It may also result in the same kind of heap corruption associated with memory overrun and memory underrun errors.

Detecting and Reporting Errors

The goal in any attempt to detect heap corruption problems is to correctly identify the source of the error, rather than getting a fault in the allocator at some later point in time. A first step to achieving this goal is to create an allocator that's more robust than the conventional allocator at determining whether the heap has been corrupted on every entry into the allocator, whether for an allocation request or a release request. For example, on a release request, the allocator should be capable of determining whether the pointer given to it's valid, and whether the associated block's header has been corrupted, and whether either of the neighboring blocks have been corrupted.

The first part of this goal can be achieved through the use of a replacement library for the allocator that keeps additional block information in the header of every heap buffer. The replacement library can be used during the testing of the application to help isolate any heap corruption problems. When a source of heap corruption is detected by this allocator, it can print an error message indicating:

the point at which the error was detected
the program location that made the request
information about the heap buffer that contained the problem.

The library technique can be refined by helping to detect some of the sources of errors that may still elude detection, such as memory overrun or underrun errors that occur before the corruption is detected by the allocator. This may be done when the standard libraries are the vehicle for the heap corruption, such as an errant call to memcpy(), for example. In this case, the standard memory manipulation functions and string functions can be replaced with versions that make use of the information in the debugging allocator library to determine if their arguments reside in the heap, and whether they would cause the bounds of the heap buffer to be exceeded. Under these conditions, the function can then call the error reporting functions to provide information about the source of the error.

Using the `malloc_g` Library

The malloc_g library provides the capabilities described above. The malloc_g library can be used by adding -lmalloc_g to the link command when building the program as a replacement for the traditional allocation routines. When used in this way, malloc_g provides a minimal level of checking by default. When an allocation or release request is performed, the library checks only the immediate block under consideration and its neighbors looking for sources of heap corruption.

Additional checking and more informative error reporting can be done by using additional calls provided by the malloc_g library. The mallopt() function has been modified to provide control over the types of checking performed by the library. There are also debug versions of each of the allocation and release routines that can be used to provide both file and line information during error reporting. In addition to reporting the file and line information about the caller when an error is detected, the error reporting mechanism prints out the file and line information that was associated with the allocation of the offending heap buffer.

In order to control the use of the malloc_g library and obtain correct prototypes for all the correct entry points into the malloc_g library, it's necessary to include a different header file for the library. This header file is included in <malloc_g/malloc.h>.

The recommended practice for using the library is to always make use of the library for debug variants in builds. In this case the macro used to identify the debug variant in C code should trigger the inclusion of the malloc_g header file, and malloc_g should always be added to the link. In addition, you may want to follow the practice of always adding an exit handler that provides a dump of leaked memory, and initialization code that turns on a reasonable level of checking for the debug variant of the program.

The malloc_g library achieves what it needs to do by keeping additional information in the header of each heap buffer. The header information includes an additional 36 bytes for keeping doubly-linked lists of all allocated blocks, file, line and other debug information, flags and a CRC of the header. The allocation policies and configuration are identical to the normal system memory allocation routines except for the additional internal overhead imposed by malloc_g. This allows the malloc_g library to perform it checks without altering the size of blocks requested by the program. Such manipulation could result in an alteration of the behavior of the program with respect to the allocator, yielding different results when linked against malloc_g.

All allocated blocks are integrated into a number of allocation chains associated with allocated regions of memory kept by the allocator in arenas or blocks. The malloc_g library has intimate knowledge about the internal structures of the allocator, allowing it to use short-cuts to find the correct heap buffer associated with any pointer, resorting to a lookup on the appropriate allocation chain only when necessary. This minimizes the performance penalty associated with validating pointers, but it's still significant.

The time and space overheads imposed by the malloc_g library are too great to make it suitable for use as a production library, but are manageable enough to allow them to be used during the test phase of development and during program maintenance.

What's Checked?

As indicated above, the malloc_g library provides a minimal level of checking by default. This includes a check of the integrity of the allocation chain at the point of the local heap buffer on every allocation request. In addition, the flags and CRC of the header are checked for integrity. When the library can locate the neighboring heap buffers, it also checks their integrity. There are also checks specific to each type of allocation request that are done. Call-specific checks are described according to the type of call below.

Additional checks can be turned on using the mallopt() call. Each of the additional types of checking, and the sources of heap corruption that it is useful for detecting are described in the next section.

Memory Allocation

When a heap buffer is allocated using any of the heap allocation routines, the heap buffer is allocated and added to the allocation chain for the arena or block within the heap that the heap buffer was allocated from. At this time, any problems detected in the allocation chain for the arena or block is reported. After successfully inserting the allocated buffer in the allocation chain, the previous and next buffers in the chain are also checked for consistency.

Reallocating Memory

When an attempt is made to resize a buffer through a call to the realloc() function, the pointer is checked for validity if it's a non-NULL value. If it's valid, the header of the heap buffer is checked for consistency. If the buffer is large enough to satisfy the request, the buffer header is modified and the call returns. If a new buffer is required to satisfy the request, memory allocation is performed to obtain a new buffer large enough to satisfy the request with the same consistency checks being applied as in the case of memory allocation described above. The original buffer is then released.

If fill area boundary checking is enabled (described in the next section) the guard code checks are also performed on the allocated buffer before it's actually resized, or if a new buffer is used, the guard code checks are done just before releasing the old buffer.

Releasing Memory

This includes, but isn't limited to checking to ensure that the pointer provided to a free() request is correct and points to an allocated heap buffer. Guard code checks may also be performed on release operations to allow fill area boundary checking.

Controlling Level of Checking

The mallopt() function call allows extra checks to be enabled within the library.

mallopt()

int mallopt stdcargs( ( int cmd, 
                        union malloptarg value ) );

cmd

An integer indicating the parameter (or option) to be affected by the call.

Available options used to enable additional checks in the library:

MALLOC_CKACCESS: Turn on boundary checking for memory and string operations.
MALLOC_FILLAREA: Turn on fill area boundary checking.
MALLOC_CKCHAIN: Enable full chain checking.

For each of the above options, an integer argument value of one indicates that the given type of checking should be enabled from that point onward.

value

A union, of type union malloptarg, that can hold any legal value for malloc options or parameters.

MALLOC_CKACCESS

Turns on boundary checking for memory and string operations. This helps in detecting buffer overruns and underruns that are a result of memory or string operations. When this checking is turned on, each pointer operand to a memory or string operation is checked to see if it's a heap buffer. If it is, the size of the heap buffer is checked and the information is used to ensure that no assignments are made beyond the bounds of the heap buffer. If an attempt is made that would assign past the buffer boundary, a diagnostic warning message is printed.

Here's how this option can be used to find an overrun error:

...
char *p;
union malloptarg opt;
opt.i = 1;
mallopt(MALLOC_CKACCESS, opt);
p = malloc(strlen("hello"));
strcpy(p, "hello, there!");  /* a warning is generated here */
...

The following illustrates how access checking can trap a reference through a stale pointer:

...
char *p;
union malloptarg opt;
opt.i = 1;
mallopt(MALLOC_CKACCESS, opt);
p = malloc(30);
free(p);
strcpy(p, "hello, there!");

MALLOC_FILLAREA

Turns on fill area boundary checking. This form of boundary checking validates that the program hasn't overrun the user-requested size of a heap buffer. It does this by applying a guard code check when the buffer is released or when it's resized. The guard code check works by filling any excess space available at the end of the heap buffer with a pattern of bytes. When the buffer is released or resized, the trailing portion is checked to see if the pattern is still present. If not, a diagnostic warning message is printed.

The effect of turning on fill area boundary checking is a little different than enabling other checks. The checking is performed only on memory buffers allocated after the point in time at which the check was enabled. Memory buffers allocated before the change won't have the checking performed.

Here's how an overrun can be caught using the fill area boundary checking option:

...
int *foo, *p, i;
union malloptarg opt;
opt.i = 1;
mallopt(MALLOC_FILLAREA, opt);
foo = (int *)malloc(10*4);
for (p = foo, i = 12; i < 12; p++, i++) 
    *p = 89;
free(foo);  /* a warning is generated here */

MALLOC_CKCHAIN

Enables full chain checking. This option is expensive and should be considered as a last resort when some code is badly corrupting the heap and otherwise escapes the detection of boundary checking or fill area boundary checking. This can occur under a number of circumstances, particularly when they are related to direct pointer assignments. In this case, the fault may occur before a check such as fill area boundary checking can be applied. There are also circumstances in which both fill area boundary checking and the normal attempts to check the headers of neighboring buffers fails to detect the source of the problem. This may happen if the buffer that's overrun is the first or last buffer associated with a block or arena. It may also happen when the allocator chooses to satisfy some requests, particularly those for large buffers, with a buffer that exactly fits the program's requested size.

Full chain checking traverse the entire set of allocation chains for all arenas and blocks in the heap every time a memory operation, including allocation requests, is performed. This allows the developer to narrow down the search for a source of corruption to the nearest memory operation.

Forcing verification

It's also possible to force a full allocation chain check at certain points in the execution of the program, without turning chain checking on. This is done with a call to mallopt() with the first parameter set to MALLOC_VERIFY. This causes a chain check to be performed immediately. If any error is found, error handling is performed.

Controlling Error Handling

The normal response to detection of an error by the library is to print a diagnostic message and continue executing. In cases where the allocation chains or another crucial part of the allocator's view is hopelessly corrupted, an error message is printed and program execution is aborted (via the abort() function).

This behavior can be overridden by setting either the malloc warning handler or the malloc fatal handler. In each case, the error handler determines what is done in response to detection of an error that would normally be considered a warning or a fatal condition.

The error handler is set with a call to mallopt() with a first parameter of MALLOC_WARN or MALLOC_FATAL depending on which handler is to be set.

The second parameter, value, is an integer value that indicates one of the standard handlers provided by the library. This must be one of:

M_HANDLE_IGNORE: Ignore the error and continue.
M_HANDLE_EXIT: Exit immediately.
M_HANDLE_ABORT: Terminate execution with a call to abort().

Any of these handlers can be ORed with the value MALLOC_DUMP to cause a complete dump of the heap before taking the handler action.

Here's how a memory overrun error can be caused to abort the program:

...
int *foo, *p, i;
union malloptarg opt;
opt.i = 1;
mallopt(MALLOC_FILLAREA,  opt);
foo = (int *)malloc(10*4);
for (p = foo, i = 12; i < 12; p++, i++)
    *p = 89;
opt.i = M_HANDLE_ABORT;
mallopt(MALLOC_WARN, opt);
free(foo); /* a fatal error is generated here */

Manual Checking (Bounds Checking)

There are times when it may be desirable to obtain information about a particular heap buffer or print a diagnostic or warning message related to that heap buffer. This is particularly true when the program has its own routines providing memory manipulation and the developer wishes to provide bounds checking. This can also be useful for adding additional bounds checking to a program to isolate a problem such as a buffer overrun or underrun that isn't associated with a call to a memory or string function.

In the latter case, rather than keeping a pointer and performing direct manipulations on the pointer, the program may define a pointer type that contains all relevant information about the pointer, including the current value, the base pointer and the extent of the buffer. Access to the pointer can then be controlled through macros or access functions. The accessors can perform the necessary bounds checks and print a warning message in response to attempts to exceed the bounds.

Any attempt to dereference the current pointer value can be checked against the boundaries obtained when the pointer was initialized. If the boundary is exceeded the malloc_warning() function should be called to print a diagnostic message and perform error handling. The arguments are: file, line, message.

Getting pointer information

To obtain information about the pointer, two functions are provided:

find_malloc_ptr()
_mptr()

find_malloc_ptr()

void* find_malloc_ptr ( const void* ptr,
                        arena_range_t* range );

Finds information about the heap buffer containing the given C pointer, including the type of allocation structure it's contained in and the pointer to the header structure for the buffer. The function returns a pointer to the Dhead structure associated with this particular heap buffer. The pointer returned can be used in conjunction with the DH_() macros to obtain more information about the heap buffer. If the pointer doesn't point into the range of a valid heap buffer, the function returns NULL.

For example, the result from find_malloc_ptr() can be used as an argument to DH_ULEN() to find out the size that the program requested for the heap buffer in the call to malloc(), calloc() or a subsequent call to realloc().

_mptr()

char* _mptr stdcargs( ( const char* ptr ) );

Returns a pointer to the beginning of the heap buffer containing the given C pointer. Information about the size of the heap buffer can be obtained with a call to _msize() or _musize() with the value returned from this call.

Getting the heap buffer size

To obtain information about the size of a heap buffer, three interfaces are provided:

_msize()
_musize()
DH_ULEN()

_msize()

ssize_t _msize( const char* ptr );

Returns the actual size of the heap buffer given the pointer to the beginning of the heap buffer. The value returned by this function is the actual size of the buffer as opposed to the program-requested size for the buffer. The pointer must point to the beginning of the buffer -- as in the case of the value returned by _mptr() -- in order for this function to work.

_musize()

ssize_t _musize( const char* ptr );

Returns the program-requested size of the heap buffer given the pointer to the beginning of the heap buffer. The value returned by this function is the size argument that was given to the routine that allocated the block, or to a subsequent invocation of realloc() that caused the block to grow.

DH_ULEN()

DH_ULEN( ptr )

Returns the program-requested size of the heap buffer given a pointer to the Dhead structure, as returned by a call to find_malloc_ptr(). This is a macro that performs the appropriate cast on the pointer argument.

Memory Leaks

The ability of the malloc_g library to keep full allocation chains of all the heap memory allocated by the program -- as opposed to just accounting for some heap buffers -- allows heap memory leaks to be detected by the library in response to requests by the program. Leaks can be detected in the program by performing tracing on the entire heap. This is described in the sections that follow.

Tracing

Tracing is an operation that attempts to determine whether a heap object is reachable by the program. In order to be reachable, a heap buffer must be available either directly or indirectly from a pointer in a global variable or on the stack of one of the threads. If this isn't the case, then the heap buffer isno longer visible to the program and can't be accessed without constructing a pointer that refers to the heap buffer -- presumably by obtaining it from a persistent store such as a file or a shared memory object. The set of global variables and stack for all threads is called the root set. Because the root set must be stable for tracing to yield valid results, tracing requires that all threads other than the one performing the trace be suspended while the trace is performed.

Tracing operates by constructing a reachability graph of the entire heap. It begins with a root set scan that determines the root set comprising the initial state of the reachability graph. The roots that can be found by tracing are:

data of the program
uninitialized data of the program
initialized and uninitialized data of any shared objects dynamically linked into the program
used portion of the stacks of all active threads in the program.

Once the root set scan is complete, tracing initiates a mark operation for each element of the root set. The mark operation looks at a node of the reachability graph, scanning the memory space represented by the node, looking for pointers into the heap. Since the program may not actually have a pointer directly to the start of the buffer -- but to some interior location -- and it isn't possible to know which part of the root set or a heap object actually contains a pointer, tracing utilizes specialized techniques for coping with ambiguous roots. The approach taken is described as a conservative pointer estimation since it assumes that any word-sized object on a word-aligned memory cell that could point to a heap buffer or the interior of that heap buffer actually points to the heap buffer itself.

Using conservative pointer estimation for dealing with ambiguous roots, the mark operation finds all children of a node of the reachability graph. For each child in the heap that's found, it checks to see whether the heap buffer has been marked as referenced. If the buffer has been marked, the operation moves on to the next child. Otherwise, the trace marks the buffer, and recursively initiates a mark operation on that heap buffer.

The tracing operation is complete when the reachability graph has been fully traversed. At this time every heap buffer that's reachable will have been marked, as could some buffers that aren't actually reachable, due to the conservative pointer estimation. Any heap buffer that hasn't been marked is definitely unreachable, constituting a memory leak. At the end of the tracing operation, all unmarked nodes can be reported as leaks.

Causing a Trace and Giving Results

A program can cause a trace to be performed and memory leaks to be reported by calling the malloc_dump_unreferenced() function provided by the library.

malloc_dump_unreferenced()

int malloc_dump_unreferenced stdcargs( ( int fd, 
                                         int detail ) );

Suspends all threads, clear the mark information for all heap buffers, perform the trace operation, and print a report of all memory leaks detected. All items are reported in memory order.

fd

The file descriptor on which the report should be produced.

todo

Indicate how the trace operation should deal with any heap corruption problems it encounters. For a value of:

1: Any problems encountered can be treated as fatal errors. After the error encountered is printed abort the program. No report is produced.
0: Print case errors, and a report based on whatever heap information is recoverable.

Analyzing Dumps

The dump of unreferenced buffers prints out one line of information for each unreferenced buffer. The information provided for a buffer includes:

address of the buffer
function that was used to allocate it (malloc(), calloc(), realloc())
file that contained the allocation request, if available
line number or return address of the call to the allocation function
size of the allocated buffer.

File and line information is available if the call to allocate the buffer was made using one of the library's debug interfaces. Otherwise, the return address of the call is reported in place of the line number. In some circumstances, no return address information is available. This usually indicates that the call was made from a function with no frame information, such as the system libraries. In such cases, the entry can usually be ignored and probably isn't a leak.

From the way tracing is performed we can see that some leaks may escape detection and may not be reported in the output. This happens if the root set or a reachable buffer in the heap has something that looks like a pointer to the buffer.

Likewise, each reported leak should be checked against the suspected code identified by the line or call return address information. If the code in question keeps interior pointers -- pointers to a location inside the buffer, rather than the start of the buffer -- the trace operation will likely fail to find a reference to the buffer. In this case, the buffer may well not be a leak. In other cases, there is almost certainly a memory leak.

Compiler Support

Manual bounds checking can be avoided in circumstances where the compiler is capable of supporting bounds checking under control of a compile-time option. For C compilers this requires explicit support in the compiler. Patches are available for the Gnu C Compiler that allow it to perform bounds checking on pointers in this manner. This will be dealt with later. For C++ compilers extensive bounds checking can be performed through the use of operator overloading and the information functions described earlier.

C++ Issues

In place of a raw pointer, C++ programs can make use of a CheckedPtr template that acts as a smart pointer. The smart pointer has initializers that obtain complete information about the heap buffer on an assignment operation and initialize the current pointer position. Any attempt to dereference the pointer causes bounds checking to be performed and prints a diagnostic error in response an attempt to dereference a value beyond the bounds of the buffer. The CheckedPtr template is provided in the <malloc.h> header for C++ programs.

The checked pointer template provided for C++ programs can be modified to suit the needs of the program. The bounds checking performed by the checked pointer is restricted to checking the actual bounds of the heap buffer, rather than the program requested size.

For C programs it's possible to compile individual modules that obey certain rules with the C++ compiler to get the behavior of the CheckedPtr template. C modules obeying these rules are written to a dialect of ANSI C that can be referred to as Clean C.

Clean C

The Clean C dialect is that subset of ANSI C that is compatible with the C++ language. Writing Clean C requires imposing coding conventions to the C code that restrict use to features that are acceptable to a C++ compiler. This section provides a summary of some of the more pertinent points to be considered. It is a mostly complete but by no means exhaustive list of the rules that must be applied.

To use the C++ checked pointers, the module including all header files it includes must be compatible with the Clean C subset. All the system headers for QNX 6 as well as the <malloc_g/malloc.h> header satisfy this requirement.

The most obvious aspect to Clean C is that it must be strict ANSI C with respect to function prototypes and declarations. The use of K&R prototypes or definitions isn't allowable in Clean C. Similarly, default types for variable and function declarations can't be used.

Another important consideration for declarations is that forward declarations must be provided when referencing an incomplete structure or union. This frequently occurs for linked data structures such as trees or lists. In this case the forward declaration must occur before any declaration of a pointer to the object in the same or another structure or union. For example, a list node may be declared as follows:

  
struct ListNode;
struct ListNode {
   struct ListNode *next;
   void *data;
};

Operations on void pointers are more restrictive in C++. In particular, implicit coercions from void pointers to other types aren't allowed including both integer types and other pointer types. Void pointers should be explicitly cast to other types.

The use of const should be consistent with C++ usage. In particular, pointers that are declared as const must always be used in a compatible fashion. Const pointers can't be passed as non-const arguments to functions unless const is cast away.

C++ Example

Here's how our overrun example from earlier could have the exact source of the error pinpointed with checked pointers:

typedef CheckedPtr<int> intp_t;
...
intp_t foo, p;
int i;
union malloptarg opt;
opt.i = 1;
mallopt(MALLOC_FILLAREA, opt);
foo = (int *)malloc(10*4);
opt.i = M_HANDLE_ABORT;
mallopt(MALLOC_WARN, opt);
for (p = foo, i = 12; i < 12; p++, i++)
    *p = 89; /* a fatal error is generated here */
opt.i = M_HANDLE_IGNORE;
mallopt(MALLOC_WARN, opt);
free(foo);

Bounds Checking GCC

Bounds checking GCC is a variant of GCC that allows individual modules to be compiled with bounds checking enabled. When a heap buffer is allocated within a checked module, information about the buffer is added to the runtime information about the memory space kept on behalf of the compiler. Attempts to dereference or update the pointer in checked modules invokes intrinsic functions that obtain information about the bounds of the object -- it may be stack, heap or an object in the data segment -- and checks to see that the reference is in bounds. When an access is out of bounds, the runtime environment generates an error.

The bounds checking variant of GCC hasn't been ported to the QNX 6 environment. In order to check objects that are kept within the data segment of the application, the compiler runtime environment requires some Unix functions that aren't provided by QNX 6. The intrinsics would have to be modified to work in the QNX 6 environment.

The model for obtaining information about heap buffers with this compiler is also slightly different than the model employed by the malloc_g library. Instead of this, the compiler includes an alternative malloc implementation that registers checked heap buffers with a tree data structure outside of the program's control. This tree is used for searches made by the intrinsics to obtain information about checked objects. This technique may take more time than the malloc_g mechanism for some programs, and is incompatible with the checking and memory leak detection provided by malloc_g. Rather than performing multiple test runs, a port which reimplemented the compiler intrinsics to obtain heap buffer information from malloc_g would be desirable.

Summary

Every program developed would benefit by performing tests against a debug version that incorporates the malloc_g library to look for common sources of errors such as overruns and aid in the detection of memory leaks. The recommended practice for this is to always use malloc_g for debug variants.

The malloc_g library and different levels of compiler support can also be particularly useful in performing unit testing and program maintenance for determining the source of overrun errors, particularly those that may escape routine detection during integration testing. In these cases, more stringent checking for low-level bounds checking of individual pointers may prove useful. The use of the Clean C subset can help in this by facilitating the use of C++ templates for low-level checking. Beyond this it would be worthwhile considering ports of the bounds checking variant of GCC to meet individual project needs.