SmartHeap テクニカル情報 (英語)

Table of contents

Overview

SmartHeap's architecture

Portability

Debugging and error detection features

© 1993-2001 MicroQuill Software Publishing, Inc.


Overview

Industry leaders at WordPerfect, Lotus, Sybase, Hewlett Packard, Symantec, Microsoft, Informix, Knowledgeware, Intersolv, Xerox, Traveling Software, and many others know memory management can affect an application's performance and reliability more than any other factor. These companies and many others rely on SmartHeap for their memory manager. Why? Because SmartHeap is the fastest, most portable, and most reliable allocator available. In addition, SmartHeap includes complete heap error detection facilities.

Speed

In C and C++ applications running on virtual memory systems, malloc and operator new affect performance more than any other factor. SmartHeap's proprietary algorithms deliver unparalleled malloc/new performance in Windows, UNIX (Sun, HP, etc.), OS/2, NT, Extended DOS, and Macintosh. Benchmarks show SmartHeap is 3X to 100X+ faster. You can achieve even better results if you use SmartHeap's malloc/new to automatically route allocations smaller than a size that you specify to an extremely fast fixed-size allocator. SmartHeap also provides multiple memory pools, which improve locality and further eliminate fragmentation.

Error detection

SmartHeap doesn't stop with blazingly fast performance. It also provides the most complete heap error detection available. Memory bugs are typically the most insidious, spurious, and damaging bugs an application faces. Because SmartHeap controls and manages the heap, it can detect bugs other add-on debugging tools miss. In addition to providing better error detection, SmartHeap uses its knowledge of the heap to report unsurpassed detail about the cause of each error. Bugs that SmartHeap detects include leakage, memory overwrites, double-freeing, wild pointers, invalid parameters, out of memory, references to previously freed memory, and so on.

Portability

In addition to outstanding speed and complete error detection, today's memory manager must be easily portable. SmartHeap ships as a binary linkable library for DOS, Windows, NT, Mac, OS/2, UNIX, etc. Each version provides an identical API, but is specifically optimized for that particular environment. Finally, SmartHeap's malloc and new are strictly ANSI compliant, so you don't have to code to a proprietary API to realize the benefits of SmartHeap.

Reliability

Because SmartHeap is a runtime library product, it must deliver absolutely bullet-proof reliability. To ensure the ultimate in error-free operation, we built a certification tester which calls each of the SmartHeap APIs hundreds of thousands of times. This tester actually includes more lines of code than SmartHeap and was designed to comprehensively test all possible conditions SmartHeap might face, thus proving that all of the bugs are out.

The bottom line? We guarantee that SmartHeap is faster, more reliable, more portable, and more complete in its heap error detection than ANY memory manager you're using -- or your money back.


Return to the Table of contents.


SmartHeap's architecture

SmartHeap is a faster malloc/new library because its underlying algorithms are superior to those used in compiler-supplied libraries.

The problem: heap management is harder than you think

At first glance, writing a malloc/new library appears to be a very simple task; all the library has to do is allocate and free a few blocks. However, building a library that can handle a random mix of thousands or millions of allocations and frees of objects from one byte to megabytes in size, while running on a virtual memory, pre-emptively multi-tasking and multi-threading operating system, is not so easy. At least not if you also want the library to be fast and efficient in all conditions.

Applications must manage large numbers of objects

Today's applications, especially those written in C++, tend to be more memory intensive than ever before, often allocating, freeing, and referencing hundreds of thousands, or even millions, of small objects. Their allocation patterns are random, with calls to new interspersed with calls to delete. As a result, the heap quickly evolves into a fragmented, chaotic jumble.

This fragmentation, in turn, causes many commercial allocators to "hit the wall" - the performance of the allocator degrades exponentially as the heap size grows or when the allocator operates in virtual memory, rather than physical memory, conditions.

Large-footprint operating systems compete with your application for precious RAM

32-bit virtual memory operating systems provide the advantage of a huge address space. However, these operating systems themselves are huge, at least relative to the typical RAM configuration. As a result, they compete with your application for precious RAM and force your application's heap to swap far more frequently.

Windows 95, for example, has a footprint of some 14 MB - several times the size of Microsoft's suggested 4 MB minimum memory configuration.) Its sheer size relative to typically available physical memory guarantees that your application's heap will always be at least partially non-resident, so each call to allocate, free, or even reference memory is likely to invoke agonizingly slow disk hits. Windows NT and UNIX systems run on machines with more memory, but their respective footprints are also larger, as are the apps that run on them. As a result, the identical competition for memory and associated performance degradation occurs.

Pre-emptive multi-tasking and multi-threading operating systems increase swapping frequency

Pre-emptive multi-tasking in Win95, NT, OS/2, and UNIX further degrades performance. For example, your application (or one of its threads) may be in the middle of traversing a data structure when the operating system turns the processor over to another application or another thread. When your application (or that thread) gets its next slice of processor time, its data will often have been swapped to disk.

Multiple threads exacerbate the problem still further. In multi-threaded environments, objects are normally serialized so that only one thread can be active in the heap at a time. This makes the heap a real bottleneck for multi-threaded applications and affects performance:

  • Serialization (semaphore locking) itself is slow. On symmetric multi-processing (SMP) systems, heap serialization can also lead to one or more processors being idle if several threads are trying to access the heap at the same time.
  • Blocking on semaphores results in additional context switching because a thread that blocks waiting for the heap gives up its processor timeslice prematurely. This additional context switching, expensive in its own right, causes still more swapping.

Return to the Table of contents.


The solution: one allocator, three algorithms

Producing an allocator that's fast and efficient for objects of all sizes is not so easy. The algorithms that work best for allocating and freeing small objects don't work as well on large objects, and vice versa. SmartHeap solves this problem by implementing three distinct algorithms, one for small objects, one for medium-sized objects, and one for large objects. You don't have to change your code at all; you simply call new or malloc, just like before, and SmartHeap automatically uses the appropriate algorithm for the specified object size. Moreover, each SmartHeap algorithm scales well to very large heap sizes and is efficient in both physical and virtual memory conditions. (See the benchmark graphs later in this section.)

Allocating small objects (under 256 bytes)

The speed-space tradeoff of fixed-size allocators

If you've studied memory management, you know that a fixed-size allocator will always be faster than a variable-size allocator (2-10X faster, if not more, depending on heap-size). Fixed-size allocators reduce memory management to simple free-list management, which is extremely fast. Rather than searching the heap for the best fit, the fixed-size allocator can simply pick the head off the free list.

You also know that the vast percentage of objects allocated by C++ apps are for things like fixed-size structures and classes--objects that tend to be smaller than 256 bytes. If you could only find a way to get fixed-size allocation performance for all of these objects, you could get an immediate performance boost.

The tradeoff has always been that fixed-size allocators waste much more memory unless all objects are the same size. And when you're dealing with an entire heap, you're going to run across a wide spectrum of block sizes. Therefore, if you want to use a fixed-size allocator, you have to tediously analyze your code to find out how many objects you have of each size, create multiple fixed-size "pools" (mini-heaps) which correspond to where the object sizes congregate, and finally change your source to specifically call these specific fixed-size allocators. All this analysis and recoding takes time and precludes you from using an off-the-shelf malloc/new lib.

Alternatively, you could choose a single fixed-size allocator that handles all objects below a certain size and routes all others to a variable-size allocator. But this technique, which SmartHeap used in the 2.x release, isn't optimum either. No matter where you draw the line, any object smaller than the chosen size wastes memory. For example, if you route all objects up to 32 bytes long to a fixed-size allocator, every object smaller than 32 bytes wastes 32 - object size bytes of memory. This waste causes the heap to grow unnecessarily large. So you end up choosing a fixed-size level so small that only a few objects use the much faster fixed-size allocator, and overall performance improvement is negligible.

SmartHeap's fixed size allocator is fast and memory-efficient

To get around the speed versus waste tradeoff common to fixed-size allocators, SmartHeap dynamically establishes a separate fixed-size pool for each object size up to 255 bytes. You get 255 allocators without touching your code! For example, when your code first calls malloc or new to create a 32-byte object, SmartHeap automatically creates a 32-byte fixed-size pool and then allocates this object from it. All subsequent 32-byte objects are also allocated from this pool. The pools SmartHeap uses internally for these small allocations are very low-overhead, and free storage is shared between all the different sizes. So SmartHeap doesn't have the problem common to other fixed-size allocators of wasting reserved memory that is dedicated to each individual pool.

This technique delivers the performance of a fixed-size allocator and because every object maps perfectly to its own fixed-size allocator, the per-object overhead of the SmartHeap small-object allocator is only a single byte for Win 16, Win 32, and OS/2. (It's five bytes for UNIX and Mac platforms.) In comparison, Visual C++ 4.0 incurs 16 bytes of overhead for every object allocated, an amount that is often larger than the actual objects being allocated. UNIX allocators incur from 8 to 16 bytes per object, depending on the vendor.

Allocating medium-sized objects (256 bytes to 64K)

The fixed-size allocator that SmartHeap uses for objects smaller than 256 bytes isn't appropriate for larger blocks. While small blocks commonly hold thousands of repeat instances of fixed-size structures and classes, larger blocks commonly hold variable-size objects such as arrays and buffers which are rarely reused. Tying up memory to maintain a fixed-size pool for an object size that is rarely repeated causes the heap size to grow (and stay) unnecessarily large. Hence, for objects larger than 256 bytes but smaller than the operating system page size or system allocator granularity, SmartHeap uses a very efficient variable-size allocation algorithm.

The perils of locality of reference when allocating

The problem with conventional variable-size algorithms is that they effectively treat the heap as one large region, maintaining a single free list that spans the entire heap. Over time, as objects are continually allocated and freed, the free list ultimately degenerates into a random path of pages in the heap. This causes the allocator to jump from page to page as it traverses the free list, which it must do on every call to malloc/new and sometimes even on every call to free/delete. The heap in these conventional implementations exhibits poor page locality: data that is referenced consecutively (in this case, the heap free list) isn't stored in the same page of the heap.

The free list's lack of data locality wouldn't be a big problem if each free block were always large enough to satisfy each subsequent allocation request and if the heap were always entirely resident in physical memory. However, the same cycle of allocating and freeing that randomizes the free list also fragments the heap. This causes an ever-shrinking average block size, which, in turn, lessens the likelihood that "the next" free block in the list will be large enough to fulfill the current request. Moreover, as discussed above, most applications don't run purely in physical memory. As a result, a call to malloc or new often touches multiple pages while looking for a free block large enough for the object, and some of these touches invoke performance-killing disk hits. When a heap is fragmented and in tight memory conditions, a single allocation call can take a second or more as the allocator thrashes while traversing the free list.

The malloc and new implementations in all the compiler-supplied runtime libraries in DOS, Win16, OS/2, and the Mac, plus Borland's Win32 allocator, all use the conventional algorithm described above.

Most UNIX allocators and Microsoft's Win32 allocator improve on this by storing the free list and associated header information in a memory area separate from the blocks of data. Because the heap headers are smaller (usually eight bytes each) than the actual data, the free list can be stored more compactly, so data locality improves and swapping is reduced.

However, separating the free list from the data still doesn't eliminate swapping. For large heaps, the free list continues to span a large number of pages, so traversing it can still touch multiple pages. In addition, for heaps with a small median object size (common in C++), very little space is actually saved because the objects themselves take up very little space. So the free list turns out to be only marginally smaller, and substantial swapping still occurs.

Locality of reference also affects deallocation performance

Locality of reference is not just an issue when allocating memory; it is equally important when freeing memory. To minimize fragmentation, most allocators "coalesce," or merge adjacent free blocks to create a single larger space. To determine whether adjacent blocks are free, and thus could be merged, some allocators traverse the entire free list during calls to free. As a result, the same consequences of free list locality apply.

SmartHeap's unique page table algorithm maintains better locality of reference and reduces swapping when allocating memory

For medium size objects, SmartHeap uses a much smarter algorithm that virtually eliminates swapping while traversing the free-list. While other allocators treat the heap effectively as one large region, SmartHeap divides the heap into discrete pages that correspond with (and are perfectly aligned with) the pages of the underlying operating system virtual memory manager. And, also like the operating system, SmartHeap maintains a compact page table that keeps track of all of the pages in the heap.

For each page in the heap, SmartHeap's page table stores the size of the largest free block in that page. This page table is much smaller than the compiler allocator's free list because the page table has just one entry per page, rather than one entry per free block in the heap. Rather than searching one long list of free blocks (and touching many pages in the process), SmartHeap quickly scans its much smaller page table for a page that it knows has space for the current allocation request. SmartHeap's actual free list is contained inside each page -- since the free list doesn't reference any other pages, only a single heap page is referenced during a SmartHeap allocation. With this technique, SmartHeap virtually eliminates swapping during allocation calls.

Allocation speed is one clear benefit of SmartHeap's page-based allocation algorithm, but there is a more subtle benefit that can have an even greater impact on your application's overall performance.

We mentioned earlier how the free list in traditional allocators follows a random path through the heap. A consequence of this is that each object that your application creates will lie on a random page within the heap. SmartHeap, on the other hand, with its page-centric free list, always tries to allocate consecutive objects from the same page. The result is that the data referenced by your application has better locality. Applications often reference (and free) memory in the same pattern in which they allocate it. For example, elements successively inserted into a list will be allocated and referenced in the order of the list links. Therefore, referencing this data will involve accessing fewer pages, which further minimizes swapping.

SmartHeap's coalescing algorithms eliminate free list traversing when deallocating memory

As we mentioned above, many compiler allocators traverse the free list on every free to determine whether or not adjacent blocks are free, and thus can be merged. As a result, deallocation performance degrades more as the heap size (and thus the free list size) grows.

SmartHeap, on the other hand, doesn't rely on the free list at all to determine if adjacent blocks are free. Instead, it maintains special bits that indicate whether the adjacent blocks are free in the block headers for each object.. With each free/delete, SmartHeap checks this local header information and immediately coalesces any adjacent free blocks. This technique is constant time; it is not affected by the size of the heap. As a result, on all but the most modestly size heaps, SmartHeap is often orders of magnitude faster when freeing memory than compiler allocators.

Allocating large objects (over 64K)

On most platforms, very large objects are best managed by the operating system. Heap implementations such as Visual C++ 2.0 that include very large objects in the normal heap end up wasting memory and causing excessive fragmentation. By not returning memory to the operating system, the allocator hoards large amounts of memory in the application's heap. Moreover, when small objects are allocated in the space formerly occupied by a large object, the heap becomes fragmented, making it necessary to obtain yet more memory from the operating system when another large object is allocated.

SmartHeap solves these problems by treating large objects - those larger than the system page size and system allocator granularity - separately from either small or medium-sized objects. On platforms such as NT that provide an efficient large-object allocator, SmartHeap passes large object allocation requests directly to the OS. On other operating systems, such as Unix, that don't provide an efficient heap for large objects, SmartHeap implements its own large object allocator.

You can control the threshold between "medium" and "large" with a SmartHeap API. The default value is different on each platform, but is generally between 4K and 64K.

Other cool features in SmartHeap

Use SmartHeap's multiple pools on a data usage basis to achieve performance gains when referencing data

As we discussed above, SmartHeap automatically and transparently uses multiple memory pools for objects smaller than 256 bytes. In addition, you can explicitly create additional pools to further improve the performance of your application with minimal coding effort. (To allocate from a particular pool, you simply override the definition of new for a given class.)

Multiple pools let you partition your data (regardless of the size range of the objects) into discrete "mini-heaps." This has a number of benefits:

  • All objects allocated by a given class are contained in the fewest possible pages of the heap (for improved locality and, therefore, less swapping).
  • Allocating each additional object is faster because the heap space that the allocator must search (one pool) is a subset of the entire heap.
  • Deleting objects is faster because you can delete all of the objects in one pool with a single call - this is much faster than traversing your entire data structure to free every individual element.
  • Referencing data is dramatically faster because all of the objects you need to reference are on a small number of pages and not spread throughout a much larger heap space (again, because better locality means less swapping).
  • You can assign each thread in your application its own SmartHeap pool. This eliminates heap serialization overhead and allows multiple threads to concurrently access their own heaps. Moreover, on symmetric multi-processing (SMP) systems, multiple pools allow true concurrency of heap operations, which can dramatically improve overall performance by maximizing processor utilization.

Shared memory

Shared Memory in Win32

In Windows NT and Windows 95, each 32-bit process has its own separate address space. To allow sharing of memory between processes, Microsoft's Win32 API provides memory-mapped files. Beginning in version 3.1, SmartHeap supports Windows 95 and Windows NT shared memory. SmartHeap uses memory mapped files to implement shared memory pools. All of the SmartHeap allocation APIs that accept a memory pool parameter support Win32 shared memory.

Because SmartHeap allocation APIs return direct pointers to memory, SmartHeap requires shared memory pools to be mapped to the same address in each process. In Windows 95, this is not a problem since shared memory is always mapped to the same address in each process. NT, however, does not guarantee that shared memory is mapped to the same address in each process. To solve this problem, SmartHeap includes the API MemPoolInitNamedSharedEx. This API lets you specify the address at which a shared memory pool should be mapped and/or an array of process IDs (pids) that will access the shared pool. If you specify a non-NULL value for pids, SmartHeap will search the address space of each of these processes to find a suitable address that is available in all of the processes. If you specify address as NULL, SmartHeap chooses a random address in the upper half of the application address space for NT, or a random address in the shared memory address space for Win95. This minimizes the chance of collisions with other shared pools or VirtualAlloc objects (which are normally allocated from the beginning of the address space).

If all of the processes that will share a memory pool are running at the time you create the memory pool, you can have SmartHeap find an address automatically by specifying pidCount and pids parameters. In this case, the shared pool will be mapped into each process's address space before the MemPoolInitNamedSharedEx call returns. If there is no address space region of suitable size available in every process, MemPoolInitNamedSharedEx will fail (this would be very unusual considering that each process has 2 GB of address space).

Win32's memory mapped files have a granularity of 4K: there's no heap API for allocating smaller blocks of shared memory. SmartHeap includes a malloc-like API to efficiently allocate small blocks (as small as 4 bytes) of memory and a free-like API to free individual small blocks. SmartHeap also provides overloaded operators new and delete for shared memory in C++. These APIs let you create and destroy sharable data structures of any kind, including those that contain pointers.

This ability to allocate very small blocks and to map memory-mapped files at the same address in each process make SmartHeap extremely useful when porting 16-bit code to Windows 95 or Windows NT.

Note: To guarantee that a set of processes will be able to successfully share a memory pool in NT, you must use the DLL version of SmartHeap.

Shared memory for other platforms

Beginning in version 3.0, SmartHeap supports UNIX shared memory. SmartHeap uses the shared memory and semaphore facilities of the standard InterProcess Communication (IPC) package to implement shared memory pools. You can use all of the SmartHeap allocation and de-allocation APIs with shared memory pools. In debug SmartHeap, shared memory pools fully support all of the same debugging facilities as private memory pools.

SmartHeap supports OS/2 shared memory. In SmartHeap for OS/2, you can allocate either named or unnamed shared memory pools. As with all other SmartHeap platforms with shared memory support, you can use all the SmartHeap allocation and de-allocation APIs, and debugging facilities are fully supported.

SmartHeap also supports shared memory in 16-bit Windows.

See the Getting Started and Programmer's Guide for platform-specific details of the SmartHeap shared implementation on your platform.

SmartHeap's handle-based allocator gives you speed and space

If you need a handle-based allocator, SmartHeap implements double-indirection handles. The memory is moveable, so fragmentation is eliminated. However you can access the memory very efficiently by de-referencing the handle as a pointer, which eliminates a performance-degrading function call. The handle-based allocator gives you both great performance and little waste in a feature that was previously available only on the Macintosh. In addition, for applications that use the Mac memory API, SmartHeap provides an emulation of the Mac memory API with an implementation that is much faster than the native Mac API.


Return to the Table of contents.


Benchmarks: blazingly fast performance

How important is malloc/free speed?

Consider a typical application, which spends 40% of its total execution time on managing memory and takes 10 minutes to run. The table below shows how a faster memory management library affects this application.

                     then malloc/new    the app          and the 
If malloc/new is     takes this         takes this       entire app is 
this much faster...  much time...       much time...     this much faster
-------------------------------------------------------------------------
no change (1X)         4.00 minutes     10.00 minutes           0%
1.5X                   3.60 minutes      9.60 minutes           4%
2X                     2.00 minutes      8.00 minutes          20%
4X                     1.00 minutes      7.00 minutes          30%
10X                    0.40 minutes      6.40 minutes          36%
100X                   0.04 minutes      6.04 minutes          39.6%

Note that even a 4X improvement in malloc can result in a 30% overall application performance improvement -- and remember that SmartHeap is generally a minimum of 4X faster than other commercial allocators and requires just a relink to implement.

Benchmark descriptions

The benchmark graphs on the following pages compare SmartHeap to compiler malloc/new libraries for various versions of Windows.

Note We also have benchmarks for the Macintosh, OS/2, SunOS, and HP 7xx. For more information, please call us at 425-827-7200 or email us at info@microquill.com.

Our benchmark test program randomly calls operators new and delete (in a ratio of 3:1) to create objects that randomly vary in size from 8 to 128 bytes until the heap reaches the specified size. The program then deletes all of the objects.

Note Applications that initially allocate all of their memory and do little or no subsequent allocation will not see substantial performance improvements because traditional new implementations are fast when allocating into a totally empty heap.


Return to the Table of contents.


Benchmark graphs

Windows NT benchmark test graph

Windows 95 benchmark test graph

Windows for Workgroups version 3.11 benchmark test graph

OS/2 Warp benchmark test graph

Power Mac benchmark test graph


Return to the Table of contents.


Portability

SmartHeap provides portability to a broad set of platforms from a single, ANSI C compliant source code base. We support compilers from Microsoft, Borland, IBM, Watcom, Metaware, SUN, HP, Symantec/Zortech, and others.

Platform-specific binary versions that are ready to be quickly and easily linked directly into an application are available for DOS, Extended DOS, Macintosh, Windows, NT/Win32s, OS/2, Unixware, SunOS, Sun Solaris, IBM AIX, HP-UX, SGI, and other platforms. Source code licenses are also available for all of these platforms and include the necessary .mak files for specific platforms.

To maximize performance and efficiency, we isolated all platform dependencies into a single module of SmartHeap. This module is carefully tuned for each platform using manifest constants to control such architecture-sensitive variables as alignment, system page size, pointer size, and integer size. The following examples illustrate how SmartHeap is carefully tuned for each platform:

  • The 16-bit X86 version of SmartHeap uses near pointers internally on intra-segment references to minimize segment loads; this difference is abstracted out of 32-bit versions where there is only one pointer size.
  • The RISC version keeps all data strictly 4-byte or 8-byte aligned, while the X86 versions keep infrequently referenced data 2-byte aligned to save space.
  • SmartHeap implements different policies of global heap management depending on the operating system's memory architecture -- for example, segmented (16-bit DOS and Windows), flat sparse (OS/2 and NT), and flat contiguous (UNIX and Phar Lap 386).
  • On operating systems that run in 16-bit X86 protected mode (extended DOS and 16-bit Windows), SmartHeap minimizes selector consumption to ensure that selectors are not exhausted while there is still available memory.
  • For virtual memory operating systems, SmartHeap always allocates from the operating system in an appropriate multiple of the system page size, and sub-allocates on pages that are exactly aligned with the underlying system pages. This ensures that the heap itself requires a minimal working set and reduces page swapping.
  • Native error checking is possible in addition to the portable error checking -- for example, parameter validation is more reliable since memory addresses are validated. For some platforms (currently Windows 3.x, NT, OS/2, and Extended DOS), native I/O is provided so that error messages are displayed graphically or integrated with native debugging tools. Users can integrate SmartHeap with other debuggers platforms not mentioned above.
  • For operating systems that provide multi-threading (NT, OS/2 2.x, HP-UX, Solaris 2.x), SmartHeap is fully thread-reentrant.

You can also readily compile SmartHeap on platforms and operating systems for which MicroQuill has not yet provided integration. Please contact MicroQuill for pricing and support details.


Return to the Table of contents.


Debugging and error detection features

In addition to incredible runtime performance, SmartHeap provides the most complete heap error detection available. Because SmartHeap "owns" the heap, it not only detects more errors, but provides greater detail about each error than that provided by other "add-on" memory debuggers.

As SmartHeap's debugging version allocates each block, it keeps track of the following information:

  • The SmartHeap API that created the object (for example, malloc).
  • The filename, line number, and pass count where the object was allocated. This information allows you to set a breakpoint at the nth pass over the line at the exact allocation spot -both lexically and in time -- where the allocation occurs. It also enables you to track any SmartHeap allocated object from its creation and follow that object's life in the debugger to the point where an error occurs. Many memory error detection facilities will report the file and line where errors are detected and/or where objects were allocated to help track down memory overwrites or leaks, but in many situations, this information is of little use because control passes through the given file/line for many different reasons. This is why SmartHeap records and reports a pass count as well as a file and line.
  • An allocation count that uniquely distinguishes it from all other allocations in that process.
  • A checkpoint -- an identifier that you can use to categorize or tag allocations in your application.
  • The actual requested size and any flags used in creation of the object -- so that the exact parameter to the allocation call can be reported at any time in the future.
  • A checksum used if the block is marked as read-only.

The SmartHeap debug library provides three levels of error detection, from simple to very exhaustive. dbgMemSetSafetyLevel controls how error checking SmartHeap performs. The three "safety levels" are:

  • MEM_SAFETY_SOME, the lowest level, performs only checks that are O(c), such as validating parameters and checking a block's guards for overwrites when the block is freed.
  • MEM_SAFETY_FULL, the default, performs additional O(n) checks, such as scanning free lists to detect double-freeing.
  • MEM_SAFETY_DEBUG performs all of the checks above and checks the entire heap for overwrites on every nth SmartHeap entry point -- making it O(nイ). You can specify the check frequency, n.

Information that SmartHeap includes in error reports

When SmartHeap detects an error, the following information is included in the error report:

  • The API/file/line/pass count of the call that detected the error.
  • The creation API/file/line/pass count, checkpoint, and allocation count of the object that was corrupted, if applicable.
  • The API/file/line/pass count where the object was last known to be "OK" (that is, not overwritten).
  • The memory address where the corruption was detected, and if the address is valid, a hex and character memory dump of the contents of the corrupted location.

Where you can send error reports

You can specify that SmartHeap send error reports to any combination of the following locations:

  • A prompt to the user interface (message box or command-line prompt, depending on the platform), where the user can abort, retry, or ignore.
  • Output to the debugging console (secondary monitor or debugger log window).
  • Output to a log file. The log file is closed after each error to ensure no information is lost if the application crashes.
  • User-defined error handler that takes control of I/O and handling.
  • User-defined tracing of entry/exit of all SmartHeap APIs.

Errors detected by SmartHeap

SmartHeap detects the following types of errors:

  • Overwrites before or after an allocated block. SmartHeap detects overwrites as small as one byte beyond the requested size, even if the block's actual size is larger than requested. You can specify the size and value of the "guards" that are placed before and after every allocation, including fixed-size allocations.
  • Overwrites over any internal heap data structures.
  • Leakage. SmartHeap includes a complete leakage reporting facility. You can group, or "checkpoint," allocations and find and report all blocks not freed in one or more groups.
  • Invalid parameters.
  • Out of memory.
  • Double freeing, writes, or references to previously freed memory. You can defer freeing memory so that double freeing is guaranteed to be caught. Normally, free blocks are recycled at the next allocation, so writing to a previously freed block often overwrites data in another part of an application where the block is now in use again.
  • References to non-shared memory owned by a different task.
  • Writes into free memory. Free blocks are always filled with a unique pattern; SmartHeap thus detects a write into any byte of any free block.
  • Exceeding pool size ceiling established by MemPoolSetCeiling. This function allows an application to constrain its own memory resource consumption.
  • Overwrites over the handle table.
  • Invalid flags or buffer parameters.
  • Modifying a block marked as "read-only." Individual blocks can be marked as read-only, and modifying the contents is detected along with other overwrites.
  • Freeing/reallocing a block marked as "no-free" or "no-realloc." A block that should not be freed during some stretch of a program can be marked as such, and SmartHeap will report as an error any attempt to free/realloc the block.

For an example of error detection and reporting, see the sample program and output SHTESTD.C and SHTESTD.OUT on the following pages.


A sample application that illustrates Debug SmartHeap error reports

The following pages show the SHTESTD.C sample application, which illustrates Debug SmartHeap error reports. Following the listing of the test program is the output it generates. Note that the line numbers in the sample code correspond with those reported by SmartHeap in the error report.

 1 
 2 
 3 
 4 
 5 /* Note: the SmartHeap header file must be included _after_ any 
 6  * files that declare malloc, etc.
 7  */
 8 #include "smrtheap.h"
 9 #include "shmalloc.h"
10 
11 #ifndef MEM_DEBUG
12 #error shtestd.c must be compiled with MEM_DEBUG defined
13 #endif
14 
15 #define TRUE 1
16 #define FALSE 0
17 
18
19 
20 int main()
21 {
22    MEM_POOL pool;
23    unsigned char *buf;
24    int i;
25    unsigned char c;
26 
27    dbgMemSetSafetyLevel(MEM_SAFETY_DEBUG);
28    dbgMemSetDefaultErrorOutput(DBGMEM_OUTPUT_PROMPT 
29       | DBGMEM_OUTPUT_CONSOLE | DBGMEM_OUTPUT_FILE, "shtestd.out");
30 
31    pool = MemPoolInit(0);
32    dbgMemPoolSetCheckFrequency(pool, 1);
33    dbgMemPoolDeferFreeing(pool, TRUE);
34    dbgMemPoolSetCheckpoint(pool, 1);
35 
36    buf = MemallocPtr(pool, 3, 0);  /* this alloc never freed (leakage) */
37 
38    /* invalid buffer */
39    MemPoolInfo(pool, NULL, NULL);
40 
41    /* invalid pointer parameter */
42    MemFreePtr((void *)ULONG_MAX);
43 
44    /* underwrite */
45    c = buf[-1];
46    buf[-1] = 'x';
47    MemValidatePtr(pool, buf);
48    buf[-1] = c;
49 
50    /* overwrite */
51    buf = MemallocPtr(pool, 3, 0);  /* more leakage */
52    c = buf[3];
53    buf[3] = 'z';
54    MemValidatePtr(pool, buf);
55    buf[3] = c;
56 
57    dbgMemPoolSetCheckpoint(pool, 2);
58    
59    /* write into read-only block */
60    buf = MemallocPtr(pool, 10, MEM_ZEROINIT);  /* more leakage */
61    *buf = 'a';
62    MemValidatePtr(pool, buf);
63    dbgMemProtectPtr(buf, DBGMEM_PTR_READONLY);
64    *buf = 'b';
65    MemValidatePtr(pool, buf);
66    *buf = 'a';
67    dbgMemProtectPtr(buf, DBGMEM_PTR_NOFREE | DBGMEM_PTR_NOREALLOC);
68    free(buf);
69    realloc(buf, 44);
70 
71    /* double free */
72    buf = malloc(1);
73    dbgMemPoolDeferFreeing(MemDefaultPool, TRUE);
74    dbgMemPoolSetCheckFrequency(MemDefaultPool, 1);
75    for (i = 0;  i < 3;  i++)
76       MemFreePtr(buf);
77 
78    /* write into free block */
79    c = *buf;
80    *buf = 'a';
81    calloc(1, 3);
82    *buf = c;
83 
84    dbgMemReportLeakage(pool, 1, 2);
85
86    return 1;
87 }

SmartHeap error output from SHTESTD.C

Here's a listing of the SHTESTD.OUT file generated by SHTESTD.C.

MEM_BAD_BUFFER: Invalid buffer parameter.
    Error detected in: MemPoolInfo(01DF:0000, 0000:0000, 0000:0000)
        at line 39 of file shtestd.c, pass #1
    Parameter is NULL pointer.
MEM_BAD_POINTER: Invalid memory pointer parameter.
    Error detected in: MemFreePtr(FFFF:FFFF)
        at line 42 of file shtestd.c, pass #1
    Error at or near address FFFF:FFFF which contains:
        <illegal address>
MEM_UNDERWRITE: Memory before beginning of allocated block overwritten.
    Error detected in: MemValidatePtr(01DF:0000, 01E7:7FF0)
        at line 47 of file shtestd.c, pass #1
    Object created by: MemallocPtr(01DF:0000, 3, 0x0)
        at line 36 of file shtestd.c, pass #1
        checkpoint 1, alloc #1
    Error at or near address 01E7:7FEF which contains:
        78 EB EB EB FC FC FC FC-FC 03 00 14 00 14 00 00   x...............
    Pool last verified in: MemPoolInfo
        at line 39 of file shtestd.c, pass #1
MEM_OVERWRITE: Memory after end of allocated block overwritten.
    Error detected in: MemValidatePtr(01DF:0000, 01E7:7FC8)
        at line 54 of file shtestd.c, pass #1
    Object created by: MemallocPtr(01DF:0000, 3, 0x0)
        at line 51 of file shtestd.c, pass #1
        checkpoint 1, alloc #2
    Error at or near address 01E7:7FCB which contains:
        7A FC FC FC FC 2B 00 00-00 01 00 03 00 00 00 01   z....+..........
    Pool last verified in: