Allocators

Allocators ISO C++ allocator Memory management for Standard Library entities is encapsulated in a class template called allocator. The allocator abstraction is used throughout the library in string, container classes, algorithms, and parts of iostreams. This class, and base classes of it, are the superset of available free store (heap) management classes.

Requirements The C++ standard only gives a few directives in this area: When you add elements to a container, and the container must allocate more memory to hold them, the container makes the request via its Allocator template parameter, which is usually aliased to allocator_type. This includes adding chars to the string class, which acts as a regular STL container in this respect. The default Allocator argument of every container-of-T is allocator<T>. The interface of the allocator<T> class is extremely simple. It has about 20 public declarations (nested typedefs, member functions, etc), but the two which concern us most are: T* allocate (size_type n, const void* hint = 0); void deallocate (T* p, size_type n); The n arguments in both those functions is a count of the number of T's to allocate space for, not their total size. (This is a simplification; the real signatures use nested typedefs.) The storage is obtained by calling ::operator new, but it is unspecified when or how often this function is called. The use of the hint is unspecified, but intended as an aid to locality if an implementation so desires. [20.4.1.1]/6 Complete details can be found in the C++ standard, look in [20.4 Memory].

Design Issues The easiest way of fulfilling the requirements is to call operator new each time a container needs memory, and to call operator delete each time the container releases memory. This method may be slower than caching the allocations and re-using previously-allocated memory, but has the advantage of working correctly across a wide variety of hardware and operating systems, including large clusters. The __gnu_cxx::new_allocator implements the simple operator new and operator delete semantics, while __gnu_cxx::malloc_allocator implements much the same thing, only with the C language functions std::malloc and free. Another approach is to use intelligence within the allocator class to cache allocations. This extra machinery can take a variety of forms: a bitmap index, an index into an exponentially increasing power-of-two-sized buckets, or simpler fixed-size pooling cache. The cache is shared among all the containers in the program: when your program's std::vector<int> gets cut in half and frees a bunch of its storage, that memory can be reused by the private std::list<WonkyWidget> brought in from a KDE library that you linked against. And operators new and delete are not always called to pass the memory on, either, which is a speed bonus. Examples of allocators that use these techniques are __gnu_cxx::bitmap_allocator, __gnu_cxx::pool_allocator, and __gnu_cxx::__mt_alloc. Depending on the implementation techniques used, the underlying operating system, and compilation environment, scaling caching allocators can be tricky. In particular, order-of-destruction and order-of-creation for memory pools may be difficult to pin down with certainty, which may create problems when used with plugins or loading and unloading shared objects in memory. As such, using caching allocators on systems that do not support abi::__cxa_atexit is not recommended.

Implementation

Interface Design The only allocator interface that is supported is the standard C++ interface. As such, all STL containers have been adjusted, and all external allocators have been modified to support this change. The class allocator just has typedef, constructor, and rebind members. It inherits from one of the high-speed extension allocators, covered below. Thus, all allocation and deallocation depends on the base class. The base class that allocator is derived from may not be user-configurable.

Selecting Default Allocation Policy It's difficult to pick an allocation strategy that will provide maximum utility, without excessively penalizing some behavior. In fact, it's difficult just deciding which typical actions to measure for speed. Three synthetic benchmarks have been created that provide data that is used to compare different C++ allocators. These tests are: Insertion. Over multiple iterations, various STL container objects have elements inserted to some maximum amount. A variety of allocators are tested. Test source for sequence and associative containers. Insertion and erasure in a multi-threaded environment. This test shows the ability of the allocator to reclaim memory on a per-thread basis, as well as measuring thread contention for memory resources. Test source here. A threaded producer/consumer model. Test source for sequence and associative containers. The current default choice for allocator is __gnu_cxx::new_allocator.

Disabling Memory Caching In use, allocator may allocate and deallocate using implementation-specified strategies and heuristics. Because of this, every call to an allocator object's allocate member function may not actually call the global operator new. This situation is also duplicated for calls to the deallocate member function. This can be confusing. In particular, this can make debugging memory errors more difficult, especially when using third party tools like valgrind or debug versions of new. There are various ways to solve this problem. One would be to use a custom allocator that just called operators new and delete directly, for every allocation. (See include/ext/new_allocator.h, for instance.) However, that option would involve changing source code to use a non-default allocator. Another option is to force the default allocator to remove caching and pools, and to directly allocate with every call of allocate and directly deallocate with every call of deallocate, regardless of efficiency. As it turns out, this last option is also available. To globally disable memory caching within the library for the default allocator, merely set GLIBCXX_FORCE_NEW (with any value) in the system's environment before running the program. If your program crashes with GLIBCXX_FORCE_NEW in the environment, it likely means that you linked against objects built against the older library (objects which might still using the cached allocations...).

Using a Specific Allocator You can specify different memory management schemes on a per-container basis, by overriding the default Allocator template parameter. For example, an easy (but non-portable) method of specifying that only malloc or free should be used instead of the default node allocator is: std::list <int, __gnu_cxx::malloc_allocator<int> > malloc_list; Likewise, a debugging form of whichever allocator is currently in use: std::deque <int, __gnu_cxx::debug_allocator<std::allocator<int> > > debug_deque;

Custom Allocators Writing a portable C++ allocator would dictate that the interface would look much like the one specified for allocator. Additional member functions, but not subtractions, would be permissible. Probably the best place to start would be to copy one of the extension allocators: say a simple one like new_allocator.

Extension Allocators Several other allocators are provided as part of this implementation. The location of the extension allocators and their names have changed, but in all cases, functionality is equivalent. Starting with gcc-3.4, all extension allocators are standard style. Before this point, SGI style was the norm. Because of this, the number of template arguments also changed. Here's a simple chart to track the changes. More details on each of these extension allocators follows. new_allocator Simply wraps ::operator new and ::operator delete. malloc_allocator Simply wraps malloc and free. There is also a hook for an out-of-memory handler (for new/delete this is taken care of elsewhere). array_allocator Allows allocations of known and fixed sizes using existing global or external storage allocated via construction of std::tr1::array objects. By using this allocator, fixed size containers (including std::string) can be used without instances calling ::operator new and ::operator delete. This capability allows the use of STL abstractions without runtime complications or overhead, even in situations such as program startup. For usage examples, please consult the testsuite. debug_allocator A wrapper around an arbitrary allocator A. It passes on slightly increased size requests to A, and uses the extra memory to store size information. When a pointer is passed to deallocate(), the stored size is checked, and assert() is used to guarantee they match. throw_allocator Includes memory tracking and marking abilities as well as hooks for throwing exceptions at configurable intervals (including random, all, none). __pool_alloc A high-performance, single pool allocator. The reusable memory is shared among identical instantiations of this type. It calls through ::operator new to obtain new memory when its lists run out. If a client container requests a block larger than a certain threshold size, then the pool is bypassed, and the allocate/deallocate request is passed to ::operator new directly. Older versions of this class take a boolean template parameter, called thr, and an integer template parameter, called inst. The inst number is used to track additional memory pools. The point of the number is to allow multiple instantiations of the classes without changing the semantics at all. All three of typedef __pool_alloc<true,0> normal; typedef __pool_alloc<true,1> private; typedef __pool_alloc<true,42> also_private; behave exactly the same way. However, the memory pool for each type (and remember that different instantiations result in different types) remains separate. The library uses 0 in all its instantiations. If you wish to keep separate free lists for a particular purpose, use a different number. The thr boolean determines whether the pool should be manipulated atomically or not. When thr = true, the allocator is thread-safe, while thr = false, is slightly faster but unsafe for multiple threads. For thread-enabled configurations, the pool is locked with a single big lock. In some situations, this implementation detail may result in severe performance degradation. (Note that the GCC thread abstraction layer allows us to provide safe zero-overhead stubs for the threading routines, if threads were disabled at configuration time.) __mt_alloc A high-performance fixed-size allocator with exponentially-increasing allocations. It has its own documentation, found here. bitmap_allocator A high-performance allocator that uses a bit-map to keep track of the used and unused memory locations. It has its own documentation, found here.

Bibliography ISO/IEC 14882:1998 Programming languages - C++ isoc++_1998 20.4 Memory The Standard Librarian: What Are Allocators Good For? MattAustern C/C++ Users Journal The Hoard Memory Allocator EmeryBerger Reconsidering Custom Memory Allocation EmeryBerger BenZorn KathrynMcKinley 2002 OOPSLA Allocator Types KlausKreft AngelikaLanger C/C++ Users Journal The C++ Programming Language BjarneStroustrup 2000 19.4 Allocators Addison Wesley Yalloc: A Recycling C++ Allocator FelixYen