Any use of parallel functionality requires additional compiler
and runtime support, in particular support for OpenMP. Adding this support is
not difficult: just compile your application with the compiler
flag -fopenmp
. This will link
in libgomp
, the GNU
OpenMP implementation,
whose presence is mandatory.
In addition, hardware that supports atomic operations and a compiler
capable of producing atomic operations is mandatory: GCC defaults to no
support for atomic operations on some common hardware
architectures. Activating atomic operations may require explicit
compiler flags on some targets (like sparc and x86), such
as -march=i686
,
-march=native
or -mcpu=v9
. See
the GCC manual for more information.
To use the libstdc++ parallel mode, compile your application with
the prerequisite flags as detailed above, and in addition
add -D_GLIBCXX_PARALLEL
. This will convert all
use of the standard (sequential) algorithms to the appropriate parallel
equivalents. Please note that this doesn't necessarily mean that
everything will end up being executed in a parallel manner, but
rather that the heuristics and settings coded into the parallel
versions will be used to determine if all, some, or no algorithms
will be executed using parallel variants.
Note that the _GLIBCXX_PARALLEL
define may change the
sizes and behavior of standard class templates such as
std::search
, and therefore one can only link code
compiled with parallel mode and code compiled without parallel mode
if no instantiation of a container is passed between the two
translation units. Parallel mode functionality has distinct linkage,
and cannot be confused with normal mode symbols.
When it is not feasible to recompile your entire application, or only specific algorithms need to be parallel-aware, individual parallel algorithms can be made available explicitly. These parallel algorithms are functionally equivalent to the standard drop-in algorithms used in parallel mode, but they are available in a separate namespace as GNU extensions and may be used in programs compiled with either release mode or with parallel mode.
An example of using a parallel version
of std::sort
, but no other parallel algorithms, is:
#include <vector> #include <parallel/algorithm> int main() { std::vector<int> v(100); // ... // Explicitly force a call to parallel sort. __gnu_parallel::sort(v.begin(), v.end()); return 0; }
Then compile this code with the prerequisite compiler flags
(-fopenmp
and any necessary architecture-specific
flags for atomic operations.)
The following table provides the names and headers of all the parallel algorithms that can be used in a similar manner:
Table 18.1. Parallel Algorithms
Algorithm | Header | Parallel algorithm | Parallel header |
---|---|---|---|
std::accumulate | numeric | __gnu_parallel::accumulate | parallel/numeric |
std::adjacent_difference | numeric | __gnu_parallel::adjacent_difference | parallel/numeric |
std::inner_product | numeric | __gnu_parallel::inner_product | parallel/numeric |
std::partial_sum | numeric | __gnu_parallel::partial_sum | parallel/numeric |
std::adjacent_find | algorithm | __gnu_parallel::adjacent_find | parallel/algorithm |
std::count | algorithm | __gnu_parallel::count | parallel/algorithm |
std::count_if | algorithm | __gnu_parallel::count_if | parallel/algorithm |
std::equal | algorithm | __gnu_parallel::equal | parallel/algorithm |
std::find | algorithm | __gnu_parallel::find | parallel/algorithm |
std::find_if | algorithm | __gnu_parallel::find_if | parallel/algorithm |
std::find_first_of | algorithm | __gnu_parallel::find_first_of | parallel/algorithm |
std::for_each | algorithm | __gnu_parallel::for_each | parallel/algorithm |
std::generate | algorithm | __gnu_parallel::generate | parallel/algorithm |
std::generate_n | algorithm | __gnu_parallel::generate_n | parallel/algorithm |
std::lexicographical_compare | algorithm | __gnu_parallel::lexicographical_compare | parallel/algorithm |
std::mismatch | algorithm | __gnu_parallel::mismatch | parallel/algorithm |
std::search | algorithm | __gnu_parallel::search | parallel/algorithm |
std::search_n | algorithm | __gnu_parallel::search_n | parallel/algorithm |
std::transform | algorithm | __gnu_parallel::transform | parallel/algorithm |
std::replace | algorithm | __gnu_parallel::replace | parallel/algorithm |
std::replace_if | algorithm | __gnu_parallel::replace_if | parallel/algorithm |
std::max_element | algorithm | __gnu_parallel::max_element | parallel/algorithm |
std::merge | algorithm | __gnu_parallel::merge | parallel/algorithm |
std::min_element | algorithm | __gnu_parallel::min_element | parallel/algorithm |
std::nth_element | algorithm | __gnu_parallel::nth_element | parallel/algorithm |
std::partial_sort | algorithm | __gnu_parallel::partial_sort | parallel/algorithm |
std::partition | algorithm | __gnu_parallel::partition | parallel/algorithm |
std::random_shuffle | algorithm | __gnu_parallel::random_shuffle | parallel/algorithm |
std::set_union | algorithm | __gnu_parallel::set_union | parallel/algorithm |
std::set_intersection | algorithm | __gnu_parallel::set_intersection | parallel/algorithm |
std::set_symmetric_difference | algorithm | __gnu_parallel::set_symmetric_difference | parallel/algorithm |
std::set_difference | algorithm | __gnu_parallel::set_difference | parallel/algorithm |
std::sort | algorithm | __gnu_parallel::sort | parallel/algorithm |
std::stable_sort | algorithm | __gnu_parallel::stable_sort | parallel/algorithm |
std::unique_copy | algorithm | __gnu_parallel::unique_copy | parallel/algorithm |