diff options
Diffstat (limited to 'libstdc++-v3/doc/xml/manual/parallel_mode.xml')
-rw-r--r-- | libstdc++-v3/doc/xml/manual/parallel_mode.xml | 878 |
1 files changed, 878 insertions, 0 deletions
diff --git a/libstdc++-v3/doc/xml/manual/parallel_mode.xml b/libstdc++-v3/doc/xml/manual/parallel_mode.xml new file mode 100644 index 000000000..ec0faf9c7 --- /dev/null +++ b/libstdc++-v3/doc/xml/manual/parallel_mode.xml @@ -0,0 +1,878 @@ +<chapter xmlns="http://docbook.org/ns/docbook" version="5.0" + xml:id="manual.ext.parallel_mode" xreflabel="Parallel Mode"> +<?dbhtml filename="parallel_mode.html"?> + +<info><title>Parallel Mode</title> + <keywordset> + <keyword> + C++ + </keyword> + <keyword> + library + </keyword> + <keyword> + parallel + </keyword> + </keywordset> +</info> + + + +<para> The libstdc++ parallel mode is an experimental parallel +implementation of many algorithms the C++ Standard Library. +</para> + +<para> +Several of the standard algorithms, for instance +<function>std::sort</function>, are made parallel using OpenMP +annotations. These parallel mode constructs and can be invoked by +explicit source declaration or by compiling existing sources with a +specific compiler flag. +</para> + + +<section xml:id="manual.ext.parallel_mode.intro" xreflabel="Intro"><info><title>Intro</title></info> + + +<para>The following library components in the include +<filename class="headerfile">numeric</filename> are included in the parallel mode:</para> +<itemizedlist> + <listitem><para><function>std::accumulate</function></para></listitem> + <listitem><para><function>std::adjacent_difference</function></para></listitem> + <listitem><para><function>std::inner_product</function></para></listitem> + <listitem><para><function>std::partial_sum</function></para></listitem> +</itemizedlist> + +<para>The following library components in the include +<filename class="headerfile">algorithm</filename> are included in the parallel mode:</para> +<itemizedlist> + <listitem><para><function>std::adjacent_find</function></para></listitem> + <listitem><para><function>std::count</function></para></listitem> + <listitem><para><function>std::count_if</function></para></listitem> + <listitem><para><function>std::equal</function></para></listitem> + <listitem><para><function>std::find</function></para></listitem> + <listitem><para><function>std::find_if</function></para></listitem> + <listitem><para><function>std::find_first_of</function></para></listitem> + <listitem><para><function>std::for_each</function></para></listitem> + <listitem><para><function>std::generate</function></para></listitem> + <listitem><para><function>std::generate_n</function></para></listitem> + <listitem><para><function>std::lexicographical_compare</function></para></listitem> + <listitem><para><function>std::mismatch</function></para></listitem> + <listitem><para><function>std::search</function></para></listitem> + <listitem><para><function>std::search_n</function></para></listitem> + <listitem><para><function>std::transform</function></para></listitem> + <listitem><para><function>std::replace</function></para></listitem> + <listitem><para><function>std::replace_if</function></para></listitem> + <listitem><para><function>std::max_element</function></para></listitem> + <listitem><para><function>std::merge</function></para></listitem> + <listitem><para><function>std::min_element</function></para></listitem> + <listitem><para><function>std::nth_element</function></para></listitem> + <listitem><para><function>std::partial_sort</function></para></listitem> + <listitem><para><function>std::partition</function></para></listitem> + <listitem><para><function>std::random_shuffle</function></para></listitem> + <listitem><para><function>std::set_union</function></para></listitem> + <listitem><para><function>std::set_intersection</function></para></listitem> + <listitem><para><function>std::set_symmetric_difference</function></para></listitem> + <listitem><para><function>std::set_difference</function></para></listitem> + <listitem><para><function>std::sort</function></para></listitem> + <listitem><para><function>std::stable_sort</function></para></listitem> + <listitem><para><function>std::unique_copy</function></para></listitem> +</itemizedlist> + +</section> + +<section xml:id="manual.ext.parallel_mode.semantics" xreflabel="Semantics"><info><title>Semantics</title></info> + + +<para> The parallel mode STL algorithms are currently not exception-safe, +i.e. user-defined functors must not throw exceptions. +Also, the order of execution is not guaranteed for some functions, of course. +Therefore, user-defined functors should not have any concurrent side effects. +</para> + +<para> Since the current GCC OpenMP implementation does not support +OpenMP parallel regions in concurrent threads, +it is not possible to call parallel STL algorithm in +concurrent threads, either. +It might work with other compilers, though.</para> + +</section> + +<section xml:id="manual.ext.parallel_mode.using" xreflabel="Using"><info><title>Using</title></info> + + +<section xml:id="parallel_mode.using.prereq_flags"><info><title>Prerequisite Compiler Flags</title></info> + + +<para> + Any use of parallel functionality requires additional compiler + and runtime support, in particular support for OpenMP. Adding this support is + not difficult: just compile your application with the compiler + flag <literal>-fopenmp</literal>. This will link + in <code>libgomp</code>, the GNU + OpenMP <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://gcc.gnu.org/onlinedocs/libgomp">implementation</link>, + whose presence is mandatory. +</para> + +<para> +In addition, hardware that supports atomic operations and a compiler + capable of producing atomic operations is mandatory: GCC defaults to no + support for atomic operations on some common hardware + architectures. Activating atomic operations may require explicit + compiler flags on some targets (like sparc and x86), such + as <literal>-march=i686</literal>, + <literal>-march=native</literal> or <literal>-mcpu=v9</literal>. See + the GCC manual for more information. +</para> + +</section> + +<section xml:id="parallel_mode.using.parallel_mode"><info><title>Using Parallel Mode</title></info> + + +<para> + To use the libstdc++ parallel mode, compile your application with + the prerequisite flags as detailed above, and in addition + add <constant>-D_GLIBCXX_PARALLEL</constant>. This will convert all + use of the standard (sequential) algorithms to the appropriate parallel + equivalents. Please note that this doesn't necessarily mean that + everything will end up being executed in a parallel manner, but + rather that the heuristics and settings coded into the parallel + versions will be used to determine if all, some, or no algorithms + will be executed using parallel variants. +</para> + +<para>Note that the <constant>_GLIBCXX_PARALLEL</constant> define may change the + sizes and behavior of standard class templates such as + <function>std::search</function>, and therefore one can only link code + compiled with parallel mode and code compiled without parallel mode + if no instantiation of a container is passed between the two + translation units. Parallel mode functionality has distinct linkage, + and cannot be confused with normal mode symbols. +</para> +</section> + +<section xml:id="parallel_mode.using.specific"><info><title>Using Specific Parallel Components</title></info> + + +<para>When it is not feasible to recompile your entire application, or + only specific algorithms need to be parallel-aware, individual + parallel algorithms can be made available explicitly. These + parallel algorithms are functionally equivalent to the standard + drop-in algorithms used in parallel mode, but they are available in + a separate namespace as GNU extensions and may be used in programs + compiled with either release mode or with parallel mode. +</para> + + +<para>An example of using a parallel version +of <function>std::sort</function>, but no other parallel algorithms, is: +</para> + +<programlisting> +#include <vector> +#include <parallel/algorithm> + +int main() +{ + std::vector<int> v(100); + + // ... + + // Explicitly force a call to parallel sort. + __gnu_parallel::sort(v.begin(), v.end()); + return 0; +} +</programlisting> + +<para> +Then compile this code with the prerequisite compiler flags +(<literal>-fopenmp</literal> and any necessary architecture-specific +flags for atomic operations.) +</para> + +<para> The following table provides the names and headers of all the + parallel algorithms that can be used in a similar manner: +</para> + +<table frame="all"> +<title>Parallel Algorithms</title> + +<tgroup cols="4" align="left" colsep="1" rowsep="1"> +<colspec colname="c1"/> +<colspec colname="c2"/> +<colspec colname="c3"/> +<colspec colname="c4"/> + +<thead> + <row> + <entry>Algorithm</entry> + <entry>Header</entry> + <entry>Parallel algorithm</entry> + <entry>Parallel header</entry> + </row> +</thead> + +<tbody> + <row> + <entry><function>std::accumulate</function></entry> + <entry><filename class="headerfile">numeric</filename></entry> + <entry><function>__gnu_parallel::accumulate</function></entry> + <entry><filename class="headerfile">parallel/numeric</filename></entry> + </row> + <row> + <entry><function>std::adjacent_difference</function></entry> + <entry><filename class="headerfile">numeric</filename></entry> + <entry><function>__gnu_parallel::adjacent_difference</function></entry> + <entry><filename class="headerfile">parallel/numeric</filename></entry> + </row> + <row> + <entry><function>std::inner_product</function></entry> + <entry><filename class="headerfile">numeric</filename></entry> + <entry><function>__gnu_parallel::inner_product</function></entry> + <entry><filename class="headerfile">parallel/numeric</filename></entry> + </row> + <row> + <entry><function>std::partial_sum</function></entry> + <entry><filename class="headerfile">numeric</filename></entry> + <entry><function>__gnu_parallel::partial_sum</function></entry> + <entry><filename class="headerfile">parallel/numeric</filename></entry> + </row> + <row> + <entry><function>std::adjacent_find</function></entry> + <entry><filename class="headerfile">algorithm</filename></entry> + <entry><function>__gnu_parallel::adjacent_find</function></entry> + <entry><filename class="headerfile">parallel/algorithm</filename></entry> + </row> + + <row> + <entry><function>std::count</function></entry> + <entry><filename class="headerfile">algorithm</filename></entry> + <entry><function>__gnu_parallel::count</function></entry> + <entry><filename class="headerfile">parallel/algorithm</filename></entry> + </row> + + <row> + <entry><function>std::count_if</function></entry> + <entry><filename class="headerfile">algorithm</filename></entry> + <entry><function>__gnu_parallel::count_if</function></entry> + <entry><filename class="headerfile">parallel/algorithm</filename></entry> + </row> + + <row> + <entry><function>std::equal</function></entry> + <entry><filename class="headerfile">algorithm</filename></entry> + <entry><function>__gnu_parallel::equal</function></entry> + <entry><filename class="headerfile">parallel/algorithm</filename></entry> + </row> + + <row> + <entry><function>std::find</function></entry> + <entry><filename class="headerfile">algorithm</filename></entry> + <entry><function>__gnu_parallel::find</function></entry> + <entry><filename class="headerfile">parallel/algorithm</filename></entry> + </row> + + <row> + <entry><function>std::find_if</function></entry> + <entry><filename class="headerfile">algorithm</filename></entry> + <entry><function>__gnu_parallel::find_if</function></entry> + <entry><filename class="headerfile">parallel/algorithm</filename></entry> + </row> + + <row> + <entry><function>std::find_first_of</function></entry> + <entry><filename class="headerfile">algorithm</filename></entry> + <entry><function>__gnu_parallel::find_first_of</function></entry> + <entry><filename class="headerfile">parallel/algorithm</filename></entry> + </row> + + <row> + <entry><function>std::for_each</function></entry> + <entry><filename class="headerfile">algorithm</filename></entry> + <entry><function>__gnu_parallel::for_each</function></entry> + <entry><filename class="headerfile">parallel/algorithm</filename></entry> + </row> + + <row> + <entry><function>std::generate</function></entry> + <entry><filename class="headerfile">algorithm</filename></entry> + <entry><function>__gnu_parallel::generate</function></entry> + <entry><filename class="headerfile">parallel/algorithm</filename></entry> + </row> + + <row> + <entry><function>std::generate_n</function></entry> + <entry><filename class="headerfile">algorithm</filename></entry> + <entry><function>__gnu_parallel::generate_n</function></entry> + <entry><filename class="headerfile">parallel/algorithm</filename></entry> + </row> + + <row> + <entry><function>std::lexicographical_compare</function></entry> + <entry><filename class="headerfile">algorithm</filename></entry> + <entry><function>__gnu_parallel::lexicographical_compare</function></entry> + <entry><filename class="headerfile">parallel/algorithm</filename></entry> + </row> + + <row> + <entry><function>std::mismatch</function></entry> + <entry><filename class="headerfile">algorithm</filename></entry> + <entry><function>__gnu_parallel::mismatch</function></entry> + <entry><filename class="headerfile">parallel/algorithm</filename></entry> + </row> + + <row> + <entry><function>std::search</function></entry> + <entry><filename class="headerfile">algorithm</filename></entry> + <entry><function>__gnu_parallel::search</function></entry> + <entry><filename class="headerfile">parallel/algorithm</filename></entry> + </row> + + <row> + <entry><function>std::search_n</function></entry> + <entry><filename class="headerfile">algorithm</filename></entry> + <entry><function>__gnu_parallel::search_n</function></entry> + <entry><filename class="headerfile">parallel/algorithm</filename></entry> + </row> + + <row> + <entry><function>std::transform</function></entry> + <entry><filename class="headerfile">algorithm</filename></entry> + <entry><function>__gnu_parallel::transform</function></entry> + <entry><filename class="headerfile">parallel/algorithm</filename></entry> + </row> + + <row> + <entry><function>std::replace</function></entry> + <entry><filename class="headerfile">algorithm</filename></entry> + <entry><function>__gnu_parallel::replace</function></entry> + <entry><filename class="headerfile">parallel/algorithm</filename></entry> + </row> + + <row> + <entry><function>std::replace_if</function></entry> + <entry><filename class="headerfile">algorithm</filename></entry> + <entry><function>__gnu_parallel::replace_if</function></entry> + <entry><filename class="headerfile">parallel/algorithm</filename></entry> + </row> + + <row> + <entry><function>std::max_element</function></entry> + <entry><filename class="headerfile">algorithm</filename></entry> + <entry><function>__gnu_parallel::max_element</function></entry> + <entry><filename class="headerfile">parallel/algorithm</filename></entry> + </row> + + <row> + <entry><function>std::merge</function></entry> + <entry><filename class="headerfile">algorithm</filename></entry> + <entry><function>__gnu_parallel::merge</function></entry> + <entry><filename class="headerfile">parallel/algorithm</filename></entry> + </row> + + <row> + <entry><function>std::min_element</function></entry> + <entry><filename class="headerfile">algorithm</filename></entry> + <entry><function>__gnu_parallel::min_element</function></entry> + <entry><filename class="headerfile">parallel/algorithm</filename></entry> + </row> + + <row> + <entry><function>std::nth_element</function></entry> + <entry><filename class="headerfile">algorithm</filename></entry> + <entry><function>__gnu_parallel::nth_element</function></entry> + <entry><filename class="headerfile">parallel/algorithm</filename></entry> + </row> + + <row> + <entry><function>std::partial_sort</function></entry> + <entry><filename class="headerfile">algorithm</filename></entry> + <entry><function>__gnu_parallel::partial_sort</function></entry> + <entry><filename class="headerfile">parallel/algorithm</filename></entry> + </row> + + <row> + <entry><function>std::partition</function></entry> + <entry><filename class="headerfile">algorithm</filename></entry> + <entry><function>__gnu_parallel::partition</function></entry> + <entry><filename class="headerfile">parallel/algorithm</filename></entry> + </row> + + <row> + <entry><function>std::random_shuffle</function></entry> + <entry><filename class="headerfile">algorithm</filename></entry> + <entry><function>__gnu_parallel::random_shuffle</function></entry> + <entry><filename class="headerfile">parallel/algorithm</filename></entry> + </row> + + <row> + <entry><function>std::set_union</function></entry> + <entry><filename class="headerfile">algorithm</filename></entry> + <entry><function>__gnu_parallel::set_union</function></entry> + <entry><filename class="headerfile">parallel/algorithm</filename></entry> + </row> + + <row> + <entry><function>std::set_intersection</function></entry> + <entry><filename class="headerfile">algorithm</filename></entry> + <entry><function>__gnu_parallel::set_intersection</function></entry> + <entry><filename class="headerfile">parallel/algorithm</filename></entry> + </row> + + <row> + <entry><function>std::set_symmetric_difference</function></entry> + <entry><filename class="headerfile">algorithm</filename></entry> + <entry><function>__gnu_parallel::set_symmetric_difference</function></entry> + <entry><filename class="headerfile">parallel/algorithm</filename></entry> + </row> + + <row> + <entry><function>std::set_difference</function></entry> + <entry><filename class="headerfile">algorithm</filename></entry> + <entry><function>__gnu_parallel::set_difference</function></entry> + <entry><filename class="headerfile">parallel/algorithm</filename></entry> + </row> + + <row> + <entry><function>std::sort</function></entry> + <entry><filename class="headerfile">algorithm</filename></entry> + <entry><function>__gnu_parallel::sort</function></entry> + <entry><filename class="headerfile">parallel/algorithm</filename></entry> + </row> + + <row> + <entry><function>std::stable_sort</function></entry> + <entry><filename class="headerfile">algorithm</filename></entry> + <entry><function>__gnu_parallel::stable_sort</function></entry> + <entry><filename class="headerfile">parallel/algorithm</filename></entry> + </row> + + <row> + <entry><function>std::unique_copy</function></entry> + <entry><filename class="headerfile">algorithm</filename></entry> + <entry><function>__gnu_parallel::unique_copy</function></entry> + <entry><filename class="headerfile">parallel/algorithm</filename></entry> + </row> +</tbody> +</tgroup> +</table> + +</section> + +</section> + +<section xml:id="manual.ext.parallel_mode.design" xreflabel="Design"><info><title>Design</title></info> + + <para> + </para> +<section xml:id="parallel_mode.design.intro" xreflabel="Intro"><info><title>Interface Basics</title></info> + + +<para> +All parallel algorithms are intended to have signatures that are +equivalent to the ISO C++ algorithms replaced. For instance, the +<function>std::adjacent_find</function> function is declared as: +</para> +<programlisting> +namespace std +{ + template<typename _FIter> + _FIter + adjacent_find(_FIter, _FIter); +} +</programlisting> + +<para> +Which means that there should be something equivalent for the parallel +version. Indeed, this is the case: +</para> + +<programlisting> +namespace std +{ + namespace __parallel + { + template<typename _FIter> + _FIter + adjacent_find(_FIter, _FIter); + + ... + } +} +</programlisting> + +<para>But.... why the ellipses? +</para> + +<para> The ellipses in the example above represent additional overloads +required for the parallel version of the function. These additional +overloads are used to dispatch calls from the ISO C++ function +signature to the appropriate parallel function (or sequential +function, if no parallel functions are deemed worthy), based on either +compile-time or run-time conditions. +</para> + +<para> The available signature options are specific for the different +algorithms/algorithm classes.</para> + +<para> The general view of overloads for the parallel algorithms look like this: +</para> +<itemizedlist> + <listitem><para>ISO C++ signature</para></listitem> + <listitem><para>ISO C++ signature + sequential_tag argument</para></listitem> + <listitem><para>ISO C++ signature + algorithm-specific tag type + (several signatures)</para></listitem> +</itemizedlist> + +<para> Please note that the implementation may use additional functions +(designated with the <code>_switch</code> suffix) to dispatch from the +ISO C++ signature to the correct parallel version. Also, some of the +algorithms do not have support for run-time conditions, so the last +overload is therefore missing. +</para> + + +</section> + +<section xml:id="parallel_mode.design.tuning" xreflabel="Tuning"><info><title>Configuration and Tuning</title></info> + + + +<section xml:id="parallel_mode.design.tuning.omp" xreflabel="OpenMP Environment"><info><title>Setting up the OpenMP Environment</title></info> + + +<para> +Several aspects of the overall runtime environment can be manipulated +by standard OpenMP function calls. +</para> + +<para> +To specify the number of threads to be used for the algorithms globally, +use the function <function>omp_set_num_threads</function>. An example: +</para> + +<programlisting> +#include <stdlib.h> +#include <omp.h> + +int main() +{ + // Explicitly set number of threads. + const int threads_wanted = 20; + omp_set_dynamic(false); + omp_set_num_threads(threads_wanted); + + // Call parallel mode algorithms. + + return 0; +} +</programlisting> + +<para> + Some algorithms allow the number of threads being set for a particular call, + by augmenting the algorithm variant. + See the next section for further information. +</para> + +<para> +Other parts of the runtime environment able to be manipulated include +nested parallelism (<function>omp_set_nested</function>), schedule kind +(<function>omp_set_schedule</function>), and others. See the OpenMP +documentation for more information. +</para> + +</section> + +<section xml:id="parallel_mode.design.tuning.compile" xreflabel="Compile Switches"><info><title>Compile Time Switches</title></info> + + +<para> +To force an algorithm to execute sequentially, even though parallelism +is switched on in general via the macro <constant>_GLIBCXX_PARALLEL</constant>, +add <classname>__gnu_parallel::sequential_tag()</classname> to the end +of the algorithm's argument list. +</para> + +<para> +Like so: +</para> + +<programlisting> +std::sort(v.begin(), v.end(), __gnu_parallel::sequential_tag()); +</programlisting> + +<para> +Some parallel algorithm variants can be excluded from compilation by +preprocessor defines. See the doxygen documentation on +<code>compiletime_settings.h</code> and <code>features.h</code> for details. +</para> + +<para> +For some algorithms, the desired variant can be chosen at compile-time by +appending a tag object. The available options are specific to the particular +algorithm (class). +</para> + +<para> +For the "embarrassingly parallel" algorithms, there is only one "tag object +type", the enum _Parallelism. +It takes one of the following values, +<code>__gnu_parallel::parallel_tag</code>, +<code>__gnu_parallel::balanced_tag</code>, +<code>__gnu_parallel::unbalanced_tag</code>, +<code>__gnu_parallel::omp_loop_tag</code>, +<code>__gnu_parallel::omp_loop_static_tag</code>. +This means that the actual parallelization strategy is chosen at run-time. +(Choosing the variants at compile-time will come soon.) +</para> + +<para> +For the following algorithms in general, we have +<code>__gnu_parallel::parallel_tag</code> and +<code>__gnu_parallel::default_parallel_tag</code>, in addition to +<code>__gnu_parallel::sequential_tag</code>. +<code>__gnu_parallel::default_parallel_tag</code> chooses the default +algorithm at compiletime, as does omitting the tag. +<code>__gnu_parallel::parallel_tag</code> postpones the decision to runtime +(see next section). +For all tags, the number of threads desired for this call can optionally be +passed to the respective tag's constructor. +</para> + +<para> +The <code>multiway_merge</code> algorithm comes with the additional choices, +<code>__gnu_parallel::exact_tag</code> and +<code>__gnu_parallel::sampling_tag</code>. +Exact and sampling are the two available splitting strategies. +</para> + +<para> +For the <code>sort</code> and <code>stable_sort</code> algorithms, there are +several additional choices, namely +<code>__gnu_parallel::multiway_mergesort_tag</code>, +<code>__gnu_parallel::multiway_mergesort_exact_tag</code>, +<code>__gnu_parallel::multiway_mergesort_sampling_tag</code>, +<code>__gnu_parallel::quicksort_tag</code>, and +<code>__gnu_parallel::balanced_quicksort_tag</code>. +Multiway mergesort comes with the two splitting strategies for multi-way +merging. The quicksort options cannot be used for <code>stable_sort</code>. +</para> + +</section> + +<section xml:id="parallel_mode.design.tuning.settings" xreflabel="_Settings"><info><title>Run Time Settings and Defaults</title></info> + + +<para> +The default parallelization strategy, the choice of specific algorithm +strategy, the minimum threshold limits for individual parallel +algorithms, and aspects of the underlying hardware can be specified as +desired via manipulation +of <classname>__gnu_parallel::_Settings</classname> member data. +</para> + +<para> +First off, the choice of parallelization strategy: serial, parallel, +or heuristically deduced. This corresponds +to <code>__gnu_parallel::_Settings::algorithm_strategy</code> and is a +value of enum <type>__gnu_parallel::_AlgorithmStrategy</type> +type. Choices +include: <type>heuristic</type>, <type>force_sequential</type>, +and <type>force_parallel</type>. The default is <type>heuristic</type>. +</para> + + +<para> +Next, the sub-choices for algorithm variant, if not fixed at compile-time. +Specific algorithms like <function>find</function> or <function>sort</function> +can be implemented in multiple ways: when this is the case, +a <classname>__gnu_parallel::_Settings</classname> member exists to +pick the default strategy. For +example, <code>__gnu_parallel::_Settings::sort_algorithm</code> can +have any values of +enum <type>__gnu_parallel::_SortAlgorithm</type>: <type>MWMS</type>, <type>QS</type>, +or <type>QS_BALANCED</type>. +</para> + +<para> +Likewise for setting the minimal threshold for algorithm +parallelization. Parallelism always incurs some overhead. Thus, it is +not helpful to parallelize operations on very small sets of +data. Because of this, measures are taken to avoid parallelizing below +a certain, pre-determined threshold. For each algorithm, a minimum +problem size is encoded as a variable in the +active <classname>__gnu_parallel::_Settings</classname> object. This +threshold variable follows the following naming scheme: +<code>__gnu_parallel::_Settings::[algorithm]_minimal_n</code>. So, +for <function>fill</function>, the threshold variable +is <code>__gnu_parallel::_Settings::fill_minimal_n</code>, +</para> + +<para> +Finally, hardware details like L1/L2 cache size can be hardwired +via <code>__gnu_parallel::_Settings::L1_cache_size</code> and friends. +</para> + +<para> +</para> + +<para> +All these configuration variables can be changed by the user, if +desired. +There exists one global instance of the class <classname>_Settings</classname>, +i. e. it is a singleton. It can be read and written by calling +<code>__gnu_parallel::_Settings::get</code> and +<code>__gnu_parallel::_Settings::set</code>, respectively. +Please note that the first call return a const object, so direct manipulation +is forbidden. +See <link xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="http://gcc.gnu.org/onlinedocs/libstdc++/latest-doxygen/a01005.html"> + <filename class="headerfile">settings.h</filename></link> +for complete details. +</para> + +<para> +A small example of tuning the default: +</para> + +<programlisting> +#include <parallel/algorithm> +#include <parallel/settings.h> + +int main() +{ + __gnu_parallel::_Settings s; + s.algorithm_strategy = __gnu_parallel::force_parallel; + __gnu_parallel::_Settings::set(s); + + // Do work... all algorithms will be parallelized, always. + + return 0; +} +</programlisting> + +</section> + +</section> + +<section xml:id="parallel_mode.design.impl" xreflabel="Impl"><info><title>Implementation Namespaces</title></info> + + +<para> One namespace contain versions of code that are always +explicitly sequential: +<code>__gnu_serial</code>. +</para> + +<para> Two namespaces contain the parallel mode: +<code>std::__parallel</code> and <code>__gnu_parallel</code>. +</para> + +<para> Parallel implementations of standard components, including +template helpers to select parallelism, are defined in <code>namespace +std::__parallel</code>. For instance, <function>std::transform</function> from <filename class="headerfile">algorithm</filename> has a parallel counterpart in +<function>std::__parallel::transform</function> from <filename class="headerfile">parallel/algorithm</filename>. In addition, these parallel +implementations are injected into <code>namespace +__gnu_parallel</code> with using declarations. +</para> + +<para> Support and general infrastructure is in <code>namespace +__gnu_parallel</code>. +</para> + +<para> More information, and an organized index of types and functions +related to the parallel mode on a per-namespace basis, can be found in +the generated source documentation. +</para> + +</section> + +</section> + +<section xml:id="manual.ext.parallel_mode.test" xreflabel="Testing"><info><title>Testing</title></info> + + + <para> + Both the normal conformance and regression tests and the + supplemental performance tests work. + </para> + + <para> + To run the conformance and regression tests with the parallel mode + active, + </para> + + <screen> + <userinput>make check-parallel</userinput> + </screen> + + <para> + The log and summary files for conformance testing are in the + <filename class="directory">testsuite/parallel</filename> directory. + </para> + + <para> + To run the performance tests with the parallel mode active, + </para> + + <screen> + <userinput>make check-performance-parallel</userinput> + </screen> + + <para> + The result file for performance testing are in the + <filename class="directory">testsuite</filename> directory, in the file + <filename>libstdc++_performance.sum</filename>. In addition, the + policy-based containers have their own visualizations, which have + additional software dependencies than the usual bare-boned text + file, and can be generated by using the <code>make + doc-performance</code> rule in the testsuite's Makefile. +</para> +</section> + +<bibliography xml:id="parallel_mode.biblio"><info><title>Bibliography</title></info> + + + <biblioentry> + <citetitle> + Parallelization of Bulk Operations for STL Dictionaries + </citetitle> + + <author><personname><firstname>Johannes</firstname><surname>Singler</surname></personname></author> + <author><personname><firstname>Leonor</firstname><surname>Frias</surname></personname></author> + + <copyright> + <year>2007</year> + <holder/> + </copyright> + + <publisher> + <publishername> + Workshop on Highly Parallel Processing on a Chip (HPPC) 2007. (LNCS) + </publishername> + </publisher> + </biblioentry> + + <biblioentry> + <citetitle> + The Multi-Core Standard Template Library + </citetitle> + + <author><personname><firstname>Johannes</firstname><surname>Singler</surname></personname></author> + <author><personname><firstname>Peter</firstname><surname>Sanders</surname></personname></author> + <author><personname><firstname>Felix</firstname><surname>Putze</surname></personname></author> + + <copyright> + <year>2007</year> + <holder/> + </copyright> + + <publisher> + <publishername> + Euro-Par 2007: Parallel Processing. (LNCS 4641) + </publishername> + </publisher> + </biblioentry> + +</bibliography> + +</chapter> |