From 554fd8c5195424bdbcabf5de30fdc183aba391bd Mon Sep 17 00:00:00 2001 From: upstream source tree Date: Sun, 15 Mar 2015 20:14:05 -0400 Subject: obtained gcc-4.6.4.tar.bz2 from upstream website; verified gcc-4.6.4.tar.bz2.sig; imported gcc-4.6.4 source tree from verified upstream tarball. downloading a git-generated archive based on the 'upstream' tag should provide you with a source tree that is binary identical to the one extracted from the above tarball. if you have obtained the source via the command 'git clone', however, do note that line-endings of files in your working directory might differ from line-endings of the respective files in the upstream repository. --- libstdc++-v3/doc/html/manual/bk01pt03ch18s04.html | 213 ++++++++++++++++++++++ 1 file changed, 213 insertions(+) create mode 100644 libstdc++-v3/doc/html/manual/bk01pt03ch18s04.html (limited to 'libstdc++-v3/doc/html/manual/bk01pt03ch18s04.html') diff --git a/libstdc++-v3/doc/html/manual/bk01pt03ch18s04.html b/libstdc++-v3/doc/html/manual/bk01pt03ch18s04.html new file mode 100644 index 000000000..91c951d38 --- /dev/null +++ b/libstdc++-v3/doc/html/manual/bk01pt03ch18s04.html @@ -0,0 +1,213 @@ + + +Design

+

+All parallel algorithms are intended to have signatures that are +equivalent to the ISO C++ algorithms replaced. For instance, the +std::adjacent_find function is declared as: +

+namespace std
+{
+  template<typename _FIter>
+    _FIter
+    adjacent_find(_FIter, _FIter);
+}
+

+Which means that there should be something equivalent for the parallel +version. Indeed, this is the case: +

+namespace std
+{
+  namespace __parallel
+  {
+    template<typename _FIter>
+      _FIter
+      adjacent_find(_FIter, _FIter);
+
+    ...
+  }
+}
+

But.... why the ellipses? +

The ellipses in the example above represent additional overloads +required for the parallel version of the function. These additional +overloads are used to dispatch calls from the ISO C++ function +signature to the appropriate parallel function (or sequential +function, if no parallel functions are deemed worthy), based on either +compile-time or run-time conditions. +

The available signature options are specific for the different +algorithms/algorithm classes.

The general view of overloads for the parallel algorithms look like this: +

Please note that the implementation may use additional functions +(designated with the _switch suffix) to dispatch from the +ISO C++ signature to the correct parallel version. Also, some of the +algorithms do not have support for run-time conditions, so the last +overload is therefore missing. +

+To force an algorithm to execute sequentially, even though parallelism +is switched on in general via the macro _GLIBCXX_PARALLEL, +add __gnu_parallel::sequential_tag() to the end +of the algorithm's argument list. +

+Like so: +

+std::sort(v.begin(), v.end(), __gnu_parallel::sequential_tag());
+

+Some parallel algorithm variants can be excluded from compilation by +preprocessor defines. See the doxygen documentation on +compiletime_settings.h and features.h for details. +

+For some algorithms, the desired variant can be chosen at compile-time by +appending a tag object. The available options are specific to the particular +algorithm (class). +

+For the "embarrassingly parallel" algorithms, there is only one "tag object +type", the enum _Parallelism. +It takes one of the following values, +__gnu_parallel::parallel_tag, +__gnu_parallel::balanced_tag, +__gnu_parallel::unbalanced_tag, +__gnu_parallel::omp_loop_tag, +__gnu_parallel::omp_loop_static_tag. +This means that the actual parallelization strategy is chosen at run-time. +(Choosing the variants at compile-time will come soon.) +

+For the following algorithms in general, we have +__gnu_parallel::parallel_tag and +__gnu_parallel::default_parallel_tag, in addition to +__gnu_parallel::sequential_tag. +__gnu_parallel::default_parallel_tag chooses the default +algorithm at compiletime, as does omitting the tag. +__gnu_parallel::parallel_tag postpones the decision to runtime +(see next section). +For all tags, the number of threads desired for this call can optionally be +passed to the respective tag's constructor. +

+The multiway_merge algorithm comes with the additional choices, +__gnu_parallel::exact_tag and +__gnu_parallel::sampling_tag. +Exact and sampling are the two available splitting strategies. +

+For the sort and stable_sort algorithms, there are +several additional choices, namely +__gnu_parallel::multiway_mergesort_tag, +__gnu_parallel::multiway_mergesort_exact_tag, +__gnu_parallel::multiway_mergesort_sampling_tag, +__gnu_parallel::quicksort_tag, and +__gnu_parallel::balanced_quicksort_tag. +Multiway mergesort comes with the two splitting strategies for multi-way +merging. The quicksort options cannot be used for stable_sort. +

+The default parallelization strategy, the choice of specific algorithm +strategy, the minimum threshold limits for individual parallel +algorithms, and aspects of the underlying hardware can be specified as +desired via manipulation +of __gnu_parallel::_Settings member data. +

+First off, the choice of parallelization strategy: serial, parallel, +or heuristically deduced. This corresponds +to __gnu_parallel::_Settings::algorithm_strategy and is a +value of enum __gnu_parallel::_AlgorithmStrategy +type. Choices +include: heuristic, force_sequential, +and force_parallel. The default is heuristic. +

+Next, the sub-choices for algorithm variant, if not fixed at compile-time. +Specific algorithms like find or sort +can be implemented in multiple ways: when this is the case, +a __gnu_parallel::_Settings member exists to +pick the default strategy. For +example, __gnu_parallel::_Settings::sort_algorithm can +have any values of +enum __gnu_parallel::_SortAlgorithm: MWMS, QS, +or QS_BALANCED. +

+Likewise for setting the minimal threshold for algorithm +parallelization. Parallelism always incurs some overhead. Thus, it is +not helpful to parallelize operations on very small sets of +data. Because of this, measures are taken to avoid parallelizing below +a certain, pre-determined threshold. For each algorithm, a minimum +problem size is encoded as a variable in the +active __gnu_parallel::_Settings object. This +threshold variable follows the following naming scheme: +__gnu_parallel::_Settings::[algorithm]_minimal_n. So, +for fill, the threshold variable +is __gnu_parallel::_Settings::fill_minimal_n, +

+Finally, hardware details like L1/L2 cache size can be hardwired +via __gnu_parallel::_Settings::L1_cache_size and friends. +

+

+All these configuration variables can be changed by the user, if +desired. +There exists one global instance of the class _Settings, +i. e. it is a singleton. It can be read and written by calling +__gnu_parallel::_Settings::get and +__gnu_parallel::_Settings::set, respectively. +Please note that the first call return a const object, so direct manipulation +is forbidden. +See + settings.h +for complete details. +

+A small example of tuning the default: +

+#include <parallel/algorithm>
+#include <parallel/settings.h>
+
+int main()
+{
+  __gnu_parallel::_Settings s;
+  s.algorithm_strategy = __gnu_parallel::force_parallel;
+  __gnu_parallel::_Settings::set(s);
+
+  // Do work... all algorithms will be parallelized, always.
+
+  return 0;
+}
+
-- cgit v1.2.3