From 554fd8c5195424bdbcabf5de30fdc183aba391bd Mon Sep 17 00:00:00 2001 From: upstream source tree Date: Sun, 15 Mar 2015 20:14:05 -0400 Subject: obtained gcc-4.6.4.tar.bz2 from upstream website; verified gcc-4.6.4.tar.bz2.sig; imported gcc-4.6.4 source tree from verified upstream tarball. downloading a git-generated archive based on the 'upstream' tag should provide you with a source tree that is binary identical to the one extracted from the above tarball. if you have obtained the source via the command 'git clone', however, do note that line-endings of files in your working directory might differ from line-endings of the respective files in the upstream repository. --- gcc/doc/loop.texi | 655 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 655 insertions(+) create mode 100644 gcc/doc/loop.texi (limited to 'gcc/doc/loop.texi') diff --git a/gcc/doc/loop.texi b/gcc/doc/loop.texi new file mode 100644 index 000000000..356c00d02 --- /dev/null +++ b/gcc/doc/loop.texi @@ -0,0 +1,655 @@ +@c Copyright (c) 2006, 2007, 2008 Free Software Foundation, Inc. +@c Free Software Foundation, Inc. +@c This is part of the GCC manual. +@c For copying conditions, see the file gcc.texi. + +@c --------------------------------------------------------------------- +@c Loop Representation +@c --------------------------------------------------------------------- + +@node Loop Analysis and Representation +@chapter Analysis and Representation of Loops + +GCC provides extensive infrastructure for work with natural loops, i.e., +strongly connected components of CFG with only one entry block. This +chapter describes representation of loops in GCC, both on GIMPLE and in +RTL, as well as the interfaces to loop-related analyses (induction +variable analysis and number of iterations analysis). + +@menu +* Loop representation:: Representation and analysis of loops. +* Loop querying:: Getting information about loops. +* Loop manipulation:: Loop manipulation functions. +* LCSSA:: Loop-closed SSA form. +* Scalar evolutions:: Induction variables on GIMPLE. +* loop-iv:: Induction variables on RTL. +* Number of iterations:: Number of iterations analysis. +* Dependency analysis:: Data dependency analysis. +* Lambda:: Linear loop transformations framework. +* Omega:: A solver for linear programming problems. +@end menu + +@node Loop representation +@section Loop representation +@cindex Loop representation +@cindex Loop analysis + +This chapter describes the representation of loops in GCC, and functions +that can be used to build, modify and analyze this representation. Most +of the interfaces and data structures are declared in @file{cfgloop.h}. +At the moment, loop structures are analyzed and this information is +updated only by the optimization passes that deal with loops, but some +efforts are being made to make it available throughout most of the +optimization passes. + +In general, a natural loop has one entry block (header) and possibly +several back edges (latches) leading to the header from the inside of +the loop. Loops with several latches may appear if several loops share +a single header, or if there is a branching in the middle of the loop. +The representation of loops in GCC however allows only loops with a +single latch. During loop analysis, headers of such loops are split and +forwarder blocks are created in order to disambiguate their structures. +Heuristic based on profile information and structure of the induction +variables in the loops is used to determine whether the latches +correspond to sub-loops or to control flow in a single loop. This means +that the analysis sometimes changes the CFG, and if you run it in the +middle of an optimization pass, you must be able to deal with the new +blocks. You may avoid CFG changes by passing +@code{LOOPS_MAY_HAVE_MULTIPLE_LATCHES} flag to the loop discovery, +note however that most other loop manipulation functions will not work +correctly for loops with multiple latch edges (the functions that only +query membership of blocks to loops and subloop relationships, or +enumerate and test loop exits, can be expected to work). + +Body of the loop is the set of blocks that are dominated by its header, +and reachable from its latch against the direction of edges in CFG@. The +loops are organized in a containment hierarchy (tree) such that all the +loops immediately contained inside loop L are the children of L in the +tree. This tree is represented by the @code{struct loops} structure. +The root of this tree is a fake loop that contains all blocks in the +function. Each of the loops is represented in a @code{struct loop} +structure. Each loop is assigned an index (@code{num} field of the +@code{struct loop} structure), and the pointer to the loop is stored in +the corresponding field of the @code{larray} vector in the loops +structure. The indices do not have to be continuous, there may be +empty (@code{NULL}) entries in the @code{larray} created by deleting +loops. Also, there is no guarantee on the relative order of a loop +and its subloops in the numbering. The index of a loop never changes. + +The entries of the @code{larray} field should not be accessed directly. +The function @code{get_loop} returns the loop description for a loop with +the given index. @code{number_of_loops} function returns number of +loops in the function. To traverse all loops, use @code{FOR_EACH_LOOP} +macro. The @code{flags} argument of the macro is used to determine +the direction of traversal and the set of loops visited. Each loop is +guaranteed to be visited exactly once, regardless of the changes to the +loop tree, and the loops may be removed during the traversal. The newly +created loops are never traversed, if they need to be visited, this +must be done separately after their creation. The @code{FOR_EACH_LOOP} +macro allocates temporary variables. If the @code{FOR_EACH_LOOP} loop +were ended using break or goto, they would not be released; +@code{FOR_EACH_LOOP_BREAK} macro must be used instead. + +Each basic block contains the reference to the innermost loop it belongs +to (@code{loop_father}). For this reason, it is only possible to have +one @code{struct loops} structure initialized at the same time for each +CFG@. The global variable @code{current_loops} contains the +@code{struct loops} structure. Many of the loop manipulation functions +assume that dominance information is up-to-date. + +The loops are analyzed through @code{loop_optimizer_init} function. The +argument of this function is a set of flags represented in an integer +bitmask. These flags specify what other properties of the loop +structures should be calculated/enforced and preserved later: + +@itemize +@item @code{LOOPS_MAY_HAVE_MULTIPLE_LATCHES}: If this flag is set, no +changes to CFG will be performed in the loop analysis, in particular, +loops with multiple latch edges will not be disambiguated. If a loop +has multiple latches, its latch block is set to NULL@. Most of +the loop manipulation functions will not work for loops in this shape. +No other flags that require CFG changes can be passed to +loop_optimizer_init. +@item @code{LOOPS_HAVE_PREHEADERS}: Forwarder blocks are created in such +a way that each loop has only one entry edge, and additionally, the +source block of this entry edge has only one successor. This creates a +natural place where the code can be moved out of the loop, and ensures +that the entry edge of the loop leads from its immediate super-loop. +@item @code{LOOPS_HAVE_SIMPLE_LATCHES}: Forwarder blocks are created to +force the latch block of each loop to have only one successor. This +ensures that the latch of the loop does not belong to any of its +sub-loops, and makes manipulation with the loops significantly easier. +Most of the loop manipulation functions assume that the loops are in +this shape. Note that with this flag, the ``normal'' loop without any +control flow inside and with one exit consists of two basic blocks. +@item @code{LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS}: Basic blocks and +edges in the strongly connected components that are not natural loops +(have more than one entry block) are marked with +@code{BB_IRREDUCIBLE_LOOP} and @code{EDGE_IRREDUCIBLE_LOOP} flags. The +flag is not set for blocks and edges that belong to natural loops that +are in such an irreducible region (but it is set for the entry and exit +edges of such a loop, if they lead to/from this region). +@item @code{LOOPS_HAVE_RECORDED_EXITS}: The lists of exits are recorded +and updated for each loop. This makes some functions (e.g., +@code{get_loop_exit_edges}) more efficient. Some functions (e.g., +@code{single_exit}) can be used only if the lists of exits are +recorded. +@end itemize + +These properties may also be computed/enforced later, using functions +@code{create_preheaders}, @code{force_single_succ_latches}, +@code{mark_irreducible_loops} and @code{record_loop_exits}. + +The memory occupied by the loops structures should be freed with +@code{loop_optimizer_finalize} function. + +The CFG manipulation functions in general do not update loop structures. +Specialized versions that additionally do so are provided for the most +common tasks. On GIMPLE, @code{cleanup_tree_cfg_loop} function can be +used to cleanup CFG while updating the loops structures if +@code{current_loops} is set. + +@node Loop querying +@section Loop querying +@cindex Loop querying + +The functions to query the information about loops are declared in +@file{cfgloop.h}. Some of the information can be taken directly from +the structures. @code{loop_father} field of each basic block contains +the innermost loop to that the block belongs. The most useful fields of +loop structure (that are kept up-to-date at all times) are: + +@itemize +@item @code{header}, @code{latch}: Header and latch basic blocks of the +loop. +@item @code{num_nodes}: Number of basic blocks in the loop (including +the basic blocks of the sub-loops). +@item @code{depth}: The depth of the loop in the loops tree, i.e., the +number of super-loops of the loop. +@item @code{outer}, @code{inner}, @code{next}: The super-loop, the first +sub-loop, and the sibling of the loop in the loops tree. +@end itemize + +There are other fields in the loop structures, many of them used only by +some of the passes, or not updated during CFG changes; in general, they +should not be accessed directly. + +The most important functions to query loop structures are: + +@itemize +@item @code{flow_loops_dump}: Dumps the information about loops to a +file. +@item @code{verify_loop_structure}: Checks consistency of the loop +structures. +@item @code{loop_latch_edge}: Returns the latch edge of a loop. +@item @code{loop_preheader_edge}: If loops have preheaders, returns +the preheader edge of a loop. +@item @code{flow_loop_nested_p}: Tests whether loop is a sub-loop of +another loop. +@item @code{flow_bb_inside_loop_p}: Tests whether a basic block belongs +to a loop (including its sub-loops). +@item @code{find_common_loop}: Finds the common super-loop of two loops. +@item @code{superloop_at_depth}: Returns the super-loop of a loop with +the given depth. +@item @code{tree_num_loop_insns}, @code{num_loop_insns}: Estimates the +number of insns in the loop, on GIMPLE and on RTL. +@item @code{loop_exit_edge_p}: Tests whether edge is an exit from a +loop. +@item @code{mark_loop_exit_edges}: Marks all exit edges of all loops +with @code{EDGE_LOOP_EXIT} flag. +@item @code{get_loop_body}, @code{get_loop_body_in_dom_order}, +@code{get_loop_body_in_bfs_order}: Enumerates the basic blocks in the +loop in depth-first search order in reversed CFG, ordered by dominance +relation, and breath-first search order, respectively. +@item @code{single_exit}: Returns the single exit edge of the loop, or +@code{NULL} if the loop has more than one exit. You can only use this +function if LOOPS_HAVE_MARKED_SINGLE_EXITS property is used. +@item @code{get_loop_exit_edges}: Enumerates the exit edges of a loop. +@item @code{just_once_each_iteration_p}: Returns true if the basic block +is executed exactly once during each iteration of a loop (that is, it +does not belong to a sub-loop, and it dominates the latch of the loop). +@end itemize + +@node Loop manipulation +@section Loop manipulation +@cindex Loop manipulation + +The loops tree can be manipulated using the following functions: + +@itemize +@item @code{flow_loop_tree_node_add}: Adds a node to the tree. +@item @code{flow_loop_tree_node_remove}: Removes a node from the tree. +@item @code{add_bb_to_loop}: Adds a basic block to a loop. +@item @code{remove_bb_from_loops}: Removes a basic block from loops. +@end itemize + +Most low-level CFG functions update loops automatically. The following +functions handle some more complicated cases of CFG manipulations: + +@itemize +@item @code{remove_path}: Removes an edge and all blocks it dominates. +@item @code{split_loop_exit_edge}: Splits exit edge of the loop, +ensuring that PHI node arguments remain in the loop (this ensures that +loop-closed SSA form is preserved). Only useful on GIMPLE. +@end itemize + +Finally, there are some higher-level loop transformations implemented. +While some of them are written so that they should work on non-innermost +loops, they are mostly untested in that case, and at the moment, they +are only reliable for the innermost loops: + +@itemize +@item @code{create_iv}: Creates a new induction variable. Only works on +GIMPLE@. @code{standard_iv_increment_position} can be used to find a +suitable place for the iv increment. +@item @code{duplicate_loop_to_header_edge}, +@code{tree_duplicate_loop_to_header_edge}: These functions (on RTL and +on GIMPLE) duplicate the body of the loop prescribed number of times on +one of the edges entering loop header, thus performing either loop +unrolling or loop peeling. @code{can_duplicate_loop_p} +(@code{can_unroll_loop_p} on GIMPLE) must be true for the duplicated +loop. +@item @code{loop_version}, @code{tree_ssa_loop_version}: These function +create a copy of a loop, and a branch before them that selects one of +them depending on the prescribed condition. This is useful for +optimizations that need to verify some assumptions in runtime (one of +the copies of the loop is usually left unchanged, while the other one is +transformed in some way). +@item @code{tree_unroll_loop}: Unrolls the loop, including peeling the +extra iterations to make the number of iterations divisible by unroll +factor, updating the exit condition, and removing the exits that now +cannot be taken. Works only on GIMPLE. +@end itemize + +@node LCSSA +@section Loop-closed SSA form +@cindex LCSSA +@cindex Loop-closed SSA form + +Throughout the loop optimizations on tree level, one extra condition is +enforced on the SSA form: No SSA name is used outside of the loop in +that it is defined. The SSA form satisfying this condition is called +``loop-closed SSA form'' -- LCSSA@. To enforce LCSSA, PHI nodes must be +created at the exits of the loops for the SSA names that are used +outside of them. Only the real operands (not virtual SSA names) are +held in LCSSA, in order to save memory. + +There are various benefits of LCSSA: + +@itemize +@item Many optimizations (value range analysis, final value +replacement) are interested in the values that are defined in the loop +and used outside of it, i.e., exactly those for that we create new PHI +nodes. +@item In induction variable analysis, it is not necessary to specify the +loop in that the analysis should be performed -- the scalar evolution +analysis always returns the results with respect to the loop in that the +SSA name is defined. +@item It makes updating of SSA form during loop transformations simpler. +Without LCSSA, operations like loop unrolling may force creation of PHI +nodes arbitrarily far from the loop, while in LCSSA, the SSA form can be +updated locally. However, since we only keep real operands in LCSSA, we +cannot use this advantage (we could have local updating of real +operands, but it is not much more efficient than to use generic SSA form +updating for it as well; the amount of changes to SSA is the same). +@end itemize + +However, it also means LCSSA must be updated. This is usually +straightforward, unless you create a new value in loop and use it +outside, or unless you manipulate loop exit edges (functions are +provided to make these manipulations simple). +@code{rewrite_into_loop_closed_ssa} is used to rewrite SSA form to +LCSSA, and @code{verify_loop_closed_ssa} to check that the invariant of +LCSSA is preserved. + +@node Scalar evolutions +@section Scalar evolutions +@cindex Scalar evolutions +@cindex IV analysis on GIMPLE + +Scalar evolutions (SCEV) are used to represent results of induction +variable analysis on GIMPLE@. They enable us to represent variables with +complicated behavior in a simple and consistent way (we only use it to +express values of polynomial induction variables, but it is possible to +extend it). The interfaces to SCEV analysis are declared in +@file{tree-scalar-evolution.h}. To use scalar evolutions analysis, +@code{scev_initialize} must be used. To stop using SCEV, +@code{scev_finalize} should be used. SCEV analysis caches results in +order to save time and memory. This cache however is made invalid by +most of the loop transformations, including removal of code. If such a +transformation is performed, @code{scev_reset} must be called to clean +the caches. + +Given an SSA name, its behavior in loops can be analyzed using the +@code{analyze_scalar_evolution} function. The returned SCEV however +does not have to be fully analyzed and it may contain references to +other SSA names defined in the loop. To resolve these (potentially +recursive) references, @code{instantiate_parameters} or +@code{resolve_mixers} functions must be used. +@code{instantiate_parameters} is useful when you use the results of SCEV +only for some analysis, and when you work with whole nest of loops at +once. It will try replacing all SSA names by their SCEV in all loops, +including the super-loops of the current loop, thus providing a complete +information about the behavior of the variable in the loop nest. +@code{resolve_mixers} is useful if you work with only one loop at a +time, and if you possibly need to create code based on the value of the +induction variable. It will only resolve the SSA names defined in the +current loop, leaving the SSA names defined outside unchanged, even if +their evolution in the outer loops is known. + +The SCEV is a normal tree expression, except for the fact that it may +contain several special tree nodes. One of them is +@code{SCEV_NOT_KNOWN}, used for SSA names whose value cannot be +expressed. The other one is @code{POLYNOMIAL_CHREC}. Polynomial chrec +has three arguments -- base, step and loop (both base and step may +contain further polynomial chrecs). Type of the expression and of base +and step must be the same. A variable has evolution +@code{POLYNOMIAL_CHREC(base, step, loop)} if it is (in the specified +loop) equivalent to @code{x_1} in the following example + +@smallexample +while (@dots{}) + @{ + x_1 = phi (base, x_2); + x_2 = x_1 + step; + @} +@end smallexample + +Note that this includes the language restrictions on the operations. +For example, if we compile C code and @code{x} has signed type, then the +overflow in addition would cause undefined behavior, and we may assume +that this does not happen. Hence, the value with this SCEV cannot +overflow (which restricts the number of iterations of such a loop). + +In many cases, one wants to restrict the attention just to affine +induction variables. In this case, the extra expressive power of SCEV +is not useful, and may complicate the optimizations. In this case, +@code{simple_iv} function may be used to analyze a value -- the result +is a loop-invariant base and step. + +@node loop-iv +@section IV analysis on RTL +@cindex IV analysis on RTL + +The induction variable on RTL is simple and only allows analysis of +affine induction variables, and only in one loop at once. The interface +is declared in @file{cfgloop.h}. Before analyzing induction variables +in a loop L, @code{iv_analysis_loop_init} function must be called on L. +After the analysis (possibly calling @code{iv_analysis_loop_init} for +several loops) is finished, @code{iv_analysis_done} should be called. +The following functions can be used to access the results of the +analysis: + +@itemize +@item @code{iv_analyze}: Analyzes a single register used in the given +insn. If no use of the register in this insn is found, the following +insns are scanned, so that this function can be called on the insn +returned by get_condition. +@item @code{iv_analyze_result}: Analyzes result of the assignment in the +given insn. +@item @code{iv_analyze_expr}: Analyzes a more complicated expression. +All its operands are analyzed by @code{iv_analyze}, and hence they must +be used in the specified insn or one of the following insns. +@end itemize + +The description of the induction variable is provided in @code{struct +rtx_iv}. In order to handle subregs, the representation is a bit +complicated; if the value of the @code{extend} field is not +@code{UNKNOWN}, the value of the induction variable in the i-th +iteration is + +@smallexample +delta + mult * extend_@{extend_mode@} (subreg_@{mode@} (base + i * step)), +@end smallexample + +with the following exception: if @code{first_special} is true, then the +value in the first iteration (when @code{i} is zero) is @code{delta + +mult * base}. However, if @code{extend} is equal to @code{UNKNOWN}, +then @code{first_special} must be false, @code{delta} 0, @code{mult} 1 +and the value in the i-th iteration is + +@smallexample +subreg_@{mode@} (base + i * step) +@end smallexample + +The function @code{get_iv_value} can be used to perform these +calculations. + +@node Number of iterations +@section Number of iterations analysis +@cindex Number of iterations analysis + +Both on GIMPLE and on RTL, there are functions available to determine +the number of iterations of a loop, with a similar interface. The +number of iterations of a loop in GCC is defined as the number of +executions of the loop latch. In many cases, it is not possible to +determine the number of iterations unconditionally -- the determined +number is correct only if some assumptions are satisfied. The analysis +tries to verify these conditions using the information contained in the +program; if it fails, the conditions are returned together with the +result. The following information and conditions are provided by the +analysis: + +@itemize +@item @code{assumptions}: If this condition is false, the rest of +the information is invalid. +@item @code{noloop_assumptions} on RTL, @code{may_be_zero} on GIMPLE: If +this condition is true, the loop exits in the first iteration. +@item @code{infinite}: If this condition is true, the loop is infinite. +This condition is only available on RTL@. On GIMPLE, conditions for +finiteness of the loop are included in @code{assumptions}. +@item @code{niter_expr} on RTL, @code{niter} on GIMPLE: The expression +that gives number of iterations. The number of iterations is defined as +the number of executions of the loop latch. +@end itemize + +Both on GIMPLE and on RTL, it necessary for the induction variable +analysis framework to be initialized (SCEV on GIMPLE, loop-iv on RTL). +On GIMPLE, the results are stored to @code{struct tree_niter_desc} +structure. Number of iterations before the loop is exited through a +given exit can be determined using @code{number_of_iterations_exit} +function. On RTL, the results are returned in @code{struct niter_desc} +structure. The corresponding function is named +@code{check_simple_exit}. There are also functions that pass through +all the exits of a loop and try to find one with easy to determine +number of iterations -- @code{find_loop_niter} on GIMPLE and +@code{find_simple_exit} on RTL@. Finally, there are functions that +provide the same information, but additionally cache it, so that +repeated calls to number of iterations are not so costly -- +@code{number_of_latch_executions} on GIMPLE and @code{get_simple_loop_desc} +on RTL. + +Note that some of these functions may behave slightly differently than +others -- some of them return only the expression for the number of +iterations, and fail if there are some assumptions. The function +@code{number_of_latch_executions} works only for single-exit loops. +The function @code{number_of_cond_exit_executions} can be used to +determine number of executions of the exit condition of a single-exit +loop (i.e., the @code{number_of_latch_executions} increased by one). + +@node Dependency analysis +@section Data Dependency Analysis +@cindex Data Dependency Analysis + +The code for the data dependence analysis can be found in +@file{tree-data-ref.c} and its interface and data structures are +described in @file{tree-data-ref.h}. The function that computes the +data dependences for all the array and pointer references for a given +loop is @code{compute_data_dependences_for_loop}. This function is +currently used by the linear loop transform and the vectorization +passes. Before calling this function, one has to allocate two vectors: +a first vector will contain the set of data references that are +contained in the analyzed loop body, and the second vector will contain +the dependence relations between the data references. Thus if the +vector of data references is of size @code{n}, the vector containing the +dependence relations will contain @code{n*n} elements. However if the +analyzed loop contains side effects, such as calls that potentially can +interfere with the data references in the current analyzed loop, the +analysis stops while scanning the loop body for data references, and +inserts a single @code{chrec_dont_know} in the dependence relation +array. + +The data references are discovered in a particular order during the +scanning of the loop body: the loop body is analyzed in execution order, +and the data references of each statement are pushed at the end of the +data reference array. Two data references syntactically occur in the +program in the same order as in the array of data references. This +syntactic order is important in some classical data dependence tests, +and mapping this order to the elements of this array avoids costly +queries to the loop body representation. + +Three types of data references are currently handled: ARRAY_REF, +INDIRECT_REF and COMPONENT_REF@. The data structure for the data reference +is @code{data_reference}, where @code{data_reference_p} is a name of a +pointer to the data reference structure. The structure contains the +following elements: + +@itemize +@item @code{base_object_info}: Provides information about the base object +of the data reference and its access functions. These access functions +represent the evolution of the data reference in the loop relative to +its base, in keeping with the classical meaning of the data reference +access function for the support of arrays. For example, for a reference +@code{a.b[i][j]}, the base object is @code{a.b} and the access functions, +one for each array subscript, are: +@code{@{i_init, + i_step@}_1, @{j_init, +, j_step@}_2}. + +@item @code{first_location_in_loop}: Provides information about the first +location accessed by the data reference in the loop and about the access +function used to represent evolution relative to this location. This data +is used to support pointers, and is not used for arrays (for which we +have base objects). Pointer accesses are represented as a one-dimensional +access that starts from the first location accessed in the loop. For +example: + +@smallexample + for1 i + for2 j + *((int *)p + i + j) = a[i][j]; +@end smallexample + +The access function of the pointer access is @code{@{0, + 4B@}_for2} +relative to @code{p + i}. The access functions of the array are +@code{@{i_init, + i_step@}_for1} and @code{@{j_init, +, j_step@}_for2} +relative to @code{a}. + +Usually, the object the pointer refers to is either unknown, or we can't +prove that the access is confined to the boundaries of a certain object. + +Two data references can be compared only if at least one of these two +representations has all its fields filled for both data references. + +The current strategy for data dependence tests is as follows: +If both @code{a} and @code{b} are represented as arrays, compare +@code{a.base_object} and @code{b.base_object}; +if they are equal, apply dependence tests (use access functions based on +base_objects). +Else if both @code{a} and @code{b} are represented as pointers, compare +@code{a.first_location} and @code{b.first_location}; +if they are equal, apply dependence tests (use access functions based on +first location). +However, if @code{a} and @code{b} are represented differently, only try +to prove that the bases are definitely different. + +@item Aliasing information. +@item Alignment information. +@end itemize + +The structure describing the relation between two data references is +@code{data_dependence_relation} and the shorter name for a pointer to +such a structure is @code{ddr_p}. This structure contains: + +@itemize +@item a pointer to each data reference, +@item a tree node @code{are_dependent} that is set to @code{chrec_known} +if the analysis has proved that there is no dependence between these two +data references, @code{chrec_dont_know} if the analysis was not able to +determine any useful result and potentially there could exist a +dependence between these data references, and @code{are_dependent} is +set to @code{NULL_TREE} if there exist a dependence relation between the +data references, and the description of this dependence relation is +given in the @code{subscripts}, @code{dir_vects}, and @code{dist_vects} +arrays, +@item a boolean that determines whether the dependence relation can be +represented by a classical distance vector, +@item an array @code{subscripts} that contains a description of each +subscript of the data references. Given two array accesses a +subscript is the tuple composed of the access functions for a given +dimension. For example, given @code{A[f1][f2][f3]} and +@code{B[g1][g2][g3]}, there are three subscripts: @code{(f1, g1), (f2, +g2), (f3, g3)}. +@item two arrays @code{dir_vects} and @code{dist_vects} that contain +classical representations of the data dependences under the form of +direction and distance dependence vectors, +@item an array of loops @code{loop_nest} that contains the loops to +which the distance and direction vectors refer to. +@end itemize + +Several functions for pretty printing the information extracted by the +data dependence analysis are available: @code{dump_ddrs} prints with a +maximum verbosity the details of a data dependence relations array, +@code{dump_dist_dir_vectors} prints only the classical distance and +direction vectors for a data dependence relations array, and +@code{dump_data_references} prints the details of the data references +contained in a data reference array. + +@node Lambda +@section Linear loop transformations framework +@cindex Linear loop transformations framework + +Lambda is a framework that allows transformations of loops using +non-singular matrix based transformations of the iteration space and +loop bounds. This allows compositions of skewing, scaling, interchange, +and reversal transformations. These transformations are often used to +improve cache behavior or remove inner loop dependencies to allow +parallelization and vectorization to take place. + +To perform these transformations, Lambda requires that the loopnest be +converted into an internal form that can be matrix transformed easily. +To do this conversion, the function +@code{gcc_loopnest_to_lambda_loopnest} is provided. If the loop cannot +be transformed using lambda, this function will return NULL. + +Once a @code{lambda_loopnest} is obtained from the conversion function, +it can be transformed by using @code{lambda_loopnest_transform}, which +takes a transformation matrix to apply. Note that it is up to the +caller to verify that the transformation matrix is legal to apply to the +loop (dependence respecting, etc). Lambda simply applies whatever +matrix it is told to provide. It can be extended to make legal matrices +out of any non-singular matrix, but this is not currently implemented. +Legality of a matrix for a given loopnest can be verified using +@code{lambda_transform_legal_p}. + +Given a transformed loopnest, conversion back into gcc IR is done by +@code{lambda_loopnest_to_gcc_loopnest}. This function will modify the +loops so that they match the transformed loopnest. + + +@node Omega +@section Omega a solver for linear programming problems +@cindex Omega a solver for linear programming problems + +The data dependence analysis contains several solvers triggered +sequentially from the less complex ones to the more sophisticated. +For ensuring the consistency of the results of these solvers, a data +dependence check pass has been implemented based on two different +solvers. The second method that has been integrated to GCC is based +on the Omega dependence solver, written in the 1990's by William Pugh +and David Wonnacott. Data dependence tests can be formulated using a +subset of the Presburger arithmetics that can be translated to linear +constraint systems. These linear constraint systems can then be +solved using the Omega solver. + +The Omega solver is using Fourier-Motzkin's algorithm for variable +elimination: a linear constraint system containing @code{n} variables +is reduced to a linear constraint system with @code{n-1} variables. +The Omega solver can also be used for solving other problems that can +be expressed under the form of a system of linear equalities and +inequalities. The Omega solver is known to have an exponential worst +case, also known under the name of ``omega nightmare'' in the +literature, but in practice, the omega test is known to be efficient +for the common data dependence tests. + +The interface used by the Omega solver for describing the linear +programming problems is described in @file{omega.h}, and the solver is +@code{omega_solve_problem}. -- cgit v1.2.3