1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
|
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml"><head><title>Diagnostics</title><meta name="generator" content="DocBook XSL-NS Stylesheets V1.76.1"/><meta name="keywords" content=" C++ , library , profile "/><meta name="keywords" content=" ISO C++ , library "/><link rel="home" href="../spine.html" title="The GNU C++ Library"/><link rel="up" href="profile_mode.html" title="Chapter 19. Profile Mode"/><link rel="prev" href="bk01pt03ch19s06.html" title="Developer Information"/><link rel="next" href="ext_allocators.html" title="Chapter 20. Allocators"/></head><body><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">Diagnostics</th></tr><tr><td align="left"><a accesskey="p" href="bk01pt03ch19s06.html">Prev</a> </td><th width="60%" align="center">Chapter 19. Profile Mode</th><td align="right"> <a accesskey="n" href="ext_allocators.html">Next</a></td></tr></table><hr/></div><div class="section" title="Diagnostics"><div class="titlepage"><div><div><h2 class="title"><a id="manual.ext.profile_mode.diagnostics"/>Diagnostics</h2></div></div></div><p>
The table below presents all the diagnostics we intend to implement.
Each diagnostic has a corresponding compile time switch
<code class="code">-D_GLIBCXX_PROFILE_<diagnostic></code>.
Groups of related diagnostics can be turned on with a single switch.
For instance, <code class="code">-D_GLIBCXX_PROFILE_LOCALITY</code> is equivalent to
<code class="code">-D_GLIBCXX_PROFILE_SOFTWARE_PREFETCH
-D_GLIBCXX_PROFILE_RBTREE_LOCALITY</code>.
</p><p>
The benefit, cost, expected frequency and accuracy of each diagnostic
was given a grade from 1 to 10, where 10 is highest.
A high benefit means that, if the diagnostic is accurate, the expected
performance improvement is high.
A high cost means that turning this diagnostic on leads to high slowdown.
A high frequency means that we expect this to occur relatively often.
A high accuracy means that the diagnostic is unlikely to be wrong.
These grades are not perfect. They are just meant to guide users with
specific needs or time budgets.
</p><div class="table"><a id="id487386"/><p class="title"><strong>Table 19.2. Profile Diagnostics</strong></p><div class="table-contents"><table summary="Profile Diagnostics" border="1"><colgroup><col style="text-align: left" class="c1"/><col style="text-align: left" class="c2"/><col style="text-align: left" class="c3"/><col style="text-align: left" class="c4"/><col style="text-align: left" class="c5"/><col style="text-align: left" class="c6"/><col style="text-align: left" class="c7"/></colgroup><thead><tr><th style="text-align: left">Group</th><th style="text-align: left">Flag</th><th style="text-align: left">Benefit</th><th style="text-align: left">Cost</th><th style="text-align: left">Freq.</th><th style="text-align: left">Implemented</th><td class="auto-generated"> </td></tr></thead><tbody><tr><td style="text-align: left"><a class="link" href="bk01pt03ch19s07.html#manual.ext.profile_mode.analysis.containers" title="Containers">
CONTAINERS</a></td><td style="text-align: left"><a class="link" href="bk01pt03ch19s07.html#manual.ext.profile_mode.analysis.hashtable_too_small" title="Hashtable Too Small">
HASHTABLE_TOO_SMALL</a></td><td style="text-align: left">10</td><td style="text-align: left">1</td><td style="text-align: left"> </td><td style="text-align: left">10</td><td style="text-align: left">yes</td></tr><tr><td style="text-align: left"> </td><td style="text-align: left"><a class="link" href="bk01pt03ch19s07.html#manual.ext.profile_mode.analysis.hashtable_too_large" title="Hashtable Too Large">
HASHTABLE_TOO_LARGE</a></td><td style="text-align: left">5</td><td style="text-align: left">1</td><td style="text-align: left"> </td><td style="text-align: left">10</td><td style="text-align: left">yes</td></tr><tr><td style="text-align: left"> </td><td style="text-align: left"><a class="link" href="bk01pt03ch19s07.html#manual.ext.profile_mode.analysis.inefficient_hash" title="Inefficient Hash">
INEFFICIENT_HASH</a></td><td style="text-align: left">7</td><td style="text-align: left">3</td><td style="text-align: left"> </td><td style="text-align: left">10</td><td style="text-align: left">yes</td></tr><tr><td style="text-align: left"> </td><td style="text-align: left"><a class="link" href="bk01pt03ch19s07.html#manual.ext.profile_mode.analysis.vector_too_small" title="Vector Too Small">
VECTOR_TOO_SMALL</a></td><td style="text-align: left">8</td><td style="text-align: left">1</td><td style="text-align: left"> </td><td style="text-align: left">10</td><td style="text-align: left">yes</td></tr><tr><td style="text-align: left"> </td><td style="text-align: left"><a class="link" href="bk01pt03ch19s07.html#manual.ext.profile_mode.analysis.vector_too_large" title="Vector Too Large">
VECTOR_TOO_LARGE</a></td><td style="text-align: left">5</td><td style="text-align: left">1</td><td style="text-align: left"> </td><td style="text-align: left">10</td><td style="text-align: left">yes</td></tr><tr><td style="text-align: left"> </td><td style="text-align: left"><a class="link" href="bk01pt03ch19s07.html#manual.ext.profile_mode.analysis.vector_to_hashtable" title="Vector to Hashtable">
VECTOR_TO_HASHTABLE</a></td><td style="text-align: left">7</td><td style="text-align: left">7</td><td style="text-align: left"> </td><td style="text-align: left">10</td><td style="text-align: left">no</td></tr><tr><td style="text-align: left"> </td><td style="text-align: left"><a class="link" href="bk01pt03ch19s07.html#manual.ext.profile_mode.analysis.hashtable_to_vector" title="Hashtable to Vector">
HASHTABLE_TO_VECTOR</a></td><td style="text-align: left">7</td><td style="text-align: left">7</td><td style="text-align: left"> </td><td style="text-align: left">10</td><td style="text-align: left">no</td></tr><tr><td style="text-align: left"> </td><td style="text-align: left"><a class="link" href="bk01pt03ch19s07.html#manual.ext.profile_mode.analysis.vector_to_list" title="Vector to List">
VECTOR_TO_LIST</a></td><td style="text-align: left">8</td><td style="text-align: left">5</td><td style="text-align: left"> </td><td style="text-align: left">10</td><td style="text-align: left">yes</td></tr><tr><td style="text-align: left"> </td><td style="text-align: left"><a class="link" href="bk01pt03ch19s07.html#manual.ext.profile_mode.analysis.list_to_vector" title="List to Vector">
LIST_TO_VECTOR</a></td><td style="text-align: left">10</td><td style="text-align: left">5</td><td style="text-align: left"> </td><td style="text-align: left">10</td><td style="text-align: left">no</td></tr><tr><td style="text-align: left"> </td><td style="text-align: left"><a class="link" href="bk01pt03ch19s07.html#manual.ext.profile_mode.analysis.assoc_ord_to_unord" title="Ordered to Unordered Associative Container">
ORDERED_TO_UNORDERED</a></td><td style="text-align: left">10</td><td style="text-align: left">5</td><td style="text-align: left"> </td><td style="text-align: left">10</td><td style="text-align: left">only map/unordered_map</td></tr><tr><td style="text-align: left"><a class="link" href="bk01pt03ch19s07.html#manual.ext.profile_mode.analysis.algorithms" title="Algorithms">
ALGORITHMS</a></td><td style="text-align: left"><a class="link" href="bk01pt03ch19s07.html#manual.ext.profile_mode.analysis.algorithms.sort" title="Sort Algorithm Performance">
SORT</a></td><td style="text-align: left">7</td><td style="text-align: left">8</td><td style="text-align: left"> </td><td style="text-align: left">7</td><td style="text-align: left">no</td></tr><tr><td style="text-align: left"><a class="link" href="bk01pt03ch19s07.html#manual.ext.profile_mode.analysis.locality" title="Data Locality">
LOCALITY</a></td><td style="text-align: left"><a class="link" href="bk01pt03ch19s07.html#manual.ext.profile_mode.analysis.locality.sw_prefetch" title="Need Software Prefetch">
SOFTWARE_PREFETCH</a></td><td style="text-align: left">8</td><td style="text-align: left">8</td><td style="text-align: left"> </td><td style="text-align: left">5</td><td style="text-align: left">no</td></tr><tr><td style="text-align: left"> </td><td style="text-align: left"><a class="link" href="bk01pt03ch19s07.html#manual.ext.profile_mode.analysis.locality.linked" title="Linked Structure Locality">
RBTREE_LOCALITY</a></td><td style="text-align: left">4</td><td style="text-align: left">8</td><td style="text-align: left"> </td><td style="text-align: left">5</td><td style="text-align: left">no</td></tr><tr><td style="text-align: left"> </td><td style="text-align: left"><a class="link" href="bk01pt03ch19s07.html#manual.ext.profile_mode.analysis.mthread.false_share" title="False Sharing">
FALSE_SHARING</a></td><td style="text-align: left">8</td><td style="text-align: left">10</td><td style="text-align: left"> </td><td style="text-align: left">10</td><td style="text-align: left">no</td></tr></tbody></table></div></div><br class="table-break"/><div class="section" title="Diagnostic Template"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.analysis.template"/>Diagnostic Template</h3></div></div></div><div class="itemizedlist"><ul class="itemizedlist"><li class="listitem"><p><span class="emphasis"><em>Switch:</em></span>
<code class="code">_GLIBCXX_PROFILE_<diagnostic></code>.
</p></li><li class="listitem"><p><span class="emphasis"><em>Goal:</em></span> What problem will it diagnose?
</p></li><li class="listitem"><p><span class="emphasis"><em>Fundamentals:</em></span>.
What is the fundamental reason why this is a problem</p></li><li class="listitem"><p><span class="emphasis"><em>Sample runtime reduction:</em></span>
Percentage reduction in execution time. When reduction is more than
a constant factor, describe the reduction rate formula.
</p></li><li class="listitem"><p><span class="emphasis"><em>Recommendation:</em></span>
What would the advise look like?</p></li><li class="listitem"><p><span class="emphasis"><em>To instrument:</em></span>
What stdlibc++ components need to be instrumented?</p></li><li class="listitem"><p><span class="emphasis"><em>Analysis:</em></span>
How do we decide when to issue the advice?</p></li><li class="listitem"><p><span class="emphasis"><em>Cost model:</em></span>
How do we measure benefits? Math goes here.</p></li><li class="listitem"><p><span class="emphasis"><em>Example:</em></span>
</p><pre class="programlisting">
program code
...
advice sample
</pre><p>
</p></li></ul></div></div><div class="section" title="Containers"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.analysis.containers"/>Containers</h3></div></div></div><p>
<span class="emphasis"><em>Switch:</em></span>
<code class="code">_GLIBCXX_PROFILE_CONTAINERS</code>.
</p><div class="section" title="Hashtable Too Small"><div class="titlepage"><div><div><h4 class="title"><a id="manual.ext.profile_mode.analysis.hashtable_too_small"/>Hashtable Too Small</h4></div></div></div><div class="itemizedlist"><ul class="itemizedlist"><li class="listitem"><p><span class="emphasis"><em>Switch:</em></span>
<code class="code">_GLIBCXX_PROFILE_HASHTABLE_TOO_SMALL</code>.
</p></li><li class="listitem"><p><span class="emphasis"><em>Goal:</em></span> Detect hashtables with many
rehash operations, small construction size and large destruction size.
</p></li><li class="listitem"><p><span class="emphasis"><em>Fundamentals:</em></span> Rehash is very expensive.
Read content, follow chains within bucket, evaluate hash function, place at
new location in different order.</p></li><li class="listitem"><p><span class="emphasis"><em>Sample runtime reduction:</em></span> 36%.
Code similar to example below.
</p></li><li class="listitem"><p><span class="emphasis"><em>Recommendation:</em></span>
Set initial size to N at construction site S.
</p></li><li class="listitem"><p><span class="emphasis"><em>To instrument:</em></span>
<code class="code">unordered_set, unordered_map</code> constructor, destructor, rehash.
</p></li><li class="listitem"><p><span class="emphasis"><em>Analysis:</em></span>
For each dynamic instance of <code class="code">unordered_[multi]set|map</code>,
record initial size and call context of the constructor.
Record size increase, if any, after each relevant operation such as insert.
Record the estimated rehash cost.</p></li><li class="listitem"><p><span class="emphasis"><em>Cost model:</em></span>
Number of individual rehash operations * cost per rehash.</p></li><li class="listitem"><p><span class="emphasis"><em>Example:</em></span>
</p><pre class="programlisting">
1 unordered_set<int> us;
2 for (int k = 0; k < 1000000; ++k) {
3 us.insert(k);
4 }
foo.cc:1: advice: Changing initial unordered_set size from 10 to 1000000 saves 1025530 rehash operations.
</pre><p>
</p></li></ul></div></div><div class="section" title="Hashtable Too Large"><div class="titlepage"><div><div><h4 class="title"><a id="manual.ext.profile_mode.analysis.hashtable_too_large"/>Hashtable Too Large</h4></div></div></div><div class="itemizedlist"><ul class="itemizedlist"><li class="listitem"><p><span class="emphasis"><em>Switch:</em></span>
<code class="code">_GLIBCXX_PROFILE_HASHTABLE_TOO_LARGE</code>.
</p></li><li class="listitem"><p><span class="emphasis"><em>Goal:</em></span> Detect hashtables which are
never filled up because fewer elements than reserved are ever
inserted.
</p></li><li class="listitem"><p><span class="emphasis"><em>Fundamentals:</em></span> Save memory, which
is good in itself and may also improve memory reference performance through
fewer cache and TLB misses.</p></li><li class="listitem"><p><span class="emphasis"><em>Sample runtime reduction:</em></span> unknown.
</p></li><li class="listitem"><p><span class="emphasis"><em>Recommendation:</em></span>
Set initial size to N at construction site S.
</p></li><li class="listitem"><p><span class="emphasis"><em>To instrument:</em></span>
<code class="code">unordered_set, unordered_map</code> constructor, destructor, rehash.
</p></li><li class="listitem"><p><span class="emphasis"><em>Analysis:</em></span>
For each dynamic instance of <code class="code">unordered_[multi]set|map</code>,
record initial size and call context of the constructor, and correlate it
with its size at destruction time.
</p></li><li class="listitem"><p><span class="emphasis"><em>Cost model:</em></span>
Number of iteration operations + memory saved.</p></li><li class="listitem"><p><span class="emphasis"><em>Example:</em></span>
</p><pre class="programlisting">
1 vector<unordered_set<int>> v(100000, unordered_set<int>(100)) ;
2 for (int k = 0; k < 100000; ++k) {
3 for (int j = 0; j < 10; ++j) {
4 v[k].insert(k + j);
5 }
6 }
foo.cc:1: advice: Changing initial unordered_set size from 100 to 10 saves N
bytes of memory and M iteration steps.
</pre><p>
</p></li></ul></div></div><div class="section" title="Inefficient Hash"><div class="titlepage"><div><div><h4 class="title"><a id="manual.ext.profile_mode.analysis.inefficient_hash"/>Inefficient Hash</h4></div></div></div><div class="itemizedlist"><ul class="itemizedlist"><li class="listitem"><p><span class="emphasis"><em>Switch:</em></span>
<code class="code">_GLIBCXX_PROFILE_INEFFICIENT_HASH</code>.
</p></li><li class="listitem"><p><span class="emphasis"><em>Goal:</em></span> Detect hashtables with polarized
distribution.
</p></li><li class="listitem"><p><span class="emphasis"><em>Fundamentals:</em></span> A non-uniform
distribution may lead to long chains, thus possibly increasing complexity
by a factor up to the number of elements.
</p></li><li class="listitem"><p><span class="emphasis"><em>Sample runtime reduction:</em></span> factor up
to container size.
</p></li><li class="listitem"><p><span class="emphasis"><em>Recommendation:</em></span> Change hash function
for container built at site S. Distribution score = N. Access score = S.
Longest chain = C, in bucket B.
</p></li><li class="listitem"><p><span class="emphasis"><em>To instrument:</em></span>
<code class="code">unordered_set, unordered_map</code> constructor, destructor, [],
insert, iterator.
</p></li><li class="listitem"><p><span class="emphasis"><em>Analysis:</em></span>
Count the exact number of link traversals.
</p></li><li class="listitem"><p><span class="emphasis"><em>Cost model:</em></span>
Total number of links traversed.</p></li><li class="listitem"><p><span class="emphasis"><em>Example:</em></span>
</p><pre class="programlisting">
class dumb_hash {
public:
size_t operator() (int i) const { return 0; }
};
...
unordered_set<int, dumb_hash> hs;
...
for (int i = 0; i < COUNT; ++i) {
hs.find(i);
}
</pre><p>
</p></li></ul></div></div><div class="section" title="Vector Too Small"><div class="titlepage"><div><div><h4 class="title"><a id="manual.ext.profile_mode.analysis.vector_too_small"/>Vector Too Small</h4></div></div></div><div class="itemizedlist"><ul class="itemizedlist"><li class="listitem"><p><span class="emphasis"><em>Switch:</em></span>
<code class="code">_GLIBCXX_PROFILE_VECTOR_TOO_SMALL</code>.
</p></li><li class="listitem"><p><span class="emphasis"><em>Goal:</em></span>Detect vectors with many
resize operations, small construction size and large destruction size..
</p></li><li class="listitem"><p><span class="emphasis"><em>Fundamentals:</em></span>Resizing can be expensive.
Copying large amounts of data takes time. Resizing many small vectors may
have allocation overhead and affect locality.</p></li><li class="listitem"><p><span class="emphasis"><em>Sample runtime reduction:</em></span>%.
</p></li><li class="listitem"><p><span class="emphasis"><em>Recommendation:</em></span>
Set initial size to N at construction site S.</p></li><li class="listitem"><p><span class="emphasis"><em>To instrument:</em></span><code class="code">vector</code>.
</p></li><li class="listitem"><p><span class="emphasis"><em>Analysis:</em></span>
For each dynamic instance of <code class="code">vector</code>,
record initial size and call context of the constructor.
Record size increase, if any, after each relevant operation such as
<code class="code">push_back</code>. Record the estimated resize cost.
</p></li><li class="listitem"><p><span class="emphasis"><em>Cost model:</em></span>
Total number of words copied * time to copy a word.</p></li><li class="listitem"><p><span class="emphasis"><em>Example:</em></span>
</p><pre class="programlisting">
1 vector<int> v;
2 for (int k = 0; k < 1000000; ++k) {
3 v.push_back(k);
4 }
foo.cc:1: advice: Changing initial vector size from 10 to 1000000 saves
copying 4000000 bytes and 20 memory allocations and deallocations.
</pre><p>
</p></li></ul></div></div><div class="section" title="Vector Too Large"><div class="titlepage"><div><div><h4 class="title"><a id="manual.ext.profile_mode.analysis.vector_too_large"/>Vector Too Large</h4></div></div></div><div class="itemizedlist"><ul class="itemizedlist"><li class="listitem"><p><span class="emphasis"><em>Switch:</em></span>
<code class="code">_GLIBCXX_PROFILE_VECTOR_TOO_LARGE</code>
</p></li><li class="listitem"><p><span class="emphasis"><em>Goal:</em></span>Detect vectors which are
never filled up because fewer elements than reserved are ever
inserted.
</p></li><li class="listitem"><p><span class="emphasis"><em>Fundamentals:</em></span>Save memory, which
is good in itself and may also improve memory reference performance through
fewer cache and TLB misses.</p></li><li class="listitem"><p><span class="emphasis"><em>Sample runtime reduction:</em></span>%.
</p></li><li class="listitem"><p><span class="emphasis"><em>Recommendation:</em></span>
Set initial size to N at construction site S.</p></li><li class="listitem"><p><span class="emphasis"><em>To instrument:</em></span><code class="code">vector</code>.
</p></li><li class="listitem"><p><span class="emphasis"><em>Analysis:</em></span>
For each dynamic instance of <code class="code">vector</code>,
record initial size and call context of the constructor, and correlate it
with its size at destruction time.</p></li><li class="listitem"><p><span class="emphasis"><em>Cost model:</em></span>
Total amount of memory saved.</p></li><li class="listitem"><p><span class="emphasis"><em>Example:</em></span>
</p><pre class="programlisting">
1 vector<vector<int>> v(100000, vector<int>(100)) ;
2 for (int k = 0; k < 100000; ++k) {
3 for (int j = 0; j < 10; ++j) {
4 v[k].insert(k + j);
5 }
6 }
foo.cc:1: advice: Changing initial vector size from 100 to 10 saves N
bytes of memory and may reduce the number of cache and TLB misses.
</pre><p>
</p></li></ul></div></div><div class="section" title="Vector to Hashtable"><div class="titlepage"><div><div><h4 class="title"><a id="manual.ext.profile_mode.analysis.vector_to_hashtable"/>Vector to Hashtable</h4></div></div></div><div class="itemizedlist"><ul class="itemizedlist"><li class="listitem"><p><span class="emphasis"><em>Switch:</em></span>
<code class="code">_GLIBCXX_PROFILE_VECTOR_TO_HASHTABLE</code>.
</p></li><li class="listitem"><p><span class="emphasis"><em>Goal:</em></span> Detect uses of
<code class="code">vector</code> that can be substituted with <code class="code">unordered_set</code>
to reduce execution time.
</p></li><li class="listitem"><p><span class="emphasis"><em>Fundamentals:</em></span>
Linear search in a vector is very expensive, whereas searching in a hashtable
is very quick.</p></li><li class="listitem"><p><span class="emphasis"><em>Sample runtime reduction:</em></span>factor up
to container size.
</p></li><li class="listitem"><p><span class="emphasis"><em>Recommendation:</em></span>Replace
<code class="code">vector</code> with <code class="code">unordered_set</code> at site S.
</p></li><li class="listitem"><p><span class="emphasis"><em>To instrument:</em></span><code class="code">vector</code>
operations and access methods.</p></li><li class="listitem"><p><span class="emphasis"><em>Analysis:</em></span>
For each dynamic instance of <code class="code">vector</code>,
record call context of the constructor. Issue the advice only if the
only methods called on this <code class="code">vector</code> are <code class="code">push_back</code>,
<code class="code">insert</code> and <code class="code">find</code>.
</p></li><li class="listitem"><p><span class="emphasis"><em>Cost model:</em></span>
Cost(vector::push_back) + cost(vector::insert) + cost(find, vector) -
cost(unordered_set::insert) + cost(unordered_set::find).
</p></li><li class="listitem"><p><span class="emphasis"><em>Example:</em></span>
</p><pre class="programlisting">
1 vector<int> v;
...
2 for (int i = 0; i < 1000; ++i) {
3 find(v.begin(), v.end(), i);
4 }
foo.cc:1: advice: Changing "vector" to "unordered_set" will save about 500,000
comparisons.
</pre><p>
</p></li></ul></div></div><div class="section" title="Hashtable to Vector"><div class="titlepage"><div><div><h4 class="title"><a id="manual.ext.profile_mode.analysis.hashtable_to_vector"/>Hashtable to Vector</h4></div></div></div><div class="itemizedlist"><ul class="itemizedlist"><li class="listitem"><p><span class="emphasis"><em>Switch:</em></span>
<code class="code">_GLIBCXX_PROFILE_HASHTABLE_TO_VECTOR</code>.
</p></li><li class="listitem"><p><span class="emphasis"><em>Goal:</em></span> Detect uses of
<code class="code">unordered_set</code> that can be substituted with <code class="code">vector</code>
to reduce execution time.
</p></li><li class="listitem"><p><span class="emphasis"><em>Fundamentals:</em></span>
Hashtable iterator is slower than vector iterator.</p></li><li class="listitem"><p><span class="emphasis"><em>Sample runtime reduction:</em></span>95%.
</p></li><li class="listitem"><p><span class="emphasis"><em>Recommendation:</em></span>Replace
<code class="code">unordered_set</code> with <code class="code">vector</code> at site S.
</p></li><li class="listitem"><p><span class="emphasis"><em>To instrument:</em></span><code class="code">unordered_set</code>
operations and access methods.</p></li><li class="listitem"><p><span class="emphasis"><em>Analysis:</em></span>
For each dynamic instance of <code class="code">unordered_set</code>,
record call context of the constructor. Issue the advice only if the
number of <code class="code">find</code>, <code class="code">insert</code> and <code class="code">[]</code>
operations on this <code class="code">unordered_set</code> are small relative to the
number of elements, and methods <code class="code">begin</code> or <code class="code">end</code>
are invoked (suggesting iteration).</p></li><li class="listitem"><p><span class="emphasis"><em>Cost model:</em></span>
Number of .</p></li><li class="listitem"><p><span class="emphasis"><em>Example:</em></span>
</p><pre class="programlisting">
1 unordered_set<int> us;
...
2 int s = 0;
3 for (unordered_set<int>::iterator it = us.begin(); it != us.end(); ++it) {
4 s += *it;
5 }
foo.cc:1: advice: Changing "unordered_set" to "vector" will save about N
indirections and may achieve better data locality.
</pre><p>
</p></li></ul></div></div><div class="section" title="Vector to List"><div class="titlepage"><div><div><h4 class="title"><a id="manual.ext.profile_mode.analysis.vector_to_list"/>Vector to List</h4></div></div></div><div class="itemizedlist"><ul class="itemizedlist"><li class="listitem"><p><span class="emphasis"><em>Switch:</em></span>
<code class="code">_GLIBCXX_PROFILE_VECTOR_TO_LIST</code>.
</p></li><li class="listitem"><p><span class="emphasis"><em>Goal:</em></span> Detect cases where
<code class="code">vector</code> could be substituted with <code class="code">list</code> for
better performance.
</p></li><li class="listitem"><p><span class="emphasis"><em>Fundamentals:</em></span>
Inserting in the middle of a vector is expensive compared to inserting in a
list.
</p></li><li class="listitem"><p><span class="emphasis"><em>Sample runtime reduction:</em></span>factor up to
container size.
</p></li><li class="listitem"><p><span class="emphasis"><em>Recommendation:</em></span>Replace vector with list
at site S.</p></li><li class="listitem"><p><span class="emphasis"><em>To instrument:</em></span><code class="code">vector</code>
operations and access methods.</p></li><li class="listitem"><p><span class="emphasis"><em>Analysis:</em></span>
For each dynamic instance of <code class="code">vector</code>,
record the call context of the constructor. Record the overhead of each
<code class="code">insert</code> operation based on current size and insert position.
Report instance with high insertion overhead.
</p></li><li class="listitem"><p><span class="emphasis"><em>Cost model:</em></span>
(Sum(cost(vector::method)) - Sum(cost(list::method)), for
method in [push_back, insert, erase])
+ (Cost(iterate vector) - Cost(iterate list))</p></li><li class="listitem"><p><span class="emphasis"><em>Example:</em></span>
</p><pre class="programlisting">
1 vector<int> v;
2 for (int i = 0; i < 10000; ++i) {
3 v.insert(v.begin(), i);
4 }
foo.cc:1: advice: Changing "vector" to "list" will save about 5,000,000
operations.
</pre><p>
</p></li></ul></div></div><div class="section" title="List to Vector"><div class="titlepage"><div><div><h4 class="title"><a id="manual.ext.profile_mode.analysis.list_to_vector"/>List to Vector</h4></div></div></div><div class="itemizedlist"><ul class="itemizedlist"><li class="listitem"><p><span class="emphasis"><em>Switch:</em></span>
<code class="code">_GLIBCXX_PROFILE_LIST_TO_VECTOR</code>.
</p></li><li class="listitem"><p><span class="emphasis"><em>Goal:</em></span> Detect cases where
<code class="code">list</code> could be substituted with <code class="code">vector</code> for
better performance.
</p></li><li class="listitem"><p><span class="emphasis"><em>Fundamentals:</em></span>
Iterating through a vector is faster than through a list.
</p></li><li class="listitem"><p><span class="emphasis"><em>Sample runtime reduction:</em></span>64%.
</p></li><li class="listitem"><p><span class="emphasis"><em>Recommendation:</em></span>Replace list with vector
at site S.</p></li><li class="listitem"><p><span class="emphasis"><em>To instrument:</em></span><code class="code">vector</code>
operations and access methods.</p></li><li class="listitem"><p><span class="emphasis"><em>Analysis:</em></span>
Issue the advice if there are no <code class="code">insert</code> operations.
</p></li><li class="listitem"><p><span class="emphasis"><em>Cost model:</em></span>
(Sum(cost(vector::method)) - Sum(cost(list::method)), for
method in [push_back, insert, erase])
+ (Cost(iterate vector) - Cost(iterate list))</p></li><li class="listitem"><p><span class="emphasis"><em>Example:</em></span>
</p><pre class="programlisting">
1 list<int> l;
...
2 int sum = 0;
3 for (list<int>::iterator it = l.begin(); it != l.end(); ++it) {
4 sum += *it;
5 }
foo.cc:1: advice: Changing "list" to "vector" will save about 1000000 indirect
memory references.
</pre><p>
</p></li></ul></div></div><div class="section" title="List to Forward List (Slist)"><div class="titlepage"><div><div><h4 class="title"><a id="manual.ext.profile_mode.analysis.list_to_slist"/>List to Forward List (Slist)</h4></div></div></div><div class="itemizedlist"><ul class="itemizedlist"><li class="listitem"><p><span class="emphasis"><em>Switch:</em></span>
<code class="code">_GLIBCXX_PROFILE_LIST_TO_SLIST</code>.
</p></li><li class="listitem"><p><span class="emphasis"><em>Goal:</em></span> Detect cases where
<code class="code">list</code> could be substituted with <code class="code">forward_list</code> for
better performance.
</p></li><li class="listitem"><p><span class="emphasis"><em>Fundamentals:</em></span>
The memory footprint of a forward_list is smaller than that of a list.
This has beneficial effects on memory subsystem, e.g., fewer cache misses.
</p></li><li class="listitem"><p><span class="emphasis"><em>Sample runtime reduction:</em></span>40%.
Note that the reduction is only noticeable if the size of the forward_list
node is in fact larger than that of the list node. For memory allocators
with size classes, you will only notice an effect when the two node sizes
belong to different allocator size classes.
</p></li><li class="listitem"><p><span class="emphasis"><em>Recommendation:</em></span>Replace list with
forward_list at site S.</p></li><li class="listitem"><p><span class="emphasis"><em>To instrument:</em></span><code class="code">list</code>
operations and iteration methods.</p></li><li class="listitem"><p><span class="emphasis"><em>Analysis:</em></span>
Issue the advice if there are no <code class="code">backwards</code> traversals
or insertion before a given node.
</p></li><li class="listitem"><p><span class="emphasis"><em>Cost model:</em></span>
Always true.</p></li><li class="listitem"><p><span class="emphasis"><em>Example:</em></span>
</p><pre class="programlisting">
1 list<int> l;
...
2 int sum = 0;
3 for (list<int>::iterator it = l.begin(); it != l.end(); ++it) {
4 sum += *it;
5 }
foo.cc:1: advice: Change "list" to "forward_list".
</pre><p>
</p></li></ul></div></div><div class="section" title="Ordered to Unordered Associative Container"><div class="titlepage"><div><div><h4 class="title"><a id="manual.ext.profile_mode.analysis.assoc_ord_to_unord"/>Ordered to Unordered Associative Container</h4></div></div></div><div class="itemizedlist"><ul class="itemizedlist"><li class="listitem"><p><span class="emphasis"><em>Switch:</em></span>
<code class="code">_GLIBCXX_PROFILE_ORDERED_TO_UNORDERED</code>.
</p></li><li class="listitem"><p><span class="emphasis"><em>Goal:</em></span> Detect cases where ordered
associative containers can be replaced with unordered ones.
</p></li><li class="listitem"><p><span class="emphasis"><em>Fundamentals:</em></span>
Insert and search are quicker in a hashtable than in
a red-black tree.</p></li><li class="listitem"><p><span class="emphasis"><em>Sample runtime reduction:</em></span>52%.
</p></li><li class="listitem"><p><span class="emphasis"><em>Recommendation:</em></span>
Replace set with unordered_set at site S.</p></li><li class="listitem"><p><span class="emphasis"><em>To instrument:</em></span>
<code class="code">set</code>, <code class="code">multiset</code>, <code class="code">map</code>,
<code class="code">multimap</code> methods.</p></li><li class="listitem"><p><span class="emphasis"><em>Analysis:</em></span>
Issue the advice only if we are not using operator <code class="code">++</code> on any
iterator on a particular <code class="code">[multi]set|map</code>.
</p></li><li class="listitem"><p><span class="emphasis"><em>Cost model:</em></span>
(Sum(cost(hashtable::method)) - Sum(cost(rbtree::method)), for
method in [insert, erase, find])
+ (Cost(iterate hashtable) - Cost(iterate rbtree))</p></li><li class="listitem"><p><span class="emphasis"><em>Example:</em></span>
</p><pre class="programlisting">
1 set<int> s;
2 for (int i = 0; i < 100000; ++i) {
3 s.insert(i);
4 }
5 int sum = 0;
6 for (int i = 0; i < 100000; ++i) {
7 sum += *s.find(i);
8 }
</pre><p>
</p></li></ul></div></div></div><div class="section" title="Algorithms"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.analysis.algorithms"/>Algorithms</h3></div></div></div><p><span class="emphasis"><em>Switch:</em></span>
<code class="code">_GLIBCXX_PROFILE_ALGORITHMS</code>.
</p><div class="section" title="Sort Algorithm Performance"><div class="titlepage"><div><div><h4 class="title"><a id="manual.ext.profile_mode.analysis.algorithms.sort"/>Sort Algorithm Performance</h4></div></div></div><div class="itemizedlist"><ul class="itemizedlist"><li class="listitem"><p><span class="emphasis"><em>Switch:</em></span>
<code class="code">_GLIBCXX_PROFILE_SORT</code>.
</p></li><li class="listitem"><p><span class="emphasis"><em>Goal:</em></span> Give measure of sort algorithm
performance based on actual input. For instance, advise Radix Sort over
Quick Sort for a particular call context.
</p></li><li class="listitem"><p><span class="emphasis"><em>Fundamentals:</em></span>
See papers:
<a class="link" href="http://portal.acm.org/citation.cfm?doid=1065944.1065981">
A framework for adaptive algorithm selection in STAPL</a> and
<a class="link" href="http://ieeexplore.ieee.org/search/wrapper.jsp?arnumber=4228227">
Optimizing Sorting with Machine Learning Algorithms</a>.
</p></li><li class="listitem"><p><span class="emphasis"><em>Sample runtime reduction:</em></span>60%.
</p></li><li class="listitem"><p><span class="emphasis"><em>Recommendation:</em></span> Change sort algorithm
at site S from X Sort to Y Sort.</p></li><li class="listitem"><p><span class="emphasis"><em>To instrument:</em></span> <code class="code">sort</code>
algorithm.</p></li><li class="listitem"><p><span class="emphasis"><em>Analysis:</em></span>
Issue the advice if the cost model tells us that another sort algorithm
would do better on this input. Requires us to know what algorithm we
are using in our sort implementation in release mode.</p></li><li class="listitem"><p><span class="emphasis"><em>Cost model:</em></span>
Runtime(algo) for algo in [radix, quick, merge, ...]</p></li><li class="listitem"><p><span class="emphasis"><em>Example:</em></span>
</p><pre class="programlisting">
</pre><p>
</p></li></ul></div></div></div><div class="section" title="Data Locality"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.analysis.locality"/>Data Locality</h3></div></div></div><p><span class="emphasis"><em>Switch:</em></span>
<code class="code">_GLIBCXX_PROFILE_LOCALITY</code>.
</p><div class="section" title="Need Software Prefetch"><div class="titlepage"><div><div><h4 class="title"><a id="manual.ext.profile_mode.analysis.locality.sw_prefetch"/>Need Software Prefetch</h4></div></div></div><div class="itemizedlist"><ul class="itemizedlist"><li class="listitem"><p><span class="emphasis"><em>Switch:</em></span>
<code class="code">_GLIBCXX_PROFILE_SOFTWARE_PREFETCH</code>.
</p></li><li class="listitem"><p><span class="emphasis"><em>Goal:</em></span> Discover sequences of indirect
memory accesses that are not regular, thus cannot be predicted by
hardware prefetchers.
</p></li><li class="listitem"><p><span class="emphasis"><em>Fundamentals:</em></span>
Indirect references are hard to predict and are very expensive when they
miss in caches.</p></li><li class="listitem"><p><span class="emphasis"><em>Sample runtime reduction:</em></span>25%.
</p></li><li class="listitem"><p><span class="emphasis"><em>Recommendation:</em></span> Insert prefetch
instruction.</p></li><li class="listitem"><p><span class="emphasis"><em>To instrument:</em></span> Vector iterator and
access operator [].
</p></li><li class="listitem"><p><span class="emphasis"><em>Analysis:</em></span>
First, get cache line size and page size from system.
Then record iterator dereference sequences for which the value is a pointer.
For each sequence within a container, issue a warning if successive pointer
addresses are not within cache lines and do not form a linear pattern
(otherwise they may be prefetched by hardware).
If they also step across page boundaries, make the warning stronger.
</p><p>The same analysis applies to containers other than vector.
However, we cannot give the same advice for linked structures, such as list,
as there is no random access to the n-th element. The user may still be
able to benefit from this information, for instance by employing frays (user
level light weight threads) to hide the latency of chasing pointers.
</p><p>
This analysis is a little oversimplified. A better cost model could be
created by understanding the capability of the hardware prefetcher.
This model could be trained automatically by running a set of synthetic
cases.
</p></li><li class="listitem"><p><span class="emphasis"><em>Cost model:</em></span>
Total distance between pointer values of successive elements in vectors
of pointers.</p></li><li class="listitem"><p><span class="emphasis"><em>Example:</em></span>
</p><pre class="programlisting">
1 int zero = 0;
2 vector<int*> v(10000000, &zero);
3 for (int k = 0; k < 10000000; ++k) {
4 v[random() % 10000000] = new int(k);
5 }
6 for (int j = 0; j < 10000000; ++j) {
7 count += (*v[j] == 0 ? 0 : 1);
8 }
foo.cc:7: advice: Insert prefetch instruction.
</pre><p>
</p></li></ul></div></div><div class="section" title="Linked Structure Locality"><div class="titlepage"><div><div><h4 class="title"><a id="manual.ext.profile_mode.analysis.locality.linked"/>Linked Structure Locality</h4></div></div></div><div class="itemizedlist"><ul class="itemizedlist"><li class="listitem"><p><span class="emphasis"><em>Switch:</em></span>
<code class="code">_GLIBCXX_PROFILE_RBTREE_LOCALITY</code>.
</p></li><li class="listitem"><p><span class="emphasis"><em>Goal:</em></span> Give measure of locality of
objects stored in linked structures (lists, red-black trees and hashtables)
with respect to their actual traversal patterns.
</p></li><li class="listitem"><p><span class="emphasis"><em>Fundamentals:</em></span>Allocation can be tuned
to a specific traversal pattern, to result in better data locality.
See paper:
<a class="link" href="http://www.springerlink.com/content/8085744l00x72662/">
Custom Memory Allocation for Free</a>.
</p></li><li class="listitem"><p><span class="emphasis"><em>Sample runtime reduction:</em></span>30%.
</p></li><li class="listitem"><p><span class="emphasis"><em>Recommendation:</em></span>
High scatter score N for container built at site S.
Consider changing allocation sequence or choosing a structure conscious
allocator.</p></li><li class="listitem"><p><span class="emphasis"><em>To instrument:</em></span> Methods of all
containers using linked structures.</p></li><li class="listitem"><p><span class="emphasis"><em>Analysis:</em></span>
First, get cache line size and page size from system.
Then record the number of successive elements that are on different line
or page, for each traversal method such as <code class="code">find</code>. Give advice
only if the ratio between this number and the number of total node hops
is above a threshold.</p></li><li class="listitem"><p><span class="emphasis"><em>Cost model:</em></span>
Sum(same_cache_line(this,previous))</p></li><li class="listitem"><p><span class="emphasis"><em>Example:</em></span>
</p><pre class="programlisting">
1 set<int> s;
2 for (int i = 0; i < 10000000; ++i) {
3 s.insert(i);
4 }
5 set<int> s1, s2;
6 for (int i = 0; i < 10000000; ++i) {
7 s1.insert(i);
8 s2.insert(i);
9 }
...
// Fast, better locality.
10 for (set<int>::iterator it = s.begin(); it != s.end(); ++it) {
11 sum += *it;
12 }
// Slow, elements are further apart.
13 for (set<int>::iterator it = s1.begin(); it != s1.end(); ++it) {
14 sum += *it;
15 }
foo.cc:5: advice: High scatter score NNN for set built here. Consider changing
the allocation sequence or switching to a structure conscious allocator.
</pre><p>
</p></li></ul></div></div></div><div class="section" title="Multithreaded Data Access"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.analysis.mthread"/>Multithreaded Data Access</h3></div></div></div><p>
The diagnostics in this group are not meant to be implemented short term.
They require compiler support to know when container elements are written
to. Instrumentation can only tell us when elements are referenced.
</p><p><span class="emphasis"><em>Switch:</em></span>
<code class="code">_GLIBCXX_PROFILE_MULTITHREADED</code>.
</p><div class="section" title="Data Dependence Violations at Container Level"><div class="titlepage"><div><div><h4 class="title"><a id="manual.ext.profile_mode.analysis.mthread.ddtest"/>Data Dependence Violations at Container Level</h4></div></div></div><div class="itemizedlist"><ul class="itemizedlist"><li class="listitem"><p><span class="emphasis"><em>Switch:</em></span>
<code class="code">_GLIBCXX_PROFILE_DDTEST</code>.
</p></li><li class="listitem"><p><span class="emphasis"><em>Goal:</em></span> Detect container elements
that are referenced from multiple threads in the parallel region or
across parallel regions.
</p></li><li class="listitem"><p><span class="emphasis"><em>Fundamentals:</em></span>
Sharing data between threads requires communication and perhaps locking,
which may be expensive.
</p></li><li class="listitem"><p><span class="emphasis"><em>Sample runtime reduction:</em></span>?%.
</p></li><li class="listitem"><p><span class="emphasis"><em>Recommendation:</em></span> Change data
distribution or parallel algorithm.</p></li><li class="listitem"><p><span class="emphasis"><em>To instrument:</em></span> Container access methods
and iterators.
</p></li><li class="listitem"><p><span class="emphasis"><em>Analysis:</em></span>
Keep a shadow for each container. Record iterator dereferences and
container member accesses. Issue advice for elements referenced by
multiple threads.
See paper: <a class="link" href="http://portal.acm.org/citation.cfm?id=207110.207148">
The LRPD test: speculative run-time parallelization of loops with
privatization and reduction parallelization</a>.
</p></li><li class="listitem"><p><span class="emphasis"><em>Cost model:</em></span>
Number of accesses to elements referenced from multiple threads
</p></li><li class="listitem"><p><span class="emphasis"><em>Example:</em></span>
</p><pre class="programlisting">
</pre><p>
</p></li></ul></div></div><div class="section" title="False Sharing"><div class="titlepage"><div><div><h4 class="title"><a id="manual.ext.profile_mode.analysis.mthread.false_share"/>False Sharing</h4></div></div></div><div class="itemizedlist"><ul class="itemizedlist"><li class="listitem"><p><span class="emphasis"><em>Switch:</em></span>
<code class="code">_GLIBCXX_PROFILE_FALSE_SHARING</code>.
</p></li><li class="listitem"><p><span class="emphasis"><em>Goal:</em></span> Detect elements in the
same container which share a cache line, are written by at least one
thread, and accessed by different threads.
</p></li><li class="listitem"><p><span class="emphasis"><em>Fundamentals:</em></span> Under these assumptions,
cache protocols require
communication to invalidate lines, which may be expensive.
</p></li><li class="listitem"><p><span class="emphasis"><em>Sample runtime reduction:</em></span>68%.
</p></li><li class="listitem"><p><span class="emphasis"><em>Recommendation:</em></span> Reorganize container
or use padding to avoid false sharing.</p></li><li class="listitem"><p><span class="emphasis"><em>To instrument:</em></span> Container access methods
and iterators.
</p></li><li class="listitem"><p><span class="emphasis"><em>Analysis:</em></span>
First, get the cache line size.
For each shared container, record all the associated iterator dereferences
and member access methods with the thread id. Compare the address lists
across threads to detect references in two different threads to the same
cache line. Issue a warning only if the ratio to total references is
significant. Do the same for iterator dereference values if they are
pointers.</p></li><li class="listitem"><p><span class="emphasis"><em>Cost model:</em></span>
Number of accesses to same cache line from different threads.
</p></li><li class="listitem"><p><span class="emphasis"><em>Example:</em></span>
</p><pre class="programlisting">
1 vector<int> v(2, 0);
2 #pragma omp parallel for shared(v, SIZE) schedule(static, 1)
3 for (i = 0; i < SIZE; ++i) {
4 v[i % 2] += i;
5 }
OMP_NUM_THREADS=2 ./a.out
foo.cc:1: advice: Change container structure or padding to avoid false
sharing in multithreaded access at foo.cc:4. Detected N shared cache lines.
</pre><p>
</p></li></ul></div></div></div><div class="section" title="Statistics"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.analysis.statistics"/>Statistics</h3></div></div></div><p>
<span class="emphasis"><em>Switch:</em></span>
<code class="code">_GLIBCXX_PROFILE_STATISTICS</code>.
</p><p>
In some cases the cost model may not tell us anything because the costs
appear to offset the benefits. Consider the choice between a vector and
a list. When there are both inserts and iteration, an automatic advice
may not be issued. However, the programmer may still be able to make use
of this information in a different way.
</p><p>
This diagnostic will not issue any advice, but it will print statistics for
each container construction site. The statistics will contain the cost
of each operation actually performed on the container.
</p></div></div><div class="navfooter"><hr/><table width="100%" summary="Navigation footer"><tr><td align="left"><a accesskey="p" href="bk01pt03ch19s06.html">Prev</a> </td><td align="center"><a accesskey="u" href="profile_mode.html">Up</a></td><td align="right"> <a accesskey="n" href="ext_allocators.html">Next</a></td></tr><tr><td align="left" valign="top">Developer Information </td><td align="center"><a accesskey="h" href="../spine.html">Home</a></td><td align="right" valign="top"> Chapter 20. Allocators</td></tr></table></div></body></html>
|