libstdc++-v3/doc/xml/manual/internals.xml


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549

<section xmlns="http://docbook.org/ns/docbook" version="5.0" 
	 xml:id="appendix.porting.internals" xreflabel="Portin Internals">
<?dbhtml filename="internals.html"?>

<info><title>Porting to New Hardware or Operating Systems</title>
  <keywordset>
    <keyword>
      ISO C++
    </keyword>
    <keyword>
      internals
    </keyword>
  </keywordset>
</info>


<para>
</para>


<para>This document explains how to port libstdc++ (the GNU C++ library) to
a new target.
</para>

   <para>In order to make the GNU C++ library (libstdc++) work with a new
target, you must edit some configuration files and provide some new
header files.  Unless this is done, libstdc++ will use generic
settings which may not be correct for your target; even if they are
correct, they will likely be inefficient.
   </para>

   <para>Before you get started, make sure that you have a working C library on
your target.  The C library need not precisely comply with any
particular standard, but should generally conform to the requirements
imposed by the ANSI/ISO standard.
   </para>

   <para>In addition, you should try to verify that the C++ compiler generally
works.  It is difficult to test the C++ compiler without a working
library, but you should at least try some minimal test cases.
   </para>

   <para>(Note that what we think of as a "target," the library refers to as
a "host."  The comment at the top of <code>configure.ac</code> explains why.)
   </para>


<section xml:id="internals.os"><info><title>Operating System</title></info>


<para>If you are porting to a new operating system (as opposed to a new chip
using an existing operating system), you will need to create a new
directory in the <code>config/os</code> hierarchy.  For example, the IRIX
configuration files are all in <code>config/os/irix</code>.  There is no set
way to organize the OS configuration directory.  For example,
<code>config/os/solaris/solaris-2.6</code> and
<code>config/os/solaris/solaris-2.7</code> are used as configuration
directories for these two versions of Solaris.  On the other hand, both
Solaris 2.7 and Solaris 2.8 use the <code>config/os/solaris/solaris-2.7</code>
directory.  The important information is that there needs to be a
directory under <code>config/os</code> to store the files for your operating
system.
</para>

   <para>You might have to change the <code>configure.host</code> file to ensure that
your new directory is activated.  Look for the switch statement that sets
<code>os_include_dir</code>, and add a pattern to handle your operating system
if the default will not suffice.  The switch statement switches on only
the OS portion of the standard target triplet; e.g., the <code>solaris2.8</code>
in <code>sparc-sun-solaris2.8</code>.  If the new directory is named after the
OS portion of the triplet (the default), then nothing needs to be changed.
   </para>

   <para>The first file to create in this directory, should be called
<code>os_defines.h</code>.  This file contains basic macro definitions
that are required to allow the C++ library to work with your C library.
   </para>

   <para>Several libstdc++ source files unconditionally define the macro
<code>_POSIX_SOURCE</code>.  On many systems, defining this macro causes
large portions of the C library header files to be eliminated
at preprocessing time.  Therefore, you may have to <code>#undef</code> this
macro, or define other macros (like <code>_LARGEFILE_SOURCE</code> or
<code>__EXTENSIONS__</code>).  You won't know what macros to define or
undefine at this point; you'll have to try compiling the library and
seeing what goes wrong.  If you see errors about calling functions
that have not been declared, look in your C library headers to see if
the functions are declared there, and then figure out what macros you
need to define.  You will need to add them to the
<code>CPLUSPLUS_CPP_SPEC</code> macro in the GCC configuration file for your
target.  It will not work to simply define these macros in
<code>os_defines.h</code>.
   </para>

   <para>At this time, there are a few libstdc++-specific macros which may be
defined:
   </para>

   <para><code>_GLIBCXX_USE_C99_CHECK</code> may be defined to 1 to check C99
function declarations (which are not covered by specialization below)
found in system headers against versions found in the library headers
derived from the standard.
   </para>

   <para><code>_GLIBCXX_USE_C99_DYNAMIC</code> may be defined to an expression that
yields 0 if and only if the system headers are exposing proper support
for C99 functions (which are not covered by specialization below).  If
defined, it must be 0 while bootstrapping the compiler/rebuilding the
library.
   </para>

   <para><code>_GLIBCXX_USE_C99_LONG_LONG_CHECK</code> may be defined to 1 to check
the set of C99 long long function declarations found in system headers
against versions found in the library headers derived from the
standard.

   </para>
   <para><code>_GLIBCXX_USE_C99_LONG_LONG_DYNAMIC</code> may be defined to an
expression that yields 0 if and only if the system headers are
exposing proper support for the set of C99 long long functions.  If
defined, it must be 0 while bootstrapping the compiler/rebuilding the
library.
   </para>
   <para><code>_GLIBCXX_USE_C99_FP_MACROS_DYNAMIC</code> may be defined to an
expression that yields 0 if and only if the system headers
are exposing proper support for the related set of macros.  If defined,
it must be 0 while bootstrapping the compiler/rebuilding the library.
   </para>
   <para><code>_GLIBCXX_USE_C99_FLOAT_TRANSCENDENTALS_CHECK</code> may be defined
to 1 to check the related set of function declarations found in system
headers against versions found in the library headers derived from
the standard.
   </para>
   <para><code>_GLIBCXX_USE_C99_FLOAT_TRANSCENDENTALS_DYNAMIC</code> may be defined
to an expression that yields 0 if and only if the system headers
are exposing proper support for the related set of functions.  If defined,
it must be 0 while bootstrapping the compiler/rebuilding the library.
   </para>
   <para>Finally, you should bracket the entire file in an include-guard, like
this:
   </para>

<programlisting>

#ifndef _GLIBCXX_OS_DEFINES
#define _GLIBCXX_OS_DEFINES
...
#endif
</programlisting>

   <para>We recommend copying an existing <code>os_defines.h</code> to use as a
starting point.
   </para>
</section>


<section xml:id="internals.cpu"><info><title>CPU</title></info>


<para>If you are porting to a new chip (as opposed to a new operating system
running on an existing chip), you will need to create a new directory in the
<code>config/cpu</code> hierarchy.  Much like the <link linkend="internals.os">Operating system</link> setup,
there are no strict rules on how to organize the CPU configuration
directory, but careful naming choices will allow the configury to find your
setup files without explicit help.
</para>

   <para>We recommend that for a target triplet <code>&lt;CPU&gt;-&lt;vendor&gt;-&lt;OS&gt;</code>, you
name your configuration directory <code>config/cpu/&lt;CPU&gt;</code>.  If you do this,
the configury will find the directory by itself.  Otherwise you will need to
edit the <code>configure.host</code> file and, in the switch statement that sets
<code>cpu_include_dir</code>, add a pattern to handle your chip.
   </para>

   <para>Note that some chip families share a single configuration directory, for
example, <code>alpha</code>, <code>alphaev5</code>, and <code>alphaev6</code> all use the
<code>config/cpu/alpha</code> directory, and there is an entry in the
<code>configure.host</code> switch statement to handle this.
   </para>

   <para>The <code>cpu_include_dir</code> sets default locations for the files controlling
<link linkend="internals.thread_safety">Thread safety</link> and <link linkend="internals.numeric_limits">Numeric limits</link>, if the defaults are not
appropriate for your chip.
   </para>

</section>


<section xml:id="internals.char_types"><info><title>Character Types</title></info>


<para>The library requires that you provide three header files to implement
character classification, analogous to that provided by the C libraries
<code>&lt;ctype.h&gt;</code> header.  You can model these on the files provided in
<code>config/os/generic</code>.  However, these files will almost
certainly need some modification.
</para>

   <para>The first file to write is <code>ctype_base.h</code>.  This file provides
some very basic information about character classification.  The libstdc++
library assumes that your C library implements <code>&lt;ctype.h&gt;</code> by using
a table (indexed by character code) containing integers, where each of
these integers is a bit-mask indicating whether the character is
upper-case, lower-case, alphabetic, etc.  The <code>ctype_base.h</code>
file gives the type of the integer, and the values of the various bit
masks.  You will have to peer at your own <code>&lt;ctype.h&gt;</code> to figure out
how to define the values required by this file.
   </para>

   <para>The <code>ctype_base.h</code> header file does not need include guards.
It should contain a single <code>struct</code> definition called
<code>ctype_base</code>.  This <code>struct</code> should contain two type
declarations, and one enumeration declaration, like this example, taken
from the IRIX configuration:
   </para>

<programlisting>
  struct ctype_base
     {
       typedef unsigned int 	mask;
       typedef int* 		__to_type;

       enum
       {
	 space = _ISspace,
	 print = _ISprint,
	 cntrl = _IScntrl,
	 upper = _ISupper,
	 lower = _ISlower,
	 alpha = _ISalpha,
	 digit = _ISdigit,
	 punct = _ISpunct,
	 xdigit = _ISxdigit,
	 alnum = _ISalnum,
	 graph = _ISgraph
       };
     };
</programlisting>

<para>The <code>mask</code> type is the type of the elements in the table.  If your
C library uses a table to map lower-case numbers to upper-case numbers,
and vice versa, you should define <code>__to_type</code> to be the type of the
elements in that table.  If you don't mind taking a minor performance
penalty, or if your library doesn't implement <code>toupper</code> and
<code>tolower</code> in this way, you can pick any pointer-to-integer type,
but you must still define the type.
</para>

   <para>The enumeration should give definitions for all the values in the above
example, using the values from your native <code>&lt;ctype.h&gt;</code>.  They can
be given symbolically (as above), or numerically, if you prefer.  You do
not have to include <code>&lt;ctype.h&gt;</code> in this header; it will always be
included before <code>ctype_base.h</code> is included.
   </para>

   <para>The next file to write is <code>ctype_noninline.h</code>, which also does
not require include guards.  This file defines a few member functions
that will be included in <code>include/bits/locale_facets.h</code>.  The first
function that must be written is the <code>ctype&lt;char&gt;::ctype</code>
constructor.  Here is the IRIX example:
   </para>

<programlisting>
ctype&lt;char&gt;::ctype(const mask* __table = 0, bool __del = false,
	   size_t __refs = 0)
       : _Ctype_nois&lt;char&gt;(__refs), _M_del(__table != 0 &amp;&amp; __del),
	 _M_toupper(NULL),
	 _M_tolower(NULL),
	 _M_ctable(NULL),
	 _M_table(!__table
		  ? (const mask*) (__libc_attr._ctype_tbl-&gt;_class + 1)
		  : __table)
       { }
</programlisting>

<para>There are two parts of this that you might choose to alter. The first,
and most important, is the line involving <code>__libc_attr</code>.  That is
IRIX system-dependent code that gets the base of the table mapping
character codes to attributes.  You need to substitute code that obtains
the address of this table on your system.  If you want to use your
operating system's tables to map upper-case letters to lower-case, and
vice versa, you should initialize <code>_M_toupper</code> and
<code>_M_tolower</code> with those tables, in similar fashion.
</para>

   <para>Now, you have to write two functions to convert from upper-case to
lower-case, and vice versa.  Here are the IRIX versions:
   </para>

<programlisting>
     char
     ctype&lt;char&gt;::do_toupper(char __c) const
     { return _toupper(__c); }

     char
     ctype&lt;char&gt;::do_tolower(char __c) const
     { return _tolower(__c); }
</programlisting>

<para>Your C library provides equivalents to IRIX's <code>_toupper</code> and
<code>_tolower</code>.  If you initialized <code>_M_toupper</code> and
<code>_M_tolower</code> above, then you could use those tables instead.
</para>

   <para>Finally, you have to provide two utility functions that convert strings
of characters.  The versions provided here will always work - but you
could use specialized routines for greater performance if you have
machinery to do that on your system:
   </para>

<programlisting>
     const char*
     ctype&lt;char&gt;::do_toupper(char* __low, const char* __high) const
     {
       while (__low &lt; __high)
	 {
	   *__low = do_toupper(*__low);
	   ++__low;
	 }
       return __high;
     }

     const char*
     ctype&lt;char&gt;::do_tolower(char* __low, const char* __high) const
     {
       while (__low &lt; __high)
	 {
	   *__low = do_tolower(*__low);
	   ++__low;
	 }
       return __high;
     }
</programlisting>

   <para>You must also provide the <code>ctype_inline.h</code> file, which
contains a few more functions.  On most systems, you can just copy
<code>config/os/generic/ctype_inline.h</code> and use it on your system.
   </para>

   <para>In detail, the functions provided test characters for particular
properties; they are analogous to the functions like <code>isalpha</code> and
<code>islower</code> provided by the C library.
   </para>

   <para>The first function is implemented like this on IRIX:
   </para>

<programlisting>
     bool
     ctype&lt;char&gt;::
     is(mask __m, char __c) const throw()
     { return (_M_table)[(unsigned char)(__c)] &amp; __m; }
</programlisting>

<para>The <code>_M_table</code> is the table passed in above, in the constructor.
This is the table that contains the bitmasks for each character.  The
implementation here should work on all systems.
</para>

   <para>The next function is:
   </para>

<programlisting>
     const char*
     ctype&lt;char&gt;::
     is(const char* __low, const char* __high, mask* __vec) const throw()
     {
       while (__low &lt; __high)
	 *__vec++ = (_M_table)[(unsigned char)(*__low++)];
       return __high;
     }
</programlisting>

<para>This function is similar; it copies the masks for all the characters
from <code>__low</code> up until <code>__high</code> into the vector given by
<code>__vec</code>.
</para>

   <para>The last two functions again are entirely generic:
   </para>

<programlisting>
     const char*
     ctype&lt;char&gt;::
     scan_is(mask __m, const char* __low, const char* __high) const throw()
     {
       while (__low &lt; __high &amp;&amp; !this-&gt;is(__m, *__low))
	 ++__low;
       return __low;
     }

     const char*
     ctype&lt;char&gt;::
     scan_not(mask __m, const char* __low, const char* __high) const throw()
     {
       while (__low &lt; __high &amp;&amp; this-&gt;is(__m, *__low))
	 ++__low;
       return __low;
     }
</programlisting>

</section>


<section xml:id="internals.thread_safety"><info><title>Thread Safety</title></info>


<para>The C++ library string functionality requires a couple of atomic
operations to provide thread-safety.  If you don't take any special
action, the library will use stub versions of these functions that are
not thread-safe.  They will work fine, unless your applications are
multi-threaded.
</para>

   <para>If you want to provide custom, safe, versions of these functions, there
are two distinct approaches.  One is to provide a version for your CPU,
using assembly language constructs.  The other is to use the
thread-safety primitives in your operating system.  In either case, you
make a file called <code>atomicity.h</code>, and the variable
<code>ATOMICITYH</code> must point to this file.
   </para>

   <para>If you are using the assembly-language approach, put this code in
<code>config/cpu/&lt;chip&gt;/atomicity.h</code>, where chip is the name of
your processor (see <link linkend="internals.cpu">CPU</link>).  No additional changes are necessary to
locate the file in this case; <code>ATOMICITYH</code> will be set by default.
   </para>

   <para>If you are using the operating system thread-safety primitives approach,
you can also put this code in the same CPU directory, in which case no more
work is needed to locate the file.  For examples of this approach,
see the <code>atomicity.h</code> file for IRIX or IA64.
   </para>

   <para>Alternatively, if the primitives are more closely related to the OS
than they are to the CPU, you can put the <code>atomicity.h</code> file in
the <link linkend="internals.os">Operating system</link> directory instead.  In this case, you must
edit <code>configure.host</code>, and in the switch statement that handles
operating systems, override the <code>ATOMICITYH</code> variable to point to
the appropriate <code>os_include_dir</code>.  For examples of this approach,
see the <code>atomicity.h</code> file for AIX.
   </para>

   <para>With those bits out of the way, you have to actually write
<code>atomicity.h</code> itself.  This file should be wrapped in an
include guard named <code>_GLIBCXX_ATOMICITY_H</code>.  It should define one
type, and two functions.
   </para>

   <para>The type is <code>_Atomic_word</code>.  Here is the version used on IRIX:
   </para>

<programlisting>
typedef long _Atomic_word;
</programlisting>

<para>This type must be a signed integral type supporting atomic operations.
If you're using the OS approach, use the same type used by your system's
primitives.  Otherwise, use the type for which your CPU provides atomic
primitives.
</para>

   <para>Then, you must provide two functions.  The bodies of these functions
must be equivalent to those provided here, but using atomic operations:
   </para>

<programlisting>
     static inline _Atomic_word
     __attribute__ ((__unused__))
     __exchange_and_add (_Atomic_word* __mem, int __val)
     {
       _Atomic_word __result = *__mem;
       *__mem += __val;
       return __result;
     }

     static inline void
     __attribute__ ((__unused__))
     __atomic_add (_Atomic_word* __mem, int __val)
     {
       *__mem += __val;
     }
</programlisting>

</section>


<section xml:id="internals.numeric_limits"><info><title>Numeric Limits</title></info>


<para>The C++ library requires information about the fundamental data types,
such as the minimum and maximum representable values of each type.
You can define each of these values individually, but it is usually
easiest just to indicate how many bits are used in each of the data
types and let the library do the rest.  For information about the
macros to define, see the top of <code>include/bits/std_limits.h</code>.
</para>

   <para>If you need to define any macros, you can do so in <code>os_defines.h</code>.
However, if all operating systems for your CPU are likely to use the
same values, you can provide a CPU-specific file instead so that you
do not have to provide the same definitions for each operating system.
To take that approach, create a new file called <code>cpu_limits.h</code> in
your CPU configuration directory (see <link linkend="internals.cpu">CPU</link>).
   </para>

</section>


<section xml:id="internals.libtool"><info><title>Libtool</title></info>


<para>The C++ library is compiled, archived and linked with libtool.
Explaining the full workings of libtool is beyond the scope of this
document, but there are a few, particular bits that are necessary for
porting.
</para>

   <para>Some parts of the libstdc++ library are compiled with the libtool
<code>--tags CXX</code> option (the C++ definitions for libtool).  Therefore,
<code>ltcf-cxx.sh</code> in the top-level directory needs to have the correct
logic to compile and archive objects equivalent to the C version of libtool,
<code>ltcf-c.sh</code>.  Some libtool targets have definitions for C but not
for C++, or C++ definitions which have not been kept up to date.
   </para>

   <para>The C++ run-time library contains initialization code that needs to be
run as the library is loaded.  Often, that requires linking in special
object files when the C++ library is built as a shared library, or
taking other system-specific actions.
   </para>

   <para>The libstdc++ library is linked with the C version of libtool, even
though it is a C++ library.  Therefore, the C version of libtool needs to
ensure that the run-time library initializers are run.  The usual way to
do this is to build the library using <code>gcc -shared</code>.
   </para>

   <para>If you need to change how the library is linked, look at
<code>ltcf-c.sh</code> in the top-level directory.  Find the switch statement
that sets <code>archive_cmds</code>.  Here, adjust the setting for your
operating system.
   </para>


</section>

</section>