diff options
Diffstat (limited to 'gcc/README.Portability')
-rw-r--r-- | gcc/README.Portability | 197 |
1 files changed, 197 insertions, 0 deletions
diff --git a/gcc/README.Portability b/gcc/README.Portability new file mode 100644 index 000000000..32a33e27b --- /dev/null +++ b/gcc/README.Portability @@ -0,0 +1,197 @@ +Copyright (C) 2000, 2003 Free Software Foundation, Inc. + +This file is intended to contain a few notes about writing C code +within GCC so that it compiles without error on the full range of +compilers GCC needs to be able to compile on. + +The problem is that many ISO-standard constructs are not accepted by +either old or buggy compilers, and we keep getting bitten by them. +This knowledge until know has been sparsely spread around, so I +thought I'd collect it in one useful place. Please add and correct +any problems as you come across them. + +I'm going to start from a base of the ISO C90 standard, since that is +probably what most people code to naturally. Obviously using +constructs introduced after that is not a good idea. + +For the complete coding style conventions used in GCC, please read +http://gcc.gnu.org/codingconventions.html + + +String literals +--------------- + +Irix6 "cc -n32" and OSF4 "cc" have problems with constant string +initializers with parens around it, e.g. + +const char string[] = ("A string"); + +This is unfortunate since this is what the GNU gettext macro N_ +produces. You need to find a different way to code it. + +Some compilers like MSVC++ have fairly low limits on the maximum +length of a string literal; 509 is the lowest we've come across. You +may need to break up a long printf statement into many smaller ones. + + +Empty macro arguments +--------------------- + +ISO C (6.8.3 in the 1990 standard) specifies the following: + +If (before argument substitution) any argument consists of no +preprocessing tokens, the behavior is undefined. + +This was relaxed by ISO C99, but some older compilers emit an error, +so code like + +#define foo(x, y) x y +foo (bar, ) + +needs to be coded in some other way. + + +free and realloc +---------------- + +Some implementations crash upon attempts to free or realloc the null +pointer. Thus if mem might be null, you need to write + + if (mem) + free (mem); + + +Trigraphs +--------- + +You weren't going to use them anyway, but some otherwise ISO C +compliant compilers do not accept trigraphs. + + +Suffixes on Integer Constants +----------------------------- + +You should never use a 'l' suffix on integer constants ('L' is fine), +since it can easily be confused with the number '1'. + + + Common Coding Pitfalls + ====================== + +errno +----- + +errno might be declared as a macro. + + +Implicit int +------------ + +In C, the 'int' keyword can often be omitted from type declarations. +For instance, you can write + + unsigned variable; + +as shorthand for + + unsigned int variable; + +There are several places where this can cause trouble. First, suppose +'variable' is a long; then you might think + + (unsigned) variable + +would convert it to unsigned long. It does not. It converts to +unsigned int. This mostly causes problems on 64-bit platforms, where +long and int are not the same size. + +Second, if you write a function definition with no return type at +all: + + operate (int a, int b) + { + ... + } + +that function is expected to return int, *not* void. GCC will warn +about this. + +Implicit function declarations always have return type int. So if you +correct the above definition to + + void + operate (int a, int b) + ... + +but operate() is called above its definition, you will get an error +about a "type mismatch with previous implicit declaration". The cure +is to prototype all functions at the top of the file, or in an +appropriate header. + +Char vs unsigned char vs int +---------------------------- + +In C, unqualified 'char' may be either signed or unsigned; it is the +implementation's choice. When you are processing 7-bit ASCII, it does +not matter. But when your program must handle arbitrary binary data, +or fully 8-bit character sets, you have a problem. The most obvious +issue is if you have a look-up table indexed by characters. + +For instance, the character '\341' in ISO Latin 1 is SMALL LETTER A +WITH ACUTE ACCENT. In the proper locale, isalpha('\341') will be +true. But if you read '\341' from a file and store it in a plain +char, isalpha(c) may look up character 225, or it may look up +character -31. And the ctype table has no entry at offset -31, so +your program will crash. (If you're lucky.) + +It is wise to use unsigned char everywhere you possibly can. This +avoids all these problems. Unfortunately, the routines in <string.h> +take plain char arguments, so you have to remember to cast them back +and forth - or avoid the use of strxxx() functions, which is probably +a good idea anyway. + +Another common mistake is to use either char or unsigned char to +receive the result of getc() or related stdio functions. They may +return EOF, which is outside the range of values representable by +char. If you use char, some legal character value may be confused +with EOF, such as '\377' (SMALL LETTER Y WITH UMLAUT, in Latin-1). +The correct choice is int. + +A more subtle version of the same mistake might look like this: + + unsigned char pushback[NPUSHBACK]; + int pbidx; + #define unget(c) (assert(pbidx < NPUSHBACK), pushback[pbidx++] = (c)) + #define get(c) (pbidx ? pushback[--pbidx] : getchar()) + ... + unget(EOF); + +which will mysteriously turn a pushed-back EOF into a SMALL LETTER Y +WITH UMLAUT. + + +Other common pitfalls +--------------------- + +o Expecting 'plain' char to be either sign or unsigned extending. + +o Shifting an item by a negative amount or by greater than or equal to + the number of bits in a type (expecting shifts by 32 to be sensible + has caused quite a number of bugs at least in the early days). + +o Expecting ints shifted right to be sign extended. + +o Modifying the same value twice within one sequence point. + +o Host vs. target floating point representation, including emitting NaNs + and Infinities in a form that the assembler handles. + +o qsort being an unstable sort function (unstable in the sense that + multiple items that sort the same may be sorted in different orders + by different qsort functions). + +o Passing incorrect types to fprintf and friends. + +o Adding a function declaration for a module declared in another file to + a .c file instead of to a .h file. + |