README.Portability 6.4 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211
  1. Copyright (C) 2000-2015 Free Software Foundation, Inc.
  2. This file is intended to contain a few notes about writing C code
  3. within GCC so that it compiles without error on the full range of
  4. compilers GCC needs to be able to compile on.
  5. The problem is that many ISO-standard constructs are not accepted by
  6. either old or buggy compilers, and we keep getting bitten by them.
  7. This knowledge until now has been sparsely spread around, so I
  8. thought I'd collect it in one useful place. Please add and correct
  9. any problems as you come across them.
  10. I'm going to start from a base of the ISO C90 standard, since that is
  11. probably what most people code to naturally. Obviously using
  12. constructs introduced after that is not a good idea.
  13. For the complete coding style conventions used in GCC, please read
  14. http://gcc.gnu.org/codingconventions.html
  15. String literals
  16. ---------------
  17. Irix6 "cc -n32" and OSF4 "cc" have problems with constant string
  18. initializers with parens around it, e.g.
  19. const char string[] = ("A string");
  20. This is unfortunate since this is what the GNU gettext macro N_
  21. produces. You need to find a different way to code it.
  22. Some compilers like MSVC++ have fairly low limits on the maximum
  23. length of a string literal; 509 is the lowest we've come across. You
  24. may need to break up a long printf statement into many smaller ones.
  25. Empty macro arguments
  26. ---------------------
  27. ISO C (6.8.3 in the 1990 standard) specifies the following:
  28. If (before argument substitution) any argument consists of no
  29. preprocessing tokens, the behavior is undefined.
  30. This was relaxed by ISO C99, but some older compilers emit an error,
  31. so code like
  32. #define foo(x, y) x y
  33. foo (bar, )
  34. needs to be coded in some other way.
  35. Avoid unnecessary test before free
  36. ----------------------------------
  37. Since SunOS 4 stopped being a reasonable portability target,
  38. (which happened around 2007) there has been no need to guard
  39. against "free (NULL)". Thus, any guard like the following
  40. constitutes a redundant test:
  41. if (P)
  42. free (P);
  43. It is better to avoid the test.[*]
  44. Instead, simply free P, regardless of whether it is NULL.
  45. [*] However, if your profiling exposes a test like this in a
  46. performance-critical loop, say where P is nearly always NULL, and
  47. the cost of calling free on a NULL pointer would be prohibitively
  48. high, consider using __builtin_expect, e.g., like this:
  49. if (__builtin_expect (ptr != NULL, 0))
  50. free (ptr);
  51. Trigraphs
  52. ---------
  53. You weren't going to use them anyway, but some otherwise ISO C
  54. compliant compilers do not accept trigraphs.
  55. Suffixes on Integer Constants
  56. -----------------------------
  57. You should never use a 'l' suffix on integer constants ('L' is fine),
  58. since it can easily be confused with the number '1'.
  59. Common Coding Pitfalls
  60. ======================
  61. errno
  62. -----
  63. errno might be declared as a macro.
  64. Implicit int
  65. ------------
  66. In C, the 'int' keyword can often be omitted from type declarations.
  67. For instance, you can write
  68. unsigned variable;
  69. as shorthand for
  70. unsigned int variable;
  71. There are several places where this can cause trouble. First, suppose
  72. 'variable' is a long; then you might think
  73. (unsigned) variable
  74. would convert it to unsigned long. It does not. It converts to
  75. unsigned int. This mostly causes problems on 64-bit platforms, where
  76. long and int are not the same size.
  77. Second, if you write a function definition with no return type at
  78. all:
  79. operate (int a, int b)
  80. {
  81. ...
  82. }
  83. that function is expected to return int, *not* void. GCC will warn
  84. about this.
  85. Implicit function declarations always have return type int. So if you
  86. correct the above definition to
  87. void
  88. operate (int a, int b)
  89. ...
  90. but operate() is called above its definition, you will get an error
  91. about a "type mismatch with previous implicit declaration". The cure
  92. is to prototype all functions at the top of the file, or in an
  93. appropriate header.
  94. Char vs unsigned char vs int
  95. ----------------------------
  96. In C, unqualified 'char' may be either signed or unsigned; it is the
  97. implementation's choice. When you are processing 7-bit ASCII, it does
  98. not matter. But when your program must handle arbitrary binary data,
  99. or fully 8-bit character sets, you have a problem. The most obvious
  100. issue is if you have a look-up table indexed by characters.
  101. For instance, the character '\341' in ISO Latin 1 is SMALL LETTER A
  102. WITH ACUTE ACCENT. In the proper locale, isalpha('\341') will be
  103. true. But if you read '\341' from a file and store it in a plain
  104. char, isalpha(c) may look up character 225, or it may look up
  105. character -31. And the ctype table has no entry at offset -31, so
  106. your program will crash. (If you're lucky.)
  107. It is wise to use unsigned char everywhere you possibly can. This
  108. avoids all these problems. Unfortunately, the routines in <string.h>
  109. take plain char arguments, so you have to remember to cast them back
  110. and forth - or avoid the use of strxxx() functions, which is probably
  111. a good idea anyway.
  112. Another common mistake is to use either char or unsigned char to
  113. receive the result of getc() or related stdio functions. They may
  114. return EOF, which is outside the range of values representable by
  115. char. If you use char, some legal character value may be confused
  116. with EOF, such as '\377' (SMALL LETTER Y WITH UMLAUT, in Latin-1).
  117. The correct choice is int.
  118. A more subtle version of the same mistake might look like this:
  119. unsigned char pushback[NPUSHBACK];
  120. int pbidx;
  121. #define unget(c) (assert(pbidx < NPUSHBACK), pushback[pbidx++] = (c))
  122. #define get(c) (pbidx ? pushback[--pbidx] : getchar())
  123. ...
  124. unget(EOF);
  125. which will mysteriously turn a pushed-back EOF into a SMALL LETTER Y
  126. WITH UMLAUT.
  127. Other common pitfalls
  128. ---------------------
  129. o Expecting 'plain' char to be either sign or unsigned extending.
  130. o Shifting an item by a negative amount or by greater than or equal to
  131. the number of bits in a type (expecting shifts by 32 to be sensible
  132. has caused quite a number of bugs at least in the early days).
  133. o Expecting ints shifted right to be sign extended.
  134. o Modifying the same value twice within one sequence point.
  135. o Host vs. target floating point representation, including emitting NaNs
  136. and Infinities in a form that the assembler handles.
  137. o qsort being an unstable sort function (unstable in the sense that
  138. multiple items that sort the same may be sorted in different orders
  139. by different qsort functions).
  140. o Passing incorrect types to fprintf and friends.
  141. o Adding a function declaration for a module declared in another file to
  142. a .c file instead of to a .h file.