[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This chapter presents several topics related to program performance.
It first describes some of the tradeoffs that need to be considered
and some of the techniques for making your program run faster.
It then documents the gnatelim
tool and unused subprogram/data
elimination feature, which can reduce the size of program executables.
7.1 Performance Considerations 7.2 Text_IO
Suggestions7.3 Reducing Size of Ada Executables with gnatelim
7.4 Reducing Size of Executables with Unused Subprogram/Data Elimination
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The GNAT system provides a number of options that allow a trade-off between
The defaults (if no options are selected) aim at improving the speed of compilation and minimizing dependences, at the expense of performance of the generated code:
These options are suitable for most program development purposes. This chapter describes how you can modify these choices, and also provides some guidelines on debugging optimized code.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
By default, GNAT generates all run-time checks, except integer overflow checks, stack overflow checks, and checks for access before elaboration on subprogram calls. The latter are not required in default mode, because all necessary checking is done at compile time. Two gnat switches, `-gnatp' and `-gnato' allow this default to be modified. See section 3.2.6 Run-Time Checks.
Our experience is that the default is suitable for most development purposes.
We treat integer overflow specially because these are quite expensive and in our experience are not as important as other run-time checks in the development process. Note that division by zero is not considered an overflow check, and divide by zero checks are generated where required by default.
Elaboration checks are off by default, and also not needed by default, since GNAT uses a static elaboration analysis approach that avoids the need for run-time checking. This manual contains a full chapter discussing the issue of elaboration checks, and if the default is not satisfactory for your use, you should read this chapter.
For validity checks, the minimal checks required by the Ada Reference Manual (for case statements and assignments to array elements) are on by default. These can be suppressed by use of the `-gnatVn' switch. Note that in Ada 83, there were no validity checks, so if the Ada 83 mode is acceptable (or when comparing GNAT performance with an Ada 83 compiler), it may be reasonable to routinely use `-gnatVn'. Validity checks are also suppressed entirely if `-gnatp' is used.
Note that the setting of the switches controls the default setting of
the checks. They may be modified using either pragma Suppress
(to
remove checks) or pragma Unsuppress
(to add back suppressed
checks) in the program source.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The use of pragma Restrictions allows you to control which features are permitted in your program. Apart from the obvious point that if you avoid relatively expensive features like finalization (enforceable by the use of pragma Restrictions (No_Finalization), the use of this pragma does not affect the generated code in most cases.
One notable exception to this rule is that the possibility of task abort results in some distributed overhead, particularly if finalization or exception handlers are used. The reason is that certain sections of code have to be marked as non-abortable.
If you use neither the abort
statement, nor asynchronous transfer
of control (select ... then abort
), then this distributed overhead
is removed, which may have a general positive effect in improving
overall performance. Especially code involving frequent use of tasking
constructs and controlled types will show much improved performance.
The relevant restrictions pragmas are
pragma Restrictions (No_Abort_Statements); pragma Restrictions (Max_Asynchronous_Select_Nesting => 0); |
It is recommended that these restriction pragmas be used if possible. Note that this also means that you can write code without worrying about the possibility of an immediate abort at any point.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Without any optimization option, the compiler's goal is to reduce the cost of compilation and to make debugging produce the expected results. Statements are independent: if you stop the program with a breakpoint between statements, you can then assign a new value to any variable or change the program counter to any other statement in the subprogram and get exactly the results you would expect from the source code.
Turning on optimization makes the compiler attempt to improve the performance and/or code size at the expense of compilation time and possibly the ability to debug the program.
If you use multiple -O options, with or without level numbers, the last such option is the one that is effective.
The default is optimization off. This results in the fastest compile
times, but GNAT makes absolutely no attempt to optimize, and the
generated programs are considerably larger and slower than when
optimization is enabled. You can use the
`-O' switch (the permitted forms are `-O0', `-O1'
`-O2', `-O3', and `-Os')
to gcc
to control the optimization level:
Note that many other compilers do fairly extensive optimization even if "no optimization" is specified. With gcc, it is very unusual to use -O0 for production if execution time is of any concern, since -O0 really does mean no optimization at all. This difference between gcc and other compilers should be kept in mind when doing performance comparisons.
Higher optimization levels perform more global transformations on the program and apply more expensive analysis algorithms in order to generate faster and more compact code. The price in compilation time, and the resulting improvement in execution time, both depend on the particular application and the hardware environment. You should experiment to find the best level for your application.
Since the precise set of optimizations done at each level will vary from release to release (and sometime from target to target), it is best to think of the optimization settings in general terms. See section `Options That Control Optimization' in Using the GNU Compiler Collection (GCC), for details about the `-O' settings and a number of `-f' options that individually enable or disable specific optimizations.
Unlike some other compilation systems, gcc
has
been tested extensively at all optimization levels. There are some bugs
which appear only with optimization turned on, but there have also been
bugs which show up only in unoptimized code. Selecting a lower
level of optimization does not improve the reliability of the code
generator, which in practice is highly reliable at all optimization
levels.
Note regarding the use of `-O3': The use of this optimization level is generally discouraged with GNAT, since it often results in larger executables which may run more slowly. See further discussion of this point in 7.1.5 Inlining of Subprograms.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Although it is possible to do a reasonable amount of debugging at nonzero optimization levels, the higher the level the more likely that source-level constructs will have been eliminated by optimization. For example, if a loop is strength-reduced, the loop control variable may be completely eliminated and thus cannot be displayed in the debugger. This can only happen at `-O2' or `-O3'. Explicit temporary variables that you code might be eliminated at level `-O1' or higher.
The use of the `-g' switch, which is needed for source-level debugging, affects the size of the program executable on disk, and indeed the debugging information can be quite large. However, it has no effect on the generated code (and thus does not degrade performance)
Since the compiler generates debugging tables for a compilation unit before it performs optimizations, the optimizing transformations may invalidate some of the debugging data. You therefore need to anticipate certain anomalous situations that may arise while debugging optimized code. These are the most common cases:
step
or next
commands show
the PC bouncing back and forth in the code. This may result from any of
the following optimizations:
goto
, a return
, or
a break
in a C switch
statement.
In general, when an unexpected value appears for a local variable or parameter you should first ascertain if that value was actually computed by your program, as opposed to being incorrectly reported by the debugger. Record fields or array elements in an object designated by an access value are generally less of a problem, once you have ascertained that the access value is sensible. Typically, this means checking variables in the preceding code and in the calling subprogram to verify that the value observed is explainable from other values (one must apply the procedure recursively to those other values); or re-running the code and stopping a little earlier (perhaps before the call) and stepping to better see how the variable obtained the value in question; or continuing to step from the point of the strange value to see if code motion had simply moved the variable's assignments later.
In light of such anomalies, a recommended technique is to use `-O0'
early in the software development cycle, when extensive debugging capabilities
are most needed, and then move to `-O1' and later `-O2' as
the debugger becomes less critical.
Whether to use the `-g' switch in the release version is
a release management issue.
Note that if you use `-g' you can then use the strip
program
on the resulting executable,
which removes both debugging information and global symbols.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
A call to a subprogram in the current unit is inlined if all the following conditions are met:
gcc
cannot support in inlined
subprograms.
pragma Inline
is applied to the
subprogram and the `-gnatn' switch is specified; the
subprogram is local to the unit and called once from within it; the
subprogram is small and optimization level `-O2' is specified;
optimization level `-O3') is specified.
Calls to subprograms in with
'ed units are normally not inlined.
To achieve actual inlining (that is, replacement of the call by the code
in the body of the subprogram), the following conditions must all be true.
gcc
cannot support in inlined
subprograms.
pragma Inline
for the subprogram.
Even if all these conditions are met, it may not be possible for the compiler to inline the call, due to the length of the body, or features in the body that make it impossible for the compiler to do the inlining.
Note that specifying the `-gnatn' switch causes additional compilation dependencies. Consider the following:
package R is procedure Q; pragma Inline (Q); end R; package body R is ... end R; with R; procedure Main is begin ... R.Q; end Main; |
With the default behavior (no `-gnatn' switch specified), the
compilation of the Main
procedure depends only on its own source,
`main.adb', and the spec of the package in file `r.ads'. This
means that editing the body of R
does not require recompiling
Main
.
On the other hand, the call R.Q
is not inlined under these
circumstances. If the `-gnatn' switch is present when Main
is compiled, the call will be inlined if the body of Q
is small
enough, but now Main
depends on the body of R
in
`r.adb' as well as on the spec. This means that if this body is edited,
the main program must be recompiled. Note that this extra dependency
occurs whether or not the call is in fact inlined by gcc
.
The use of front end inlining with `-gnatN' generates similar additional dependencies.
Note: The `-fno-inline' switch can be used to prevent all inlining. This switch overrides all other conditions and ensures that no inlining occurs. The extra dependences resulting from `-gnatn' will still be active, even if this switch is used to suppress the resulting inlining actions.
Note: The `-fno-inline-functions' switch can be used to prevent automatic inlining of subprograms if `-O3' is used.
Note: The `-fno-inline-small-functions' switch can be used to prevent automatic inlining of small subprograms if `-O2' is used.
Note: The `-fno-inline-functions-called-once' switch can be used to prevent inlining of subprograms local to the unit and called once from within it if `-O1' is used.
Note regarding the use of `-O3': There is no difference in inlining
behavior between `-O2' and `-O3' for subprograms with an explicit
pragma Inline
assuming the use of `-gnatn'
or `-gnatN' (the switches that activate inlining). If you have used
pragma Inline
in appropriate cases, then it is usually much better
to use `-O2' and `-gnatn' and avoid the use of `-O3' which
in this case only has the effect of inlining subprograms you did not
think should be inlined. We often find that the use of `-O3' slows
down code by performing excessive inlining, leading to increased instruction
cache pressure from the increased code size. So the bottom line here is
that you should not automatically assume that `-O3' is better than
`-O2', and indeed you should use `-O3' only if tests show that
it actually improves performance.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
You can take advantage of the auto-vectorizer present in the gcc
back end to vectorize loops with GNAT. The corresponding command line switch
is `-ftree-vectorize' but, as it is enabled by default at `-O3'
and other aggressive optimizations helpful for vectorization also are enabled
by default at this level, using `-O3' directly is recommended.
You also need to make sure that the target architecture features a supported SIMD instruction set. For example, for the x86 architecture, you should at least specify `-msse2' to get significant vectorization (but you don't need to specify it for x86-64 as it is part of the base 64-bit architecture). Similarly, for the PowerPC architecture, you should specify `-maltivec'.
The preferred loop form for vectorization is the for
iteration scheme.
Loops with a while
iteration scheme can also be vectorized if they are
very simple, but the vectorizer will quickly give up otherwise. With either
iteration scheme, the flow of control must be straight, in particular no
exit
statement may appear in the loop body. The loop may however
contain a single nested loop, if it can be vectorized when considered alone:
A : array (1..4, 1..4) of Long_Float; S : array (1..4) of Long_Float; procedure Sum is begin for I in A'Range(1) loop for J in A'Range(2) loop S (I) := S (I) + A (I, J); end loop; end loop; end Sum; |
The vectorizable operations depend on the targeted SIMD instruction set, but the adding and some of the multiplying operators are generally supported, as well as the logical operators for modular types. Note that, in the former case, enabling overflow checks, for example with `-gnato', totally disables vectorization. The other checks are not supposed to have the same definitive effect, although compiling with `-gnatp' might well reveal cases where some checks do thwart vectorization.
Type conversions may also prevent vectorization if they involve semantics that are not directly supported by the code generator or the SIMD instruction set. A typical example is direct conversion from floating-point to integer types. The solution in this case is to use the following idiom:
Integer (S'Truncation (F)) |
if S
is the subtype of floating-point object F
.
In most cases, the vectorizable loops are loops that iterate over arrays. All kinds of array types are supported, i.e. constrained array types with static bounds:
type Array_Type is array (1 .. 4) of Long_Float; |
constrained array types with dynamic bounds:
type Array_Type is array (1 .. Q.N) of Long_Float; type Array_Type is array (Q.K .. 4) of Long_Float; type Array_Type is array (Q.K .. Q.N) of Long_Float; |
or unconstrained array types:
type Array_Type is array (Positive range <>) of Long_Float; |
The quality of the generated code decreases when the dynamic aspect of the array type increases, the worst code being generated for unconstrained array types. This is so because, the less information the compiler has about the bounds of the array, the more fallback code it needs to generate in order to fix things up at run time.
You can obtain information about the vectorization performed by the compiler by specifying `-ftree-vectorizer-verbose=N'. For more details of this switch, see section `Options for Debugging Your Program or GCC' in Using the GNU Compiler Collection (GCC).
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Since GNAT
uses the gcc
back end, all the specialized
gcc
optimization switches are potentially usable. These switches
have not been extensively tested with GNAT but can generally be expected
to work. Examples of switches in this category are `-funroll-loops'
and the various target-specific `-m' options (in particular, it has
been observed that `-march=xxx' can significantly improve performance
on appropriate machines). For full details of these switches, see
section `Hardware Models and Configurations' in Using the GNU Compiler Collection (GCC).
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The strong typing capabilities of Ada allow an optimizer to generate efficient code in situations where other languages would be forced to make worst case assumptions preventing such optimizations. Consider the following example:
procedure R is type Int1 is new Integer; type Int2 is new Integer; type Int1A is access Int1; type Int2A is access Int2; Int1V : Int1A; Int2V : Int2A; ... begin ... for J in Data'Range loop if Data (J) = Int1V.all then Int2V.all := Int2V.all + 1; end if; end loop; ... end R; |
In this example, since the variable Int1V
can only access objects
of type Int1
, and Int2V
can only access objects of type
Int2
, there is no possibility that the assignment to
Int2V.all
affects the value of Int1V.all
. This means that
the compiler optimizer can "know" that the value Int1V.all
is constant
for all iterations of the loop and avoid the extra memory reference
required to dereference it each time through the loop.
This kind of optimization, called strict aliasing analysis, is
triggered by specifying an optimization level of `-O2' or
higher or `-Os' and allows GNAT
to generate more efficient code
when access values are involved.
However, although this optimization is always correct in terms of
the formal semantics of the Ada Reference Manual, difficulties can
arise if features like Unchecked_Conversion
are used to break
the typing system. Consider the following complete program example:
package p1 is type int1 is new integer; type int2 is new integer; type a1 is access int1; type a2 is access int2; end p1; with p1; use p1; package p2 is function to_a2 (Input : a1) return a2; end p2; with Unchecked_Conversion; package body p2 is function to_a2 (Input : a1) return a2 is function to_a2u is new Unchecked_Conversion (a1, a2); begin return to_a2u (Input); end to_a2; end p2; with p2; use p2; with p1; use p1; with Text_IO; use Text_IO; procedure m is v1 : a1 := new int1; v2 : a2 := to_a2 (v1); begin v1.all := 1; v2.all := 0; put_line (int1'image (v1.all)); end; |
This program prints out 0 in `-O0' or `-O1'
mode, but it prints out 1 in `-O2' mode. That's
because in strict aliasing mode, the compiler can and
does assume that the assignment to v2.all
could not
affect the value of v1.all
, since different types
are involved.
This behavior is not a case of non-conformance with the standard, since
the Ada RM specifies that an unchecked conversion where the resulting
bit pattern is not a correct value of the target type can result in an
abnormal value and attempting to reference an abnormal value makes the
execution of a program erroneous. That's the case here since the result
does not point to an object of type int2
. This means that the
effect is entirely unpredictable.
However, although that explanation may satisfy a language lawyer, in practice an applications programmer expects an unchecked conversion involving pointers to create true aliases and the behavior of printing 1 seems plain wrong. In this case, the strict aliasing optimization is unwelcome.
Indeed the compiler recognizes this possibility, and the unchecked conversion generates a warning:
p2.adb:5:07: warning: possible aliasing problem with type "a2" p2.adb:5:07: warning: use -fno-strict-aliasing switch for references p2.adb:5:07: warning: or use "pragma No_Strict_Aliasing (a2);" |
Unfortunately the problem is recognized when compiling the body of
package p2
, but the actual "bad" code is generated while
compiling the body of m
and this latter compilation does not see
the suspicious Unchecked_Conversion
.
As implied by the warning message, there are approaches you can use to avoid the unwanted strict aliasing optimization in a case like this.
One possibility is to simply avoid the use of `-O2', but that is a bit drastic, since it throws away a number of useful optimizations that do not involve strict aliasing assumptions.
A less drastic approach is to compile the program using the
option `-fno-strict-aliasing'. Actually it is only the
unit containing the dereferencing of the suspicious pointer
that needs to be compiled. So in this case, if we compile
unit m
with this switch, then we get the expected
value of zero printed. Analyzing which units might need
the switch can be painful, so a more reasonable approach
is to compile the entire program with options `-O2'
and `-fno-strict-aliasing'. If the performance is
satisfactory with this combination of options, then the
advantage is that the entire issue of possible "wrong"
optimization due to strict aliasing is avoided.
To avoid the use of compiler switches, the configuration
pragma No_Strict_Aliasing
with no parameters may be
used to specify that for all access types, the strict
aliasing optimization should be suppressed.
However, these approaches are still overkill, in that they causes all manipulations of all access values to be deoptimized. A more refined approach is to concentrate attention on the specific access type identified as problematic.
First, if a careful analysis of uses of the pointer shows
that there are no possible problematic references, then
the warning can be suppressed by bracketing the
instantiation of Unchecked_Conversion
to turn
the warning off:
pragma Warnings (Off); function to_a2u is new Unchecked_Conversion (a1, a2); pragma Warnings (On); |
Of course that approach is not appropriate for this particular example, since indeed there is a problematic reference. In this case we can take one of two other approaches.
The first possibility is to move the instantiation of unchecked
conversion to the unit in which the type is declared. In
this example, we would move the instantiation of
Unchecked_Conversion
from the body of package
p2
to the spec of package p1
. Now the
warning disappears. That's because any use of the
access type knows there is a suspicious unchecked
conversion, and the strict aliasing optimization
is automatically suppressed for the type.
If it is not practical to move the unchecked conversion to the same unit
in which the destination access type is declared (perhaps because the
source type is not visible in that unit), you may use pragma
No_Strict_Aliasing
for the type. This pragma must occur in the
same declarative sequence as the declaration of the access type:
type a2 is access int2; pragma No_Strict_Aliasing (a2); |
Here again, the compiler now knows that the strict aliasing optimization
should be suppressed for any reference to type a2
and the
expected behavior is obtained.
Finally, note that although the compiler can generate warnings for simple cases of unchecked conversions, there are tricker and more indirect ways of creating type incorrect aliases which the compiler cannot detect. Examples are the use of address overlays and unchecked conversions involving composite types containing access types as components. In such cases, no warnings are generated, but there can still be aliasing problems. One safe coding practice is to forbid the use of address clauses for type overlaying, and to allow unchecked conversion only for primitive types. This is not really a significant restriction since any possible desired effect can be achieved by unchecked conversion of access values.
The aliasing analysis done in strict aliasing mode can certainly have significant benefits. We have seen cases of large scale application code where the time is increased by up to 5% by turning this optimization off. If you have code that includes significant usage of unchecked conversion, you might want to just stick with `-O1' and avoid the entire issue. If you get adequate performance at this level of optimization level, that's probably the safest approach. If tests show that you really need higher levels of optimization, then you can experiment with `-O2' and `-O2 -fno-strict-aliasing' to see how much effect this has on size and speed of the code. If you really need to use `-O2' with strict aliasing in effect, then you should review any uses of unchecked conversion of access types, particularly if you are getting the warnings described above.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Text_IO
Suggestions
The Ada.Text_IO
package has fairly high overheads due in part to
the requirement of maintaining page and line counts. If performance
is critical, a recommendation is to use Stream_IO
instead of
Text_IO
for volume output, since this package has less overhead.
If Text_IO
must be used, note that by default output to the standard
output and standard error files is unbuffered (this provides better
behavior when output statements are used for debugging, or if the
progress of a program is observed by tracking the output, e.g. by
using the Unix tail -f
command to watch redirected output.
If you are generating large volumes of output with Text_IO
and
performance is an important factor, use a designated file instead
of the standard output file, or change the standard output file to
be buffered using Interfaces.C_Streams.setvbuf
.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
gnatelim
This section describes gnatelim
, a tool which detects unused
subprograms and helps the compiler to create a smaller executable for your
program.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
gnatelim
When a program shares a set of Ada packages with other programs, it may happen that this program uses only a fraction of the subprograms defined in these packages. The code created for these unused subprograms increases the size of the executable.
gnatelim
tracks unused subprograms in an Ada program and
outputs a list of GNAT-specific pragmas Eliminate
marking all the
subprograms that are declared but never called. By placing the list of
Eliminate
pragmas in the GNAT configuration file `gnat.adc' and
recompiling your program, you may decrease the size of its executable,
because the compiler will not generate the code for 'eliminated' subprograms.
See section `Pragma Eliminate' in GNAT Reference Manual, for more
information about this pragma.
gnatelim
needs as its input data the name of the main subprogram.
If a set of source files is specified as gnatelim
arguments, it
treats these files as a complete set of sources making up a program to
analyse, and analyses only these sources.
After a full successful build of the main subprogram gnatelim
can be
called without specifying sources to analyse, in this case it computes
the source closure of the main unit from the `ALI' files.
The following command will create the set of `ALI' files needed for
gnatelim
:
$ gnatmake -c Main_Prog |
Note that gnatelim
does not need object files.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
gnatelim
gnatelim
has the following command-line interface:
$ gnatelim [switches] -main=main_unit_name {filename} [-cargs gcc_switches] |
main_unit_name should be a name of a source file that contains the main subprogram of a program (partition).
Each filename is the name (including the extension) of a source file to process. "Wildcards" are allowed, and the file name may contain path information.
`gcc_switches' is a list of switches for
gcc
. They will be passed on to all compiler invocations made by
gnatelim
to generate the ASIS trees. Here you can provide
`-I' switches to form the source search path,
use the `-gnatec' switch to set the configuration file,
use the `-gnat05' switch if sources should be compiled in
Ada 2005 mode etc.
gnatelim
has the following switches:
gnatelim
. You also can combine this switch with
an explicit list of files.
gnatelim
output into a specified file. If this file already exists,
it is overridden. If this switch is not used, gnatelim
outputs its results
into `stderr'
gnatelim
outputs to the standard error
stream the number of program units left to be processed. This option turns
this trace off.
gnatelim
version information is printed as Ada
comments to the standard output stream. Also, in addition to the number of
program units left gnatelim
will output the name of the current unit
being processed.
Note: to invoke gnatelim
with a project file, use the gnat
driver (see 12.2 The GNAT Driver and Project Files).
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
If some program uses a precompiled Ada library, it can be processed by
gnatelim
in a usual way. gnatelim
will newer generate an
Eliminate pragma for a subprogram if the body of this subprogram has not
been analysed, this is a typical case for subprograms from precompiled
libraries. Switch `-wq' may be used to suppress
warnings about missing source files and non-analyzed subprogram bodies
that can be generated when processing precompiled Ada libraries.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
In some rare cases gnatelim
may try to eliminate
subprograms that are actually called in the program. In this case, the
compiler will generate an error message of the form:
main.adb:4:08: cannot reference subprogram "P" eliminated at elim.out:5 |
You will need to manually remove the wrong Eliminate
pragmas from
the configuration file indicated in the error message. You should recompile
your program from scratch after that, because you need a consistent
configuration file(s) during the entire compilation.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
In order to get a smaller executable for your program you now have to recompile the program completely with the configuration file containing pragmas Eliminate generated by gnatelim. If these pragmas are placed in `gnat.adc' file located in your current directory, just do:
$ gnatmake -f main_prog |
(Use the `-f' option for gnatmake
to
recompile everything
with the set of pragmas Eliminate
that you have obtained with
gnatelim
).
Be aware that the set of Eliminate
pragmas is specific to each
program. It is not recommended to merge sets of Eliminate
pragmas created for different programs in one configuration file.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
gnatelim
Usage Cycle
Here is a quick summary of the steps to be taken in order to reduce
the size of your executables with gnatelim
. You may use
other GNAT options to control the optimization level,
to produce the debugging information, to set search path, etc.
$ gnatmake -c main_prog |
Eliminate
pragmas in default configuration file
`gnat.adc' in the current directory
$ gnatelim main_prog >[>] gnat.adc |
$ gnatmake -f main_prog |
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This section describes how you can eliminate unused subprograms and data from your executable just by setting options at compilation time.
7.4.1 About unused subprogram/data elimination 7.4.2 Compilation options 7.4.3 Example of unused subprogram/data elimination
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
By default, an executable contains all code and data of its composing objects (directly linked or coming from statically linked libraries), even data or code never used by this executable.
This feature will allow you to eliminate such unused code from your executable, making it smaller (in disk and in memory).
This functionality is available on all Linux platforms except for the IA-64 architecture and on all cross platforms using the ELF binary file format. In both cases GNU binutils version 2.16 or later are required to enable it.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The operation of eliminating the unused code and data from the final executable is directly performed by the linker.
In order to do this, it has to work with objects compiled with the following options: `-ffunction-sections' `-fdata-sections'. These options are usable with C and Ada files. They will place respectively each function or data in a separate section in the resulting object file.
Once the objects and static libraries are created with these options, the
linker can perform the dead code elimination. You can do this by setting
the `-Wl,--gc-sections' option to gcc command or in the
`-largs' section of gnatmake
. This will perform a
garbage collection of code and data never referenced.
If the linker performs a partial link (`-r' ld linker option), then you will need to provide one or several entry point using the `-e' / `--entry' ld option.
Note that objects compiled without the `-ffunction-sections' and `-fdata-sections' options can still be linked with the executable. However, no dead code elimination will be performed on those objects (they will be linked as is).
The GNAT static library is now compiled with -ffunction-sections and -fdata-sections on some platforms. This allows you to eliminate the unused code and data of the GNAT library from your executable.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Here is a simple example:
with Aux; procedure Test is begin Aux.Used (10); end Test; package Aux is Used_Data : Integer; Unused_Data : Integer; procedure Used (Data : Integer); procedure Unused (Data : Integer); end Aux; package body Aux is procedure Used (Data : Integer) is begin Used_Data := Data; end Used; procedure Unused (Data : Integer) is begin Unused_Data := Data; end Unused; end Aux; |
Unused
and Unused_Data
are never referenced in this code
excerpt, and hence they may be safely removed from the final executable.
$ gnatmake test $ nm test | grep used 020015f0 T aux__unused 02005d88 B aux__unused_data 020015cc T aux__used 02005d84 B aux__used_data $ gnatmake test -cargs -fdata-sections -ffunction-sections \ -largs -Wl,--gc-sections $ nm test | grep used 02005350 T aux__used 0201ffe0 B aux__used_data |
It can be observed that the procedure Unused
and the object
Unused_Data
are removed by the linker when using the
appropriate options.
[ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |