A Handbook For ELF

Last Significant Page Update:

2025-09-12

Comments to:

Since I am dealing with a lot of ELF-related detail for my Elftoolchain work, I thought that it might be a good idea to collate the material into a Handbook.

The current plan is to structure the ELF Handbook as a reference book, with emphasis on how ELF concepts translate to the bits inside of files, and how ELF’s semantics are implemented by an operating system. There will be plenty of diagrams illustrating the behavior expected at the level of machine code.

Please see the section “Concept Index” below for the list of concepts that are planned to be covered.

Proposed Handbook Structure

The structure below is tentative, and very much subject to change.

Chapters / Appendices

Introduction/Overview

Hello World (Static)

A simple, standalone, hello-world executable.
Examination of the generated ELF object.
How the resulting process is laid out in virtual memory.

Hello World (Dynamic)

Examination of the generated ELF objects.
How the resulting process is laid out in virtual memory.
Overview of the startup process.

ELF File Structure

Sufficient detail to make the following sections comprehensible.

Dynamic Executables

Relocations Explained

Linking

Static Linking

Dynamic Linking

Link time actions and runtime actions.
A step-by-step guide (for the default symbol lookup rules).

Handling Archives

Loading

Process layout in virtual memory.
Static vs dynamic executables.
Init/Fini sections.

Multi-Threading Support

PT_TLS segments.

C++ Support

Name mangling.

Rust Support

Symbol Versioning

How it works.
How it is represented in the ELF.

Symbol and Object Capabilities

How capabilities work; what they are used for.
Symbol and object lookup rules.

Using The Runtime Loader

Various RTLD_* options to dlopen().

(How-To) Evolving A Shared Library

How to preserve binary compatibility with prior binaries.
Tools to check ABI compatibility.
Using symbol versioning.
Using multiple .so files for binary-incompatible changes.
The notion of ‘interfaces’.

(How-To) Interposing Shared Objects

An example showing how to override the definition of say printf().
Finding and invoking the downstream definition using dlsym(RTLD_NEXT).

Figures

ar(1) Archive Structure: Showing the header, and member chaining.
ELF structure (executables): Showing the header, PHDR table, and segments.
ELF structure (relocatables): Showing the header, SHDR table, and section data.
Linking (static linking): Diagram of how sections are merged.
Linking (dynamic linking): Link time merging of sections, runtime resolution of symbols.
Process Layout (Static Executables): Showing the mapping of text, bss and data segments.
Process Layout (Dynamic Executables): Showing the mapping of text, bss, data segments across of multiple objects in an address space.
Shared Object Lookup: Showing how objects are found on disk.
Shared Objects: Disk sharing: Showing the sharing of disk space vs. static linking.
Shared Objects: RAM sharing: Showing the sharing of physical memory of text segments by the virtual memory system.
Symbol Lookup (Capabilities): Showing the selection of capability-specific symbols in preference to a generic one.
Symbol Lookup (Default Lookup): Showing symbol binding within a dynamic object and to a different object.
Symbol Lookup (Group Lookup): Restricting symbol binding to within a group of objects.
Symbol Lookup (Versions): Showing resolution of versioned symbols.

Tables

Auto-generated Symbols: Each auto-generated symbol along with a short description of the symbol.
LD_* environment variables: All LD_* environment variables and their impact on the linker’s behavior.
Linker Expansions: All string tokens (like $ORIGIN) known to the linker with their meaning, and the contexts where they are expanded.
Program Segments: Segments known to the kernel or runtime linker, with their semantics.
Reserved Symbols: Reserved symbols and their meanings.
Section Types: Short definitions of each section type; examples of where elements of that type would be used.
Relocation Types: Each relocation type defined for RISC-V or ARM (say), with examples of source code that results in that relocation type.

Concept Index

A (partial) list of concepts to be covered in the handbook.

32- vs 64- bit objects

Compile-time and runtime search paths.
Pitfalls in mix-and-match linking (if at all supported by an architecture).

Absolute Symbols

What these are.
How to define such symbols.
Where such symbols are needed.
Why these kinds of symbols are discouraged in shared objects.

Ancilliary Shared Objects

E.g., for holding debug information separately from a ‘main’ object:

Why ancillary objects are useful or needed.
How to generate these files.

ar(1) archives

What these are.
How used during program development.
Archive structure. Extensions to the format.
Archive symbol tables: SVR4 and BSD tables.
Special members (like __.LIBDEP, //, etc.).
Using libelf to access archive members.
Using libarchive to access archive members.

Archive Processing by the Linker

How linkers look for symbol definitions in archives.
Speeding up the link using lorder and tsort.
Using undefined symbols to force an archive lookup.
One-pass linkers.

Auditing the Runtime Linker

What this is and why it is useful.
Environment variables controlling the audit.

Auto-Generated Symbols

What these are.
When these are generated during the link.

Backward Compatibility

See Object Evolution.

Capabilities (Link-time)

Describe what these are and where they are useful.
Types of capabilities: hardware, platform, machine, software, etc.
Object capabilities vs symbol capabilities.
How archives and capabilities interact.
The notion of a ‘lead symbol’.

C++ Name Mangling

The need for name mangling.
Demangling schemes in current use.
The Itanium ‘standard’ for demangling.
Potential issues with mangled names.
Tools to mangle and demangle C++ names.

C++ Templates

What these are, and the ELF structures that they are compiled to.
Handling duplicate template expansions at link time: code, data, entries that would go into .bss.
.gnu.linkonce sections.

COMDAT Sections

Why needed and where used.
Examples of C and FORTRAN sources that require COMDAT semantics.
How these sections are handling during linking.
Interaction with dynamic objects.
Any gotchas.

Compensating Dependencies

Why programmers end up specifying these.
Build problems due to their use.
Solutions.

Compilation vs Runtime Environments

List the differences between link behavior in compile-time and runtime environments.

Compilation Models

Why needed by some architectures.
Mix-and-match of objects built with differing models.

Compressed Debug Sections

How represented in an ELF object.
How to create these.
Costs and benefits.

Controlling Symbol Visibility

Why useful.
How to control symbol visibility.

Copy Relocations

What these are.
Why these are useful.

Cross- vs native linking

Issues in handling non-native architectures.

Cyclic Link-time Dependencies

Why these arise.
How to handle them.

Data Segments

Various types (.data, .bss, etc.) and their properties.
How the linker assembles these from its input relocatable objects.
Control of data placement using linker scripts.
Relocations that are applicable.
When potentially sharable across processes by the virtual memory system.
How data segments are distributed across shared objects.

Default Symbol Lookup Process

Describe the sequence of objects searched by the runtime loader for symbols.
Searching in dependent objects.
Lazy loading of dependent objects.
Searching in `dlopen()’ed objects.
Search scopes: world vs local.

Deferred Symbol References

What these are.
Where useful.

Dependent Shared Objects

Define the notion of a ‘dependent object’.
How object dependenices are specified at link time.
How dependencies are represented in the ELF format.
Rules for look up of dependent objects at link time and at run time.
Control of the lookup path using linker variables like $ORIGIN.

Direct Bindings

What these are.
Where useful.
How to specify direct bindings.
Conversely, how to prevent a symbol from being directly bound.
Why singleton symbols cannot be directly bound.

Displacement Relocations

What these are.
How they arise.

dlopen() and dlsym()

What these APIs are for.
How these APIs work: search paths, linker variables like $ORIGIN, the effect for the various RTLD_* flags.

Dynamic Linking

Describe what this is.
Advantages and disadvantages.
How represented in the ELF file format.
The need for position-independent code.
The need for a GOT and PLT.
Evolving APIs with backward compatibility when using dynamic linking.

Dynamic Objects

Shared objects, dynamic executables, position-independent executables.
How these look to the OS virtual memory manager.
Startup (.init) and teardown (.fini) semantics, when these are invoked.
Relocations used by dynamic objects; examples of source code that generates such relocations.

.dynamic Segment

The function of this section in an ELF object.
How pointed to from the ELF program header.

Dynamic String Tokens

List the linker tokens (e.g. $ORIGIN, etc.) supported by BSD and GNU runtime linkers, and their semantics.
Security considerations with setuid executables.

elfdump / objdump / readelf (Utilities)

Briefly describe how to use these tools.

ELF File Format

Describe the format.
The ELF header and its features.
How sections are described in a relocatable.
How segments are described in a executable object.
Possibly reuse/rework content that is Libelf by Example.

ELF Section Groups

Describe what these are.
Why useful.
How to specify and use section groups.
Linker behavior with section groups.

ELF Sections

What these are, and the role they play
How represented in an ELF file.
Various properties that sections can have.
Which ELF sections end up an executable and which are purely for link-time use.

ELF Sections vs Segments

The point in the object’s lifecycle that each concept (section or segment) is relevant.
How sections map to segments.
How the linker combines sections.
Discarded sections.

ELF Segments

How these are constructed from sections.
How represented in the ELF file format.

ELF Standards

The main ‘standard’ and who (apparently) maintains these.
Processor-specific ABIs and who maintains these.
psABI documents lacking clear owners.
Architectures using ELF but without a formal psABI (e.g. VAX).

ELF Startup: Dynamic Executables

The PT_INTERP segment and its contents.
How the kernel invokes the ELF interpreter.
What the interpreter does.
Relocations and fix-ups needed at startup.
Loading dependent shared objects.
Lazy loading.

ELF Startup

Static Executables ::

The _start entry point.
Relocations needed at load time.
How program segments map to virtual memory, copy-on-write sharing.
The machine environment when control passes to _start.
The runtime environment expected by the C language main().

Encapsulation Symbols

What these are, and where these would be useful.
How to get a linker to generate these.

Environment Variables

Describe the environment variables that influence linker behavior (e.g. LD_LIBRARY_PATH, LD_AUDIT, etc.)
Describe the environment variables that influence runtime loader behavior.

Filter Objects

Where filter objects are useful.
Filter object types.
Creating filter objects and linking against them.
Symbol lookup when using filter objects.

.fini and .fini_array Sections

What these are for.
When invoked during the process/shared object’s usage lifecycle.
How represented in the ELF object.
The runtime environment that code in these sections should expect.
Specifying code to be added to these sections.
How the linker merges these sections across relocatables.

Global Offset Table (GOT)

Why a GOT is needed, source/machine code examples.
How represented in an ELF object.
Relocations applicable.

Hiding Obsolete APIs

Using stubs to remove functions from future compilations, while keeping them around for older programs.
See Object Evolution below.

Immediate Reference

Where triggered.
Forcing using LD_BIND_NOW.

.init and .init_array Sections

What these sections are for.
When the code in these sections are involved during a process/shared object’s usage lifecycle.
How represented in the ELF object.
The runtime environment that code in these sections should expect.
Specifying code to be added to these sections.
How the linker merges these sections across relocatables.

.interp Segment

What this is and what it holds.

Kernel Loader

What this does.
Describe differences to the runtime loader for user programs.

Kernel Modules

What these are.
How these are different from dynamic executables meant for userspace.
How represented in ELF form.
Describe NetBSD/FreeBSD kernel modules.

Lazy Loading (of Objects)

What this is.
Advantages/drawbacks of lazy loading.
How to specify objects as lazily loaded.

Lazy Reference (of Symbols)

Why needed.

Lead Symbols

What these are.
How they guide the lookup for capability-specific symbols.

Library Naming Conventions

For unix.

Linker Scripts

What these are and where these are useful.
Implicit (default) scripts.
Linker script features (for a few linkers).

Link Editor

Describe broadly what a link editor does.
How invoked.
How symbols are resolved.
Impact of command-line position of objects and archives on symbol resolution.
Controlling the layout and content of the linker’s output.

Link Editor Extensions

How to extend the functionality of the link editor at runtime.

Linker Environment Variables

See Environment Variables.

Mapfiles

What these are.
Where and how used.

Multiple Symbol Definitions

How these are are resolved when using dynamic objects.
Impact of search order.
Examples of unexpected behavior.
Difference between static linking and dynamic linking.

Non-symbolic Relocations

Describe what these are.

Object Evolution

Techniques to preserve backward compatibility when evolving objects.
How to use symbol versioning.
How to use multiple .so objects.
SONAME functionality.
Defining interfaces for shared objects.
Additive vs non-additive changes to interfaces.
Controlling symbol scope / keeping symbols ‘local’.
Using filter objects.
Differences in linker lookup vs runtime lookup of symbols.
Differences in linker lookup vs runtime lookup of dependent objects.

Object Groups (Link time)

Why useful.
Forcing symbol lookup to be within a group.

Object Interposition

What this is.
Why useful.
How to use LD_PRELOAD.

Object Versioning

How to version files to preserve backward compatibility.
Using .soname.

Parent Objects (Plugins)

What these are.
How to specify a ‘parent’ object using mapfiles.

.plt Segment

What this is.
Why needed.
Lazy resolution of procedure symbols.

Position Independent Code

What PIC looks like at the machine level.
Advantages of PIC.
Disadvantages of PIC: extra indirections, loss of performance, code size, etc.

Position Independent Executables

What these are.
How to create PIEs.

.preinit_array Segment

Describe the semantics of this segment.

Procedure Linkage Table (PLT)

What this is.
Why necessary for dynamic objects.
Lazy resolution of procedure symbols: why useful.
Before- and after- content of a PLT entry after symbol resolution.

Relocatable Object

The structure of a relocatable object.
What specifically is ‘relocatable’ about the contents.
How relocations needed are represented in ELF.

Relocations

What relocations are, and why they are needed.
Types of relocations.
Where relocations are defined in a psABI.
Examples of instructions being modified by relocation.
When relocations are applied during the linking and loading process.
Tools to look at relocations in an ELF object.
Errors during relocation processing.

Reserved Symbols

A list of reserved symbols (like _init, _etext, etc.) and their meanings.

Runnable Process

How a file on disk is transformed into a runnable process.
Read-only vs read-write parts.
Executable vs non-executable memoryy.
Stacks.
Threads.

Runpaths

How used to find shared object dependencies.
How specified at link time.
How stored in the ELF file format.

Runtime Linker

Where found on the file system.
How invoked by the kernel, how information about the current executable is passed to the linker.
Initialization steps.
When and how control is passed to the main executable.
Directories searched for dependent objects.
How dependencies are represented in the ELF format.
Symbol lookup within a set of loaded objects.
Application access using dlsym().
Flags controlling runtime linker behavior.

Rust

Name mangling rules for Rust.
Anything else that is Rust specific.

Singleton Symbols

What these are, and why they are useful.
How multiply definitions across shared objects are handled.

Startup Performance (Dynamic Objects)

Slow-downs due to startup actions, relative to static executables.
Lazy symbol lookup (of procedure symbols).
Measuring startup performance.
Mitigations.
Caching by the runtime loader.

String Tables

What they hold.
Which strings tables exist in an ELF object.

String Table Compression

Why compression is needed.
Methods of compressing a string table.
Performance implications.

Stub Objects

Used for speeding up builds.
Also used for ensuring build correctness, to separate interface from implementation.
How to create, and use.

Symbol Binding (concept)

Define the notion of binding.
Weak vs strong binding.

Symbol Capabilities

What these are used for.
Examples of use for hardware-optimized functionality.
The ‘lead symbol’ that represents a family of related symbols distinguished by capability.
How represented in an ELF object.

Symbol Elimination

Why needed.
How to specify symbols to be eliminated from symbol tables.

Symbolic Binding (of Symbols)

Why needed.
How to specify symbolic binding for a shared object.
Differences from Direct Binding for symbols.

Symbolic Relocations

What these are.
Contrast with non-symbolic relocations.
Source code examples requiring symbolic relocation.

Symbol Interposition

What this is.
Where useful to override functionality.
Where unwanted; how to prevent unwanted interposition with Direct Binding.

Symbol Resolution

Simple vs Complex resolutions of symbols.
Handling symbols with differing characteristics.
Symbol resolution states: ‘undefined’, ‘tentative’, ‘defined’.
Precedence of resolution states.

Symbol Scopes

‘Global’ and ‘Local’ scope.
Reducing the scope of symbols using mapfiles / linker scripts.

Symbol Search Order

Default Symbol Search Order (‘World’ search), vs local (‘Group’) searches.
The impact of the order of loading of dependent objects.
Linker lookups vs runtime lookups.

Symbol Tables

What these are, and what they are used for.
The structure of each symbol table entry.
The associated string table.
Link-time tables (.symtab/.strtab) vs runtime table (.dynsym/.dynstr).
Table generation process.

Symbol Types

The types expressible in ELF.

Symbol Versioning

How to specify versions for sets of symbols.
The use of symbol versioning.
How represented in the ELF file.
Symbol lookup in the presence of multiple symbol versions.
The ‘base’ symbol version, based on the object’s name.
“Empty” versions.
Using dlsym() to look up version identifiers.
Pinning down version dependencies at link time.

Symbol Visibility

The meaning of ‘Local’ or ‘Global’ visibility.
Adjusting symbol visibility at link time.
Singleton symbols.

Tentative Symbols

How to define these.
Where useful.
Lack of ordering guarantees in output files.

Text Relocations

What these are.
Why they are bad (pessimal copy-on-write behavior).
Why they arise. Source code examples.
How to find them e.g. the findtextrel utility.

Thread-local Storage

The semantics of thread-local storage.
How to specify thread-local storage in source code.
How represented in an ELF object.
Segments holding thread-local storage (of type SHT_TLS).
Sections with TLS data: (initialized) .tdata, (uninitialized) .tbss.
Various TLS models.
How TLS segments are processed at startup and termination of a dynamic object.

Undefined Symbols

What these are.
The effect of undefined symbols on a static link.
The effect of undefined symbols on dynamic objects.
Specifying additional undefined symbols at link time, and why one would do such a thing.

Unused Sections

How the linker determines that a section is unused.
How the linker handles cyclic references between sections where the section group is other unused.
The default behavior of unused sections.
Forcing unused sections to be kept around.

Versioned Filenames

Using versioned filenames to allow older binaries to run.
Rules for using version numbers in filenames for shared objects.
Using symbolic links.
How dependencies are recorded in an ELF object.
Runtime lookup of the correct versioned filename.

Weak Symbols

What these are.
Why they are useful.
How to define weak symbols.
Why they are considered to be fragile and error-prone.
Using dlsym(RTLD_PROBE) instead of weak symbols.