[Exploit development] 6- Dealing with ELF files programmatically
Intro
Welcome to our third part in the exploration of executable binary files. This article delves into the structure of ELF files, exploring the critical information they contain and how to programmatically interact with them. As previously mentioned, while the PE format is vital for cybersecurity specialists, especially the specializations emerging from reverse engineering, our focus here is on comprehending the ELF format which is also important. First, you had better read the first part, which is an important theoretical overview of executable binary files, then read the second part, which offers a practical in-depth exploration of Windows PE files, and a lot of base concepts we’re gonna use have been explained in that part.
Prepare ELF parser
Now, it’s time to begin crafting our ELF file parser. First, we implement the main function that takes the ELF file name via command line arguments.
int main(int argc, char **argv)
{
void *pBuffer;
int nRet = EXIT_FAILURE;
if ( argc <= 1 )
return puts("Usage:\n\t./ELFParser </path/to/elffile>");
pBuffer = ReadELFFile(argv[1]);
if ( ! pBuffer )
{
printf("Failed to read %s\n", argv[1]);
goto LEAVE;
}
/*
PARSE HERE
*/
nRet = EXIT_SUCCESS;
LEAVE:
if ( pBuffer ) free(pBuffer);
return nRet;
}
The main function reads the ELF file by using the ReadELFFile
function, which is implemented as follows:
void *ReadELFFile(char *cpFileName)
{
FILE *pFile;
void *pBuffer;
size_t lSize;
// Get a file handle
if ( pFile = fopen(cpFileName, "rb") )
{
// Tell the handle about the ending of the file to obtain its size
fseek(pFile, 0L, SEEK_END);
// Get the file size
lSize = ftell(pFile);
// Restore back the handle
fseek(pFile, 0L, SEEK_SET);
// Allocate some memory for the buffer
if ( !(pBuffer = malloc(lSize)) ) // OOM CASE
goto LEAVE;
// Read the file content
fread(pBuffer, lSize, 1L, pFile);
}
LEAVE:
if ( pFile ) fclose(pFile);
return pBuffer;
}
Headers
ELF header (Ehdr)
The ELF file’s header is described by either the Elf32_Ehdr
or Elf64_Ehdr
data type, depending on the architecture:
#define EI_NIDENT 16
typedef struct {
unsigned char e_ident[EI_NIDENT];
uint16_t e_type;
uint16_t e_machine;
uint32_t e_version;
ElfN_Addr e_entry;
ElfN_Off e_phoff;
ElfN_Off e_shoff;
uint32_t e_flags;
uint16_t e_ehsize;
uint16_t e_phentsize;
uint16_t e_phnum;
uint16_t e_shentsize;
uint16_t e_shnum;
uint16_t e_shstrndx;
} ElfN_Ehdr;
- e_ident: This array of bytes specifies how to interpret the file, independent of the processor or the file’s remaining contents. First four bytes are the image magic number (0x7fELF).
- e_type: This member identifies the object file type:
ET_NONE
: An unknown type.ET_REL
: A relocatable file.ET_EXEC
: An executable file.ET_DYN
: A shared object.ET_CORE
: A core file.
- e_machine: This member specifies the required architecture for an individual file.
- e_entry: This member gives the virtual address to which the system first transfers control, thus starting the process. If the file has no associated entry point, this member holds zero.
- e_phoff: This member holds the program header table’s file offset in bytes. If the file has no program header table, this member holds zero.
- e_shoff: This member holds the section header table’s file offset in bytes. If the file has no section header table, this member holds zero.
- e_ehsize: This member holds the ELF header’s size in bytes.
- e_phentsize: This member holds the size in bytes of one entry in the file’s program header table; all entries are the same size.
- e_phnum: This member holds the number of entries in the program header table. Thus the product of
e_phentsize
ande_phnum
gives the table’s size in bytes. If a file has no program header,e_phnum
holds the value zero. - e_shentsize: This member holds a sections header’s size in bytes. A section header is one entry in the section header table; all entries are the same size.
- e_shnum: This member holds the number of entries in the section header table. Thus the product of
e_shentsize
ande_shnum
gives the section header table’s size in bytes. If a file has no section header table,e_shnum
holds the value of zero. - e_shstrndx: This member holds the section header table index of the entry associated with the section name string table. If the file has no section name string table, this member holds the value
SHN_UNDEF
.
Program header (Phdr)
An array of structures, each describing a segment or other information the system needs to prepare the program for execution. An object file segment contains one or more sections. Program headers are meaningful only for executable and shared object files. A file specifies its own program header size with
the ELF header’s e_phentsize
and e_phnum members
. The ELF program header is described by the type Elf32_Phdr
or Elf64_Phdr
depending on the architecture:
typedef struct {
uint32_t p_type;
Elf32_Off p_offset;
Elf32_Addr p_vaddr;
Elf32_Addr p_paddr;
uint32_t p_filesz;
uint32_t p_memsz;
uint32_t p_flags;
uint32_t p_align;
} Elf32_Phdr;
typedef struct {
uint32_t p_type;
uint32_t p_flags;
Elf64_Off p_offset;
Elf64_Addr p_vaddr;
Elf64_Addr p_paddr;
uint64_t p_filesz;
uint64_t p_memsz;
uint64_t p_align;
} Elf64_Phdr;
- p_type: This member of the structure indicates what kind of segment this array element describes or how to interpret the array element’s information. This member can be one of the following values:
PT_NULL
: The array element is unused and the other members’ values are undefined.PT_LOAD
: The array element specifies a loadable segment, described byp_filesz
andp_memsz
. The bytes from the file are mapped to the beginning of the memory segment. If the segment’s memory sizep_memsz
is larger than the file sizep_filesz
, the “extra” bytes are defined to hold the value 0 and to follow the segment’s initialized area. The file size may not be larger than the memory size. Loadable segment entries in the program header table appear in ascending order, sorted on thep_vaddr
member.PT_INTERP
: The array element specifies the location and size of a null-terminated pathname to invoke as an interpreter.PT_NOTE
: The array element specifies the location of notesElfN_Nhdr
.PT_SHLIB
: This segment type is reserved but has unspecified semantics.PT_PHDR
: The array element, if present, specifies the location and size of the program header table itself, both in the file and in the memory image of the program.PT_LOPROC | PT_HIPROC
: Reserved for processor-specific semantics.PT_GNU_STACK
: GNU extension which is used by the Linux kernel to control the state of the stack via the flags set in the p_flags member.
- p_offset: This member holds the offset from the beginning of the file at which the first byte of the segment resides.
- p_vaddr: On systems for which physical addressing is relevant, this member is reserved for the segment’s physical address. Under BSD this member is not used and must be zero.
- p_paddr: On systems for which physical addressing is relevant, this member is reserved for the segment’s physical address. Under BSD this member is not used and must be zero.
- p_filesz: This member holds the number of bytes in the file image of the segment. It may be zero.
- p_memsz: This member holds the number of bytes in the memory image of the segment. It may be zero.
Section header (Shdr)
An array of Elf32_Shdr
or Elf64_Shdr
structures. The ElfN_Ehdr.e_shoff
member gives the byte offset from the beginning of the file to the section header table. ElfN_Ehdr.e_shnum
holds the number of entries the section header table contains. ElfN_Ehdr.e_shentsize
holds the size in bytes of each entry.
typedef struct {
uint32_t sh_name;
uint32_t sh_type;
uint32_t sh_flags;
Elf32_Addr sh_addr;
Elf32_Off sh_offset;
uint32_t sh_size;
uint32_t sh_link;
uint32_t sh_info;
uint32_t sh_addralign;
uint32_t sh_entsize;
} Elf32_Shdr;
typedef struct {
uint32_t sh_name;
uint32_t sh_type;
uint64_t sh_flags;
Elf64_Addr sh_addr;
Elf64_Off sh_offset;
uint64_t sh_size;
uint32_t sh_link;
uint32_t sh_info;
uint64_t sh_addralign;
uint64_t sh_entsize;
} Elf64_Shdr;
- sh_name: An index into the section header string table section, giving the location of a null-terminated section’s name.
- sh_type: This member categorizes the section’s contents and semantics.
- sh_flags: Some flags describe miscellaneous attributes. such as, is the section writable? occupies memory during process execution?
- sh_addr: This member holds the address at which the section’s first byte should reside, or zero if doesn’t appear in the memory image of a process.
- sh_offset: This member’s value holds the byte offset from the beginning of the file to the first byte in the section.
- sh_size: The section’s size in bytes.
- sh_link: This member holds a section header table index link, whose interpretation depends on the section type.
- sh_entsize: The size in bytes of fixed-sized entries, or zero if the section does not hold a table of fixed-size entries.
Parse headers
Let’s develop a function that parses this valuable data.
void ParseHeaders(void *pBaseAddress)
{
Elf64_Addr elfPtr;
Elf64_Addr elfSectionPtr;
Elf64_Half nEntries;
Elf64_Word wPadding;
PRINT_LINE("ELF HEADER", 100);
// Pointing now to the beginning of the file
elfPtr = (Elf64_Addr) pBaseAddress;
printf("Image magic number => 0x%X (%.4s)\n", *(uint32_t *) elfPtr, (char *) elfPtr);
printf("Arch => ");
switch ( ((Elf64_Ehdr *) elfPtr)->e_ident[EI_CLASS] )
{
case ELFCLASS64:
puts("x64");
break;
case ELFCLASS32:
puts("x32");
break;
default:
puts("Invalid");
}
printf("ELF type => ");
switch ( ((Elf64_Ehdr *) elfPtr)->e_type )
{
case ET_REL:
puts("Relocatable");
break;
case ET_EXEC:
puts("Executable");
break;
case ET_DYN:
puts("DYN (Position-Independent Executable file)");
break;
case ET_NONE:
default:
puts("Unknown");
}
printf("Machine => ");
switch ( ((Elf64_Ehdr *) elfPtr)->e_machine )
{
case EM_386:
puts("Intel x86");
break;
case EM_IA_64:
puts("Intel Itanium");
break;
case EM_X86_64:
puts("AMD x86-64");
break;
case EM_NONE:
default:
puts("Unknown");
}
printf("Entry point => 0x%X\n", ((Elf64_Ehdr *) elfPtr)->e_entry);
printf("ELF header size => %d\n", ((Elf64_Ehdr *) elfPtr)->e_ehsize);
printf("Program header offset => 0x%X\n", ((Elf64_Ehdr *) elfPtr)->e_phoff);
printf("Program header Entry size => %d\n", ((Elf64_Ehdr *) elfPtr)->e_phentsize);
printf("Numbre of entries => %d\n", ((Elf64_Ehdr *) elfPtr)->e_phnum);
printf("Section header offset => 0x%X\n", ((Elf64_Ehdr *) elfPtr)->e_shoff);
printf("Section header size => %d\n", ((Elf64_Ehdr *) elfPtr)->e_shentsize);
printf("Numbre of entries => %d\n", ((Elf64_Ehdr *) elfPtr)->e_shnum);
printf("Section name table offset => 0x%X\n", ((Elf64_Ehdr *) elfPtr)->e_shstrndx);
//**************************************************************************
PRINT_LINE("Program Header", 100);
// Number of entries
nEntries = ((Elf64_Ehdr *) elfPtr)->e_phnum;
// Pointing now to Program Header
elfPtr += ((Elf64_Ehdr *) elfPtr)->e_phoff;
puts("Type\t\tOffset\t\tVirtAddr\tPhysAddr\tFileSize\tMemSize\tFlag");
while ( nEntries-- )
{
if ( ((Elf64_Phdr *) elfPtr)->p_type != PT_NULL )
{
switch ( ((Elf64_Phdr *) elfPtr)->p_type )
{
case PT_LOAD:
printf("LOAD\t");
break;
case PT_DYNAMIC:
printf("DYNAMIC\t");
break;
case PT_INTERP:
printf("INTERP\t");
break;
case PT_NOTE:
printf("NOTE\t");
break;
case PT_SHLIB:
printf("SHLIB\t");
break;
case PT_PHDR:
printf("PHDR\t");
break;
case PT_GNU_STACK:
printf("GNU_STACK");
break;
case PT_LOPROC:
puts("LOPROC\t");
break;
case PT_HIPROC:
printf("HIPROC\t");
break;
default:
printf("Unknown\t");
}
printf("\t0x%08LX\t0x%08LX\t0x%08LX\t%d\t\t%d\t",
((Elf64_Phdr *) elfPtr)->p_offset,
((Elf64_Phdr *) elfPtr)->p_vaddr,
((Elf64_Phdr *) elfPtr)->p_paddr,
((Elf64_Phdr *) elfPtr)->p_filesz,
((Elf64_Phdr *) elfPtr)->p_memsz
);
if ( ((Elf64_Phdr *) elfPtr)->p_flags & PF_X )
printf("EXECUTABLE ");
if ( ((Elf64_Phdr *) elfPtr)->p_flags & PF_W )
printf("WRITABLE ");
if ( ((Elf64_Phdr *) elfPtr)->p_flags & PF_R )
printf("READABLE ");
// New Line
puts("");
}
// Next entry
elfPtr += ((Elf64_Ehdr *) pBaseAddress)->e_phentsize;
}
//**************************************************************************
PRINT_LINE("Section Header", 100);
// Move to section header
elfPtr = (Elf64_Addr) pBaseAddress + ((Elf64_Ehdr *) pBaseAddress)->e_shoff;
// Section name string header
elfSectionPtr = elfPtr + ( ((Elf64_Ehdr *) pBaseAddress)->e_shstrndx * sizeof(Elf64_Shdr) );
// Jump on the names table
elfSectionPtr = (Elf64_Addr) pBaseAddress + ((Elf64_Shdr *) elfSectionPtr)->sh_offset;
// Number of entries
nEntries = ((Elf64_Ehdr *) pBaseAddress)->e_shnum;
puts("Offset\tAddr\t\tSize\t\tType\t\tName\t\t\tFlags");
while ( nEntries-- )
{
if ( ((Elf64_Shdr *) elfPtr)->sh_type != SHT_NOBITS && ((Elf64_Shdr *) elfPtr)->sh_size )
{
printf("0x%LX\t0x%08LX\t%d\t\t0x%08LX\t%s",
((Elf64_Shdr *) elfPtr)->sh_offset,
((Elf64_Shdr *) elfPtr)->sh_addr,
((Elf64_Shdr *) elfPtr)->sh_size,
((Elf64_Shdr *) elfPtr)->sh_type,
(char *) elfSectionPtr + ((Elf64_Shdr *) elfPtr)->sh_name
);
// Alignment
wPadding = 24 - strlen((char *) elfSectionPtr + ((Elf64_Shdr *) elfPtr)->sh_name);
while ( wPadding-- ) printf(" ");
// Handle flags
if ( ((Elf64_Shdr *) elfPtr)->sh_flags & SHF_WRITE )
printf("WRITABLE ");
if ( ((Elf64_Shdr *) elfPtr)->sh_flags & SHF_ALLOC )
printf("ALLOCATABLE ");
if ( ((Elf64_Shdr *) elfPtr)->sh_flags & SHF_EXECINSTR )
printf("EXECUTABLE ");
// New Line
puts("");
}
// Next section
elfPtr += ((Elf64_Ehdr *) pBaseAddress)->e_shentsize;
}
}
String and symbol tables (Imports & Exports)
In contrast to the Windows PE format, ELF files consolidate imports and exports within a single structure. Each section contains an array of Symbol table, each item of that array contains the index of its corresponding name in String table.
String table sections
An array of null-terminated strings that contains symbol and section names.
Symbol table
symbol table holds information needed to locate and relocate a program’s symbolic definitions and references. A symbol table index is a subscript into this array.
typedef struct {
uint32_t st_name;
Elf32_Addr st_value;
uint32_t st_size;
unsigned char st_info;
unsigned char st_other;
uint16_t st_shndx;
} Elf32_Sym;
typedef struct {
uint32_t st_name;
unsigned char st_info;
unsigned char st_other;
uint16_t st_shndx;
Elf64_Addr st_value;
uint64_t st_size;
} Elf64_Sym;
- st_name: An index into the object file’s symbol string table, if the value is zero, the symbol has no name.
- st_info: This member specifies the symbol’s type and binding attributes
- st_other: This member defines the symbol visibility.
- st_shndx: An index into the relevant section header table index.
- st_value: This member gives the value of the associated symbol.
- st_size: For clarification, st_size represents the size of symbols, with a value of zero indicating either no size or an unknown size.
Parse Symbols
Let’s develop a function that parses those sections and obtains their valuable data.
void ParseSymbols(void *pBaseAddress)
{
Elf64_Addr elfPtr;
Elf64_Addr elfSectionHdrPtr;
Elf64_Addr elfSymPtr;
Elf64_Addr elfStrSectionPtr;
Elf64_Half nEntries;
Elf64_Word wSymbols;
PRINT_LINE("Symbols", 100);
// Pointing now to the section header
elfSectionHdrPtr = (Elf64_Addr) pBaseAddress + ((Elf64_Ehdr *) pBaseAddress)->e_shoff;
elfPtr = elfSectionHdrPtr;
// Number of section entries
nEntries = ((Elf64_Ehdr *) pBaseAddress)->e_shnum;
puts("Value\t\tType\tBind\tSize\tName");
// Iterate over all section entries
while ( nEntries-- )
{
// Find symbols
if (
((Elf64_Shdr *) elfPtr)->sh_type == SHT_SYMTAB ||
((Elf64_Shdr *) elfPtr)->sh_type == SHT_DYNSYM
)
{
// Jump on the symbols section
elfSymPtr = (Elf64_Addr) pBaseAddress + ((Elf64_Shdr *) elfPtr)->sh_offset;
// Find string section header table
elfStrSectionPtr = elfSectionHdrPtr + ( sizeof(Elf64_Shdr) * ((Elf64_Shdr *) elfPtr)->sh_link );
// Jump on the string table ( .strtab )
elfStrSectionPtr = (Elf64_Addr) pBaseAddress + ((Elf64_Shdr *) elfStrSectionPtr)->sh_offset;
// Calculate the number of symbols
wSymbols = ((Elf64_Shdr *) elfPtr)->sh_size / sizeof(Elf64_Sym);
// Iterate over all symbol entries
while ( wSymbols-- )
{
if ( ((Elf64_Sym *) elfSymPtr)->st_size )
{
printf("0x%08LX\t0x%02X\t0x%02X\t%d\t%s\n",
((Elf64_Sym *) elfSymPtr)->st_value,
ELF64_ST_TYPE( ((Elf64_Sym *) elfSymPtr)->st_info ),
ELF64_ST_BIND( ((Elf64_Sym *) elfSymPtr)->st_info ),
((Elf64_Sym *) elfSymPtr)->st_size,
(char *) elfStrSectionPtr + ((Elf64_Sym *) elfSymPtr)->st_name
);
}
// Next symbol entry
elfSymPtr += sizeof(Elf64_Sym);
}
}
// Next section entry
elfPtr += ((Elf64_Ehdr *) pBaseAddress)->e_shentsize;
}
}
Relocations
Relocation is the process of connecting symbolic references with symbolic definitions. Relocatable files must have information that describes how to modify their section contents, thus allowing executable and shared object files to hold the right information for a process’s program image.
Relocation table
One of the following structures are found within each of the relocation sections, depending on the architecture:
// do not need an addend
typedef struct {
Elf32_Addr r_offset;
uint32_t r_info;
} Elf32_Rel;
typedef struct {
Elf64_Addr r_offset;
uint64_t r_info;
} Elf64_Rel;
// need an addend
typedef struct {
Elf32_Addr r_offset;
uint32_t r_info;
int32_t r_addend;
} Elf32_Rela;
typedef struct {
Elf64_Addr r_offset;
uint64_t r_info;
int64_t r_addend;
} Elf64_Rela;
- r_offset: The location at which to apply the relocation action. For a relocatable file, the value is the byte offset from the beginning of the section to the storage unit affected by the relocation. For an executable file or shared object, the value is the virtual address of the storage unit affected by the relocation.
- r_info: This member gives both the symbol table index with respect to which the relocation must be made and the type of relocation to apply.
- r_addend: a constant addend used to compute the value to be stored into the relocatable field.
Parse relocations
Let’s develop a function that parses rela section and obtains its data.
void ParseRelocations(void *pBaseAddress)
{
Elf64_Addr elfPtr;
Elf64_Addr elfRelocPtr;
Elf64_Half nEntries;
Elf64_Word wRelocEntries;
PRINT_LINE("Relocations", 100);
// Pointing now to the section header
elfPtr = (Elf64_Addr) pBaseAddress + ((Elf64_Ehdr *) pBaseAddress)->e_shoff;
// Number of sections
nEntries = ((Elf64_Ehdr *) pBaseAddress)->e_shnum;
puts("Offset\t\tAddend\t\tType\t\tSymtab index");
// Iterate over all sections
while ( nEntries-- )
{
// Find relocation section header
if ( ((Elf64_Shdr *) elfPtr)->sh_type == SHT_RELA )
{
// Find relocations entries of current section
elfRelocPtr = (Elf64_Addr) pBaseAddress + ((Elf64_Shdr *) elfPtr)->sh_offset;
// Number of relocation entries
wRelocEntries = ((Elf64_Shdr *) elfPtr)->sh_size / sizeof(Elf64_Rela);
// Iterate over all relocation entries
while ( wRelocEntries-- )
{
printf("0x%08X\t0x%08X\t0x%08X\t%d\n",
((Elf64_Rela *) elfRelocPtr)->r_offset,
((Elf64_Rela *) elfRelocPtr)->r_addend,
ELF64_R_TYPE( ((Elf64_Rela *) elfRelocPtr)->r_info ),
ELF64_R_SYM( ((Elf64_Rela *) elfRelocPtr)->r_info )
);
// Next reloc entry
elfRelocPtr += sizeof(Elf64_Rela);
}
}
// Next section entry
elfPtr += ((Elf64_Ehdr *) pBaseAddress)->e_shentsize;
}
}
The full code
#include <stdlib.h>
#include <stdio.h>
#include <stdbool.h>
#include <string.h>
#include <elf.h>
#define PRINT_LINE(str, n) \
printf("\n------%s", str); \
for (uint16_t i = 0; i < n - strlen(str); i++) \
printf("-"); \
puts("")
void ParseHeaders(void *pBaseAddress)
{
Elf64_Addr elfPtr;
Elf64_Addr elfSectionPtr;
Elf64_Half nEntries;
Elf64_Word wPadding;
PRINT_LINE("ELF HEADER", 100);
// Pointing now to the beginning of the file
elfPtr = (Elf64_Addr) pBaseAddress;
printf("Image magic number => 0x%X (%.4s)\n", *(uint32_t *) elfPtr, (char *) elfPtr);
printf("Arch => ");
switch ( ((Elf64_Ehdr *) elfPtr)->e_ident[EI_CLASS] )
{
case ELFCLASS64:
puts("x64");
break;
case ELFCLASS32:
puts("x32");
break;
default:
puts("Invalid");
}
printf("ELF type => ");
switch ( ((Elf64_Ehdr *) elfPtr)->e_type )
{
case ET_REL:
puts("Relocatable");
break;
case ET_EXEC:
puts("Executable");
break;
case ET_DYN:
puts("DYN (Position-Independent Executable file)");
break;
case ET_NONE:
default:
puts("Unknown");
}
printf("Machine => ");
switch ( ((Elf64_Ehdr *) elfPtr)->e_machine )
{
case EM_386:
puts("Intel x86");
break;
case EM_IA_64:
puts("Intel Itanium");
break;
case EM_X86_64:
puts("AMD x86-64");
break;
case EM_NONE:
default:
puts("Unknown");
}
printf("Entry point => 0x%X\n", ((Elf64_Ehdr *) elfPtr)->e_entry);
printf("ELF header size => %d\n", ((Elf64_Ehdr *) elfPtr)->e_ehsize);
printf("Program header offset => 0x%X\n", ((Elf64_Ehdr *) elfPtr)->e_phoff);
printf("Program header Entry size => %d\n", ((Elf64_Ehdr *) elfPtr)->e_phentsize);
printf("Numbre of entries => %d\n", ((Elf64_Ehdr *) elfPtr)->e_phnum);
printf("Section header offset => 0x%X\n", ((Elf64_Ehdr *) elfPtr)->e_shoff);
printf("Section header size => %d\n", ((Elf64_Ehdr *) elfPtr)->e_shentsize);
printf("Numbre of entries => %d\n", ((Elf64_Ehdr *) elfPtr)->e_shnum);
printf("Section name table offset => 0x%X\n", ((Elf64_Ehdr *) elfPtr)->e_shstrndx);
//**************************************************************************
PRINT_LINE("Program Header", 100);
// Number of entries
nEntries = ((Elf64_Ehdr *) elfPtr)->e_phnum;
// Pointing now to Program Header
elfPtr += ((Elf64_Ehdr *) elfPtr)->e_phoff;
puts("Type\t\tOffset\t\tVirtAddr\tPhysAddr\tFileSize\tMemSize\tFlag");
while ( nEntries-- )
{
if ( ((Elf64_Phdr *) elfPtr)->p_type != PT_NULL )
{
switch ( ((Elf64_Phdr *) elfPtr)->p_type )
{
case PT_LOAD:
printf("LOAD\t");
break;
case PT_DYNAMIC:
printf("DYNAMIC\t");
break;
case PT_INTERP:
printf("INTERP\t");
break;
case PT_NOTE:
printf("NOTE\t");
break;
case PT_SHLIB:
printf("SHLIB\t");
break;
case PT_PHDR:
printf("PHDR\t");
break;
case PT_GNU_STACK:
printf("GNU_STACK");
break;
case PT_LOPROC:
puts("LOPROC\t");
break;
case PT_HIPROC:
printf("HIPROC\t");
break;
default:
printf("Unknown\t");
}
printf("\t0x%08LX\t0x%08LX\t0x%08LX\t%d\t\t%d\t",
((Elf64_Phdr *) elfPtr)->p_offset,
((Elf64_Phdr *) elfPtr)->p_vaddr,
((Elf64_Phdr *) elfPtr)->p_paddr,
((Elf64_Phdr *) elfPtr)->p_filesz,
((Elf64_Phdr *) elfPtr)->p_memsz
);
if ( ((Elf64_Phdr *) elfPtr)->p_flags & PF_X )
printf("EXECUTABLE ");
if ( ((Elf64_Phdr *) elfPtr)->p_flags & PF_W )
printf("WRITABLE ");
if ( ((Elf64_Phdr *) elfPtr)->p_flags & PF_R )
printf("READABLE ");
// New Line
puts("");
}
// Next entry
elfPtr += ((Elf64_Ehdr *) pBaseAddress)->e_phentsize;
}
//**************************************************************************
PRINT_LINE("Section Header", 100);
// Move to section header
elfPtr = (Elf64_Addr) pBaseAddress + ((Elf64_Ehdr *) pBaseAddress)->e_shoff;
// Section name string header
elfSectionPtr = elfPtr + ( ((Elf64_Ehdr *) pBaseAddress)->e_shstrndx * sizeof(Elf64_Shdr) );
// Jump on the names table
elfSectionPtr = (Elf64_Addr) pBaseAddress + ((Elf64_Shdr *) elfSectionPtr)->sh_offset;
// Number of entries
nEntries = ((Elf64_Ehdr *) pBaseAddress)->e_shnum;
puts("Offset\tAddr\t\tSize\t\tType\t\tName\t\t\tFlags");
while ( nEntries-- )
{
if ( ((Elf64_Shdr *) elfPtr)->sh_type != SHT_NOBITS && ((Elf64_Shdr *) elfPtr)->sh_size )
{
printf("0x%LX\t0x%08LX\t%d\t\t0x%08LX\t%s",
((Elf64_Shdr *) elfPtr)->sh_offset,
((Elf64_Shdr *) elfPtr)->sh_addr,
((Elf64_Shdr *) elfPtr)->sh_size,
((Elf64_Shdr *) elfPtr)->sh_type,
(char *) elfSectionPtr + ((Elf64_Shdr *) elfPtr)->sh_name
);
// Alignment
wPadding = 24 - strlen((char *) elfSectionPtr + ((Elf64_Shdr *) elfPtr)->sh_name);
while ( wPadding-- ) printf(" ");
// Handle flags
if ( ((Elf64_Shdr *) elfPtr)->sh_flags & SHF_WRITE )
printf("WRITABLE ");
if ( ((Elf64_Shdr *) elfPtr)->sh_flags & SHF_ALLOC )
printf("ALLOCATABLE ");
if ( ((Elf64_Shdr *) elfPtr)->sh_flags & SHF_EXECINSTR )
printf("EXECUTABLE ");
// New Line
puts("");
}
// Next section
elfPtr += ((Elf64_Ehdr *) pBaseAddress)->e_shentsize;
}
}
void ParseSymbols(void *pBaseAddress)
{
Elf64_Addr elfPtr;
Elf64_Addr elfSectionHdrPtr;
Elf64_Addr elfSymPtr;
Elf64_Addr elfStrSectionPtr;
Elf64_Half nEntries;
Elf64_Word wSymbols;
PRINT_LINE("Symbols", 100);
// Pointing now to the section header
elfSectionHdrPtr = (Elf64_Addr) pBaseAddress + ((Elf64_Ehdr *) pBaseAddress)->e_shoff;
elfPtr = elfSectionHdrPtr;
// Number of section entries
nEntries = ((Elf64_Ehdr *) pBaseAddress)->e_shnum;
puts("Value\t\tType\tBind\tSize\tName");
// Iterate over all section entries
while ( nEntries-- )
{
// Find symbols
if (
((Elf64_Shdr *) elfPtr)->sh_type == SHT_SYMTAB ||
((Elf64_Shdr *) elfPtr)->sh_type == SHT_DYNSYM
)
{
// Jump on the symbols section
elfSymPtr = (Elf64_Addr) pBaseAddress + ((Elf64_Shdr *) elfPtr)->sh_offset;
// Find string section header table
elfStrSectionPtr = elfSectionHdrPtr + ( sizeof(Elf64_Shdr) * ((Elf64_Shdr *) elfPtr)->sh_link );
// Jump on the string table ( .strtab )
elfStrSectionPtr = (Elf64_Addr) pBaseAddress + ((Elf64_Shdr *) elfStrSectionPtr)->sh_offset;
// Calculate the number of symbols
wSymbols = ((Elf64_Shdr *) elfPtr)->sh_size / sizeof(Elf64_Sym);
// Iterate over all symbol entries
while ( wSymbols-- )
{
if ( ((Elf64_Sym *) elfSymPtr)->st_size )
{
printf("0x%08LX\t0x%02X\t0x%02X\t%d\t%s\n",
((Elf64_Sym *) elfSymPtr)->st_value,
ELF64_ST_TYPE( ((Elf64_Sym *) elfSymPtr)->st_info ),
ELF64_ST_BIND( ((Elf64_Sym *) elfSymPtr)->st_info ),
((Elf64_Sym *) elfSymPtr)->st_size,
(char *) elfStrSectionPtr + ((Elf64_Sym *) elfSymPtr)->st_name
);
}
// Next symbol entry
elfSymPtr += sizeof(Elf64_Sym);
}
}
// Next section entry
elfPtr += ((Elf64_Ehdr *) pBaseAddress)->e_shentsize;
}
}
void ParseRelocations(void *pBaseAddress)
{
Elf64_Addr elfPtr;
Elf64_Addr elfRelocPtr;
Elf64_Half nEntries;
Elf64_Word wRelocEntries;
PRINT_LINE("Relocations", 100);
// Pointing now to the section header
elfPtr = (Elf64_Addr) pBaseAddress + ((Elf64_Ehdr *) pBaseAddress)->e_shoff;
// Number of sections
nEntries = ((Elf64_Ehdr *) pBaseAddress)->e_shnum;
puts("Offset\t\tAddend\t\tType\t\tSymtab index");
// Iterate over all sections
while ( nEntries-- )
{
// Find relocation section header
if ( ((Elf64_Shdr *) elfPtr)->sh_type == SHT_RELA )
{
// Find relocations entries of current section
elfRelocPtr = (Elf64_Addr) pBaseAddress + ((Elf64_Shdr *) elfPtr)->sh_offset;
// Number of relocation entries
wRelocEntries = ((Elf64_Shdr *) elfPtr)->sh_size / sizeof(Elf64_Rela);
// Iterate over all relocation entries
while ( wRelocEntries-- )
{
printf("0x%08X\t0x%08X\t0x%08X\t%d\n",
((Elf64_Rela *) elfRelocPtr)->r_offset,
((Elf64_Rela *) elfRelocPtr)->r_addend,
ELF64_R_TYPE( ((Elf64_Rela *) elfRelocPtr)->r_info ),
ELF64_R_SYM( ((Elf64_Rela *) elfRelocPtr)->r_info )
);
// Next reloc entry
elfRelocPtr += sizeof(Elf64_Rela);
}
}
// Next section entry
elfPtr += ((Elf64_Ehdr *) pBaseAddress)->e_shentsize;
}
}
bool ParseELF(void *pBaseAddress)
{
// Check if a valid ELF file
if (!(
((Elf64_Ehdr *) pBaseAddress)->e_ident[EI_MAG0] == ELFMAG0 &&
((Elf64_Ehdr *) pBaseAddress)->e_ident[EI_MAG1] == ELFMAG1 &&
((Elf64_Ehdr *) pBaseAddress)->e_ident[EI_MAG2] == ELFMAG2 &&
((Elf64_Ehdr *) pBaseAddress)->e_ident[EI_MAG3] == ELFMAG3
))
return false;
// Check if the given ELF is x32 bit
if ( ((Elf64_Ehdr *) pBaseAddress)->e_ident[EI_CLASS] == ELFCLASS32 )
return false;
ParseHeaders(pBaseAddress);
ParseSymbols(pBaseAddress);
ParseRelocations(pBaseAddress);
return true;
}
void *ReadELFFile(char *cpFileName)
{
FILE *pFile;
void *pBuffer;
size_t lSize;
// Get a file handle
if ( pFile = fopen(cpFileName, "rb") )
{
// Tell the handle about the ending of the file to obtain its size
fseek(pFile, 0L, SEEK_END);
// Get the file size
lSize = ftell(pFile);
// Restore back the handle
fseek(pFile, 0L, SEEK_SET);
// Allocate some memory for the buffer
if ( !(pBuffer = malloc(lSize)) ) // OOM CASE
goto LEAVE;
// Read the file content
fread(pBuffer, lSize, 1L, pFile);
}
LEAVE:
if ( pFile ) fclose(pFile);
return pBuffer;
}
int main(int argc, char **argv)
{
void *pBuffer;
int nRet = EXIT_FAILURE;
if ( argc <= 1 )
return puts("Usage:\n\t./ELFParser </path/to/elffile>");
pBuffer = ReadELFFile(argv[1]);
if ( ! pBuffer )
{
printf("Failed to read %s\n", argv[1]);
goto LEAVE;
}
if ( ! ParseELF(pBuffer) )
{
puts("Invalid ELF file, or maybe 32-bit application");
goto LEAVE;
}
nRet = EXIT_SUCCESS;
LEAVE:
if ( pBuffer ) free(pBuffer);
return nRet;
}
Test
Let’s test the parser on the following application:
#include <stdio.h>
void main(void)
{
printf("Hello World!\n");
}
Compiled via gcc test.c -o test
. Let’s run the parser.
┌──(user㉿host)-[~/path/to]
└─$ ./ELFParser ~/test
------ELF HEADER------------------------------------------------------------------------------------------
Image magic number => 0x464C457F (ELF)
Arch => x64
ELF type => DYN (Position-Independent Executable file)
Machine => AMD x86-64
Entry point => 0x1050
ELF header size => 64
Program header offset => 0x40
Program header Entry size => 56
Numbre of entries => 13
Section header offset => 0x3738
Section header size => 64
Numbre of entries => 31
Section name table offset => 0x1E
------Program Header--------------------------------------------------------------------------------------
Type Offset VirtAddr PhysAddr FileSize MemSize Flag
PHDR 0x00000040 0x00000040 0x00000040 728 728 READABLE
INTERP 0x00000318 0x00000318 0x00000318 28 28 READABLE
LOAD 0x00000000 0x00000000 0x00000000 1528 1528 READABLE
LOAD 0x00001000 0x00001000 0x00001000 445 445 EXECUTABLE READABLE
LOAD 0x00002000 0x00002000 0x00002000 344 344 READABLE
LOAD 0x00002DE8 0x00003DE8 0x00003DE8 584 592 WRITABLE READABLE
DYNAMIC 0x00002DF8 0x00003DF8 0x00003DF8 480 480 WRITABLE READABLE
NOTE 0x00000338 0x00000338 0x00000338 32 32 READABLE
NOTE 0x00000358 0x00000358 0x00000358 68 68 READABLE
Unknown 0x00000338 0x00000338 0x00000338 32 32 READABLE
Unknown 0x00002014 0x00002014 0x00002014 60 60 READABLE
GNU_STACK 0x00000000 0x00000000 0x00000000 0 0 WRITABLE READABLE
Unknown 0x00002DE8 0x00003DE8 0x00003DE8 536 536 READABLE
------Section Header--------------------------------------------------------------------------------------
Offset Addr Size Type Name Flags
0x318 0x00000318 28 0x00000001 .interp ALLOCATABLE
0x338 0x00000338 32 0x00000007 .note.gnu.property ALLOCATABLE
0x358 0x00000358 36 0x00000007 .note.gnu.build-id ALLOCATABLE
0x37C 0x0000037C 32 0x00000007 .note.ABI-tag ALLOCATABLE
0x3A0 0x000003A0 36 0x6FFFFFF6 .gnu.hash ALLOCATABLE
0x3C8 0x000003C8 168 0x0000000B .dynsym ALLOCATABLE
0x470 0x00000470 130 0x00000003 .dynstr ALLOCATABLE
0x4F2 0x000004F2 14 0x6FFFFFFF .gnu.version ALLOCATABLE
0x500 0x00000500 32 0x6FFFFFFE .gnu.version_r ALLOCATABLE
0x520 0x00000520 192 0x00000004 .rela.dyn ALLOCATABLE
0x5E0 0x000005E0 24 0x00000004 .rela.plt ALLOCATABLE
0x1000 0x00001000 23 0x00000001 .init ALLOCATABLE EXECUTABLE
0x1020 0x00001020 32 0x00000001 .plt ALLOCATABLE EXECUTABLE
0x1040 0x00001040 8 0x00000001 .plt.got ALLOCATABLE EXECUTABLE
0x1050 0x00001050 353 0x00000001 .text ALLOCATABLE EXECUTABLE
0x11B4 0x000011B4 9 0x00000001 .fini ALLOCATABLE EXECUTABLE
0x2000 0x00002000 17 0x00000001 .rodata ALLOCATABLE
0x2014 0x00002014 60 0x00000001 .eh_frame_hdr ALLOCATABLE
0x2050 0x00002050 264 0x00000001 .eh_frame ALLOCATABLE
0x2DE8 0x00003DE8 8 0x0000000E .init_array WRITABLE ALLOCATABLE
0x2DF0 0x00003DF0 8 0x0000000F .fini_array WRITABLE ALLOCATABLE
0x2DF8 0x00003DF8 480 0x00000006 .dynamic WRITABLE ALLOCATABLE
0x2FD8 0x00003FD8 40 0x00000001 .got WRITABLE ALLOCATABLE
0x3000 0x00004000 32 0x00000001 .got.plt WRITABLE ALLOCATABLE
0x3020 0x00004020 16 0x00000001 .data WRITABLE ALLOCATABLE
0x3030 0x00000000 31 0x00000001 .comment
0x3050 0x00000000 960 0x00000002 .symtab
0x3410 0x00000000 526 0x00000003 .strtab
0x361E 0x00000000 282 0x00000003 .shstrtab
------Symbols---------------------------------------------------------------------------------------------
Value Type Bind Size Name
0x0000037C 0x01 0x00 32 __abi_tag
0x00004030 0x01 0x00 1 completed.0
0x000011B0 0x02 0x01 1 __libc_csu_fini
0x00002000 0x01 0x01 4 _IO_stdin_used
0x00001150 0x02 0x01 93 __libc_csu_init
0x00001050 0x02 0x01 43 _start
0x00001139 0x02 0x01 22 main
------Relocations-----------------------------------------------------------------------------------------
Offset Addend Type Symtab index
0x00003DE8 0x00001130 0x00000008 0
0x00003DF0 0x000010F0 0x00000008 0
0x00004028 0x00004028 0x00000008 0
0x00003FD8 0x00000000 0x00000006 1
0x00003FE0 0x00000000 0x00000006 3
0x00003FE8 0x00000000 0x00000006 4
0x00003FF0 0x00000000 0x00000006 5
0x00003FF8 0x00000000 0x00000006 6
0x00004018 0x00000000 0x00000007 2
Conclusion
I would say that the topic is not easy and requires some effort, focus, and practice to understand it well, so I encourage you to try implementing your own parser yourself. Thank you for reading.