Cpu0 backend generated the ELF format of obj.The ELF (Executable and Linkable Format) is a common standard file format forexecutables, object code, shared libraries and core dumps.First published in the System V Application Binary Interface specification,and later in the Tool Interface Standard, it was quickly accepted amongdifferent vendors of Unixsystems.In 1999 it was chosen as the standard binary file format for Unix andUnix-like systems on x86 by the x86open project.Please reference [1].
The binary encode of cpu0 instruction set in obj has been checked in theprevious chapters.But we didn’t dig into the ELF file format like elf header and relocationrecord at that time.This chapter will use the binutils which has been installed in“sub-section Install other tools on iMac” of Appendix A: “Installing LLVM”[2] to analysis cpu0 ELF file.You will learn the objdump, readelf, ..., tools and understand the ELF fileformat itself through using these tools to analyze the cpu0 generated obj inthis chapter.LLVM has the llvm-objdump tool which like objdump. We will make cpu0 supportllvm-objdump tool in this chapter.The binutils support other CPU ELF dump as a cross compiler tool chains.Linux platform has binutils already and no need to install it further.We use Linux binutils in this chapter just because iMac will display Chinesetext.The iMac corresponding binutils have no problem except it use add g incommand, for example, use gobjdump instead of objdump, and display your arealanguage instead of pure English.
The binutils tool we use is not a part of llvm tools, but it’s a powerful toolin ELF analysis.This chapter introduce the tool to readers since we think it is a valuableknowledge in this popular ELF format and the ELF binutils analysis tool.An LLVM compiler engineer has the responsibility to analyze the ELF sincethe obj is need to be handled by linker or loader later.With this tool, you can verify your generated ELF format.
The cpu0 author has published a “System Software” book which introduce thetopicsof assembler, linker, loader, compiler and OS in concept, and at same timedemonstrate how to use binutils and gcc to analysis ELF through the examplecode in his book.It’s a Chinese book of “System Software” in concept and practice.This book does the real analysis through binutils.The “System Software”[3] written by Beck is a famous book in concept oftelling readers what is the compiler output, what is the linker output,what is the loader output, and how they work together.But it covers the concept only.You can reference it to understand how the “Relocation Record” works if youneed to refresh or learning this knowledge for this chapter.
[4], [5], [6] are the Chinese documents available from the cpu0 author onweb site.
ELF is a format used both in obj and executable file.So, there are two views in it as Figure 1.
As Figure 1, the “Section header table” include sections .text, .rodata,..., .data which are sections layout for code, read only data, ..., andread/write data.“Program header table” include segments include run time code and data.The definition of segments is run time layout for code and data, and sectionsis link time layout for code and data.
Let’s run Chapter7_7/ with ch6_1.cpp, and dump ELF header information byreadelf -h to see what information the ELF header contains.
[Gamma@localhost InputFiles]$ /usr/local/llvm/test/cmake_debug_build/bin/llc -march=cpu0 -relocation-model=pic -filetype=obj ch6_1.bc -o ch6_1.cpu0.o[Gamma@localhost InputFiles]$ readelf -h ch6_1.cpu0.oELF Header: Magic: 7f 45 4c 46 01 02 01 08 00 00 00 00 00 00 00 00 Class: ELF32 Data: 2's complement, big endian Version: 1 (current) OS/ABI: UNIX - IRIX ABI Version: 0 Type: REL (Relocatable file) Machine: <unknown>: 0xc9 Version: 0x1 Entry point address: 0x0 Start of program headers: 0 (bytes into file) Start of section headers: 212 (bytes into file) Flags: 0x70000001 Size of this header: 52 (bytes) Size of program headers: 0 (bytes) Number of program headers: 0 Size of section headers: 40 (bytes) Number of section headers: 10 Section header string table index: 7[Gamma@localhost InputFiles]$[Gamma@localhost InputFiles]$ /usr/local/llvm/test/cmake_debug_build/bin/llc -march=mips -relocation-model=pic -filetype=obj ch6_1.bc -o ch6_1.mips.o[Gamma@localhost InputFiles]$ readelf -h ch6_1.mips.oELF Header: Magic: 7f 45 4c 46 01 02 01 08 00 00 00 00 00 00 00 00 Class: ELF32 Data: 2's complement, big endian Version: 1 (current) OS/ABI: UNIX - IRIX ABI Version: 0 Type: REL (Relocatable file) Machine: MIPS R3000 Version: 0x1 Entry point address: 0x0 Start of program headers: 0 (bytes into file) Start of section headers: 212 (bytes into file) Flags: 0x70000001 Size of this header: 52 (bytes) Size of program headers: 0 (bytes) Number of program headers: 0 Size of section headers: 40 (bytes) Number of section headers: 11 Section header string table index: 8[Gamma@localhost InputFiles]$
As above ELF header display, it contains information of magic number, version,ABI, ..., . The Machine field of cpu0 is unknown while mips is MIPSR3000.It is because cpu0 is not a popular CPU recognized by utility readelf.Let’s check ELF segments information as follows,
[Gamma@localhost InputFiles]$ readelf -l ch6_1.cpu0.oThere are no program headers in this file.[Gamma@localhost InputFiles]$
The result is in expectation because cpu0 obj is for link only, not forexecution.So, the segments is empty.Check ELF sections information as follows.It contains offset and size information for every section.
[Gamma@localhost InputFiles]$ readelf -S ch6_1.cpu0.oThere are 10 section headers, starting at offset 0xd4:Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [ 0] NULL 00000000 000000 000000 00 0 0 0 [ 1] .text PROGBITS 00000000 000034 000034 00 AX 0 0 4 [ 2] .rel.text REL 00000000 000310 000018 08 8 1 4 [ 3] .data PROGBITS 00000000 000068 000004 00 WA 0 0 4 [ 4] .bss NOBITS 00000000 00006c 000000 00 WA 0 0 4 [ 5] .eh_frame PROGBITS 00000000 00006c 000028 00 A 0 0 4 [ 6] .rel.eh_frame REL 00000000 000328 000008 08 8 5 4 [ 7] .shstrtab STRTAB 00000000 000094 00003e 00 0 0 1 [ 8] .symtab SYMTAB 00000000 000264 000090 10 9 6 4 [ 9] .strtab STRTAB 00000000 0002f4 00001b 00 0 0 1Key to Flags: W (write), A (alloc), X (execute), M (merge), S (strings) I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown) O (extra OS processing required) o (OS specific), p (processor specific)[Gamma@localhost InputFiles]$
The cpu0 backend translate global variable as follows,
[Gamma@localhost InputFiles]$ clang -c ch6_1.cpp -emit-llvm -o ch6_1.bc[Gamma@localhost InputFiles]$ /usr/local/llvm/test/cmake_debug_build/bin/llc -march=cpu0 -relocation-model=pic -filetype=asm ch6_1.bc -o ch6_1.cpu0.s[Gamma@localhost InputFiles]$ cat ch6_1.cpu0.s .section .mdebug.abi32 .previous .file "ch6_1.bc" .text .globl main .align 2 .type main,@function .ent main # @mainmain: .cfi_startproc .frame $sp,8,$lr .mask 0x00000000,0 .set noreorder .cpload $t9... ld $2, %got(gI)($gp)... .type gI,@object # @gI .data .globl gI .align 2gI: .4byte 100 # 0x64 .size gI, 4[Gamma@localhost InputFiles]$ /usr/local/llvm/test/cmake_debug_build/bin/llc -march=cpu0 -relocation-model=pic -filetype=obj ch6_1.bc -o ch6_1.cpu0.o[Gamma@localhost InputFiles]$ objdump -s ch6_1.cpu0.och6_1.cpu0.o: file format elf32-bigContents of section .text:// .cpload machine instruction 0000 09a00000 1eaa0010 09aa0000 13aa6000 ..............`. ... 0020 002a0000 00220000 012d0000 09dd0008 .*..."...-...... ...[Gamma@localhost InputFiles]$ Jonathan$[Gamma@localhost InputFiles]$ readelf -tr ch6_1.cpu0.oThere are 10 section headers, starting at offset 0xd4:Section Headers: [Nr] Name Type Addr Off Size ES Lk Inf Al Flags [ 0] NULL 00000000 000000 000000 00 0 0 0 [00000000]: [ 1] .text PROGBITS 00000000 000034 000034 00 0 0 4 [00000006]: ALLOC, EXEC [ 2] .rel.text REL 00000000 000310 000018 08 8 1 4 [00000000]: [ 3] .data PROGBITS 00000000 000068 000004 00 0 0 4 [00000003]: WRITE, ALLOC [ 4] .bss NOBITS 00000000 00006c 000000 00 0 0 4 [00000003]: WRITE, ALLOC [ 5] .eh_frame PROGBITS 00000000 00006c 000028 00 0 0 4 [00000002]: ALLOC [ 6] .rel.eh_frame REL 00000000 000328 000008 08 8 5 4 [00000000]: [ 7] .shstrtab STRTAB 00000000 000094 00003e 00 0 0 1 [00000000]: [ 8] .symtab SYMTAB 00000000 000264 000090 10 9 6 4 [00000000]: [ 9] .strtab STRTAB 00000000 0002f4 00001b 00 0 0 1 [00000000]:Relocation section '.rel.text' at offset 0x310 contains 3 entries: Offset Info Type Sym.Value Sym. Name00000000 00000805 unrecognized: 5 00000000 _gp_disp00000008 00000806 unrecognized: 6 00000000 _gp_disp00000020 00000609 unrecognized: 9 00000000 gIRelocation section '.rel.eh_frame' at offset 0x328 contains 1 entries: Offset Info Type Sym.Value Sym. Name0000001c 00000202 unrecognized: 2 00000000 .text[Gamma@localhost InputFiles]$ readelf -tr ch6_1.mips.oThere are 10 section headers, starting at offset 0xd0:Section Headers: [Nr] Name Type Addr Off Size ES Lk Inf Al Flags [ 0] NULL 00000000 000000 000000 00 0 0 0 [00000000]: [ 1] .text PROGBITS 00000000 000034 000030 00 0 0 4 [00000006]: ALLOC, EXEC [ 2] .rel.text REL 00000000 00030c 000018 08 8 1 4 [00000000]: [ 3] .data PROGBITS 00000000 000064 000004 00 0 0 4 [00000003]: WRITE, ALLOC [ 4] .bss NOBITS 00000000 000068 000000 00 0 0 4 [00000003]: WRITE, ALLOC [ 5] .eh_frame PROGBITS 00000000 000068 000028 00 0 0 4 [00000002]: ALLOC [ 6] .rel.eh_frame REL 00000000 000324 000008 08 8 5 4 [00000000]: [ 7] .shstrtab STRTAB 00000000 000090 00003e 00 0 0 1 [00000000]: [ 8] .symtab SYMTAB 00000000 000260 000090 10 9 6 4 [00000000]: [ 9] .strtab STRTAB 00000000 0002f0 00001b 00 0 0 1 [00000000]:Relocation section '.rel.text' at offset 0x30c contains 3 entries: Offset Info Type Sym.Value Sym. Name00000000 00000805 R_MIPS_HI16 00000000 _gp_disp00000004 00000806 R_MIPS_LO16 00000000 _gp_disp00000018 00000609 R_MIPS_GOT16 00000000 gIRelocation section '.rel.eh_frame' at offset 0x324 contains 1 entries: Offset Info Type Sym.Value Sym. Name0000001c 00000202 R_MIPS_32 00000000 .text
As depicted in section Handle $gp register in PIC addressing mode, ittranslate “.cpload %reg” into the following.
// Lower ".cpload $reg" to// "addiu $gp, $zero, %hi(_gp_disp)"// "shl $gp, $gp, 16"// "addiu $gp, $gp, %lo(_gp_disp)"// "addu $gp, $gp, $t9"
The _gp_disp value is determined by loader. So, it’s undefined in obj.You can find the Relocation Records for offset 0 and 8 of .text sectionreferred to _gp_disp value.The offset 0 and 8 of .text section are instructions “addiu $gp, $zero,%hi(_gp_disp)” and “addiu $gp, $gp, %lo(_gp_disp)” and their corresponding objencode are 09a00000 and 09aa0000.The obj translate the %hi(_gp_disp) and %lo(_gp_disp) into 0 since when loaderload this obj into memory, loader will know the _gp_disp value at run time andwill update these two offset relocation records into the correct offset value.You can check the cpu0 of %hi(_gp_disp) and %lo(_gp_disp) are correct by abovemips Relocation Records of R_MIPS_HI(_gp_disp) and R_MIPS_LO(_gp_disp) eventhough the cpu0 is not a CPU recognized by greadelf utilitly.The instruction “ld $2, %got(gI)($gp)” is same since we don’t know what theaddress of .data section variable will load to.So, translate the address to 0 and made a relocation record on 0x00000020 of.text section. Loader will change this address too.
Run with ch8_3_3.cpp will get the unknown result in _Z5sum_iiz and other symbolreference as below.Loader or linker will take care them according the relocation records compilergenerated.
[Gamma@localhost InputFiles]$ /usr/local/llvm/test/cmake_debug_build/bin/llc -march=cpu0 -relocation-model=pic -filetype=obj ch8_3_3.bc -o ch8_3__3.cpu0.o[Gamma@localhost InputFiles]$ readelf -tr ch8_3_3.cpu0.oThere are 11 section headers, starting at offset 0x248:Section Headers: [Nr] Name Type Addr Off Size ES Lk Inf Al Flags [ 0] NULL 00000000 000000 000000 00 0 0 0 [00000000]: [ 1] .text PROGBITS 00000000 000034 000178 00 0 0 4 [00000006]: ALLOC, EXEC [ 2] .rel.text REL 00000000 000538 000058 08 9 1 4 [00000000]: [ 3] .data PROGBITS 00000000 0001ac 000000 00 0 0 4 [00000003]: WRITE, ALLOC [ 4] .bss NOBITS 00000000 0001ac 000000 00 0 0 4 [00000003]: WRITE, ALLOC [ 5] .rodata.str1.1 PROGBITS 00000000 0001ac 000008 01 0 0 1 [00000032]: ALLOC, MERGE, STRINGS [ 6] .eh_frame PROGBITS 00000000 0001b4 000044 00 0 0 4 [00000002]: ALLOC [ 7] .rel.eh_frame REL 00000000 000590 000010 08 9 6 4 [00000000]: [ 8] .shstrtab STRTAB 00000000 0001f8 00004d 00 0 0 1 [00000000]: [ 9] .symtab SYMTAB 00000000 000400 0000e0 10 10 8 4 [00000000]: [10] .strtab STRTAB 00000000 0004e0 000055 00 0 0 1 [00000000]:Relocation section '.rel.text' at offset 0x538 contains 11 entries: Offset Info Type Sym.Value Sym. Name00000000 00000c05 unrecognized: 5 00000000 _gp_disp00000008 00000c06 unrecognized: 6 00000000 _gp_disp0000001c 00000b09 unrecognized: 9 00000000 __stack_chk_guard000000b8 00000b09 unrecognized: 9 00000000 __stack_chk_guard000000dc 00000a0b unrecognized: b 00000000 __stack_chk_fail000000e8 00000c05 unrecognized: 5 00000000 _gp_disp000000f0 00000c06 unrecognized: 6 00000000 _gp_disp00000140 0000080b unrecognized: b 00000000 _Z5sum_iiz00000154 00000209 unrecognized: 9 00000000 $.str00000158 00000206 unrecognized: 6 00000000 $.str00000160 00000d0b unrecognized: b 00000000 printfRelocation section '.rel.eh_frame' at offset 0x590 contains 2 entries: Offset Info Type Sym.Value Sym. Name0000001c 00000302 unrecognized: 2 00000000 .text00000034 00000302 unrecognized: 2 00000000 .text[Gamma@localhost InputFiles]$ /usr/local/llvm/test/cmake_debug_build/bin/llc -march=mips -relocation-model=pic -filetype=obj ch8_3_3.bc -o ch8_3__3.mips.o[Gamma@localhost InputFiles]$ readelf -tr ch8_3_3.mips.oThere are 11 section headers, starting at offset 0x254:Section Headers: [Nr] Name Type Addr Off Size ES Lk Inf Al Flags [ 0] NULL 00000000 000000 000000 00 0 0 0 [00000000]: [ 1] .text PROGBITS 00000000 000034 000184 00 0 0 4 [00000006]: ALLOC, EXEC [ 2] .rel.text REL 00000000 000544 000058 08 9 1 4 [00000000]: [ 3] .data PROGBITS 00000000 0001b8 000000 00 0 0 4 [00000003]: WRITE, ALLOC [ 4] .bss NOBITS 00000000 0001b8 000000 00 0 0 4 [00000003]: WRITE, ALLOC [ 5] .rodata.str1.1 PROGBITS 00000000 0001b8 000008 01 0 0 1 [00000032]: ALLOC, MERGE, STRINGS [ 6] .eh_frame PROGBITS 00000000 0001c0 000044 00 0 0 4 [00000002]: ALLOC [ 7] .rel.eh_frame REL 00000000 00059c 000010 08 9 6 4 [00000000]: [ 8] .shstrtab STRTAB 00000000 000204 00004d 00 0 0 1 [00000000]: [ 9] .symtab SYMTAB 00000000 00040c 0000e0 10 10 8 4 [00000000]: [10] .strtab STRTAB 00000000 0004ec 000055 00 0 0 1 [00000000]:Relocation section '.rel.text' at offset 0x544 contains 11 entries: Offset Info Type Sym.Value Sym. Name00000000 00000c05 R_MIPS_HI16 00000000 _gp_disp00000004 00000c06 R_MIPS_LO16 00000000 _gp_disp00000024 00000b09 R_MIPS_GOT16 00000000 __stack_chk_guard000000c8 00000b09 R_MIPS_GOT16 00000000 __stack_chk_guard000000f0 00000a0b R_MIPS_CALL16 00000000 __stack_chk_fail00000100 00000c05 R_MIPS_HI16 00000000 _gp_disp00000104 00000c06 R_MIPS_LO16 00000000 _gp_disp00000134 0000080b R_MIPS_CALL16 00000000 _Z5sum_iiz00000154 00000209 R_MIPS_GOT16 00000000 $.str00000158 00000206 R_MIPS_LO16 00000000 $.str0000015c 00000d0b R_MIPS_CALL16 00000000 printfRelocation section '.rel.eh_frame' at offset 0x59c contains 2 entries: Offset Info Type Sym.Value Sym. Name0000001c 00000302 R_MIPS_32 00000000 .text00000034 00000302 R_MIPS_32 00000000 .text[Gamma@localhost InputFiles]$
Files Cpu0ELFObjectWrite.cpp and Cpu0MC*.cpp are the files take care the objformat.Most obj code translation are defined by Cpu0InstrInfo.td andCpu0RegisterInfo.td.With these td description, LLVM translate the instruction into obj formatautomatically.
The lld is a project of LLVM linker.It’s under development and we cannot finish the installation by following theweb site direction.Even with this, it’s really make sense to develop a new linker according lld website information.Please visit the web site [7].
In iMac, gobjdump -tr can display the information of relocation recordslike readelf -tr. LLVM tool llvm-objdump is the same tool as objdump.Let’s run gobjdump and llvm-objdump commands as follows to see the differences.
118-165-83-12:InputFiles Jonathan$ clang -c ch8_3_3.cpp -emit-llvm -I/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.8.sdk/usr/include/ -o ch8_3_3.bc118-165-83-10:InputFiles Jonathan$ /Users/Jonathan/llvm/test/cmake_debug_build/bin/Debug/llc -march=cpu0 -relocation-model=pic -filetype=obj ch8_3_3.bc -och8_3_3.cpu0.o118-165-78-12:InputFiles Jonathan$ gobjdump -t -r ch8_3_3.cpu0.och8_3_3.cpu0.o: file format elf32-bigSYMBOL TABLE:00000000 l df *ABS* 00000000 ch8_3_3.bc00000000 l O .rodata.str1.1 00000008 $.str00000000 l d .text 00000000 .text00000000 l d .data 00000000 .data00000000 l d .bss 00000000 .bss00000000 l d .rodata.str1.1 00000000 .rodata.str1.100000000 l d .eh_frame 00000000 .eh_frame00000000 g F .text 000000d4 _Z5sum_iiz000000d4 g F .text 00000074 main00000000 *UND* 00000000 __stack_chk_fail00000000 *UND* 00000000 __stack_chk_guard00000000 *UND* 00000000 printfRELOCATION RECORDS FOR [.text]:OFFSET TYPE VALUE00000008 UNKNOWN __stack_chk_guard00000010 UNKNOWN __stack_chk_guard000000d0 UNKNOWN __stack_chk_fail00000118 UNKNOWN _Z5sum_iiz00000124 UNKNOWN $.str0000012c UNKNOWN $.str00000134 UNKNOWN printfRELOCATION RECORDS FOR [.eh_frame]:OFFSET TYPE VALUE0000001c UNKNOWN .text00000034 UNKNOWN .text118-165-83-10:InputFiles Jonathan$ /Users/Jonathan/llvm/test/cmake_debug_build/bin/Debug/llvm-objdump -t -r ch8_3_3.cpu0.och8_3_3.cpu0.o: file format ELF32-CPU0RELOCATION RECORDS FOR [.text]:0 R_CPU0_HI16 _gp_disp8 R_CPU0_LO16 _gp_disp28 R_CPU0_GOT16 __stack_chk_guard188 R_CPU0_GOT16 __stack_chk_guard224 R_CPU0_CALL24 __stack_chk_fail236 R_CPU0_HI16 _gp_disp244 R_CPU0_LO16 _gp_disp324 R_CPU0_CALL24 _Z5sum_iiz344 R_CPU0_GOT16 $.str348 R_CPU0_LO16 $.str356 R_CPU0_CALL24 printfRELOCATION RECORDS FOR [.eh_frame]:28 R_CPU0_32 .text52 R_CPU0_32 .textSYMBOL TABLE:00000000 l df *ABS* 00000000 ch8_3_3.bc00000000 l .rodata.str1.1 00000008 $.str00000000 l d .text 00000000 .text00000000 l d .data 00000000 .data00000000 l d .bss 00000000 .bss00000000 l d .rodata.str1.1 00000000 .rodata.str1.100000000 l d .eh_frame 00000000 .eh_frame00000000 g F .text 000000ec _Z5sum_iiz000000ec g F .text 00000094 main00000000 *UND* 00000000 __stack_chk_fail00000000 *UND* 00000000 __stack_chk_guard00000000 *UND* 00000000 _gp_disp00000000 *UND* 00000000 printf
The latter llvm-objdump can display the file format and relocation recordsinformation since we add the relocation records information in ELF.h as follows,
include/support/ELF.h
// Machine architecturesenum { ... EM_CPU0 = 201, // Document Write An LLVM Backend Tutorial For Cpu0 ...}// include/object/ELF.h...template<support::endianness target_endianness, bool is64Bits>error_code ELFObjectFile<target_endianness, is64Bits> ::getRelocationTypeName(DataRefImpl Rel, SmallVectorImpl<char> &Result) const { ... switch (Header->e_machine) { case ELF::EM_CPU0: // llvm-objdump -t -r switch (type) { LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_NONE); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_16); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_32); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_REL32); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_24); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_HI16); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_LO16); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_GPREL16); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_LITERAL); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_GOT16); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_PC24); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_CALL24); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_GPREL32); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_SHIFT5); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_SHIFT6); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_64); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_GOT_DISP); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_GOT_PAGE); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_GOT_OFST); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_GOT_HI16); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_GOT_LO16); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_SUB); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_INSERT_A); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_INSERT_B); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_DELETE); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_HIGHER); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_HIGHEST); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_CALL_HI16); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_CALL_LO16); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_SCN_DISP); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_REL16); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_ADD_IMMEDIATE); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_PJUMP); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_RELGOT); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_JALR); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_TLS_DTPMOD32); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_TLS_DTPREL32); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_TLS_DTPMOD64); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_TLS_DTPREL64); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_TLS_GD); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_TLS_LDM); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_TLS_DTPREL_HI16); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_TLS_DTPREL_LO16); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_TLS_GOTTPREL); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_TLS_TPREL32); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_TLS_TPREL64); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_TLS_TPREL_HI16); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_TLS_TPREL_LO16); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_GLOB_DAT); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_COPY); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_JUMP_SLOT); LLVM_ELF_SWITCH_RELOC_TYPE_NAME(R_CPU0_NUM); default: res = "Unknown"; } break; ... }template<support::endianness target_endianness, bool is64Bits>error_code ELFObjectFile<target_endianness, is64Bits> ::getRelocationValueString(DataRefImpl Rel, SmallVectorImpl<char> &Result) const { ... case ELF::EM_CPU0: // llvm-objdump -t -r res = symname; break; ...}template<support::endianness target_endianness, bool is64Bits>StringRef ELFObjectFile<target_endianness, is64Bits> ::getFileFormatName() const { switch(Header->e_ident[ELF::EI_CLASS]) { case ELF::ELFCLASS32: switch(Header->e_machine) { ... case ELF::EM_CPU0: // llvm-objdump -t -r return "ELF32-CPU0"; ...}template<support::endianness target_endianness, bool is64Bits>unsigned ELFObjectFile<target_endianness, is64Bits>::getArch() const { switch(Header->e_machine) { ... case ELF::EM_CPU0: // llvm-objdump -t -r return (target_endianness == support::little) ? Triple::cpu0el : Triple::cpu0; ...}
Run Chapter8_9/ and command llvm-objdump -d for dump file from elf to hex asfollows,
JonathantekiiMac:InputFiles Jonathan$ clang -c ch7_1_1.cpp -emit-llvm -och7_1_1.bcJonathantekiiMac:InputFiles Jonathan$ /Users/Jonathan/llvm/test/cmake_debug_build/bin/Debug/llc -march=cpu0 -relocation-model=pic -filetype=obj ch7_1_1.bc-o ch7_1_1.cpu0.oJonathantekiiMac:InputFiles Jonathan$ /Users/Jonathan/llvm/test/cmake_debug_build/bin/Debug/llvm-objdump -d ch7_1_1.cpu0.och7_1_1.cpu0.o: file format ELF32-unknownDisassembly of section .text:error: no disassembler for target cpu0-unknown-unknown
To support llvm-objdump, the following code added to Chapter9_1/.
LLVMBackendTutorialExampleCode/Chapter9_1/CMakeLists.txt
tablegen(LLVM Cpu0GenDisassemblerTables.inc -gen-disassembler)...
LLVMBackendTutorialExampleCode/Chapter9_1/LLVMBuild.txt
[common]subdirectories = Disassembler ......has_disassembler = 1...
LLVMBackendTutorialExampleCode/Chapter9_1/Cpu0InstrInfo.td
class CmpInstr<bits<8> op, string instr_asm, InstrItinClass itin, RegisterClass RC, RegisterClass RD, bit isComm = 0>: FA<op, (outs RD:$rc), (ins RC:$ra, RC:$rb), !strconcat(instr_asm, "\t$ra, $rb"), [], itin> { ... let DecoderMethod = "DecodeCMPInstruction";}class CBranch<bits<8> op, string instr_asm, RegisterClass RC, list<Register> UseRegs>: FJ<op, (outs), (ins RC:$ra, brtarget:$addr), !strconcat(instr_asm, "\t$addr"), [(brcond RC:$ra, bb:$addr)], IIBranch> { ... let DecoderMethod = "DecodeBranchTarget";}let isBranch=1, isTerminator=1, isBarrier=1, imm16=0, hasDelaySlot = 1, isIndirectBranch = 1 inclass JumpFR<bits<8> op, string instr_asm, RegisterClass RC>: FL<op, (outs), (ins RC:$ra), !strconcat(instr_asm, "\t$ra"), [(brind RC:$ra)], IIBranch> { let rb = 0; let imm16 = 0;}let isCall=1, hasDelaySlot=0 in { class JumpLink<bits<8> op, string instr_asm>: FJ<op, (outs), (ins calltarget:$target, variable_ops), !strconcat(instr_asm, "\t$target"), [(Cpu0JmpLink imm:$target)], IIBranch> { let DecoderMethod = "DecodeJumpAbsoluteTarget"; }def JR : JumpFR<0x2C, "ret", CPURegs>;
LLVMBackendTutorialExampleCode/Chapter9_1/Disassembler/CMakeLists.txt
include_directories( ${CMAKE_CURRENT_BINARY_DIR}/.. ${CMAKE_CURRENT_SOURCE_DIR}/.. )add_llvm_library(LLVMCpu0Disassembler Cpu0Disassembler.cpp )# workaround for hanging compilation on MSVC9 and 10if( MSVC_VERSION EQUAL 1400 OR MSVC_VERSION EQUAL 1500 OR MSVC_VERSION EQUAL 1600 )set_property( SOURCE Cpu0Disassembler.cpp PROPERTY COMPILE_FLAGS "/Od" )endif()add_dependencies(LLVMCpu0Disassembler Cpu0CommonTableGen)
LLVMBackendTutorialExampleCode/Chapter9_1/Disassembler/LLVMBuild.txt
1 2 3 4 5 6 7 8 91011121314151617181920212223 | ;===- ./lib/Target/Cpu0/Disassembler/LLVMBuild.txt --------------*- Conf -*--===;;; The LLVM Compiler Infrastructure;; This file is distributed under the University of Illinois Open Source; License. See LICENSE.TXT for details.;;===------------------------------------------------------------------------===;;; This is an LLVMBuild description file for the components in this subdirectory.;; For more information on the LLVMBuild system, please see:;; http://llvm.org/docs/LLVMBuild.html;;===------------------------------------------------------------------------===;[component_0]type = Libraryname = Cpu0Disassemblerparent = Cpu0required_libraries = MC Support Cpu0Infoadd_to_library_groups = Cpu0 |
LLVMBackendTutorialExampleCode/Chapter9_1/Disassembler/Cpu0Disassembler.cpp
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300 | //===- Cpu0Disassembler.cpp - Disassembler for Cpu0 -------------*- C++ -*-===////// The LLVM Compiler Infrastructure//// This file is distributed under the University of Illinois Open Source// License. See LICENSE.TXT for details.////===----------------------------------------------------------------------===////// This file is part of the Cpu0 Disassembler.////===----------------------------------------------------------------------===//#include "Cpu0.h"#include "Cpu0Subtarget.h"#include "Cpu0RegisterInfo.h"#include "llvm/MC/MCDisassembler.h"#include "llvm/MC/MCFixedLenDisassembler.h"#include "llvm/Support/MemoryObject.h"#include "llvm/Support/TargetRegistry.h"#include "llvm/MC/MCSubtargetInfo.h"#include "llvm/MC/MCInst.h"#include "llvm/Support/MathExtras.h"using namespace llvm;typedef MCDisassembler::DecodeStatus DecodeStatus;/// Cpu0Disassembler - a disasembler class for Cpu032.class Cpu0Disassembler : public MCDisassembler {public: /// Constructor - Initializes the disassembler. /// Cpu0Disassembler(const MCSubtargetInfo &STI, bool bigEndian) : MCDisassembler(STI), isBigEndian(bigEndian) { } ~Cpu0Disassembler() { } /// getInstruction - See MCDisassembler. DecodeStatus getInstruction(MCInst &instr, uint64_t &size, const MemoryObject ®ion, uint64_t address, raw_ostream &vStream, raw_ostream &cStream) const;private: bool isBigEndian;};// Decoder tables for Cpu0 registerstatic const unsigned CPURegsTable[] = { Cpu0::ZERO, Cpu0::AT, Cpu0::V0, Cpu0::V1, Cpu0::A0, Cpu0::A1, Cpu0::T9, Cpu0::S0, Cpu0::S1, Cpu0::S2, Cpu0::GP, Cpu0::FP, Cpu0::SW, Cpu0::SP, Cpu0::LR, Cpu0::PC};static DecodeStatus DecodeCPURegsRegisterClass(MCInst &Inst, unsigned RegNo, uint64_t Address, const void *Decoder);static DecodeStatus DecodeCMPInstruction(MCInst &Inst, unsigned Insn, uint64_t Address, const void *Decoder);static DecodeStatus DecodeBranchTarget(MCInst &Inst, unsigned Insn, uint64_t Address, const void *Decoder);static DecodeStatus DecodeJumpRelativeTarget(MCInst &Inst, unsigned Insn, uint64_t Address, const void *Decoder);static DecodeStatus DecodeJumpAbsoluteTarget(MCInst &Inst, unsigned Insn, uint64_t Address, const void *Decoder);static DecodeStatus DecodeMem(MCInst &Inst, unsigned Insn, uint64_t Address, const void *Decoder);static DecodeStatus DecodeSimm16(MCInst &Inst, unsigned Insn, uint64_t Address, const void *Decoder);namespace llvm {extern Target TheCpu0elTarget, TheCpu0Target, TheCpu064Target, TheCpu064elTarget;}static MCDisassembler *createCpu0Disassembler( const Target &T, const MCSubtargetInfo &STI) { return new Cpu0Disassembler(STI,true);}static MCDisassembler *createCpu0elDisassembler( const Target &T, const MCSubtargetInfo &STI) { return new Cpu0Disassembler(STI,false);}extern "C" void LLVMInitializeCpu0Disassembler() { // Register the disassembler. TargetRegistry::RegisterMCDisassembler(TheCpu0Target, createCpu0Disassembler); TargetRegistry::RegisterMCDisassembler(TheCpu0elTarget, createCpu0elDisassembler);}#include "Cpu0GenDisassemblerTables.inc" /// readInstruction - read four bytes from the MemoryObject /// and return 32 bit word sorted according to the given endianessstatic DecodeStatus readInstruction32(const MemoryObject ®ion, uint64_t address, uint64_t &size, uint32_t &insn, bool isBigEndian) { uint8_t Bytes[4]; // We want to read exactly 4 Bytes of data. if (region.readBytes(address, 4, (uint8_t*)Bytes, NULL) == -1) { size = 0; return MCDisassembler::Fail; } if (isBigEndian) { // Encoded as a big-endian 32-bit word in the stream. insn = (Bytes[3] << 0) | (Bytes[2] << 8) | (Bytes[1] << 16) | (Bytes[0] << 24); } else { // Encoded as a small-endian 32-bit word in the stream. insn = (Bytes[0] << 0) | (Bytes[1] << 8) | (Bytes[2] << 16) | (Bytes[3] << 24); } return MCDisassembler::Success;}DecodeStatusCpu0Disassembler::getInstruction(MCInst &instr, uint64_t &Size, const MemoryObject &Region, uint64_t Address, raw_ostream &vStream, raw_ostream &cStream) const { uint32_t Insn; DecodeStatus Result = readInstruction32(Region, Address, Size, Insn, isBigEndian); if (Result == MCDisassembler::Fail) return MCDisassembler::Fail; // Calling the auto-generated decoder function. Result = decodeInstruction(DecoderTableCpu032, instr, Insn, Address, this, STI); if (Result != MCDisassembler::Fail) { Size = 4; return Result; } return MCDisassembler::Fail;}static DecodeStatus DecodeCPURegsRegisterClass(MCInst &Inst, unsigned RegNo, uint64_t Address, const void *Decoder) { if (RegNo > 16) return MCDisassembler::Fail; Inst.addOperand(MCOperand::CreateReg(CPURegsTable[RegNo])); return MCDisassembler::Success;}static DecodeStatus DecodeMem(MCInst &Inst, unsigned Insn, uint64_t Address, const void *Decoder) { int Offset = SignExtend32<16>(Insn & 0xffff); int Reg = (int)fieldFromInstruction(Insn, 20, 4); int Base = (int)fieldFromInstruction(Insn, 16, 4); Inst.addOperand(MCOperand::CreateReg(CPURegsTable[Reg])); Inst.addOperand(MCOperand::CreateReg(CPURegsTable[Base])); Inst.addOperand(MCOperand::CreateImm(Offset)); return MCDisassembler::Success;}/* CMP instruction define $rc and then $ra, $rb; The printOperand() print operand 1 and operand 2 (operand 0 is $rc and operand 1 is $ra), so we Create register $rc first and create $ra next, as follows,// Cpu0InstrInfo.tdclass CmpInstr<bits<8> op, string instr_asm, InstrItinClass itin, RegisterClass RC, RegisterClass RD, bit isComm = 0>: FA<op, (outs RD:$rc), (ins RC:$ra, RC:$rb), !strconcat(instr_asm, "\t$ra, $rb"), [], itin> {// Cpu0AsmWriter.incvoid Cpu0InstPrinter::printInstruction(const MCInst *MI, raw_ostream &O) {... case 3: // CMP, JEQ, JGE, JGT, JLE, JLT, JNE printOperand(MI, 1, O); break;... case 1: // CMP printOperand(MI, 2, O); return; break;*/static DecodeStatus DecodeCMPInstruction(MCInst &Inst, unsigned Insn, uint64_t Address, const void *Decoder) { int Reg_a = (int)fieldFromInstruction(Insn, 20, 4); int Reg_b = (int)fieldFromInstruction(Insn, 16, 4); int Reg_c = (int)fieldFromInstruction(Insn, 12, 4); Inst.addOperand(MCOperand::CreateReg(CPURegsTable[Reg_c])); Inst.addOperand(MCOperand::CreateReg(CPURegsTable[Reg_a])); Inst.addOperand(MCOperand::CreateReg(CPURegsTable[Reg_b])); return MCDisassembler::Success;}/* CBranch instruction define $ra and then imm24; The printOperand() print operand 1 (operand 0 is $ra and operand 1 is imm24), so we Create register operand first and create imm24 next, as follows,// Cpu0InstrInfo.tdclass CBranch<bits<8> op, string instr_asm, RegisterClass RC, list<Register> UseRegs>: FJ<op, (outs), (ins RC:$ra, brtarget:$addr), !strconcat(instr_asm, "\t$addr"), [(brcond RC:$ra, bb:$addr)], IIBranch> {// Cpu0AsmWriter.incvoid Cpu0InstPrinter::printInstruction(const MCInst *MI, raw_ostream &O) {... case 3: // CMP, JEQ, JGE, JGT, JLE, JLT, JNE printOperand(MI, 1, O); break;*/static DecodeStatus DecodeBranchTarget(MCInst &Inst, unsigned Insn, uint64_t Address, const void *Decoder) { int BranchOffset = fieldFromInstruction(Insn, 0, 24); if (BranchOffset > 0x8fffff) BranchOffset = -1*(0x1000000 - BranchOffset); Inst.addOperand(MCOperand::CreateReg(CPURegsTable[0])); Inst.addOperand(MCOperand::CreateImm(BranchOffset)); return MCDisassembler::Success;}static DecodeStatus DecodeJumpRelativeTarget(MCInst &Inst, unsigned Insn, uint64_t Address, const void *Decoder) { int JumpOffset = fieldFromInstruction(Insn, 0, 24); if (JumpOffset > 0x8fffff) JumpOffset = -1*(0x1000000 - JumpOffset); Inst.addOperand(MCOperand::CreateImm(JumpOffset)); return MCDisassembler::Success;}static DecodeStatus DecodeJumpAbsoluteTarget(MCInst &Inst, unsigned Insn, uint64_t Address, const void *Decoder) { unsigned JumpOffset = fieldFromInstruction(Insn, 0, 24); Inst.addOperand(MCOperand::CreateImm(JumpOffset)); return MCDisassembler::Success;}static DecodeStatus DecodeSimm16(MCInst &Inst, unsigned Insn, uint64_t Address, const void *Decoder) { Inst.addOperand(MCOperand::CreateImm(SignExtend32<16>(Insn))); return MCDisassembler::Success;} |
As above code, it add directory Disassembler for handling the obj to assemblycode reverse translation. So, add Disassembler/Cpu0Disassembler.cpp and modifythe CMakeList.txt and LLVMBuild.txt to build with directory Disassembler andenable the disassembler table generated by “has_disassembler = 1”.Most of code is handled by the table of *.td files defined.Not every instruction in *.td can be disassembled without trouble even thoughthey can be translated into assembly and obj successfully.For those cannot be disassembled, LLVM supply the “let DecoderMethod”keyword to allow programmers implement their decode function.In Cpu0 example, we define function DecodeCMPInstruction(), DecodeBranchTarget()and DecodeJumpAbsoluteTarget() in Cpu0Disassembler.cpp and tell the LLVM tabledriven system by write “let DecoderMethod = ...” in the correspondinginstruction definitions or ISD node of Cpu0InstrInfo.td.LLVM will call these DecodeMethod when user use Disassembler job in tools, likellvm-objdump -d.You can check the comments above these DecodeMethod functions to see how itwork.For the CMP instruction, since there are 3 operand $rc, $ra and $rb occurs inCmpInstr<...>, and the assembler print $ra and $rb.LLVM table generate system will print operand 1 and 2($ra and $rb) in the table generated function printInstruction().The operand 0 ($rc) didn’t be printed in printInstruction() since assemblyprint $ra and $rb only.In the CMP decode function, we didn’t decode shamt field because wedon’t want it to be displayed and it’s not in the assembler print pattern ofCpu0InstrInfo.td.
The RET (Cpu0ISD::Ret) and JR (ISD::BRIND) are both for “ret” instruction.The former is for instruction encode in assembly and obj while the latter isfor decode in disassembler.The IR node Cpu0ISD::Ret is created in LowerReturn() which called at functionexit point.
Now, run Chapter9_1/ with command llvm-objdump -d ch7_1_1.cpu0.o will getthe following result.
JonathantekiiMac:InputFiles Jonathan$ /Users/Jonathan/llvm/test/cmake_debug_build/bin/Debug/llc -march=cpu0 -relocation-model=pic -filetype=objch7_1_1.bc -o ch7_1_1.cpu0.oJonathantekiiMac:InputFiles Jonathan$ /Users/Jonathan/llvm/test/cmake_debug_build/bin/Debug/llvm-objdump -d ch7_1_1.cpu0.och7_1_1.cpu0.o: file format ELF32-CPU0Disassembly of section .text:main: 0: 09 dd ff d8 addiu $sp, $sp, -40 4: 09 30 00 00 addiu $3, $zero, 0 8: 02 3d 00 24 st $3, 36($sp) c: 02 3d 00 20 st $3, 32($sp) 10: 09 20 00 01 addiu $2, $zero, 1 14: 02 2d 00 1c st $2, 28($sp) 18: 09 40 00 02 addiu $4, $zero, 2 1c: 02 4d 00 18 st $4, 24($sp) 20: 09 40 00 03 addiu $4, $zero, 3 24: 02 4d 00 14 st $4, 20($sp) 28: 09 40 00 04 addiu $4, $zero, 4 2c: 02 4d 00 10 st $4, 16($sp) 30: 09 40 00 05 addiu $4, $zero, 5 34: 02 4d 00 0c st $4, 12($sp) 38: 09 40 00 06 addiu $4, $zero, 6 3c: 02 4d 00 08 st $4, 8($sp) 40: 09 40 00 07 addiu $4, $zero, 7 44: 02 4d 00 04 st $4, 4($sp) 48: 09 40 00 08 addiu $4, $zero, 8 4c: 02 4d 00 00 st $4, 0($sp) 50: 01 4d 00 20 ld $4, 32($sp) 54: 28 40 00 0c bne $4, $zero, 12 58: 01 4d 00 20 ld $4, 32($sp) 5c: 09 44 00 01 addiu $4, $4, 1 60: 02 4d 00 20 st $4, 32($sp) 64: 01 4d 00 1c ld $4, 28($sp) 68: 27 40 00 0c beq $4, $zero, 12 6c: 01 4d 00 1c ld $4, 28($sp) 70: 09 44 00 01 addiu $4, $4, 1 74: 02 4d 00 1c st $4, 28($sp) 78: 01 4d 00 18 ld $4, 24($sp) 7c: 0a 44 00 01 slti $4, $4, 1 80: 28 40 00 0c bne $4, $zero, 12 84: 01 4d 00 18 ld $4, 24($sp) 88: 09 44 00 01 addiu $4, $4, 1 8c: 02 4d 00 18 st $4, 24($sp) 90: 01 4d 00 14 ld $4, 20($sp) 94: 0a 44 00 00 slti $4, $4, 0 98: 28 40 00 0c bne $4, $zero, 12 9c: 01 4d 00 14 ld $4, 20($sp) a0: 09 44 00 01 addiu $4, $4, 1 a4: 02 4d 00 14 st $4, 20($sp) a8: 01 4d 00 10 ld $4, 16($sp) ac: 09 50 ff ff addiu $5, $zero, -1 b0: 20 45 40 00 slt $4, $5, $4 b4: 28 40 00 0c bne $4, $zero, 12 b8: 01 4d 00 10 ld $4, 16($sp) bc: 09 44 00 01 addiu $4, $4, 1 c0: 02 4d 00 10 st $4, 16($sp) c4: 01 4d 00 0c ld $4, 12($sp) c8: 20 33 40 00 slt $3, $3, $4 cc: 28 30 00 0c bne $3, $zero, 12 d0: 01 3d 00 0c ld $3, 12($sp) d4: 09 33 00 01 addiu $3, $3, 1 d8: 02 3d 00 0c st $3, 12($sp) dc: 01 3d 00 08 ld $3, 8($sp) e0: 20 22 30 00 slt $2, $2, $3 e4: 28 20 00 0c bne $2, $zero, 12 e8: 01 2d 00 08 ld $2, 8($sp) ec: 09 22 00 01 addiu $2, $2, 1 f0: 02 2d 00 08 st $2, 8($sp) f4: 01 2d 00 04 ld $2, 4($sp) f8: 0a 22 00 01 slti $2, $2, 1 fc: 28 20 00 0c bne $2, $zero, 12 100: 01 2d 00 04 ld $2, 4($sp) 104: 09 22 00 01 addiu $2, $2, 1 108: 02 2d 00 04 st $2, 4($sp) 10c: 01 2d 00 04 ld $2, 4($sp) 110: 01 3d 00 00 ld $3, 0($sp) 114: 20 23 20 00 slt $2, $3, $2 118: 27 20 00 0c beq $2, $zero, 12 11c: 01 2d 00 00 ld $2, 0($sp) 120: 09 22 00 01 addiu $2, $2, 1 124: 02 2d 00 00 st $2, 0($sp) 128: 01 2d 00 1c ld $2, 28($sp) 12c: 01 3d 00 20 ld $3, 32($sp) 130: 27 32 00 0c beq $3, $2, 12 134: 01 2d 00 20 ld $2, 32($sp) 138: 09 22 00 01 addiu $2, $2, 1 13c: 02 2d 00 20 st $2, 32($sp) 140: 01 2d 00 20 ld $2, 32($sp) 144: 09 dd 00 28 addiu $sp, $sp, 40 148: 2c 00 00 00 ret $zero
We expain how the dynamic link work for Cpu0 even though the Cpu0 linker anddynamic linker not exist at this point. Same with other parts, Cpu0 dynamic linkimplementation borrowed from Mips ABI. We trace the dynamic link implementationby lldb on X86 platform. Finding X86 and Mips all use the plt as the dynamic linkimplementation.
In this section, it shows what’s the code that comipler generate to supportdynamic link. And what’s the code generated by linker for dyanmic link.
Compile main.cpp to get the Cpu0 PIC assembly code as follows,
LLVMBackendTutorialExampleCode/InputFiles/main.cpp
1 2 3 4 5 6 7 8 91011 | extern int foo(int x1, int x2);extern int bar();int main(){ int a = foo(1, 2); a += foo(3, 4); a += bar(); return a;} |
18-165-77-200:InputFiles Jonathan$ /Users/Jonathan/llvm/test/cmake_debug_build/bin/Debug/llc -march=cpu0 -relocation-model=pic -filetype=asm main.bc -o - .section .mdebug.abi32 .previous .file "main.bc" .text .globl main .align 2 .type main,@function .ent main # @mainmain: .cfi_startproc .frame $sp,40,$lr .mask 0x00004080,-4 .set noreorder .cpload $t9 .set nomacro# BB#0: addiu $sp, $sp, -40$tmp2: .cfi_def_cfa_offset 40 st $lr, 36($sp) # 4-byte Folded Spill st $7, 32($sp) # 4-byte Folded Spill$tmp3: .cfi_offset 14, -4$tmp4: .cfi_offset 7, -8 .cprestore 8 addiu $2, $zero, 0 st $2, 28($sp) addiu $2, $zero, 2 st $2, 4($sp) addiu $2, $zero, 1 st $2, 0($sp) ld $7, %call24(_Z3fooii)($gp) add $6, $zero, $7 jalr $6 ld $gp, 8($sp) st $2, 24($sp) addiu $2, $zero, 4 st $2, 4($sp) addiu $2, $zero, 3 st $2, 0($sp) add $6, $zero, $7 jalr $6 ld $gp, 8($sp) ld $3, 24($sp) add $2, $3, $2 st $2, 24($sp) ld $6, %call24(_Z3barv)($gp) jalr $6 ld $gp, 8($sp) ld $3, 24($sp) add $2, $3, $2 st $2, 24($sp) ld $7, 32($sp) # 4-byte Folded Reload ld $lr, 36($sp) # 4-byte Folded Reload addiu $sp, $sp, 40 ret $2 .set macro .set reorder .end main$tmp5: .size main, ($tmp5)-main .cfi_endproc118-165-77-200:InputFiles Jonathan$ /Users/Jonathan/llvm/test/cmake_debug_build/bin/Debug/llc -march=cpu0 -relocation-model=pic -filetype=obj main.bc -omain.cpu0.o118-165-77-200:InputFiles Jonathan$ /Users/Jonathan/llvm/test/cmake_debug_build/bin/Debug/llvm-objdump -r main.cpu0.omain.cpu0.o: file format ELF32-CPU0RELOCATION RECORDS FOR [.text]:4 R_CPU0_LO16 _gp_disp52 R_CPU0_CALL24 _Z3fooii112 R_CPU0_CALL24 _Z3barvRELOCATION RECORDS FOR [.eh_frame]:28 R_CPU0_32 .text
Suppost we have the linker which support dynamic link of Cpu0.After linker, the plt for dynamic link function _Z5fooii and _Z3barv aresolved and the ELF file looks like the following,
SYMBOL TABLE:...0040035c l d .dynsym 00000000 .dynsym...Disassembly of section .plt:...00400720 <_Z3barv@plt>: 400720: lui $8,0x41 400724: ld $6,0x0acc($8) 400738: addiu $9,$zero,0x18 40073c: jr $600400730 <_Z3fooii@plt>: 400730: lui $8,0x41 400734: ld $6,0x0ado($8) 400738: addiu $9,$zero,0x08 40073c: jr $6 ...004009f0 <.CPU0.stubs>: 4009F0: 8f998010 ld $6,-32752(gp) 4009F4: 03e07821 add $8,$zero, lr 4009F8: 0320f809 jalr $6
To support dynamic link, Cpu0 set the protocal as Table registers changed forcall dynamic link function _Z3fooii(). The .dynsym+0x08 include the dynamiclink function information. Usually the information include whichlibrary and the offset value in this library. This information can be got andsaved in ELF file in link time.
After the ELF is loaded to memory, it looks like Figure 2
register/memory | call _Z3fooii first time | call _Z3fooii second time |
---|---|---|
0x410ad0 | point to CPU0.stubs | point to _Z3fooii |
.dynsym+0x08 | (libfoobar.so, offset, length) about _Z3fooii | useless |
-32752(gp) | point to dynamic_linker | useless |
$8 | the next instruction of _Z3fooii() | useless |
$9 | .dynsym+0x08 | useless |
$6 (at the end of _Z3fooii@plt) | point to CPU0.stubs0 | point to _Z3fooii |
$6 (at the end of CPU0.stubs) | point to dynamic_linker | useless |
Explains it as follows,
After the _Z3fooii() is called at second time, it looks likeFigure 3. It jump to _Z3fooii() directly in <_Z3fooii@plt> sincethe contents of address 0x410ad0 is changed to the memory address of _Z3fooii()at step 6 of Figure 2. From now on, any call _Z3fooii() willjump to _Z3fooii() directly from _Z3fooii@plt instructions.
According Mips Application Binary Interface (ABI), $t9 is register aliasfor $25 in Mips. The %t9 is the registerused in jalr $25 for long distance function pointer (far subroutine call).Cpu0 use register $6 as the $t9 ($25) register of Mips.The jal %subroutine has 24 bits range of address offset relative to ProgramCounter (PC) while jalr has 32 bits address range in register size of 32 bits.One example of PIC mode is used in share library just like this example.Share library is re-entry code which can be loaded in different memory addressdecided on run time. The jalr make the implementation of dynamic link functioneasier and faster as above.
We tracking the dynamic link on X86 as below. You can skip it if you have nointerest or already know how to track it via lldb or gdb.
118-165-77-200:InputFiles Jonathan$ clang -fPIC -g -c foobar.cpp118-165-77-200:InputFiles Jonathan$ clang -shared -g foobar.o -o libfoobar.so118-165-77-200:InputFiles Jonathan$ clang -g -c main.cpp118-165-77-200:InputFiles Jonathan$ clang -g main.o libfoobar.so118-165-77-200:InputFiles Jonathan$ gobjdump -d main.omain.o: fileformat mach-o-x86-64Disassembly of section .text:0000000000000000 <_main>: 0: 55 push %rbp 1: 48 89 e5 mov %rsp,%rbp 4: 48 83 ec 10 sub $0x10,%rsp 8: bf 01 00 00 00 mov $0x1,%edi d: be 02 00 00 00 mov $0x2,%esi 12: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%rbp) 19: e8 00 00 00 00 callq 1e <_main+0x1e> 1e: bf 03 00 00 00 mov $0x3,%edi 23: be 04 00 00 00 mov $0x4,%esi 28: 89 45 f8 mov %eax,-0x8(%rbp) 2b: e8 00 00 00 00 callq 30 <_main+0x30> 30: 8b 75 f8 mov -0x8(%rbp),%esi 33: 01 c6 add %eax,%esi 35: 89 75 f8 mov %esi,-0x8(%rbp) 38: e8 00 00 00 00 callq 3d <_main+0x3d> 3d: 8b 75 f8 mov -0x8(%rbp),%esi 40: 01 c6 add %eax,%esi 42: 89 75 f8 mov %esi,-0x8(%rbp) 45: 8b 45 f8 mov -0x8(%rbp),%eax 48: 48 83 c4 10 add $0x10,%rsp 4c: 5d pop %rbp 4d: c3 retq118-165-77-200:InputFiles Jonathan$ gobjdump -d a.out...main.o: fileformat mach-o-x86-64Disassembly of section .text:0000000100000ef0 <_main>: 100000ef0: 55 push %rbp 100000ef1: 48 89 e5 mov %rsp,%rbp 100000ef4: 48 83 ec 10 sub $0x10,%rsp 100000ef8: bf 01 00 00 00 mov $0x1,%edi 100000efd: be 02 00 00 00 mov $0x2,%esi 100000f02: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%rbp) 100000f09: e8 36 00 00 00 callq 100000f44 <__Z3fooii$stub> 100000f0e: bf 03 00 00 00 mov $0x3,%edi 100000f13: be 04 00 00 00 mov $0x4,%esi 100000f18: 89 45 f8 mov %eax,-0x8(%rbp) 100000f1b: e8 24 00 00 00 callq 100000f44 <__Z3fooii$stub> 100000f20: 8b 75 f8 mov -0x8(%rbp),%esi 100000f23: 01 c6 add %eax,%esi 100000f25: 89 75 f8 mov %esi,-0x8(%rbp) 100000f28: e8 11 00 00 00 callq 100000f3e <__Z3barv$stub> 100000f2d: 8b 75 f8 mov -0x8(%rbp),%esi 100000f30: 01 c6 add %eax,%esi 100000f32: 89 75 f8 mov %esi,-0x8(%rbp) 100000f35: 8b 45 f8 mov -0x8(%rbp),%eax 100000f38: 48 83 c4 10 add $0x10,%rsp 100000f3c: 5d pop %rbp 100000f3d: c3 retqDisassembly of section __TEXT.__stubs:0000000100000f3e <__Z3barv$stub>: 100000f3e: ff 25 cc 00 00 00 jmpq *0xcc(%rip) # 100001010 <__Z3barv$stub>0000000100000f44 <__Z3fooii$stub>: 100000f44: ff 25 ce 00 00 00 jmpq *0xce(%rip) # 100001018 <__Z3fooii$stub>Disassembly of section __TEXT.__stub_helper:0000000100000f4c <__TEXT.__stub_helper>: 100000f4c: 4c 8d 1d b5 00 00 00 lea 0xb5(%rip),%r11 # 100001008 <> 100000f53: 41 53 push %r11 100000f55: ff 25 a5 00 00 00 jmpq *0xa5(%rip) # 100001000 <dyld_stub_binder$stub> 100000f5b: 90 nop 100000f5c: 68 00 00 00 00 pushq $0x0 100000f61: e9 e6 ff ff ff jmpq 100000f4c <__Z3fooii$stub+0x8> 100000f66: 68 0f 00 00 00 pushq $0xf 100000f6b: e9 dc ff ff ff jmpq 100000f4c <__Z3fooii$stub+0x8>Disassembly of section __TEXT.__unwind_info: ...118-165-77-200:InputFiles Jonathan$ lldb a.outCurrent executable set to 'a.out' (x86_64).(lldb) run mainProcess 702 launched: '/Users/Jonathan/test/lbd/docs/BackendTutorial/LLVMBackendTutorialExampleCode/InputFiles/a.out' (x86_64)Process 702 exited with status = 15 (0x0000000f)(lldb) b mainBreakpoint 1: where = a.out`main + 25 at main.cpp:7, address = 0x0000000100000f09(lldb) target stop-hook addEnter your stop hook command(s). Type 'DONE' to end.> disassemble --pc> DONEStop hook #1 added.(lldb) runProcess 705 launched: '/Users/Jonathan/test/lbd/docs/BackendTutorial/LLVMBackendTutorialExampleCode/InputFiles/a.out' (x86_64)dyld`_dyld_start:-> 0x7fff5fc01028: popq %rdi 0x7fff5fc01029: pushq $0 0x7fff5fc0102b: movq %rsp, %rbp 0x7fff5fc0102e: andq $-16, %rspProcess 753 stopped* thread #1: tid = 0x1c03, 0x0000000100000f09 a.out`main + 25 at main.cpp:7,stop reason = breakpoint 1.1 frame #0: 0x0000000100000f09 a.out`main + 25 at main.cpp:7 4 5 int main() 6 {-> 7 int a = foo(1, 2); 8 a += foo(3, 4); 9 a += bar(); 10a.out`main + 25 at main.cpp:7:-> 0x100000f09: callq 0x100000f44 ; symbol stub for: foo(int, int) 0x100000f0e: movl $3, %edi 0x100000f13: movl $4, %esi 0x100000f18: movl %eax, -8(%rbp)(lldb) stepiProcess 753 stopped* thread #1: tid = 0x1c03, 0x0000000100000f44 a.out`foo(int, int), stop reason= instruction step into frame #0: 0x0000000100000f44 a.out`foo(int, int)a.out`symbol stub for: foo(int, int):-> 0x100000f44: jmpq *206(%rip) ; (void *)0x0000000100000f66a.out`symbol stub for: foo(int, int):-> 0x100000f44: jmpq *206(%rip) ; (void *)0x0000000100000f66 0x100000f4a: addb %al, (%rax) 0x100000f4c: addb %al, (%rax) 0x100000f4e: addb %al, (%rax)(lldb) p $rip(unsigned long) $1 = 4294971204(lldb) memory read/4xw 42949714100x100001012: 0x00010000 0x0f660000 0x00010000 0x00000000(lldb) stepiProcess 859 stopped* thread #1: tid = 0x1c03, 0x0000000100000f66 a.out, stop reason = instruction step into frame #0: 0x0000000100000f66 a.out-> 0x100000f66: pushq $15 0x100000f6b: jmpq 0x100000f4c-> 0x100000f66: pushq $15 0x100000f6b: jmpq 0x100000f4c 0x100000f70: addb %al, (%rax) 0x100000f72: addb %al, (%rax)) stepistepiProcess 859 stopped* thread #1: tid = 0x1c03, 0x0000000100000f6b a.out, stop reason = instruction step into frame #0: 0x0000000100000f6b a.out-> 0x100000f6b: jmpq 0x100000f4c-> 0x100000f6b: jmpq 0x100000f4c 0x100000f70: addb %al, (%rax) 0x100000f72: addb %al, (%rax) 0x100000f74: addb %al, (%rax)(lldb) stepiProcess 859 stopped* thread #1: tid = 0x1c03, 0x0000000100000f4c a.out, stop reason = instruction step into frame #0: 0x0000000100000f4c a.out-> 0x100000f4c: leaq 181(%rip), %r11 ; (void *)0x0000000000000000 0x100000f53: pushq %r11 0x100000f55: jmpq *165(%rip) ; (void *)0x00007fff978da878: dyld_stub_binder 0x100000f5b: nop-> 0x100000f4c: leaq 181(%rip), %r11 ; (void *)0x0000000000000000 0x100000f53: pushq %r11 0x100000f55: jmpq *165(%rip) ; (void *)0x00007fff978da878: dyld_stub_binder 0x100000f5b: nop(lldb)Process 859 stopped* thread #1: tid = 0x1c03, 0x0000000100000f53 a.out, stop reason = instruction step into frame #0: 0x0000000100000f53 a.out-> 0x100000f53: pushq %r11 0x100000f55: jmpq *165(%rip) ; (void *)0x00007fff978da878: dyld_stub_binder 0x100000f5b: nop 0x100000f5c: pushq $0-> 0x100000f53: pushq %r11 0x100000f55: jmpq *165(%rip) ; (void *)0x00007fff978da878: dyld_stub_binder 0x100000f5b: nop 0x100000f5c: pushq $0(lldb)Process 859 stopped* thread #1: tid = 0x1c03, 0x0000000100000f55 a.out, stop reason = instruction step into frame #0: 0x0000000100000f55 a.out-> 0x100000f55: jmpq *165(%rip) ; (void *)0x00007fff978da878: dyld_stub_binder 0x100000f5b: nop 0x100000f5c: pushq $0 0x100000f61: jmpq 0x100000f4c-> 0x100000f55: jmpq *165(%rip) ; (void *)0x00007fff978da878: dyld_stub_binder 0x100000f5b: nop 0x100000f5c: pushq $0 0x100000f61: jmpq 0x100000f4c(lldb)Process 859 stopped* thread #1: tid = 0x1c03, 0x00007fff978da878 libdyld.dylib`dyld_stub_binder, stop reason = instruction step into frame #0: 0x00007fff978da878 libdyld.dylib`dyld_stub_binderlibdyld.dylib`dyld_stub_binder:-> 0x7fff978da878: pushq %rbp 0x7fff978da879: movq %rsp, %rbp 0x7fff978da87c: subq $192, %rsp 0x7fff978da883: movq %rdi, (%rsp)libdyld.dylib`dyld_stub_binder:-> 0x7fff978da878: pushq %rbp 0x7fff978da879: movq %rsp, %rbp 0x7fff978da87c: subq $192, %rsp 0x7fff978da883: movq %rdi, (%rsp)(lldb) contProcess 753 resumingProcess 753 stopped* thread #1: tid = 0x1c03, 0x0000000100000f1b a.out`main + 43 at main.cpp:8, stop reason = breakpoint 2.1 frame #0: 0x0000000100000f1b a.out`main + 43 at main.cpp:8 5 int main() 6 { 7 int a = foo(1, 2);-> 8 a += foo(3, 4); 9 a += bar(); 10 11 return a;a.out`main + 43 at main.cpp:8:-> 0x100000f1b: callq 0x100000f44 ; symbol stub for: foo(int, int) 0x100000f20: movl -8(%rbp), %esi 0x100000f23: addl %eax, %esi 0x100000f25: movl %esi, -8(%rbp)(lldb) stepiProcess 753 stopped* thread #1: tid = 0x1c03, 0x0000000100000f44 a.out`foo(int, int), stop reason = instruction step into frame #0: 0x0000000100000f44 a.out`foo(int, int)a.out`symbol stub for: foo(int, int):-> 0x100000f44: jmpq *206(%rip) ; (void *)0x0000000100003f20: foo(int, int) at /Users/Jonathan/test/lbd/docs/BackendTutorial/LLVMBackendT utorialExampleCode/InputFiles/foobar.cpp:3a.out`symbol stub for: foo(int, int):-> 0x100000f44: jmpq *206(%rip) ; (void *)0x0000000100003f20: foo(int, int) at /Users/Jonathan/test/lbd/docs/BackendTutorial/LLVMBackendT utorialExampleCode/InputFiles/foobar.cpp:3 0x100000f4a: addb %al, (%rax) 0x100000f4c: addb %al, (%rax) 0x100000f4e: addb %al, (%rax)(lldb) p $rip(unsigned long) $2 = 4294971204(lldb) memory read/4xw 42949714100x100001012: 0x00010000 0x3f200000 0x00010000 0x00000000(lldb) stepiProcess 753 stopped* thread #1: tid = 0x1c03, 0x0000000100003f20 libfoobar.so`foo(x1=0, x2=3) at foobar.cpp:3, stop reason = instruction step into frame #0: 0x0000000100003f20 libfoobar.so`foo(x1=0, x2=3) at foobar.cpp:3 1 2 int foo(int x1, int x2)-> 3 { 4 int sum = x1 + x2; 5 6 return sum;libfoobar.so`foo(int, int) at foobar.cpp:3:-> 0x100003f20: pushq %rbp 0x100003f21: movq %rsp, %rbp 0x100003f24: movl %edi, -4(%rbp) 0x100003f27: movl %esi, -8(%rbp)(lldb)
[1] | http://en.wikipedia.org/wiki/Executable_and_Linkable_Format |
[2] | http://jonathan2251.github.com/lbd/install.html#install-other-tools-on-imac |
[3] | Leland Beck, System Software: An Introduction to Systems Programming. |
[4] | http://ccckmit.wikidot.com/lk:aout |
[5] | http://ccckmit.wikidot.com/lk:objfile |
[6] | http://ccckmit.wikidot.com/lk:elf |
[7] | http://lld.llvm.org/ |
[8] | http://developer.mips.com/clang-llvm/ |
联系客服