GNU Toolchain is the de facto standard of IT industrial and has been improved by comprehensive open source contributions. In this session, it is expected to cover the mechanism of compiler driver, system interaction (take GNU/Linux for example), linker, C runtime library, and the related dynamic linker. Instead of analyzing the system design, the session is use case driven and illustrated progressively.
1. FROM SOURCE TO BINARY
How GNU Toolchain Works
從原始碼到二進制
Luse Cheng
ソースからバイナリへ
Deputy Manager, Andes Technology
Von Quelle Zu Binären
Jim Huang ( 黃敬群 ) <jserv@0xlab.org>
De source au binaire
Developer & Co-founder, 0xlab
Desde fuente a binario
March 31, 2011 / 臺北科技大學
Binarium ut a fonte
46. Binutils – GNU Linker
Linker 的工作 ( 一般靜態連結執行檔 )
把所有目的檔彙整成執行檔
上窮碧落下黃泉 (Symbol Resolve)
一切依法處理 ( 處理 Relocation Type)
TEXT
TEXT
DATA
TEXT
LINKER
DATA
TEXT
DATA
DATA
47. Dynamically Linked Shared Libraries
m.c a.c
Translators Translators
(cc1, as) (cc1,as) Shared Library
Dynamically relocatable
object files
m.o a.o
Linker (ld) $ ldd hello
libc.so.6 =>
Partially linked /lib/ld-linux.so.2 (0x00524000)
executable program libc.so
(on disk)
ar g
vect or Loader / Dynamic Linker
main( ) libc.so functions called by m.c
(ld-linux.so)
printf( ) and a.c are loaded, linked, and
..
.
.
(potentially) shared among
Fully linked executable processes.
(in memory) Program’
48. Relocatable Object Files Executable Object File
system code .text 0
headers
system data .data
system code
main() .text
a()
main() .text
m.o
int e = 7 .data more system code
system data
int e = 7 .data
a() .text int *ep = &e
int x = 15
a.o int *ep = &e .data uninitialized data .bss
int x = 15 .symtab
int y .bss
.debug
a.c
int e=7; extern int e;
m.c
int *ep=&e, x=15, y;
int main() {
int r = a(); int a() {
exit(0); return *ep+x+y;
} }
49. 每個 symbol 都賦予一個特定值,一般來說就是
memory address
Code symbol definitions / reference
Reference local / external
Local symbol (ep)
External
的 definition
symbol (e)
m.c a.c 的 reference
int e=7; extern int e;
Local symbol (e) int main() { int *ep=&e;
的 definition int r = a(); int x=15;
exit(0); int y;
}
int a() {
Local symbol
External return *ep+x+y;
(x, y)
symbol (exit)
}
External 的 definition
的 reference symbol (a) Local symbol (a) Local symbol
的 reference 的 definition (x, y)
的 reference
50. GCC Linker - ld
m.c
int e=7; Disassembly of section .text:
int main() { 00000000 <main>: 00000000 <main>:
0: 55 pushl %ebp
int r = a();
1: 89 e5 movl %esp,%ebp
exit(0); 3: e8 fc ff ff ff call 4 <main+0x4>
} 4: R_386_PC32 a
8: 6a 00 pushl $0x0
a: e8 fc ff ff ff call b <main+0xb>
Relocation Info b: R_386_PC32 exit
f: 90 nop
Disassembly of section .data:
00000000 <e>:
0: 07 00 00 00
52. 那些 Linker 要做的事
Linker 知道什麼 :
每個 .text 與 .data 區段的長度
.text 與 .data 區段的順序
Linker 的運算 :
absolute address of each label to be jumped to
(internal or external) and each piece of data being
referenced
53. Page size
Magic number
Virtual address
memory segment
type (.o / .so / exec)
Machine
ELF
(sections)
(Executable and
byte order
Linkable Format)
Segment size … 0
ELF header
Initialized (static) data Program header table
code (required for executables)
Un-initialized (static) data
.text section
Block started by symbol
Has section header but .data section
occupies no space .bss section
注意: .dynsym 還保留 .symtab
.rel.txt
Runtime 只需要左邊欄位 .rel.data
ELF header 可透過“ strip” 指令去除不
.debug
Program header table 需要的 section
(required for executables) Section header table
(required for relocatables)
.text section
.data section
.bss section
60. Relocation: 與平台相關的實做
glibc elf/dynamic-link.h
/* This can't just be an inline function because GCC is too dumb
to inline functions containing inlines themselves. */
# define ELF_DYNAMIC_RELOCATE(map, lazy, consider_profile)
do {
int edr_lazy = elf_machine_runtime_setup ((map), (lazy),
(consider_profile));
ELF_DYNAMIC_DO_REL ((map), edr_lazy);
ELF_DYNAMIC_DO_RELA ((map), edr_lazy);
} while (0)
glibc sysdeps/i386/dl-machine.h
/* Set up the loaded object described by L so its unrelocated PLT
entries will jump to the ondemand fixup code in dlruntime.c.
*/
static inline int __attribute__ ((unused, always_inline))
elf_machine_runtime_setup (struct link_map *l, int lazy, int
profile)
{ :
got[1] = (Elf32_Addr) l; /* Identify this shared object. */
:
got[2] = (Elf32_Addr) &_dl_runtime_resolve;
:
ELF resolver
}
62. 總結
微言大義
Hello World 程式啟動流程
Hello World 程式編譯流程
Compiler Driver
三大步驟
五大階段 , 三大法門
編譯 , 組譯 , 連結
63. 參考資料
Loader and Linker, John R. Levine 2000
程序員的自我修養 – 連結、載入與程式庫
LLVM: A Compilation Framework for Lifelong Program Analysis &
Transformation
http://llvm.org/pubs/2004-01-30-CGO-LLVM.html
GCC, the GNU Compiler Collection
http://gcc.gnu.org/
GNU Binutils
http://www.gnu.org/software/binutils/
Embedded GLIBC
http://www.eglibc.org/home