Tutorial on Essential Abstractions in GCC
Concluding Remarks
Uday Khedker
(www.cse.iitb.ac.in/grc)
GCC Resource Center,
Department of Computer Science and Engineering, Indian Institute of Technology, Bombay
April 2011
Outline
• Essential Abstractions in GCC
• About GCC Resource Center
Part 1
Essential Abstractions in GCC
Compilation Models
Aho Ullman Model
Davidson Fraser Model
Front End AST Optimizer Target Indep. IR
Code Generator Target Program
Front End AST Expander Register Transfers
Optimizer Register Transfers
Recognizer Target Program Aho Ullman: Instruction selection
• over optimized IR using
• cost based tree pattern matching Davidson Fraser: Instruction selection
• over AST using
• structural tree pattern matching
• naive code which is
◮
target dependent, and is
◮
optimized subsequently
The GNU Tool Chain
gcc Source Program
Target Program
cc1 cpp
cc1 cpp
as
ld
glibc/newlib
GCC
The Architecture of GCC
Language Specific
Code
Language and Machine Independent Generic Code
Machine Dependent
Generator Code
Machine Descriptions Compiler Generation Framework
Parser Gimplifier Tree SSA Optimizer
RTL
Generator Optimizer Code Generator Generated Compiler (cc1)
Source Program Assembly Program
Input Language Target Name
Selected Copied Copied
Generated
Generated
Development Time
Build Time
Time Use
Configuring GCC
configure config.guess
configure.in config/*
config.sub
config.log config.cache config.status
config.h.in Makefile.in
Makefile config.h
Bootstrapping: The Conventional View
C
n−1C
n−2m/c C
nC
n−1m/c
input language output language
implementation language
Level n C
A Native Build on i386
Requirement: BS = HS = TS = i386
• Stage 1 build compiled using cc
• Stage 2 build compiled using gcc
• Stage 3 build compiled using gcc
• Stage 2 and Stage 3 Builds must be identical for a successful native build GCC
Source
C i386
i386
cc C i386
C i386
i386
gcc
Stage 1 Build
C i386
i386
gcc
Stage 2 Build
C i386
i386
gcc
Stage 3 Build
Build for a Given Machine This is what actually happens!
• Generation
◮
Generator sources
($(SOURCE D)/gcc/gen*.c) are read and generator executables are created in
$(BUILD)/gcc/build
◮
MD files are read by the generator executables and back end source code is generated in $(BUILD)/gcc
• Compilation
Other source files are read from
$(SOURCE D) and executables created in corresponding subdirectories of $(BUILD)
• Installation
Created executables and libraries are copied in $(INSTALL)
genattr
gencheck
genconditions
genconstants
genflags
genopinit
genpreds
genattrtab
genchecksum
gencondmd
genemit
gengenrtl
genmddeps
genoutput
genrecog
genautomata
gencodes
genconfig
genextract
gengtype
genmodes
genpeep
More Details of an Actual Stage 1 Build for C
native cc + native binutils GCC sources
libraries
libiberty
fixincl
gen*
cc1 cpp
xgcc libgcc target
binutils
cc + binutils
for stage 2
Basic Transformations in GCC
Tranformation from a language to a different language
Target Independent Target Dependent
Parse Gimplify Tree SSA Optimize
Generate
RTL Optimize RTL Generate
ASM
GIMPLE → RTL RTL → ASM
RTL Passes
GIMPLE Passes
Instruction Specification and Translation: A Recap
Target Independent Target Dependent
Parse Gimplify Tree SSA Optimize
Generate
RTL Optimize RTL Generate
ASM
GIMPLE
→RTL RTL
→ASM
•
GIMPLE: target independent
•
RTL: target dependent
• Need: associate thesemantics
⇒
GCC Solution:
Standard Pattern Names GIMPLE ASSIGNRTL Template ASM
(define_insn "movsi"
[(set (match_operand 0 "register_operand" "r") (match_operand 1 "const_int_operand" "k"))]
"" /* C boolean expression, if required */
"mov %0, %1"
)
Translation Sequence in GCC
(define_insn
"movsi"
[(set
(match_operand 0 "register_operand" "r") (match_operand 1 "const_int_operand" "k") )]
"" /* C boolean expression, if required */
"li %0, %1"
)
D.1283 = 10;
(set
(reg:SI 58 [D.1283]) (const int 10: [0xa]) )
li $t0, 10
D e ve lopm e nt U se
Retargetability Mechanism of GCC
Language Specific
Code
Language and Machine Independent Generic Code
Machine Dependent
Generator Code
Machine Descriptions Compiler Generation Framework
Input Language Target Name
Parser Gimplifier Tree SSA Optimizer
RTL
Generator Optimizer Code Generator Selected Copied
Copied
Generated
Generated
Generated Compiler
Development Time
Build Time
Use Time
GIMPLE → PN + PN → IR-RTL
+ IR-RTL → ASM
GIMPLE → IR-RTL +
IR-RTL → ASM
Plugin Structure in cc1
toplev main
front end
pass manager
pass 1 pass 2
. . .
pass expand
. . .
pass n
code for pass 2 code for pass 1
code for pass n expander code
optab table langhook
. . . code for language 1 code for language 2 code for language n
insn data generated code for machine 1
MD 1
MD 2
MD n
Hooking up Back End Details
optab table
. . . .
OTI mov
mov optab
handler
SI insn code CODE FOR movsi SF insn code
CODE FOR nothing
$(SOURCE)/gcc/optabs.h
$(SOURCE)/gcc/optabs.c $(BUILD)/gcc/insn-output.c insn data
. . . . . . 1280
"movsi"
. . .
gen movsi . . .
$BUILD/gcc/insn-codes.h CODE FOR movsi=1280
CODE FOR movsf=CODE FOR nothing
$BUILD/gcc/insn-opinit.c ...
Runtime initialization
of data structure
Part 2
Genesis and Objectives of GRC
How Did It All Begin?
• An Informal Group
• A Desire
How Did It All Begin?
• An Informal Group
◮
CSE faculty members at IITB: Uday Khedker Amitabha Sanyal Supratim Biswas
◮
Reasonably long and deep experience of research in compilers
• A Desire
How Did It All Begin?
• An Informal Group
◮
CSE faculty members at IITB: Uday Khedker Amitabha Sanyal Supratim Biswas
◮
Reasonably long and deep experience of research in compilers
• A Desire
Performing research grounded in theory and corroborated by empirical evidence
◮
Exploring research issues in real compilers
◮
Demonstrating the relevance and effectiveness of our research in real
compilers
A Modest Start in 2003. . .
• Our Tool of Experiment
• Our Guinea Pigs
A Modest Start in 2003. . .
• Our Tool of Experiment The GNU Compiler Collection
◮
Compiler generation framework
Stable compiler generated for several dozen targets
◮
Millions of users
• Our Guinea Pigs
A Modest Start in 2003. . .
• Our Tool of Experiment The GNU Compiler Collection
◮
Compiler generation framework
Stable compiler generated for several dozen targets
◮
Millions of users
• Our Guinea Pigs
Several unsuspecting M.Tech. students, external B.E. students, and
project engineers
And Then in 2007. . .
And then in 2008. . .
Thanks to small seed grants from IITB and IBM Faculty Award. . .
And then in 2008. . .
Finally in 2009. . .
A generous grant from the Department of Information Technology, Ministry of Communication and Information Technology, Gov. of India.
• National Resource Center for F/OSS, Phase II
• Participating agencies:
CDAC Chennai (coordinating agency), CDAC Mumbai, CDAC Hyderabad, IIT Bombay, IIT Madras, Anna University
• IIT Bombay’s focus: GCC
2009. . .
July 2009. . .
July 2010. . .
Objectives of GCC Resource Center
1. To support the open source movement
Providing training and technical know-how of the GCC framework to academia and industry.
2. To include better technologies in GCC
Whole program optimization, Optimizer generation, Tree tiling based instruction selection.
3. To facilitate easier and better quality deployments/enhancements of GCC
Restructuring GCC and devising methodologies for systematic construction of machine descriptions in GCC.
4. To bridge the gap between academic research and practical implementation
Designing suitable abstractions of GCC architecture
Broad Areas of Interests
• Program Analysis and Optimization
• Translation Validation
• Retargetable compilation
• Parallelization and Vectorization for SIMD and MIMD Architectures
General explorations applied in the context of GCC
Broad Research Goals of GCC Resource Center
• Using GCC as a means
◮
Adding new optimizations to GCC
◮
Adding flow and context sensitive analyses to GCC (In particular, pointer analysis)
◮
Translation validation of GCC
• Using GCC as an end in itself
◮
Changing the retargetability mechanism of GCC
◮
Cleaning up the machine descriptions of GCC
◮
Systematic construction of machine descriptions
◮