Your browser doesn't support the features required by impress.js, so you are presented with a simplified version of this presentation.
For the best experience please use the latest Chrome, Safari or Firefox browser.
Multi-Tier Compilation
Multi-tier compilation smooths the warmup curve
in JIT compilers.
In HotSpot, C1 is the first-tier compiler,
C2 is the second-tier.
In GraalVM, C1 is the first-tier compiler,
Graal is the second-tier.
Graal as the first-tier compiler?
Graal IR
Based on "sea-of-nodes".
Graal Compiler
Frontend: high-, mid- and low-part.
Backend: low-level optimizations, register allocation, code generation.
Graal Frontend (High Part)
IR nodes are close to Java bytecode.
Bytecode parsing
Escape analysis and read elimination
Tail duplication
Loop unswitching, peeling, unrolling
First lowering
Partial Escape Analysis and Inlining
Partial Escape Analysis and Inlining
Tail Duplication and Simplification
Tail Duplication and Simplification
Graal Frontend (Mid Part)
Optimizations unrelated to bytecode.
Lock elimination
Adding and removing loop-safepoints
Replacing guards with deopt stubs
Frame-state assignment
Partial loop unrolling
Deoptimization grouping
Second lowering
Replacing guards with stubs
Replacing guards with stubs
Frame-State Assignment
Frame-State Assignment
Graal Frontend (Low Part)
Mostly lowering and scheduling.
Third lowering
Floating reads reduction
Dead code elimination
Scheduling to basic blocks
Scheduling of Nodes
Graal Economy Frontend
High tier: bytecode parsing, simplification, lowering
Mid tier: adding loop-safepoints, guards-to-deopts, lowering,
frame-state assignment
Low tier: lowering, logic lowering, scheduling
Compare peak performance, warmup and code size.
Warmup comparisons are tricky.
Graal codebase is currently always compiled with the first-tier compiler!
Compare C1, C1+C2, C1+GraalCE, C1+GraalEE, C1+GraalEconomy.
DaCapo and Scalabench suites.
Evaluation Conclusions
Eco has faster warmup than Graal even in C1+Eco mode.
On DaCapo, C1 and Eco are within 1-2x of peak performance.
On Scalabench, C1 and Eco are within 2.5-5.5x of peak performance.
On 9 out of 22, C1 is <20% faster.
On 5 out of 22, Eco is <20% faster.
Code size is reasonable (close to C1).
Future Steps
Tuning: enabling optimizations, changing register allocation, etc.
Replacing snippets with runtime stubs
To solve the warmup problem, compile Graal ahead-of-time (libgraal)
Graal Economy can replace C1.
Thank you.