Presentation from JVMLS 2015
One bottleneck in the Nashorn JavaScript engine is startup time. Nashorn, as it works currently in Java 8, JITs everything to Java bytecode, accruing overhead in code generation and class installation. Nashorn in Java 9, can in unfortunate cases, increase this compilation workload significantly, as the new optimistic type system, which has greatly increased steady state performance, requires more code invalidation on warmup. Based on our optimistic type compilation framework, which contains all the mechanisms for quick code replacement and on stack replacement on the bytecode level, I will present the new execution architecture we are developing. It will minimizes compile time intelligently, while maintaining or possible even increasing code performance, due to extra profiling and execution frequency information being passed to the JIT. I will also talk about what the future will bring in terms of other dynamic languages on the Nashorn engine, partial method compilation of hot paths and other intriguing possibilities that our new execution model opens up.
4. Safe
Harbor
Statement
"THE FOLLOWING IS INTENDED TO OUTLINE OUR
GENERAL PRODUCT DIRECTION. IT IS INTENDED FOR
INFORMATION PURPOSES ONLY, AND MAY NOT BE
INCORPORATED INTO ANY CONTRACT. IT IS NOT A
COMMITMENT TO DELIVER ANY MATERIAL, CODE, OR
FUNCTIONALITY, AND SHOULD NOT BE RELIED UPON IN
MAKING PURCHASING DECISION. THE DEVELOPMENT,
RELEASE, AND TIMING OF ANY FEATURES OR
FUNCTIONALITY DESCRIBED FOR ORACLE'S PRODUCTS
REMAINS AT THE SOLE DISCRETION OF ORACLE."
5. Safe
Harbor
Statement
"THE FOLLOWING IS INTENDED TO OUTLINE OUR
GENERAL PRODUCT DIRECTION. IT IS INTENDED FOR
INFORMATION PURPOSES ONLY, AND MAY NOT BE
INCORPORATED INTO ANY CONTRACT. IT IS NOT A
COMMITMENT TO DELIVER ANY MATERIAL, CODE, OR
FUNCTIONALITY, AND SHOULD NOT BE RELIED UPON IN
MAKING PURCHASING DECISION. THE DEVELOPMENT,
RELEASE, AND TIMING OF ANY FEATURES OR
FUNCTIONALITY DESCRIBED FOR ORACLE'S PRODUCTS
REMAINS AT THE SOLE DISCRETION OF ORACLE."
@lagergren
6. Agenda
• Dynamic
languages
on
the
JVM
–
Why?
• Nashorn
• Performance
• Op>mis>c
Types,
Steady
State
Performance
• Startup
and
Warmup
Performance
• Nashorn
Java
9
execu>on
architecture
• Future
Work
13. • Automa>c
memory
management
• State
of
the
art
JIT
op>miza>ons
• Na>ve
threading
capability
• Hybridiza>on
– (javax.scripting,
JSR-‐223)
• Man
decades
of
high
tech
– Already
in
the
JVM
Why
“Alien”
Languages
on
the
JVM?
14. Why
“Alien”
Languages
on
the
JVM?
0
100000
200000
300000
400000
500000
Code
bases
Lines
16. • Invokedynamic
POC
• Compliant
JavaScript
run>me
• Open
• Fast*
• Hybrid
– JSR-‐223,
javax.scrip>ng
• Extensible
Nashorn
Goals,
2010-‐
*
At
first
comparable
in
performance
to
na>ve
implementa>ons,
in
domains
where
it
makers.
17. • Toolbox
for
other
dynamic
languages
on
top
of
the
JVM
– Dynalink
– TypeScript
thesis
– JRuby
9000
synergy
– “the
invokedynamic
way
of
language
implementa>on”
Long
Term
Nashorn
Goals
20. What
Does
“Performance”
Mean?
Total
Performance
=
Execu>on
Time
+
Run>me
Overhead
[JavaScript
&
na>ve
>me
–
brought
down
by:
invokedynamic
op>miza>ons,
incremental
inlining,
field
access
>me
minimiza>on,
efficient
na>ve
code
implementa>on,
type
specializa>on,
op>mis>c
type
guesses,
JIT
op>miza>ons
of
bytecode]
21. What
Does
“Performance”
Mean?
Total
Performance
=
Execu>on
Time
+
Run>me
Overhead
The
goal
of
8u60
(main
func>onality,
like
–optimistic-
types s>ll
disabled
by
default)
22. What
Does
“Performance”
Mean?
Total
Performance
=
Execu>on
Time
+
Run>me
Overhead
The
stretch
goal
for
9
23. What
Does
“Performance”
Mean?
Total
Performance
=
Execu>on
Time
+
Run>me
Overhead
[increase
BC/Nashorn
jit
speed,
minimize
relinking
of
callsite/bytecode
regenera>on,
>red
JIT
recompila>on,
class
installa>on
speed,
>me
spent
in
GC
etc
–
MAKE
STARTUP
FASTER]
24. What
Does
“Performance”
Mean?
Total
Performance
=
Execu>on
Time
+
Warmup/Steady
State
Overhead
+
Run>me
Overhead
Especially
important:
Time
to
reach
steady
state!
Use
cases:
frequent
restarts,
REPL,
redeployments,
evals
25. What
Does
“Performance”
Mean?
Especially
important:
Time
to
reach
steady
state!
8u60
has
lazy
compilaBon
&
code
caching
that
helps
some
for
runs
n…,
n
>
1
Total
Performance
=
Execu>on
Time
+
Warmup/Steady
State
Overhead
+
Run>me
Overhead
26. Genera>ng
Code
That
Runs
Faster:
Op>mis>c
Types
[for
even
more
in
depth
info,
see
my
JVMLS
presenta>ons
from
2013
and
2014]
28. Akack
Execu>on
Time
• invokedynamic
implementa>on
– JVM
– java.lang.invoke
implementa>on
• Boxing,
boxing
everywhere
– In
the
libraries
– Representa>on
of
generated
code
– Insufficient
escape
analysis,
or
even
opportuni>es
for
it
32. Op>mis>c
Types
function() {
return a + b;
}
try {
operation; // get a, get b or iadd
} catch (final UnwarrantedOptimismException e) {
throw new RewriteException(e.getLocalVariables(), e.getProgramPoint());
}
33. Op>mis>c
Types
• Use
whatever
sta>c
types
there
are
• Guess
the
rest
• Take
a
con>nua>on
and
recompile
if
wrong
long
double
Object
(pessimis>c)
int
34. Op>mis>c
Types
• Retain
primi>ve
storage
is
possible
– Dual
fields,
later
VarHandles/TaggedArrays
– Method
specializa>on
• Add
specialized
version
of
na>ve
methods
35. Op>mis>c
Types
@Function(arity = 2,
attributes = Attribute.NOT_ENUMERABLE,
where = Where.CONSTRUCTOR)
public static double max(final Object self, final Object... args) {
switch (args.length) {
case 0:
return Double.NEGATIVE_INFINITY;
case 1:
return JSType.toNumber(args[0]);
default:
double res = JSType.toNumber(args[0]);
for (int i = 1; i < args.length; i++) {
res = Math.max(res, JSType.toNumber(args[i]));
}
return res;
}
}
37. Op>mis>c
Types
@SpecializedFunction
public static int max(final Object self, final int x, final int y) {
return Math.max(x, y);
}
@SpecializedFunction
public static long max(final Object self, final long x, final long y) {
return Math.max(x, y);
}
38. Op>mis>c
Types
@SpecializedFunction
public static int max(final Object self, final int x, final int y) {
return Math.max(x, y);
}
@SpecializedFunction
public static long max(final Object self, final long x, final long y) {
return Math.max(x, y);
}
@SpecializedFunction
public static double max(final Object self, final double x, final double y) {
return Math.max(x, y);
}
39. Op>mis>c
Types
@SpecializedFunction
public static int max(final Object self, final int x, final int y) {
return Math.max(x, y);
}
@SpecializedFunction
public static long max(final Object self, final long x, final long y) {
return Math.max(x, y);
}
@SpecializedFunction
public static double max(final Object self, final double x, final double y) {
return Math.max(x, y);
}
@SpecializedFunction
public static double max(final Object self, final Object, final Object y) {
return Math.max(JSType.toNumber(x), JSType.toNumber(y));
}
42. The
Cost
of
Steady
State
Performance
Startup
and
warmup
>me
un>l
steady
state.
Bytecode
genera>on
>me.
Memory
usage
/
GC
overhead.
43. Startup
Time
With
and
Without
Op>mis>c
Types
0
0.5
1
1.5
2
2.5
3
3.5
4
8u60,
jit
9,
jit
44. Bytecodes
generated
during
Startup
With
and
Without
Op>mis>c
Types
0
2
4
6
8
10
12
14
16
18
20
8u60,
jit
9,
jit
45. #
classes
generated
during
Startup
With
and
Without
Op>mis>c
Types
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
8u60,
jit
9,
jit
46. Nightmare
Use
Cases
• Dynamically
evaluated
throwaway
code
• Lots
of
relinking
– Lots
of
type
invalida>on
(special
case
of
the
above)
function mtd() {
// x is an Object, but starts out assumed int
// 9 times, 9 indy call sites,
// 9 continuations, 9 recompilations
//
// Or x is just a getter with side effects or
// whatever – JavaScript: anything goes
return x * x * x * x * x * x * x * x * x;
}
51. Stable
Run>me
Performance
• Op>mis>c
types
definitely
gets
us
run>me
performance
in
steady
state
• But
they
also
make
HotSpot
unfeasibly
slow
to
warm
up
– (and
also:
The
bigger
the
method,
the
worse
the
op>miza>ons)
52.
53. Startup
&
Warmup
Overhead
• Observa>on:
most
type
guesses
are
invalidated
once,
and
are
then
correct
forever
– ~95-‐99%
of
the
>me
in
Octane
• Do
we
really
need
to
generate
so
much
new
code?
54. Startup
&
Warmup
Overhead
• Observa>on:
even
callsites
aren’t
monomorphic,
for
small
polymorphism
a
guard
tree
s>ll
works
fine
• Gets
rid
of
most
relinks
if (propertyMap == propertyMap1) { … }
} else if { (propertyMap == propertyMap2) { … }
…
} else { … megamorphic slow dispatch }
55. Startup
&
Warmup
Overhead
• Let’s
assume
that
steady
state
performance
is
indeed
good
enough
(for
now)
– So
for
steady
state,
op>mis>c
types
are
definitely
a
performance
success.
• Akacking
startup
and
warmup
>me
56. Previous
Work
• Lazy
CompilaBon
(jdk
8u40)
alleviates
some
of
this
• Code
Caching
(jdk
8u40)
alleviates
some
of
this
too,
for
runs
2..n
– OpBmisBc
Type
Caching
(jdk
8u60)
57. Akacking
Startup
&
Warmup
Overhead?
• Akacking
startup
and
warmup
>me
• First
try:
Tier
the
JIT?
• Profiling
JIT
level
with
pessimis>c,
non
faul>ng
code?
58. Tier
the
JIT?
• Pro:
very
likle
new
code
needs
to
be
wriken.
59. Tier
the
JIT?
• Pro:
very
likle
new
code
needs
to
be
wriken.
• Pro:
we
already
generate
arbitrary
level
of
pessimism
on
demand
for
RewriteExceptions
60. Tier
the
JIT?
• Pro:
very
likle
new
code
needs
to
be
wriken.
• Pro:
we
already
generate
arbitrary
level
of
pessimism
on
demand
for
RewriteExceptions
• Pro:
no
significant
test
matrix
growth
61. Tier
the
JIT?
• Con:
Type
pollu>on
– Too
wide
values
wriken
to
scopes
• Con:
(dealbreaker)
code
genera>on
overhead
– We
don’t
get
away
from
the
root
cause
of
overhead
– Smallest
compile
unit
is
a
class
with
a
method
– System
dic>onary
locks
– Byte
code
verifica>on
– Various
other
class
registra>on
horrors
• (also
a
problem
in
lambda
forms
or
any
woven
bytecode)
62. Tier
the
JIT?
• Con:
Type
pollu>on
– Too
wide
values
wriken
to
scopes
• Con:
Code
genera>on
overhead
– We
don’t
get
away
from
the
root
cause
of
overhead
– Smallest
compile
unit
is
a
class
with
a
method
– System
dic>onary
locks
– Byte
code
verifica>on
– Various
other
class
registra>on
horrors
• (also
a
problem
in
lambda
forms
or
any
woven
bytecode)
63. Tier
the
JIT?
• Con:
Type
pollu>on
– Too
wide
values
wriken
to
scopes
• Con:
Code
genera>on
overhead
– We
don’t
get
away
from
the
root
cause
of
overhead
– Smallest
compile
unit
is
a
class
with
a
method
– System
dic>onary
locks
– Byte
code
verifica>on
– Various
other
class
registra>on
horrors
• (also
a
problem
in
lambda
forms
or
any
woven
bytecode)
DEAL
BREAKER
64. • It
looks
like
JIT
overhead
is
expensive
• We
don’t
get
close
to
Rhino
interpreter
startup
with
the
>ered
JIT
approach
• We
also
miss
op>mis>c
types
in
scope,
by
never
using
their
narrowest
form
– Has
to
be
corrected
azerwards
– Too
slow
&
complex
Akacking
Startup
&
Warmup
Overhead?
67. AST
profiling?
• Don’t
add
more
bytecode
genera>on
• Execute
AST
once?
– Collect
types
and
use
them
for
first
JIT
if
called
again
• Or
keep
execu>ng
AST
un>l
something
is
hot
–
only
then
send
it
to
the
JIT.
73. Example:
WhileNode
package jdk.nashorn.internal.ir;
public class WhileNode extends LoopNode {
@Override
public Object interpret(final Frame frame) throws Throwable {
final Label label = frame.getCurrentLabel();
Object result = ScriptRuntime.UNDEFINED;
while (test == null || JSType.toBoolean(test.interpret(frame))) {
try {
checkOSR(frame); // Might need to transition to JIT on backedge
result = body.interpret(frame);
} catch (final BreakException e) {
if (e.matchesLabel(label)) {
break;
}
throw e;
} catch (final ContinueException e) {
if (e.matchesLabel(label)) {
continue;
}
throw e; //not my continue
}
}
return result;
}
}
74. CompiledFunction
• Represents
one
version
of
a
JavaScript
or
Na>ve
func>on
– Contains
invoker
method
handle
– Poten>ally
also
constructor
method
handle
– Is
of
a
certain
type
(e.g.
specialized
on
params
or
generic)
• A
ScriptFunctionData
has
>=
0
CompiledFunctions.
75. • Basically
add
a
CompiledFunction
subclass
– InterpretedFunction
Trampolines
76. • Basically
add
a
CompiledFunction
subclass
– InterpretedFunction
– CallNode.interpret
returns
a
ScriptFunc>on
that
is
really
a
trampoline
Trampolines
77. Trampolines
• Basically
add
a
CompiledFunction
subclass
– InterpretedFunction
– CallNode.interpret
returns
a
ScriptFunc>on
that
is
really
a
trampoline
– Trampoline:
“interpret
yourself”
when
invoked
78. Trampolines
• Basically
add
a
CompiledFunction
subclass
– InterpretedFunction
– CallNode.interpret
returns
a
ScriptFunc>on
that
is
really
a
trampoline
– Trampoline:
“interpret
yourself”
when
invoked
• Add
a
Node.interpret(Frame frame)
method
to
the
IR
Node
– Frame
is
an
interpreter
state
(locals/scope)
79. Duplicated
Func>onality?
• But
then
we
need
to
implement
a
lot
of
code
that
already
exists
in
the
JIT
• Just
think
of
reimplemenBng
link
logic
for
all
indy
calls!!!
80. Duplicated
Func>onality?
• But
then
we
need
to
implement
a
lot
of
code
that
already
exists
in
the
JIT
• Just
think
of
reimplemenBng
link
logic
for
all
indy
calls!!!
• And
that’s
just
one
thing!
81. Duplicated
Func>onality?
• But
then
we
need
to
implement
a
lot
of
code
that
already
exists
in
the
JIT
• Just
think
of
reimplemenBng
link
logic
for
all
indy
calls!!!
– Na>ve
calls,
JavaScript
calls,
different
guards,
nested
receiver
checks,
call
site
reuse
– Even
through
Dynalink
– Infeasible
to
duplicate
codegen
logic
for
this
– And
the
tes>ng!
My
god!
82. Reuse
Link
Logic
protected GuardedInvocation findSetMethod(final CallSiteDescriptor desc, final LinkRequest request) {
final String name = desc.getNameToken(CallSiteDescriptor.NAME_OPERAND);
if (request.isCallSiteUnstable() || hasWithScope()) {
return findMegaMorphicSetMethod(desc, name);
}
final boolean explicitInstanceOfCheck = explicitInstanceOfCheck(desc, request);
/*
* If doing property set on a scope object, we should stop proto search on the first
* non-scope object. Without this, for example, when assigning "toString" on global scope,
* we'll end up assigning it on it's proto - which is Object.prototype.toString !!
*
* toString = function() { print("global toString"); } // don't affect Object.prototype.toString
*/
FindProperty find = findProperty(name, true, this);
// If it's not a scope search, then we don't want any inherited properties except those with user defined accessors.
if (find != null && find.isInherited() && !(find.getProperty() instanceof UserAccessorProperty)) {
// We should still check if inherited data property is not writable
if (isExtensible() && !find.getProperty().isWritable()) {
return createEmptySetMethod(desc, explicitInstanceOfCheck, "property.not.writable", true);
}
// Otherwise, forget the found property unless this is a scope callsite and the owner is a scope object as well.
if (!NashornCallSiteDescriptor.isScope(desc) || !find.getOwner().isScope()) {
find = null;
}
}
if (find != null) {
if (!find.getProperty().isWritable() && !NashornCallSiteDescriptor.isDeclaration(desc)) {
if (NashornCallSiteDescriptor.isScope(desc) && find.getProperty().isLexicalBinding()) {
throw typeError("assign.constant", name); // Overwriting ES6 const should throw also in non-strict mode.
}
// Existing, non-writable property
return createEmptySetMethod(desc, explicitInstanceOfCheck, "property.not.writable", true);
}
} else {
if (!isExtensible()) {
return createEmptySetMethod(desc, explicitInstanceOfCheck, "object.non.extensible", false);
}
}
final GuardedInvocation inv = new SetMethodCreator(this, find, desc, equest).createGuardedInvocation(findBuiltinSwitchPoint(name));
final GlobalConstants globalConstants = getGlobalConstants();
if (globalConstants != null) {
final GuardedInvocation cinv = globalConstants.findSetMethod(find, this, inv, desc, request);
if (cinv != null) {
return cinv;
}
}
return inv;
}
Example:
linking
a
set
method
for
a
ScriptObject
83. Reuse
Link
Logic
• Reuse
link
logic?
protected GuardedInvocation findSetMethod(final CallSiteDescriptor desc, final LinkRequest request) {
final String name = desc.getNameToken(CallSiteDescriptor.NAME_OPERAND);
if (request.isCallSiteUnstable() || hasWithScope()) {
return findMegaMorphicSetMethod(desc, name);
}
final boolean explicitInstanceOfCheck = explicitInstanceOfCheck(desc, request);
/*
* If doing property set on a scope object, we should stop proto search on the first
* non-scope object. Without this, for example, when assigning "toString" on global scope,
* we'll end up assigning it on it's proto - which is Object.prototype.toString !!
*
* toString = function() { print("global toString"); } // don't affect Object.prototype.toString
*/
FindProperty find = findProperty(name, true, this);
// If it's not a scope search, then we don't want any inherited properties except those with user defined accessors.
if (find != null && find.isInherited() && !(find.getProperty() instanceof UserAccessorProperty)) {
// We should still check if inherited data property is not writable
if (isExtensible() && !find.getProperty().isWritable()) {
return createEmptySetMethod(desc, explicitInstanceOfCheck, "property.not.writable", true);
}
// Otherwise, forget the found property unless this is a scope callsite and the owner is a scope object as well.
if (!NashornCallSiteDescriptor.isScope(desc) || !find.getOwner().isScope()) {
find = null;
}
}
if (find != null) {
if (!find.getProperty().isWritable() && !NashornCallSiteDescriptor.isDeclaration(desc)) {
if (NashornCallSiteDescriptor.isScope(desc) && find.getProperty().isLexicalBinding()) {
throw typeError("assign.constant", name); // Overwriting ES6 const should throw also in non-strict mode.
}
// Existing, non-writable property
return createEmptySetMethod(desc, explicitInstanceOfCheck, "property.not.writable", true);
}
} else {
if (!isExtensible()) {
return createEmptySetMethod(desc, explicitInstanceOfCheck, "object.non.extensible", false);
}
}
final GuardedInvocation inv = new SetMethodCreator(this, find, desc, equest).createGuardedInvocation(findBuiltinSwitchPoint(name));
final GlobalConstants globalConstants = getGlobalConstants();
if (globalConstants != null) {
final GuardedInvocation cinv = globalConstants.findSetMethod(find, this, inv, desc, request);
if (cinv != null) {
return cinv;
}
}
return inv;
}
Example:
linking
a
set
method
for
a
ScriptObject
84. Reuse
Link
Logic
• Reuse
link
logic?
protected GuardedInvocation findSetMethod(final CallSiteDescriptor desc, final LinkRequest request) {
final String name = desc.getNameToken(CallSiteDescriptor.NAME_OPERAND);
if (request.isCallSiteUnstable() || hasWithScope()) {
return findMegaMorphicSetMethod(desc, name);
}
final boolean explicitInstanceOfCheck = explicitInstanceOfCheck(desc, request);
/*
* If doing property set on a scope object, we should stop proto search on the first
* non-scope object. Without this, for example, when assigning "toString" on global scope,
* we'll end up assigning it on it's proto - which is Object.prototype.toString !!
*
* toString = function() { print("global toString"); } // don't affect Object.prototype.toString
*/
FindProperty find = findProperty(name, true, this);
// If it's not a scope search, then we don't want any inherited properties except those with user defined accessors.
if (find != null && find.isInherited() && !(find.getProperty() instanceof UserAccessorProperty)) {
// We should still check if inherited data property is not writable
if (isExtensible() && !find.getProperty().isWritable()) {
return createEmptySetMethod(desc, explicitInstanceOfCheck, "property.not.writable", true);
}
// Otherwise, forget the found property unless this is a scope callsite and the owner is a scope object as well.
if (!NashornCallSiteDescriptor.isScope(desc) || !find.getOwner().isScope()) {
find = null;
}
}
if (find != null) {
if (!find.getProperty().isWritable() && !NashornCallSiteDescriptor.isDeclaration(desc)) {
if (NashornCallSiteDescriptor.isScope(desc) && find.getProperty().isLexicalBinding()) {
throw typeError("assign.constant", name); // Overwriting ES6 const should throw also in non-strict mode.
}
// Existing, non-writable property
return createEmptySetMethod(desc, explicitInstanceOfCheck, "property.not.writable", true);
}
} else {
if (!isExtensible()) {
return createEmptySetMethod(desc, explicitInstanceOfCheck, "object.non.extensible", false);
}
}
final GuardedInvocation inv = new SetMethodCreator(this, find, desc, equest).createGuardedInvocation(findBuiltinSwitchPoint(name));
final GlobalConstants globalConstants = getGlobalConstants();
if (globalConstants != null) {
final GuardedInvocation cinv = globalConstants.findSetMethod(find, this, inv, desc, request);
if (cinv != null) {
return cinv;
}
}
return inv;
}
Example:
linking
a
set
method
for
a
ScriptObject
85. Link
Logic
• InterpreterAccessor
– IndexNode, AccessNode, IdentNode
– get, set methods
(take
a
Frame)
– interpret
calls
get
– Lookup
delegates
to
findGetMethod,findSetMethod
• InterpreterCall
– Lookup
delegates
to
findCallMethod,findNewMethod
• InterpreterCallable
– interpret,
creates/gets
a
ScriptFunction
• Trampolined
to
call
invoke,
possibly
wormhole
– call method
(actual
invoca>on)
86. Reusing
Link
Logic
package jdk.nashorn.internal.ir;
public class AccessNode extends BaseNode {
@Override
public Object interpret(final Frame frame) throws Throwable {
interpreterEnter(frame);
try {
return get(frame);
} finally {
interpreterLeave(frame);
}
}
@Override
public Object get(final Frame frame, final Object interpretedBase) throws Throwable {
try {
// lookupGetter uses existing ScriptObject/Dynalink link logic (set is analogue)
final CallSite cs = lookupGetter(frame, getterType(), interpretedBase, 0);
final Object value = cs.getTarget().invokeExact(interpretedBase);
return interpreterReturn(frame, value);
} catch (final ECMAException e) {
if (e.hasScriptStackTrace()) { //has stack trace been rewritten
throw e;
}
throw e.rewriteStackTrace(frame);
}
}
}
87. Interpreter?
• The
link
logic
reuse
actually
makes
us
end
up
with
rela>vely
likle
new
code!
• For
most
logic,
we
can
just
use
ScriptObject
and
ScriptRuntime
func>ons
that
already
exist
for
slow
cases
• And
add
type
narrowing
88. Interpreter
Speed;
Observa>ons
• Startup
is
significantly
faster
(even
early
in
the
project)
• But
execu>on
is
5-‐100
>mes
slower
than
execu>ng
op>mized
warmed
up
wriken
code
• We
get
automa>c
type
profiles
before
JIT>ng
• We
need
to
transi>on
from
interpreted
to
JITted
code
fairly
quickly
– Time-‐to-‐steady-‐state
must
not
suffer
from
fast
startup
89. Transi>oning
to
JIT;
Determinism
• Right
now
we
are
using
“number
of
invoca>ons”
as
the
only
JIT
metric
– No
explicit
bytecode
–
MH
return
value
filter
counter
• Tests
are
then
determinis>c
• We
are
rather
aggressive
in
transferring
to
JIT
code
as
– It
doesn’t
take
long
to
do
a
stable
type
profile
(1-‐2
execu>ons)
90. Transi>oning
to
JIT;
Loops
• At
n
backedge
execu>ons,
have
the
interpreter
throw
a
RewriteException
– Contains
in
type
map
(no
new
code)
– Works
just
like
in
the
JIT
for
a
too
wide
type
• Very
likle
code
– Logic
for
all
this
already
exists
in
op>mis>c
JIT
91. Transi>oning
to
JIT;
Loops
• Need
OSR
support
• Reuse
Program
Point
concept
from
op>mis>c
types
• Add
an
“invisible”
op>mis>c
program
point
– LoopNode implements Optimistic
– LoopNode.getProgramPoint()
93. The
Stack
Trace
Problem
• A
stack
trace
is
not
a
special
case
in
the
JIT
– All
bytecode
has
line
number
and
file
name
informa>on
• An
interpreter
would
contain
Node.interpret
methods
instead
of
the
correct
JavaScript
line
numbers
– Need
a
special
case
to
rewrite
stack
traces
from
Interpreter
mode
to
script
code
– Doable
94. The
Stack
Trace
Problem
function f() {
print(a);
}
function g() {
f();
}
g();
95. The
Stack
Trace
Problem
stacktrace.js:2 ReferenceError: "a" is not defined
at jdk.nashorn.internal.runtime.ECMAErrors.error(ECMAErrors.java:66)
at jdk.nashorn.internal.runtime.ECMAErrors.referenceError(ECMAErrors.java:331)
at jdk.nashorn.internal.runtime.ECMAErrors.referenceError(ECMAErrors.java:304)
at jdk.nashorn.internal.runtime.ECMAErrors.referenceError(ECMAErrors.java:291)
at jdk.nashorn.internal.runtime.ScriptObject.noSuchProperty(ScriptObject.java:2400)
at jdk.nashorn.internal.runtime.ScriptObject.findGetMethod(ScriptObject.java:2005)
at jdk.nashorn.internal.objects.Global.findGetMethod(Global.java:2428)
at jdk.nashorn.internal.runtime.ScriptObject.lookup(ScriptObject.java:1866)
at jdk.nashorn.internal.runtime.linker.NashornLinker.getGuardedInvocation(NashornLinker.java:104)
at jdk.nashorn.internal.runtime.linker.NashornLinker.getGuardedInvocation(NashornLinker.java:98)
at jdk.internal.dynalink.support.CompositeTypeBasedGuardingDynamicLinker.
getGuardedInvocation(CompositeTypeBasedGuardingDynamicLinker.java:176)
at jdk.internal.dynalink.support.CompositeGuardingDynamicLinker.
getGuardedInvocation(CompositeGuardingDynamicLinker.java:124)
at jdk.internal.dynalink.support.LinkerServicesImpl.getGuardedInvocation(LinkerServicesImpl.java:154)
at jdk.internal.dynalink.DynamicLinker.relink(DynamicLinker.java:253)
at jdk.nashorn.internal.scripts.Script$Recompilation$3$stacktrace.f(stacktrace.js:2)
at jdk.nashorn.internal.scripts.Script$Recompilation$2$31$stacktrace.g(stacktrace.js:5)
at jdk.nashorn.internal.scripts.Script$Recompilation$1$stacktrace.:program(stacktrace.js:7)
at jdk.nashorn.internal.runtime.ScriptFunctionData.invoke(ScriptFunctionData.java:772)
at jdk.nashorn.internal.runtime.ScriptFunction.invoke(ScriptFunction.java:267)
at jdk.nashorn.internal.runtime.ScriptRuntime.applyThrow(ScriptRuntime.java:434)
at jdk.nashorn.internal.runtime.ScriptRuntime.apply(ScriptRuntime.java:411)
at jdk.nashorn.tools.Shell.apply(Shell.java:410)
at jdk.nashorn.tools.Shell.runScripts(Shell.java:339)
at jdk.nashorn.tools.Shell.run(Shell.java:172)
at jdk.nashorn.tools.Shell.main(Shell.java:136)
at jdk.nashorn.tools.Shell.main(Shell.java:112)
96. The
Stack
Trace
Problem
stacktrace.js:4 ReferenceError: "a" is not defined
at jdk.nashorn.internal.runtime.ECMAErrors.error(ECMAErrors.java:66)
at jdk.nashorn.internal.runtime.ECMAErrors.referenceError(ECMAErrors.java:331)
at jdk.nashorn.internal.runtime.ECMAErrors.referenceError(ECMAErrors.java:304)
at jdk.nashorn.internal.runtime.ECMAErrors.referenceError(ECMAErrors.java:291)
at jdk.nashorn.internal.runtime.ScriptObject.noSuchProperty(ScriptObject.java:2400)
at jdk.nashorn.internal.runtime.ScriptObject.findGetMethod(ScriptObject.java:2005)
at jdk.nashorn.internal.objects.Global.findGetMethod(Global.java:2428)
at jdk.nashorn.internal.runtime.ScriptObject.lookup(ScriptObject.java:1866)
at jdk.nashorn.internal.runtime.linker.NashornLinker.getGuardedInvocation(NashornLinker.java:104)
at jdk.nashorn.internal.runtime.linker.NashornLinker.getGuardedInvocation(NashornLinker.java:98)
at jdk.internal.dynalink.support.CompositeTypeBasedGuardingDynamicLinker.getGuardedInvocation(CompositeTypeBasedGuardingDynamicLinker.java:176)
at jdk.internal.dynalink.support.CompositeGuardingDynamicLinker.getGuardedInvocation(CompositeGuardingDynamicLinker.java:124)
at jdk.internal.dynalink.support.LinkerServicesImpl.getGuardedInvocation(LinkerServicesImpl.java:154)
at jdk.internal.dynalink.DynamicLinker.relink(DynamicLinker.java:253)
at jdk.nashorn.internal.ir.IdentNode.get(IdentNode.java:422)
at jdk.nashorn.internal.ir.IdentNode.interpret(IdentNode.java:400)
at jdk.nashorn.internal.runtime.interpreter.Interpreter.interpret(Interpreter.java:395)
at jdk.nashorn.internal.ir.CallNode$1.interpretArguments(CallNode.java:473)
at jdk.nashorn.internal.ir.CallNode$1.execute(CallNode.java:449)
at jdk.nashorn.internal.runtime.interpreter.ExceptionInterpreterOperation.run(ExceptionInterpreterOperation.java:32)
at jdk.nashorn.internal.ir.CallNode.interpret(CallNode.java:640)
at jdk.nashorn.internal.ir.ExpressionStatement.interpret(ExpressionStatement.java:102)
at jdk.nashorn.internal.ir.Block.interpret(Block.java:653)
at jdk.nashorn.internal.ir.FunctionNode.invoke(FunctionNode.java:1486)
at jdk.nashorn.internal.ir.FunctionNode.call(FunctionNode.java:1519)
at java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:625)
at jdk.nashorn.internal.ir.CallNode$1.invokeCallSite(CallNode.java:511)
at jdk.nashorn.internal.ir.CallNode$1.execute(CallNode.java:449)
at jdk.nashorn.internal.runtime.interpreter.ExceptionInterpreterOperation.run(ExceptionInterpreterOperation.java:32)
at jdk.nashorn.internal.ir.CallNode.interpret(CallNode.java:640)
at jdk.nashorn.internal.ir.ExpressionStatement.interpret(ExpressionStatement.java:102)
at jdk.nashorn.internal.ir.Block.interpret(Block.java:653)
at jdk.nashorn.internal.ir.FunctionNode.invoke(FunctionNode.java:1486)
at jdk.nashorn.internal.ir.FunctionNode.call(FunctionNode.java:1519)
at java.lang.invoke.MethodHandle.invokeWithArguments(MethodHandle.java:625)
at jdk.nashorn.internal.ir.CallNode$1.invokeCallSite(CallNode.java:511)
at jdk.nashorn.internal.ir.CallNode$1.execute(CallNode.java:449)
at jdk.nashorn.internal.runtime.interpreter.ExceptionInterpreterOperation.run(ExceptionInterpreterOperation.java:32)
at jdk.nashorn.internal.ir.CallNode.interpret(CallNode.java:640)
at jdk.nashorn.internal.ir.BinaryNode.interpret(BinaryNode.java:633)
at jdk.nashorn.internal.ir.ExpressionStatement.interpret(ExpressionStatement.java:102)
at jdk.nashorn.internal.ir.Block.interpret(Block.java:653)
at jdk.nashorn.internal.ir.FunctionNode.invoke(FunctionNode.java:1486)
at jdk.nashorn.internal.ir.FunctionNode.call(FunctionNode.java:1519)
at jdk.nashorn.internal.runtime.interpreter.InterpreterCallable.wormholeInterpreterCall(InterpreterCallable.java:101)
at jdk.nashorn.internal.scripts.Wormhole$=stacktrace,js.wormholeInterpreterCall(stacktrace.js:4)
at jdk.nashorn.internal.runtime.ScriptFunctionData.invoke(ScriptFunctionData.java:763)
at jdk.nashorn.internal.runtime.ScriptFunction.invoke(ScriptFunction.java:267)
at jdk.nashorn.internal.runtime.ScriptRuntime.applyThrow(ScriptRuntime.java:434)
at jdk.nashorn.internal.runtime.ScriptRuntime.apply(ScriptRuntime.java:411)
at jdk.nashorn.tools.Shell.apply(Shell.java:410)
at jdk.nashorn.tools.Shell.runScripts(Shell.java:339)
at jdk.nashorn.tools.Shell.run(Shell.java:172)
at jdk.nashorn.tools.Shell.main(Shell.java:136)
at jdk.nashorn.tools.Shell.main(Shell.java:112)
97. The
Stack
Trace
Problem
stacktrace.js:2 ReferenceError: "a" is not defined
at jdk.nashorn.internal.objects.Global.findGetMethod(Global.java:2428)
at jdk.internal.dynalink.support.CompositeTypeBasedGuardingDynamicLinker.
getGuardedInvocation(CompositeTypeBasedGuardingDynamicLinker.java:176)
at jdk.internal.dynalink.support.CompositeGuardingDynamicLinker.
getGuardedInvocation(CompositeGuardingDynamicLinker.java:124)
at jdk.internal.dynalink.support.LinkerServicesImpl.getGuardedInvocation(LinkerServicesImpl.java:154)
at jdk.internal.dynalink.DynamicLinker.relink(DynamicLinker.java:253)
at jdk.nashorn.internal.scripts.Script$Interpreted$.f(stacktrace.js:2)
at jdk.nashorn.internal.scripts.Script$Interpreted$.g(stacktrace.js:5)
at jdk.nashorn.internal.scripts.Script$Interpreted$.:program(stacktrace.js:7)
at jdk.nashorn.tools.Shell.apply(Shell.java:410)
at jdk.nashorn.tools.Shell.runScripts(Shell.java:339)
at jdk.nashorn.tools.Shell.run(Shell.java:172)
at jdk.nashorn.tools.Shell.main(Shell.java:136)
at jdk.nashorn.tools.Shell.main(Shell.java:112)
98. The
Stack
Trace
Problem
• Any
NativeError
trace
passed
out
of
the
interpreter
needs
to
be
trapped
in
e.g.
FunctionNode.interpret
and
poten>ally
rewriken
99. The
Stack
Trace
Problem
• Any
NativeError
trace
passed
out
of
the
interpreter
needs
to
be
trapped
in
e.g.
FunctionNode.interpret
and
poten>ally
rewriken
• nasgen
tool
needs
to
support
Interpreter
state
in
some
cases,
NativeError
– Added
@needsInterpreterFrame=[true|false] annota>on
100. Security
Problems
• The
JIT
code
uses
a
MethodHandles.lookup()
reachable
only
from
jdk.nashorn.internal.scripts package.
101. Security
Problems
• The
JIT
code
uses
a
MethodHandles.lookup()
reachable
only
from
jdk.nashorn.internal.scripts package.
• If
we
use
a
lookup
from
the
IR
package,
it
is
too
privileged
– The
Interpreter
needs
a
Source.getLookup()
method
that
returns
the
restricted
lookup.
– Lookup
is
wriken
to
Source,
whenever
we
enter
a
source
we
haven’t
found
before,
through
a
“wormhole”
in
the
script
package.
102. Security
Problems
• The
JIT
code
uses
a
MethodHandles.lookup()
reachable
only
from
jdk.nashorn.internal.scripts package.
• If
we
use
a
lookup
from
the
IR
package,
it
is
too
privileged
– The
Interpreter
needs
a
Source.getLookup()
method
that
returns
the
restricted
lookup.
– Lookup
is
wriken
to
Source,
whenever
we
enter
a
source
we
haven’t
found
before,
through
a
“wormhole”
in
the
script
package.
• One
wormhole
method
per
Source
is
all
that’s
required
103. Security
Problems
public static wormholeInterpreterCall(InterpreterCallable;Frame;ScriptFunction;Object;[Object;)Object;
0 aload 0
1 invokeinterface InterpreterCallable.getSource()Source;
6 dup
// CHECK IF SOURCE ALREADY HAS LOOKUP
7 invokevirtual Source.hasLookup()Z
10 ifne 22
// NO – GET ONE WITH SCRIPT PACKAGE PRIVILEGE
13 invokestatic MethodHandles.lookup()MethodHandles$Lookup;
16 invokevirtual Source.setLookup(MethodHandles$Lookup;)V
19 goto 23
22 pop
// MARSHALL PARAMETERS TO READ INVOCATION
23 aload 0
24 aload 1
25 aload 2
26 aload 3
27 aload 4
29 invokestatic InterpreterCallable.wormholeInterpreterCall
(InterpreterCallable;Frame;ScriptFunction;Object;[Object;)Object;
32 areturn
105. Avoiding
too
much
link
>me
• CallSite
caching
• Can
actually
be
used
to
solve
problems
that
we
don’t
detect
in
the
JIT
function f() {
g(); //new bootstrap/lookup, store in cache
g(); //no need to link separately, reuse g()
g(); // -”-
g(); // -”-
g(); // -”-
g = function() { return 17; } //invalidate
g(); //new bootstrap/lookup
}
106. Run>me
overhead
• Add
known
set
of
specialized,
non
invalidated
CallSites
already
linked
in
current
scope
• InterpreterAccessor
– isCachedCallSite(…)
• InterpreterCall
– isCachedCallSite(…)
107. “Code
Shape
Overhead”
–
Program
Points
• Nashorn
concept:
program
point
• Used
to
iden>fy
a
program
point
in
a
method
when
re-‐JITTING?
• A
lot
of
JIT
only
transforms
change
code
shape
– Spli€ng,
Inlining
Finallies,
Folding
(interpreter
doesn’t
need
that)
– Program
points
are
assigned
very
late
108. “Code
Shape
Overhead”
–
Program
Points
• Should
avoid
JIT
only
transforms
in
interpreted
mode
for
speed
– Spli€ng,
Lowering
etc
• S>ll
need
to
preserve
code
shape
to
correctly
map
program
point
informa>on
• Alterna>ve
“fuzzier”
program
point
representa>on
– Tuple
(#
in
expression,
source
posi>on)?
109. Background
Processing
• This
a
main
strength
with
two
code
execu>on
environments
• We
can
do
(even
specula>ve)
JIT>ng
in
the
background
early
• Even
(non
explicit)
mul>threaded
if
we
want
– Balancing
heuris>cs
• java.util.concurrent.Future<CompiledFunction>
111. Current
Results
• Tests
are
clean
• We
are
JavaScript
compliant
in
“interpreter
only”
and
“mixed”
modes.
• Startup
performance
is
significantly
beker
than
before
• Ini>al
footprint
/
code
genera>on
>me
is
much
lower
– It
is
important
to
go
to
JIT
quickly
– Type
info
is
usually
already
correct
112. JEP
• A
JEP
is
coming,
and
is
moving
through
the
process
• Will
be
made
public
shortly
119. Call
and
Alloca>on
Site
Profiling
• Execu>on
Overhead
– Alloca>on
site
profiling
– Call
site
profiling
120. Alloca>on
Profiling
• Simple
to
do
• Eliminate
op>mis>c
data
structure
invalida>ons
function vector() {
return new Array(); //defaults to optimistic int
}
var vectors = [];
for (var i = 0; i < 1e6; i++) {
vectors.push(new vector().push(“string”));
}
121. Par>al
Compila>on
• Par>al
compila>on
• Reuse
compiled
Nodes*
– Hang
on
to
MethodHandles
by
“Signature”
• Parameter
types
• Parameter
values
– A
loss
let
code
*
PotenBal
path
profile
polluBon
issues
that
need
to
be
worked
around,
similar
to
as
in
LambdaForm
caching
122. package jdk.nashorn.internal.ir;
public abstract class Node {
protected WeakHashMap<Signature, MethodHandle> code;
// …
public abstract Object interpret(Frame frame);
// …
}
Par>al
Compila>on
123. package jdk.nashorn.internal.ir;
public abstract class Node {
protected WeakHashMap<Signature, MethodHandle> code;
// …
public abstract Object interpret(Frame frame);
// …
}
Par>al
Compila>on
124. Par>al
Compila>on
• Enables
par>al
evalua>on
– with
MethodHandles.constant
and
MethodHandle
combinators
– Truffle
style
behavior
without
requiring
a
modified
VM
125. The
Peeks
and
The
Pokes
(but
safely)
• Not
interpreter
specific
• VarHandles
– Fast
gekers
and
sekers
– No
extra
storage
for
primi>ve/object
versions
– No
bounds
checks
(e.g.
spill
pool,
ArrayData,
TypedArrays
–
who
needs
Unsafe?)
– (sun.misc.TaggedArray)
126. JFR
Integra>on;
Events
• Dynalink
– Relinking
callsites
– Megamorhic
callsites
• Language
agnos>c
– Mul>
language
data
generated
– Type
changes
• Language
specific
– Array
like
object
layout,
packed
/
sparse
– ScriptObject
layouts
127. Parallelism
• The
more
cores,
the
more
we
can
specula>vely
work
(e.g.
JIT)
in
the
background
• Specula>ve
parallel
background
processing
• java.util.concurrent.Future<CompiledFunction>
128. Leveraging
JDK
Changes
• Improvements
and
speedups
of
java.lang.invoke
• Improvements
(or
removal)
of
LambdaForms
• We
Project
Valhalla,
we
think
– VarHandles
complete
– The
minimum
compile
unit
-‐
can
it
shrink?
• Could
spring
from
ClassDynamic?
129. Research
• Mul>
language
framework
– Mul>
Pla‚orm
Typeless
IR
(JRuby
9000
style)
– TypeScript
• “On
implemen>ng
mul>ple
pluggable
dynamic
language
frontends
on
the
JVM,
using
the
Nashorn
run>me”
[Gabrielsson,
Lagergren,
Szegedi]
• Pluggable
VM
frontends