closure-builder
Version:
Simple Closure, Soy and JavaScript Build system
916 lines (735 loc) • 37.6 kB
Markdown
[TOC]
# A Bytecode Compiler for Soy
This package implements a bytecode compiler for the Soy language. The high
level goals are to
* Increase rendering performance over Tofu
* Allow async rendering so that Soy can pause/resume rendering when
* It encounters an unfinished future.
* The output buffer is full.
The general strategy is to generate a new Java class for each Soy `{template}`.
Full details on how different pieces of Soy syntax map to Java code are detailed
below.
## Package design
The jbcsrc implementation is split across several packages.
* `com.google.template.soy.jbcsrc`
The base package contains the core compiler implementation and the public
compiler entry point: `BytecodeCompiler`
* `com.google.template.soy.jbcsrc.runtime`
This package contains helper classes and utility routines that are only
accessed by the generated code. A lot of the `jbcsrc` runtime is actually
defined in other soy packages (such as
`com.google.template.soy.shared.internal.SharedRuntime` or
`com.google.template.soy.shared.data`), when it is possible to share with
Tofu. So this package is really intended for jbcsrc specific functionality.
* `com.google.template.soy.jbcsrc.api`
This package contains the public api for rendering jbcsrc compiled templates
via the `SoySauce` class.
* `com.google.template.soy.jbcsrc.shared`
This package contains functionality that is shared by the compiler and
runtime, but is meant to be private to soy.
## Background
The Soy server side renderer is currently implemented as a
[recursive visitor](https://github.com/google/closure-templates/blob/master/java/src/com/google/template/soy/sharedpasses/render/RenderVisitor.java)
on the Soy AST. This implementation is expedient since the renderer uses the
same API as all the parse visitors and other 'compiler passes' and thus can
benefit from developer familiarity. However, this design
makes it very difficult to perform even basic optimizations. By contrast,
the JS implementation of Soy works by generating JS code and thus can
benefit from all the optimizations in the JS compiler and browser.
The new Python implementation will work in a similar way (generating Python
code).
Finally, Soy rendering is one of the last sources of blocking IO in modern Java
servers. Soy will block the request thread when coming across unfinished futures
or when the output buffer becomes full. The current design of Soy rendering
makes it very difficult to move to a fully asynchronous rendering model. This is
important for production stability and resource utilization since it is much
easier to provision servers when the number of threads needed to serve incoming
requests doesn't depend on worst case backend latency.
## Overview
For each Soy template we will generate a Java class by generating bytecode
directly from the parse tree. The Soy language is simple and all
the basic language constructs map directly into Java constructs. For example
this template:
~~~
{template .foo}
{@param p : string}
{@param p2 : string}
{$p}
{$p2}
{/template}
~~~
could be implemented by a Java function like:
~~~
void render(Appendable output, SoyRecord params) {
params.getField("p").render(output);
params.getField("p2").render(output);
}
~~~
Which is, in fact, effectively the code that the current implementation
executes (it is just that between each method call there are many many visitor
operations). So at least initially most of the benefit would be realized
simply by not traversing the AST. However, this solution still doesn’t address
our concerns about asynchrony.
There are two kinds of asynchrony we will wish to handle:
1. Asynchronous data. Any piece of data passed into Soy is wrapped in a
`SoyValueProvider`. If the item is a [`Future`](http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/Future.html),
it is wrapped in a `SoyFutureValueProvider`. If `$p` above was passed in as
a `Future`, then we would block during the `getField` method call.
2. Asynchronous output. In a modern HTTP server, it is desirable to handle
slow clients (e.g. mobile devices). However, in the current Soy design if
the server writes too fast it will either block the rendering thread
(causing poor thread utilization) or it will buffer unbounded bytes in RAM.
If we are buffering too much, it may be better for rendering to pause and
for the request thread to serve another request while waiting for output
buffers to drain.
Given these constraints, the above direct approach will not work. So instead
we could generate something like this:
~~~
class Foo {
int state = 0;
StringData p;
StringData p2;
Result render(AdvisingAppendable output, SoyRecord params) {
switch (state) {
case 0:
SoyValueProvider provider = params.getFieldProvider("p");
Result r = provider.canResolve();
if (r.type() != Result.Type.DONE) {
return r;
}
p = (StringData) provider.resolve();
p.render(output);
state = 1;
if (output.softLimitReached()) {
return Result.limited();
}
case 1:
provider = params.getFieldProvider("p2");
Result r = provider.canResolve();
if (r.type() != Result.Type.DONE) {
return r;
}
p2 = (StringData) provider.resolve();
p2.render(output);
state = 2;
if (output.softLimitReached()) {
return Result.limited();
}
case 2:
return Result.done();
default:
throw new AssertionError();
}
}
}
~~~
In this example, we are now checking whether the output is full (after every
write operation) and we are checking if the `SoyValueProviders`
can be 'resolved' without blocking prior to resolving. Additionally, we are
storing resolved parameters in fields so that we don’t have to re-resolve them
when re-entering the method.
This is the heart of the design: to generate for each template a tiny state
machine that can be used to save and restore state up to the point of the last
'detach'. A sophisticated rendering client could then use these return types
to detach from the request thread or find other work to do while buffers are
being flushed or futures are completing.
This approach is similar to how `yield` generators are implemented in
C#/Python or how `async/await` are implemented in C#/Scala/Dart.
###Implementation strategy
All the examples below use Java code to demonstrate what the generated code
will look like. However, the actual implementation will be using
[ASM](http://asm.ow2.org/) to generate bytecode directly. This comes with a
number of pros and cons.
* Pros
* Small library. Fast code generation.
* Greater control flow flexibility (bytecode GOTO is more powerful than a
Java switch statement)
* Can generate debugging information that points directly to the Soy
template resources
* Makes refresh-to-reload more straightforward than a source compiler
based approach would be.
* Cons
* Few people are familiar with bytecode. This may be a high barrier to entry
for contributions.
* Verbose/tedious! (we lose all the javac compiler magic that you normally
get)
* New compile time dependency for Soy (ASM library)
To demonstrate the control flow issues mentioned above, consider the following
example:
~~~
{template .foo}
{@param p1 : [f: bool, v: list<string>]}
{if $p1.f}
{for $s in $p1.v}
<div>{$s}</div>
{/for}
{/if}
{/template}
~~~
This is a simple template with a `for` loop inside an `if` statement.
To allow the renderer to suspend rendering after print statements or to
implement detaching when handling `$s` we would need to implement something like
this:
~~~
int index;
int state;
public Result render(SoyRecord params, Appendable output) throws IOException {
while (true) {
switch (state) {
case 0:
SoyRecord soyRecord = (SoyRecord) params.getField("p1");
if (soyRecord.getField("f").coerceToBoolean()) {
state = 1;
} else {
state = 3;
}
break;
case 1:
SoyListData vList =
((SoyListData) ((SoyRecord) params.getField("p1")).getField("v"));
if (vList.length() > index) {
output.append("<div>");
state = 2;
} else {
state = 3;
}
break;
case 2:
SoyValueProvider s =
((SoyListData) ((SoyRecord) params.getField("p1")).getField("v"))
.asJavaList()
.get(index);
Result resolvable = s.canResolve();
if (resolvable.isDone()) {
s.resolve().render(output);
output.append("</div>");
state = 1;
index++;
} else {
return resolvable;
}
break;
case 3:
return Result.done();
}
}
}
~~~
We could generate code like this, but in doing so we would lose the major
benefits of source generation: human readability and debuggability. So given
that, we have decided not to generate Java sources and instead to generate
bytecode directly. For example, if Java had a `goto` keyword we could rewrite
the above as:
~~~
int index;
int state = 0;
public void render(SoyRecord params, Appendable output) throws IOException {
goto state;
L0:
SoyRecord soyRecord = (SoyRecord) params.getField("p1");
if (soyRecord.getField("f").coerceToBoolean()) {
List<? extends SoyValueProvider> asJavaList =
((SoyListData) soyRecord.getField("v")).asJavaList();
for (index = 0; index < asJavaList.size(); index++) {
output.append("<div>");
L1:
SoyValueProvider s = asJavaList.get(index);
Result r = s.canResolve();
if (!s.isDone()) {
state = 1;
}
s.resolve().render(output);
output.append("</div>");
}
}
}
~~~
The strategy is to generate bytecode that looks like that.
# Structure of Compiled Templates
For every Soy template we compile a number of classes to implement our
functionality:
* A [CompiledTemplate](https://github.com/google/closure-templates/blob/master/java/src/com/google/template/soy/jbcsrc/api/CompiledTemplate.java)
subclass. This has a single `render` method that will render the template
* A [CompiledTemplate.Factory](api/CompiledTemplate.java) subclass. This
provides a non-reflective mechanism for constructing CompiledTemplate
instances
* A [SoyAbstractCachingValueProvider](https://github.com/google/closure-templates/blob/master/java/src/com/google/template/soy/data/SoyAbstractCachingValueProvider.java)
subclass for each [CallParamValueNode](https://github.com/google/closure-templates/blob/master/java/src/com/google/template/soy/soytree/CallParamValueNode.java)
and each [LetValueNode](https://github.com/google/closure-templates/blob/master/java/src/com/google/template/soy/soytree/LetValueNode.java).
These allow us to implement 'lazy' `{let ...}` and `{param ...}` statements.
* A [RenderableThunk](https://github.com/google/closure-templates/blob/master/java/src/com/google/template/soy/data/internal/RenderableThunk.java)
subclass for each [CallParamContentNode](https://github.com/google/closure-templates/blob/master/java/src/com/google/template/soy/soytree/CallParamContentNode.java)
and each [LetContentNode](https://github.com/google/closure-templates/blob/master/java/src/com/google/template/soy/soytree/LetContentNode.java).
These allow us to implement 'lazy' `{let ...}` and `{param ...}` statements
that render content blocks.
### Glossary
A few specialized terms are used throughout this document and the
implementation.
* `detach`: A 'deatach' is the act of pausing rendering, saving execution state
and returning control to our caller. We are 'detaching' the current rendering
thread from the render operation.
* `attach`: The counterpart of `detach`. This is the act of attaching a new
thread to a detached rendering operation. We may also use the term 'reatach'.
### Helper Objects and APIs
Our implementation will depend on a few new helper objects.
#### AdvisingAppendable
A simple Appendable subtype that exposes an additional method ‘boolean
softLimitReached()’. This method can be queried to see if writes should be
suspended.
#### RenderResult
A value type that indicates the result of a rendering operation. The 3 kinds
are: Result.done(), meaning that rendering completed fully; Result.limited(),
meaning that the output informed us that the limit was reached;
Result.detach(Future), meaning that rendering found an incomplete future and is
detaching on that.
#### RenderContext
A somewhat catch-all object for propagating cross cutting data items. Via the
`RederContext` object, templates should be able to access:
* The SoyMessageBundle
* SoyFunction instances
* PrintDirective instances
* renaming maps (css, xid)
* EscapingDirective instances
* IJ params
* DeltemplateSelector
We will propagate this as a single object from the top level (directly through
the render() calls), because this object will be constant per render.
As future work we should many of these into compiler plugins. For example,
instead of looking up the PrintDirective instances each time we need to apply
it, we could instead introduce a `SoyJbcsrcPrintDirective` that would run in
the compiler and then we wouldn't need to look up instances at runtime. This
would be similar to how `jssrc` implements SoyFunctions and SoyPrintDirectives.
Additionally we will enhance some core APIs to expose additional information:
#### SoyValue
`void SoyValue.render(Appendable)` will change to `RenderResult
render(AdvisingAppendable)`. That will allow individual values to detach
mid-render. Most Soy values will have trivial implementations of this method,
but for our [lazy transclusion values](https://github.com/google/closure-templates/blob/master/java/src/com/google/template/soy/data/internal/RenderableThunk)
we will need this.
#### SoyValueProvider
SoyValueProviders are used to encapsulate values that aren’t done yet. This
includes lazy expressions as well as Futures. In order to identify and
conditionally resolve these providers we will need a new `Result canResolve()`
method.
## Compilation Strategy
The main details of the design will be a discussion of exactly how the code of
the render method is generated. The Soy language is logically divided into two
parts: The expression language and the command language. The expression
language is everything inside of a set of `{}`'s while the command
language is everything outside of it. Since the expression language is the
simplest part, we will start there.
### Compiling Soy expressions
Soy has a relatively simple expression language divided into 4 main parts:
1. Literals: `1`, `'foo'`, `[1,2,3,4]`, `['k': 'v', 'k2': 'v2']`
2. Operators: `+`, `-`, `==`, `?:` etc.
3. Function invocations: `index()`, `isFirst()`, etc.
4. Data access expressions: `$foo`, `$foo.bar.baz`, `$foo[$key]`, `$foo[1]`
Since expressions are (for the most part) where data access occurs, it is in
the expressions that we must handle resolving SoyValueProviders to SoyValues
and optionally detaching if we come across a future. One simplifying
assumption we will make is that Soy expressions are idempotent and sufficiently
cheap (relative to a detaching operation) that it is fine to re-execute an
expression when re-attaching.
#### Operators, literals and function invocations
These 3 parts of the expression language translate quite directly and do not
interact with either the output stream or any data that may contain futures and
therefore do not have any complex control flow requirements.
The biggest optimization opportunities exists in this part of the
implementation. Soy tracks a fair bit of type information in order to flag
issues at parse time as well as to generate type casts in the JS implementation.
However, the Java runtime hasn’t been able to take advantage of any benefits
from specialization due these types. For example, the expression `$a + $b` has
somewhat complex semantics since Soy has essentially the javascript rules for
the `+` operator. So in order to execute the operator we need to know if either
of the parameters is a string or a number and then decide to concat or sum.
The current Tofu implementation is
[here](https://github.com/google/closure-templates/blob/master/java/src/com/google/template/soy/sharedpasses/render/EvalVisitor.java)
and implements this by a sequence of explicit type checks at runtime. This is
unfortunate since in a large number of cases the types are fixed at parse time.
Because we will generate code for each `+` operator we can specialize the
implementation based on the types of the subexpressions and move many of these
type checks from runtime to compile time.
Finally, another obvious optimization in expression evaluation is the removal
of SoyValue boxes. If an expression is fully typed, then we could eliminate
all the SoyValue wrappers and instead operate directly on raw `longs`, `doubles`
and `Strings`.
For future work we should consider using the java7 `invokedynamic` instruction
to optimize this further. This would allow us to specialize based on runtime
types.
#### Data access
When coming across a data reference we will need to generate code to
conditionally resolve it. Resolution may mean one of several things:
There are two kinds of data access:
* VarRef
* For each of these we will generate a field to hold the SoyValueProvider
* To access, we first check if the provider has been resolved, if it hasn’t
been we then resolve the variable
* If the provider is resolvable (via the `canResolve` method), then we
`resolve()`.
* If is isn’t resolvable, we calculate a RenderResult object, store our
state and return.
* For future work we can use a version of definite assignment analysis to
eliminate some checks. For example, if it is definitely not the first
access, then we can just read the field, no need to generate any code
beyond that. An initial version of this is in `TemplateAnalysis`
* DataAccess
* These are for accessing subfields, map entries or list items
* There are no fields to check so we grab the item as a provider, check
canResolve and conditionally detach.
For example, a VarRef data access `$foo` referring to a template param may be
implemented as:
~~~
SoyValue fooValue = fooField;
if (fooValue == null) {
// first access
case N:
SoyValueProvider valueProvider =
params.getFieldProvider("foo");
Result r = valueProvider.canResolve();
if (r.type() != Type.DONE) {
return r;
}
fooValue = fooField = valueProvider.resolve();
}
state = N +1;
~~~
Obvious optimizations of this code may include:
* eliminating the field if the var is only accessed once
* not checking for `null` (or generating a new state) if this is provably not
the first reference in the template
DataAccess nodes will be similar with the caveat that they will be referencing
subfields of other SoyValues instead of from the params.
## Compiling Soy commands
Soy commands are the most complex part of the design. The may contain complex
control flow or define complex objects. The next section will go through all
the Soy commands and discuss exactly how they would be implemented.
### RAW\_TEXT\_NODE
Trivial compilation. Simply translates to:
~~~
case N:
output.append(RAW\_TEXT);
state = N + 1;
if (output.isSoftLimitReached()) {
return Result.limited();
}
~~~
So for each RAW\_TEXT command we will need to allocate a state and check the
output for being limited after writing.
Issues:
* The text constant may be very large. We may want to rewrite as multiple
write operations if the constant is very large (>1K? >4K?)
* The jvm limits string constants to <64K bytes (in modified UTF8), so for
very large content blocks we have to split into multiple writes.
* For small writes we should attempt to eliminate soft limit
checks if we have only written a few characters. Coming up with reasonable
heuristics here will be the hard part (e.g. <100 chars? bytes?)
### PRINT\_NODE, PRINT\_DIRECTIVE\_NODE
The general form of a print command is
~~~
{print <expr>|<directive1>|<directive2>...}
~~~
(Note that the `print` command name is optional and often omitted)
To evaluate this statement we will first use the expression compiler to
generate code that produces a SoyValue object, then we will invoke code that
looks like this:
~~~
N:
expr = …;
expr = context.getPrintDirective("directive1").apply(expr);
expr = context.getPrintDirective("directive2").apply(expr);
state = N + 1;
case N+1:
Result r = expr.render(output);
if (r.type() != Type.DONE) {
return r;
}
state = N + 2;
~~~
### XID\_NODE, CSS\_NODE
These nodes are truly trivial. In fact it was probably a mistake to implement
them as commands instead of just a `SoyFunction`.
In ToFu we currently use a single-element cache optimize renaming.
See [CssNode.renameCache](https://github.com/google/closure-templates/blob/master/java/src/com/google/template/soy/soytree/CssNode.java&l=83)
. This is one of the few examples of an optimization that would be lost in the
redesign. Based on profiling of SoySauce applications, renaming does not appear
to be on the hot path, but if we thought it was important we could optimize
this (via the same technique, or possibly by using integer keys and array
lookups instead of hash lookups, which may be simpler/smaller/faster).
###LET\_VALUE\_NODE,LET\_CONTENT\_NODE
`{let ..}` statements are more complex than you might think! Due to our desire
for laziness we cannot simply evaluate and stash in a field. Instead we
generate a class for each `{let}` command. For let value nodes, we will
generate a `DetachableSoyValueProvider` subclass, for `SoyContentNodes` we will
generate a `DetachableContentProvider` subclass. For example, assume that the
template `ns.owner` declares this let
variable `{let $foo : $a + 1 /}`, will generate the following code:
~~~
private static final class let$foo_1 extends DetachableSoyValueProvider {
private final ns$$owner owner;
private int state;
let$foo_1(ns$$owner owner) {
this.owner = owner;
}
@Override protected Result doResolve() {
// evaluate expression using normal rules
// finally take the resolved expression and
// assign to the value field (defined by our
// super class)
this.value = expr;
return Result.done();
}
}
~~~
Then the owner class will declare a field of type let$$foo\_1 and initialize it
at the normal declaration point. Let-content nodes will be very similar with
the caveat that the base class will be different (RenderableThunk). Unlike
params, the fields for let nodes need to be cleared (nulled out), when they go
out of scope. This is to sure that they behave properly in loops (re-evaluated
per iteration) and it will also make sure we don’t pin their values in memory
too long.
Optimizations performed on lets:
* Identify constant lets eagerly evaluate the expression to avoid
generating the closure.
* Identify lets/params that simply alias other lets/params and 'inline' the
references. e.g. `{let $foo : $bar /}` doesn't need a subclass.
* TODO: identify lets that (based on control flow analysis) will not need
detach logic and eagerly evaluate. (Work for this has started in
`TemplateAnalysis`)
### IF\_NODE, IF\_COND\_NODE, IF\_ELSE\_NODE
If conditions will translate quite naturally since the Soy semantics and the
java semantics are identical.
### SWITCH\_NODE, SWITCH\_CASE\_NODE, SWITCH\_DEFAULT\_NODE
The behavior of switch is fairly similar to a sequence of if and else-if
statements (and will be implemented just like that), however, because each
comparison references the same SoyValue and we could detach mid-comparison. We
need to store the switch expression in a field.
Note: this analysis is based on the assumption that switch case statements may
be arbitrary expressions. The AST and current implementation imply that they
are.
TODO(lukes): change soy semantics to ensure that swtich case expressions are
constants, then the implementation could resolve to something like a Java
`switch()` statement, which would be preferable.
### FOREACH\_NODE, FOREACH\_NONEMPTY\_NODE, FOREACH\_IFEMPTY\_NODE, FOR\_NODE
For loops are also pretty straightforward with 2 important caveats.
1. The loop variable, the loop collection, and the current index all need to
be stored as fields so that the loop state can be recovered when
reattaching.
2. The non-empty and if-empty blocks can be implemented via simple loop
unrolling.
### LOG\_NODE
A `{log}...{/log}` statement is simply a collection of Soy statements that
should render to `System.out` instead of the user supplied output buffer. This
is implemented by simply generating code for all the child statements
(as normal), but replace references to `output` with a trivial adapter of
`System.out` to the AdvisingAppendable interface.
Additionally, we can skip generating any and all `softLimitChecks` since
`System.out` doesn’t have an appropriate implementation.
NOTE: this does mean that log statements can block the render thread while
waiting for stdout buffers to flush to disk. This is considered acceptable
since log statements are generally only used for debugging.
### DEBUGGER\_NODE
No op implementation. We can generate a label with a line number here, but
that is about it.
### CALL\_PARAM\_VALUE\_NODE,CALL\_PARAM\_CONTENT\_NODE
See the section on [`{let}` commands](#let_value_node_let_content_node),
`{param}` commands will use an identical strategy for defining the values.
Each one will be stored in a SoyRecord that will be passed as an argument to
the next template.
N.B. None of the `{param}` values or the SoyRecord holding them for a call will
be stored as fields, see the section on
[template calling](#call_basic_nodecall_delegate_node) for a detailed example.
### CALL\_BASIC\_NODE,CALL\_DELEGATE\_NODE
There are several styles of calls for now I will demonstate a normal call with
no data param. e.g.
`{call .foo}{param bar : 1 /}{/call}`
This will generate code that looks like:
~~~
private ns$$foo fooTemplate;
case N:
SoyEasyDict record = new SoyEasyDict();
record.put("bar", <generate bar param>);
fooTemplate = new ns$$foo(record);
state = N+1;
case N+1:
Result r = fooTemplate.render(output, context);
if (r.type() != Type.DONE) {
return r;
}
~~~
parameters like `data = "all"` or `data="$expr"` will simply modify how the
record is initialized.
For `{delcall...}s` the process is mostly the same, but instead of invoking the
callee constructor directly, we instead trigger deltemplate selection by
invoking `RenderContext.getDelTemplate` which selects and constructs the target
callee.
Optimizations and future work:
* We should eliminate the `SoyDict` parameter map whenever possible. Most
calls pass a fixed set of params and in those cases we can eliminate
allocations and map operations by just generating a specialized constructor in
the callee.
### MSG\_NODE,MSG\_FALLBACK\_GROUP\_NODE
Soy has direct support for translations. In `jssrc`, this is mostly delegated
to `goog.getMessage`, but in SoySauce we don't have such a good option, instead
we handle rendering and placeholder substitution ourselves. `{msg ..}`
rendering breaks into 2 cases
* Simple constant messages: This is for when there are no parameters, in these
cases we can calculate the message id in the compiler and look it up directly
in the `SoyMsgBundle`. Here we generate code that directly calls:
`renderContext.getSoyMessge(<id>).getParts().get(0).getRawText()`.
* Messages with placeholders (including gendered messages and plurals): For
these the rendering strategy is much more complex since translators may move
placeholders around, introduce new plurals cases, etc. So for this we use a
runtime library to interpret the `SoyMsg` object against a map of placeholder
objects. So the compiler mostly generates code to populate the placeholder
map. See `Runtime.renderSoyMsgWithPlaceholders`
Future Optimizations:
* For plurals and gendered messages we can generate more specialized calls to
avoid boxing the plurals variable and having to pass the gender parameter in
the placeholder map.
## Compiling Soy Types
The Soy type system mostly follows the JS type system (as understood by the js
compiler). Notably, it doesn’t really fit into the Java type system. The
current renderer manages this disconnect via the
[SoyType type hierarchy](https://github.com/google/closure-templates/blob/master/java/src/com/google/template/soy/types/SoyType.java)
and a plethora of runtime type checks. Currently the runtime checks are a
combination of explicit SoyType operations and the Java <b>instanceof</b>
operator. This will cause a variety of problems for the compiler:
* Soy has a number of places where it checks types. We will need to generate
code that performs these checks by generating `checkcast` instructions.
* Soy has union types. These are not (easily) representable in the Java type
system, so every union typed variable will most likely by represented by a
static `SoyValue` type
* Soy has both `null` values and a `null` type
* The Soy type system is pluggable (notably for protos).
* The soy type system is not completely accurate (e.g. nullability is not
trustable)
Due to these issues we take a conservative approach to how we make use of the
type system. The key principals we will use are:
* If the user declared it, we should enforce it. (with `checkcast` operations)
This way type errors will get caught early and often.
* Nullability information from the type system cannot be relied on. For
example, if `map` contains integer values, then `$map['key]'` will be
assigned the type `int` by the type system, but `int|null` is probably more
accurate since we don't know whether or not the key exists. So in general
we need to be careful when dealing with possibly null expressions. To deal
with this we have our own concept of nullability (`Expression.isNullable()`).
* Type information from the compiler is best effort only. The soy type system
was designed mostly for adding some compile time checks and generating
accessors in the jssrc backend. Using it to generate code for soy
expressions is quite difficult. (this is really just a generalization of
the above point)
## Runtime Dependencies
The generated `Soy` classes will need access to a number of runtime libraries to
perform basic logic. The most obvious ones are:
* `com.google.template.soy.jbcsrc.runtime`:
Contains `jbcsrc` specific runtime libraries
* `com.google.template.soy.jbcsrc.shared`:
Defines common interfaces for normal java code to interact with the generated
code.
* `com.google.template.soy.data`:
Defines common data representations for passing data to/from soy
* `com.google.template.soy.shared.internal.SharedRuntime`
Defines runtime libraries that are shared between jbcsrc and tofu.
There is a long tail of additional libraries that are needed that are scattered
across soy packages. In the long run we should seek to consolidate this kind of
runtime support into a smaller set of packages and then eventually release a
separate maven artifact to encapsulate them. In this way servers can avoid
depending on the compiler at runtime.
## Non-Functional Requirements
Cross cutting architectural issues that influence overall design choices.
* Refresh to reload. Soy development mode should not change at all.
* This should be pretty straightforward with bytecode generation since it
is just as hard to use it to generate a Jar vs. loading the classes into
the current VM.
* For reloading we would just reparse and recompile into a different (heap
sourced) class loader. There is some risk that we will leak permgen,
so we should write leak tests for the classloaders.
* Stack traces are readable! Currently the tofu renderer does a lot of work
to generate stack traces that point to the templates. We should do the
same.
* Efficiency! This new system should be significantly faster (>20% cpu
reduction) than the current approach.
* Reasonable permgen usage. This will add a lot of new classes to the JVM
which may consume too much permgen. In general, we are fine with trading
server ram for cpu, but there are limits.
## Compatibility
There are a number of places where SoySauce has slightly different semantics
than Tofu. We have tried to minimize these as much as possible but in a few
cases we prefer the SoySauce semantics (generally because they demonstrate
errors or ambiguity in user templates). I will attempt to document all known
incompatibilities here:
* SoySauce disallows (at compile time) calls to undefined templates. Tofu
turns these into runtime failures. This may create build failures in
otherwise dead code.
* Stricter type checking of template parameters. Tofu does runtime type
checking but it is somewhat limited in the accuracy of these checks. For
example:
* If you declare that a template has a param `{@param foos : list<string>}`
Tofu will assert that the value is actually a list. SoySauce will do that,
but it will also assert that `$foos[0]` is a string by checking it on access
(this is the same strategy that java uses for generics)..
* Tofu fails to type check params which are statically typed to `?`, this is
a known bug.
SoySauce does not have this bug so user templates relying on it will have to
be fixed..
* SoySauce is stricter about dereferencing `null` objects. For example, given
the expression `isNonnull($foo.bar.baz)` if `bar` is `null` then accessing
`.baz` on it should cause an error, and it does in SoySauce and the JS
backend, however, in Tofu this doesn’t happen (though there is a TODO),
instead it only causes an error if you perform certain operations with the
result of the expression (calling `isNonnull` and simple comparisons
the only thing you can do). An appropriate fix would be
to rewrite it as `isNonnull($foo.bar?.baz)`..
* SoySauce interprets 'required' template parameters slightly differently than
Tofu. Imagine this template:
```
{template .foo}
{@param p : string}
{$p}
{/template}
```
In Tofu, if you call `.foo` without passing `$p` there are a few things that
can happen:
* If it is a top level call (Java code calling `.foo`), then you will get a
`SoyTofuException` saying that a required parameter is missing.
* If it is a Soy->Soy call then you will get `null` for `$p`
In SoySauce you always get `null`. We chose this option because it is more
internally consistent (soy->soy and java->soy calls are treated equivalently)
and it is more consistent with the behavior of the Javascript Soy backend.
## A Bytecode Primer
### Definitions
* Runtime/operand stack - The implicit runtime stack of the virtual machine
* Basic Block - a sequence of instructions with no branches
* Frame - the set of values on the runtime stack and in the local variable
table
### Stack Machine
Java bytecode is a 'stack machine', this means that all operators perform some
kind of operations on an implicit runtime stack. For example, the opcode `IADD`
will pop 2 `int` values off the runtime stack and put their sum back onto the
stack. Bytecode also has a local variable table that can be used to store
named (well indexed) values. However, there are no opcodes that can operate
on local variables (other that pushing them onto the stack).
### Types
The Java bytecode type system mostly maps to the normal Java type system with a
notable exception that boolean is not a type, boolean is just any int,
`0 == false` and `non zero == true`. However, types impose some important
constraints on how bytecode can be written. Every value on the runtime stack has
a type associated, as well as every local variable. At any instruction there
is a notion of an active 'frame'. For a basic block, frames are trivial to
maintain and update. However, for branch target instructions (jump locations),
the frames at each branching instruction must be identical. The jvm has
dedicated opcodes to manipulate frame state for branch targets. For the most
part ASM will calculate these, but errors due to inconsistent frames are easy
to introduce (and do not have pretty failures).
### Opcodes
A good resource for figuring out what each jvm opcode does is from the
[java spec](http://docs.oracle.com/javase/specs/jvms/se7/html/jvms-6.html)
### ASM Tips
The asm library has a lot of benefits. It is small, blazing fast, and well
supported. However, it can be very error prone to generate bytecode. In
particular, asm has no error checking, so when you make a mistake the errors
produced can be inscrutable. Here is what I know:
* NegativeArraySizeException is thrown from MethodWriter.visitMaxes. The most
likely explanation is that you have accidentally popped too many items off
the runtime stack. Look for stray POP instructions, or using the wrong
branch instruction (IF_IEQ pops two ints, IFEQ pops one).