Feather / implementation.md
A few notes on the implementation, which are only applicable to those reading/writing the feather compiler... As I don't expect any help writing Feather, this is really only for future-me!
General Structure
FeatherCompiler is the outward face, visible to the users of Feather. CompilerWorker is where all the hard work happens!
Compilation is a two stage process.
The First Phase
The feather script is parsed, and turned into data, including : MetaDataClass, MetaDataMethod etc which in turn have blocks of code (see class Block). Blocks have a set of statements, which are stored as StackEntry (there are many different types!) StackEntry is badly named. They ARE in a stack during the first phase, but they are also the "statements" in a Block, and that's a simple list, not a stack! Oops.
Example expression of how a simple expression is parsed: 1 + i Antlr does its magic, and tells us it has found 1. We push an IntConstant on the stack. Then it tells use about "i" - we push a new LocalVariableStackEntry Then it tells us that the binary operator is complete. We pop both operators. The stack is now empty again. Now we create a "plus" stack entry, giving it the two StackEntries we just popped. Push the "plus" on the stack, and we are done!
During the parsing phase the types of fields, variables, method parameters and return types
are not all known, because we have yet to reach some source code, and the names
of the types may only be found later.
If we see a class name, but cannot find it, we assume it will be found later and
create an UnresolvedType
which holds enough information to deduce the actual type later
(or throw an exception if the class name is still not found).
Therefore, during the first phase, we cannot search for methods.
Instead, we make use of Lazy
StackEntry (there are a few)
They store enough information to discover the actual method call/field access
during the second phase (or throw an exception if the method isn't found).
The Second Phase
During the second pass, we take all the MetaDataClass (and its gubbings, including StackEntries), and begin creating java byte code. All class names, and method are now known.
During this phase, we can resolve
the types i.e. turn an UnresolvedType into a MetaDataType.
Special Types
Array
When dealing with arrays, during the compilation, the internal representation uses uk.co.nickthecoder.feather.internal.Array. The compiled code does NOT reference this class in any way.
Function
Unlike Java, Feather treats functions as first-class citizens. You can store them in variables, pass them to methods etc. The Type for a function is uk.co.nickthecoder.feather.reflect.Function This is a Java class, and is part of the Feather runtime. This type is found in compiled code. (assuming it uses Function Types!)
Implied Types
During the second phase, there is still a problem with Types. Feather allows us to declare fields and variables without explicitly declaring their type. When the implied type refers to a Java library, things are simple. When it refers to a Feather type, it's not so simple.
This is because class A may reference class B (in a different source file), and we may only be able to deduce an implied type in A, after B has been compiled.
The ImpliedType class helps out, by automatically compiling class B halfway through compiling class A (when the implied type is needed).
It also checks for cyclic dependencies, in which case, the script writer must explicitly declare one of the types.
NOTE. Currently, (Aug-2022) inferring implied types isn't perfect. Sometimes vals/functions will need explicit types, when in theory it would be possible to resolve them.
Boxed vs Primitive Types
Currently, I've not allowed Feather to reference the boxed versions of Java primitives directly (you cannot use the type Float for example). The main reason for this: I'm a Kotlin programmer, and it would be very easy to accidentally use the word "Float" when I mean "float".
Kotlin does an excellent job hiding the naffness of Java primitives. Feather doesn't. This is in part because Feather doesn't have a nullable qualifier on its types.
Kotlin's Int? is the same as Java's Integer class, whereas Kotlin's Int is the same as Java's int primitive.
Gotchas
My knowledge/skill writing a compiler and understanding ANTLR has improved greatly. There may well be some amateurish stuff still lurking about which is tricky to understand, because it is doing things the "wrong" way.
Antlr is written in Java, and so we don't have Kotlin's differentiation between nullable and non-nullable values. To make things worse, parts of the grammar which are declared as mandatory can still be passed as null. I'm not sure if this is an Antlr bug, or not. But it is bloody annoying! For example, when I use a keyword when declaring a local variable, the "simpleIdentifier" is null. So, I've gone through and made my code "defensive" - checking for null values, and returning. fun exitXXX( ctx : xxx ) { ctx.foo()?: return // If foo is mandatory, but is missing, there's nothing we can do!!! } Fingers crossed Antlr eventually gets around to spitting out an error, because if it doesn't the compiled code will be garbage!
When a bug in feather creates incorrect bytecode, it is handy to see what it created. Alas, some illegal bytecode doesn't play nicely with ANTLR, and it strips chucks of bytecode from the results. The "fix" for this is to temporarily remove the "COMPUTE_FRAME" flag at the top of fun CompileWorker.compileClass(...) FYI, this happens when a point in the bytecode can be reached in two different ways, with different things on the stack. So is only related to loops/ifs etc.
Method vs Function
Internally, we use "Java" terminology, so "method" is used a LOT. We only use the word "function" when referring to the Feather-specific data type: Function.
Documentation for Feather script developers should NOT use the word method.
Argument vs Parameter
I have a bad habit of using the words argument
and parameter
interchangeably.
That's wrong!
* `parameter` refers to the signature of the method/function
* `argument` refers to the values passes to a method/function
I will endeavour to fix my mistake, but for now, don't believe what you read!
Type Naming Conventions
Consider the method call : foo<Bar>( "Hello" )
Bar
is a type argument (which I generally refer to as typeArg
).
"Hello" is an argument of type String
(which I generally refer to as argType
)
When finding a method paramTypes
refer to the types that the method expects.
So we need to check that the argTypes
match the paramTypes
.
For extra clarity, I try to prefix argType
and typeArg
with resolved
to indicate that
ImpliedTypes
and UnresolvedTypes
have been taken care of already.
i.e. we should have a Class, MetaDataType.
It is a bug to past an ImpliedType or UnresolvedType are an argument whose parameter name
contains the word resolved
.
Possible Optimisations
As part of feather-core tests, I write code in Feather and Java, and check that they produce identical bytecode.
I stopped writing these side-by-side bytecode comparisons quite a while ago.
However, I found one notable difference : Feather doesn't use IINC
with the ++ operator.
I should probably create a special case in Increment
class to check if the assignable operator
is a LocalVariable.
Note, this isn't as bad as you might imagine, because most uses of ++ in Java is in for loops, and Tickle does use IINC in its for loops.
Also, when I look at generated bytecode, there's often ASTORE followed by ALOAD, which would probably be more efficient it is used DUP instead. I have no intention of optimising this, because it would be a lot of work for little gain. And I suspect that Java's JIT will optimise this out anyway!
String concatenation is not optimal. Feather uses a StringBuilder, even for simple expressions such as :
"Hello $name"