Exit Full View

Feather / implementation.md

A few notes on the implementation, which are only applicable to those reading/writing the feather compiler... As I don't expect any help writing Feather, this is really only for future-me!

General Structure

FeatherCompiler is the outward face, visible to the users of Feather. CompilerWorker is where all the hard work happens!

Compilation is a two stage process.

The First Phase

The feather script is parsed, and turned into data, including : MetaDataClass, MetaDataMethod etc which in turn have blocks of code (see class Block). Blocks have a set of statements, which are stored as StackEntry (there are many different types!) StackEntry is badly named. They ARE in a stack during the first phase, but they are also the "statements" in a Block, and that's a simple list, not a stack! Oops.

Example expression of how a simple expression is parsed: 1 + i Antlr does its magic, and tells us it has found 1. We push an IntConstant on the stack. Then it tells use about "i" - we push a new LocalVariableStackEntry Then it tells us that the binary operator is complete. We pop both operators. The stack is now empty again. Now we create a "plus" stack entry, giving it the two StackEntries we just popped. Push the "plus" on the stack, and we are done!

During the parsing phase the types of fields, variables, method parameters and return types are not all known, because we have yet to reach some source code, and the names of the types may only be found later. If we see a class name, but cannot find it, we assume it will be found later and create an UnresolvedType which holds enough information to deduce the actual type later (or throw an exception if the class name is still not found).

Therefore, during the first phase, we cannot search for methods. Instead, we make use of Lazy StackEntry (there are a few) They store enough information to discover the actual method call/field access during the second phase (or throw an exception if the method isn't found).

The Second Phase

During the second pass, we take all the MetaDataClass (and its gubbings, including StackEntries), and begin creating java byte code. All class names, and method are now known.

During this phase, we can resolve the types i.e. turn an UnresolvedType into a MetaDataType.

Special Types

Array

When dealing with arrays, during the compilation, the internal representation uses uk.co.nickthecoder.feather.internal.Array. The compiled code does NOT reference this class in any way.

Function

Unlike Java, Feather treats functions as first-class citizens. You can store them in variables, pass them to methods etc. The Type for a function is uk.co.nickthecoder.feather.reflect.Function This is a Java class, and is part of the Feather runtime. This type is found in compiled code. (assuming it uses Function Types!)

Implied Types

During the second phase, there is still a problem with Types. Feather allows us to declare fields and variables without explicitly declaring their type. When the implied type refers to a Java library, things are simple. When it refers to a Feather type, it's not so simple.

This is because class A may reference class B (in a different source file), and we may only be able to deduce an implied type in A, after B has been compiled.

The ImpliedType class helps out, by automatically compiling class B halfway through compiling class A (when the implied type is needed).

It also checks for cyclic dependencies, in which case, the script writer must explicitly declare one of the types.

NOTE. Currently, (Aug-2022) inferring implied types isn't perfect. Sometimes vals/functions will need explicit types, when in theory it would be possible to resolve them.

Boxed vs Primitive Types

Currently, I've not allowed Feather to reference the boxed versions of Java primitives directly (you cannot use the type Float for example). The main reason for this: I'm a Kotlin programmer, and it would be very easy to accidentally use the word "Float" when I mean "float".

Kotlin does an excellent job hiding the naffness of Java primitives. Feather doesn't. This is in part because Feather doesn't have a nullable qualifier on its types.

Kotlin's Int? is the same as Java's Integer class, whereas Kotlin's Int is the same as Java's int primitive.

Gotchas

My knowledge/skill writing a compiler and understanding ANTLR has improved greatly. There may well be some amateurish stuff still lurking about which is tricky to understand, because it is doing things the "wrong" way.

Antlr is written in Java, and so we don't have Kotlin's differentiation between nullable and non-nullable values. To make things worse, parts of the grammar which are declared as mandatory can still be passed as null. I'm not sure if this is an Antlr bug, or not. But it is bloody annoying! For example, when I use a keyword when declaring a local variable, the "simpleIdentifier" is null. So, I've gone through and made my code "defensive" - checking for null values, and returning. fun exitXXX( ctx : xxx ) { ctx.foo()?: return // If foo is mandatory, but is missing, there's nothing we can do!!! } Fingers crossed Antlr eventually gets around to spitting out an error, because if it doesn't the compiled code will be garbage!

When a bug in feather creates incorrect bytecode, it is handy to see what it created. Alas, some illegal bytecode doesn't play nicely with ANTLR, and it strips chucks of bytecode from the results. The "fix" for this is to temporarily remove the "COMPUTE_FRAME" flag at the top of fun CompileWorker.compileClass(...) FYI, this happens when a point in the bytecode can be reached in two different ways, with different things on the stack. So is only related to loops/ifs etc.

Method vs Function

Internally, we use "Java" terminology, so "method" is used a LOT. We only use the word "function" when referring to the Feather-specific data type: Function.

Documentation for Feather script developers should NOT use the word method.

Argument vs Parameter

I have a bad habit of using the words argument and parameter interchangeably. That's wrong!

* `parameter` refers to the signature of the method/function
* `argument` refers to the values passes to a method/function

I will endeavour to fix my mistake, but for now, don't believe what you read!

Type Naming Conventions

Consider the method call : foo<Bar>( "Hello" )

Bar is a type argument (which I generally refer to as typeArg). "Hello" is an argument of type String (which I generally refer to as argType)

When finding a method paramTypes refer to the types that the method expects. So we need to check that the argTypes match the paramTypes.

For extra clarity, I try to prefix argType and typeArg with resolved to indicate that ImpliedTypes and UnresolvedTypes have been taken care of already. i.e. we should have a Class, MetaDataType.

It is a bug to past an ImpliedType or UnresolvedType are an argument whose parameter name contains the word resolved.

Possible Optimisations

As part of feather-core tests, I write code in Feather and Java, and check that they produce identical bytecode.

I stopped writing these side-by-side bytecode comparisons quite a while ago.

However, I found one notable difference : Feather doesn't use IINC with the ++ operator. I should probably create a special case in Increment class to check if the assignable operator is a LocalVariable.

Note, this isn't as bad as you might imagine, because most uses of ++ in Java is in for loops, and Tickle does use IINC in its for loops.

Also, when I look at generated bytecode, there's often ASTORE followed by ALOAD, which would probably be more efficient it is used DUP instead. I have no intention of optimising this, because it would be a lot of work for little gain. And I suspect that Java's JIT will optimise this out anyway!

String concatenation is not optimal. Feather uses a StringBuilder, even for simple expressions such as :

"Hello $name"