Feather / notes.txt

Notes
=====

How the compiler works
----------------------

You don't have to read this unless you want to tinker with the compiler (or are just curious!).


The FeatherCompiler instance can be reused multiple times.
It has attributes such as classFilter, simplifyMain, impliedImportPackages etc, and must be
set before the first call to compile. (Or you can choose to keep the default values).

FeatherCompiler has the following compile methods :
    compileFiles
    compileFromStrings : Useful if the source code isn't stored in a file.

Each compilation creates a Worker (an inner class of FeatherCompiler), and it is Worker that
holds the state information of a compilation.
So if you start two compilations, then there will be two Worker instances.
This allows compilations to be thread safe, as each is self contained using its own Worker.
NOTE. changing the values of FeatherCompiler's attributes is NOT thread safe.


A source file may contain more than one top-level class, and there is no requirement that the
source file must be related in any way to the name of the class defined inside.

The Worker creates an instance of Source for each piece of source code (a File or a String).

We then have two phases, the parsing phase, and the compilation phase.

Once both phases are complete, the results of the compilation is returned as a Map.
The key is the fully qualified class name, and the value is the Java byte code for that class.

FeatherCompiler does not save the byte code to a file, nor does it set up a classloader
which will allow the classes to be executes.
These tasks are quite straight forward. Look at the code in class Feather for details on how to
save byte code to .class or .jar files, and also how to create a ClassLoader.

Parsing Phase
-------------

Each Source is parsed (using ANTLR).
The header section (containing package and import statements) updates the Source data.

As class or interface definition are found, a MetaDataClass is created, and added to the
Worker's metaDataMap. metaDataMap is keyed on the fully qualified class name, and the value
is a MetaDataClass.
MetaDataClass is similar to java.lang.Class; it hold meta data about the classes, such as
their constructors, methods and fields.

Each method and field's initial value is transformed into a StackEntry.
Here are some of the concrete types of StackEntry :

Identifier
    This could be the name of a local variable a field or a Class name.
    During the parsing phase we don't know what the identifier will refer to.
    It's attribute "actual" is null for now, and will only be set during the compilation phase.

ConstantStackEntry
    Constant string, numbers, boolean values etc.

An operator (e.g. AdditionOperator, SubtractionOperator etc).
    During the parsing phase, we have a stack of StackEntry, and when we find an operator,
    the operands are popped off the stack, and the operator is pushed onto the stack.
    For example the expression : a + 1
    Identifier for a is pushed onto the stack, then a ConstantStackEntry for 1 is pushed,
    then the "+" will cause both to be popped, and an AdditionOperator is pushed onto the stack.
    So the stack ends up with a single StackEntry (not three!)

BlockStackEntry
    A Block contains a sequence of other StackEntries. Each time you see code
    enclosed in curly braces, that is a new Block. Sometime a Block will be created containing a
    single expression.

FunctionCall, MethodCall
    This is where things start getting tricky! During the parsing phase, we know very little
    about the Types. The code a.foo(b) is obviously a method call, but we don't know the types
    of a or b yet (well we will in some cases, but not all, so we can't assume anything yet).
    "foo" might be a real method of a, but it may also be an extension function, or it may even
    be an error, and there is no method called "foo".
    So at this stage FunctionCall and MethodCall have a field called "actual" which will be
    filled in later.

There are many of other kinds of StackEntry, such as Return, For, While, DoWhile, Assignment...

Many of these StackEntries store the position within the source code, so that if an error is
found in the next stage, the position of the error can be reported.
Most errors are detected in the compilation phase.

The types of StackEntries can be unknown for two reasons :
    1) We haven't parsed the class file to which they refer
    2) The type is an implied type, such as : val a = foo.baz()
       The type of "a" is implied, it is whatever the return type of the method baz()

So during the parsing stage, we never refer to StackEntry.type() as it won't give us a good answer!

Receivers
---------

When looking up a field or a method when the subject is not explicit :
    e.g. foo = 1 // Assuming foo is a field on the current class
    or   foo(1) // Assuming foo is a method of the current class
we use a stack of Receivers.
"this" is the most common receiver, but there can be others.
The special "apply" construct, adds to the stack of receivers.
In the following example, when we get to "length", there are two receivers, the String "Hello"
as well as "this". In this case the result will be 5, because String's receiver comes before "this".

    class Foo {
        fun foo() : int {
            fun length() = 0
            "Hello".apply {
                return length()
            }
        }
    }

Currently "apply" is the only special construct which adds a receiver, but I may add more later.
Alas, I couldn't implement "apply" using extension functions, due to Feather's lack of lambdas
(and also limitation of its type system), so "apply" is hard-wired into the compiler, and is a keyword.

The Compilation Phase
---------------------

After all the Sources have been parsed (converted into MetaDataClass), we are ready to start
generating Java byte code.

Worker.compileClass is called for each item in the metaDataMap.
This in turn calls Worker.compileField and Worker.compileMethod for each field and method.

For each method, we compile the Block's BlockStackEntry, which will compile each of it's
StackEntries.

Fields and methods that have implied types will become actual types, so the MetaDataField.returnType()
and MetaDataMethod.returnType() will be a "real" type (either a Class or a MetaDataClass).

In the simplest case, we iterate through each class once, and we are done.
However, we may not be that lucky. Perhaps a method (or field) from class A cannot be determined
until another class B has been compiled. In this case compiling A will throw an exception,
and caught, and we can then start compiling class B, and return to A later.
This part is vague because I haven't written it yet ;-)