diff --git a/HOME_EXAM.md b/HOME_EXAM.md index 046d480551c28d03bc7bac185385ec575ffa2ef5..00646219e10b9c6fd953cef060b6dda8a9a0cf18 100644 --- a/HOME_EXAM.md +++ b/HOME_EXAM.md @@ -4,66 +4,61 @@ ## Your repo -- Link to your repo here: +- Link to your repo here: [https://github.com/widforss/compiler](https://github.com/widforss/compiler) ## Your syntax -- Give an as complete as possible EBNF grammar for your language - -- Give an example that showcases all rules of your EBNF. The program should "do" something as used in the next excercise. - -- If you support pointers, make sure your example covers pointers as well. - -- Compare your solution to the requirements (as stated in the README.md). What are your contributions to the implementation. - +- Give an as complete as possible EBNF grammar for your language: [ebnf.ebnf](ebnf.ebnf) +- Give an example that showcases all rules of your EBNF. The program should "do" something as used in the next excercise: [programs/prime_gen](https://github.com/widforss/compiler/blob/master/programs/prime_gen) ## Your semantics -- Give an as complete as possible Structural Operetional Semantics (SOS) for your language +- Give an as complete as possible Structural Operetional Semantics (SOS) for your language: [structural_operational_semantics.pdf](structural_operational_semantics.pdf) -- Explain (in text) what an interpretation of your example should produce, do that by dry running your given example step by step. Relate back to the SOS rules. You may skip repetions to avoid cluttering. - -- Compare your solution to the requirements (as stated in the README.md). What are your contributions to the implementation. +- Explain (in text) what an interpretation of your example should produce, do that by dry running your given example step by step. Relate back to the SOS rules. You may skip repetions to avoid cluttering: [walkthrough.md](walkthrough.md) ## Your type checker -- Give an as complete as possible set of Type Checking Rules for your language (those rules look very much like the SOS rules, but over types not values). - -- Demonstrate each "type rule" by an example. +- Give an as complete as possible set of Type Checking Rules for your language (those rules look very much like the SOS rules, but over types not values): [type_checking_rules.pdf](type_checking_rules.pdf) -- Compare your solution to the requirements (as stated in the README.md). What are your contributions to the implementation. +- Demonstrate each "type rule" by an example: [programs/prime_gen](https://github.com/widforss/compiler/blob/master/programs/type_checking) -## Your borrrow checker +## Your borrow checker - Give a specification for well versus ill formed borrows. (What are the rules the borrow checker should check). -- Demonstrate the cases of ill formed borrows that your borrow checker is able to detect and reject. - -- Compare your solution to the requirements (as stated in the README.md). What are your contributions to the implementation. + - The return lifetime must be declared. + - The return lifetime must exist in the parameter lifetimes. + - Two return lifetimes that are declared identical must originate from the same function's state. + - The declared return lifetime must correspond with the lifetime of return statements, calculated by tracking references through the function's program flow. -## Your LLVM backend +- Demonstrate the cases of ill formed borrows that your borrow checker is able to detect and reject: [programs/bad_borrows](https://github.com/widforss/compiler/blob/master/programs/bad_borrows) -- Let your backend produces LLVM-IR for your example program. - -- Describe where and why you introduced allocations and phi nodes. - -- If you have added optimization passes and/or attributed your code for better optimization (using e.g., `noalias`). - -- Compare your solution to the requirements (as stated in the README.md). What are your contributions to the implementation. - -## Overal course goals and learning outcomes. +## Overall course goals and learning outcomes. Comment on the alignment of the concrete course goals (taken from the course description) to the theory presented, work You have done and knowledge You have gained. (I have put some comments in [...]). - Lexical analysis, syntax analysis, and translation into abstract syntax. + - Probably the hardest part theoretically as it was important to get the data structures right here. However, it was the easiest part to program in Rust due to the strictly hierarchial structure of the program flow. Precedence climbing and figuring out in what order the parser should check for things was fun problems to work with. This was the part I had the best knowledge of beforehand. + - The work I did here on piggybacknig meta-data to the AST proved very helpful when implementing error messages. + - Regular expressions and grammars, context-free languages and grammars, lexer and parser generators. [Nom is lexer/parser library (and replaces the need for a generator, while lalr-pop is a classical parser generator)] + - I basically read through the code yesterday and build my context-free grammar from that. This was very easy, as it was simply a translation of the existing code. EBNF is rather trivial, so it did not pose a problem. + - Identifier handling and symbol table organization. Type-checking, logical inference systems. [SOS is a logical inference system] + + - I find SOS a hard concept to grasp. It works well for simple expressions and statements, but what about function calls and return statements? + - The same is true for the Type Checking rules for statements, as my statements are type free (although they are dependent on internal type-checking). + - The practical type-checking worked out rather well though. It's very liberating when working with the interpreter to know that the AST is type-correct. + - Ditto for borrow-checking. - Intermediate representations and transformations for different languages. [LLVM is a cross language compiler infrastructure] + - I did very little work in this area. - Code optimization and register allocation. Machine code generation for common architectures. [LLVM is a cross target compiler infrastructure, doing the "dirty work" of optimazation/register allocation leveraging the SSA form of the LLVM-IR] + - I did very little work in this area. Comment on additional things that you have experienced and learned throughout the course. diff --git a/ebnf.ebnf b/ebnf.ebnf new file mode 100644 index 0000000000000000000000000000000000000000..ae0866d3b9e10573b0f0b08bb62f0247b0b3cf83 --- /dev/null +++ b/ebnf.ebnf @@ -0,0 +1,62 @@ +program = { { ws }, function }, { ws } ; + +function = "fn ", { ws }, ident, { ws }, + "<", [ lifetime, { ws }, { ",", lifetime } ], ">", { ws }, + "(", [ param, { ws }, { ",", param } ], ")", { ws }, + "->", { ws }, type_life, { ws }, block ; +param = { ws }, ident, { ws }, ":", { ws }, type_life, + +stmt = { ws }, ( block | let | while | if_else + | return | print | assign | fn_stmt ; + +block = "{", { stmt }, { ws }, "}" ; +let = "let ", { ws }, [ "mut " ], ident, { ws }, + ":", { ws }, type, { ws }, "=" expr, { ws }, ";" ; +while = "while", ( parens | " ", expr ), { ws }, block ; +if_else = "if", ( parens | " ", expr ), { ws }, block, { ws }, + [ "else", ( { ws }, block | ws, if_else ) ] ; +return = "return ", { ws }, expr, { ws }, ";" ; +print = "print ", { ws }, expr, { ws }, ";" ; +assign = { "*", { ws } }, ident, { ws }, "=", expr, { ws }, ";" ; +fn_stmt = expr, { ws }, ";" ; + +type_life = primitive | "&", lifetime, type_suffix ; +type = primitive | "&", type_suffix ; +lifetime = { ws }, "'", ident ; +type_suffix = { ws }, [ "mut " ], { ws }, type ; +primitive = "()" | "bool" | "i64" | "f64" ; + +expr = { ws } ( binexpr | nobinexpr ) ; +nobinexpr = parens | literal | call | ident | unexpr ; + +unexpr = unop, nobinexpr ; +binexpr = nobinexpr, { ws }, binop, expr ; +parens = "(", expr, { ws }, ")" ; +ident = ( alpha | "_" ) , { alphanum | "_" } ; +call = ident, { ws }, + "(", [ ident, { ws }, { ",", { ws }, ident } ], ")" ; + +unop = "-" | "!" | "." | "*" | "&" ; +binop = "**" | "*" | "%" | "//" | "/" | "+" | "-" | "<=" + | ">=" | "<" | ">" | "==" | "!=" | "&&" | "||" ; + +int = [ "-" ], { ws }, numeric, { numeric } ; +float = [ "-" ], [ ws ], numeric, { numeric }, ".", { numeric } + | "NAN" | [ "-" ], [ white_space ], "INFINITY" ; +unit = "()" +bool = [ "!" ], { ws }, ( "true" | "false" ) ; +number = int | float ; +literal = unit | bool | number ; + +ws = "\t" | "\n" | "\r" | " " ; +numeric = "0" | "1" | "2" | "3" | "4" + | "5" | "6" | "7" | "8" | "9" ; +alpha = "A" | "B" | "C" | "D" | "E" | "F" | "G" + | "H" | "I" | "J" | "K" | "L" | "M" | "N" + | "O" | "P" | "Q" | "R" | "S" | "T" | "U" + | "V" | "W" | "X" | "Y" | "Z" + | "a" | "b" | "c" | "d" | "e" | "f" | "g" + | "h" | "i" | "j" | "k" | "l" | "m" | "n" + | "o" | "p" | "q" | "r" | "s" | "t" | "u" + | "v" | "w" | "x" | "y" | "z" ; +alphanum = numeric | alpha ; \ No newline at end of file diff --git a/structural_operational_semantics.pdf b/structural_operational_semantics.pdf new file mode 100644 index 0000000000000000000000000000000000000000..433783c95b0256d438fb9cc852801b5e15a09781 Binary files /dev/null and b/structural_operational_semantics.pdf differ diff --git a/type_checking_rules.pdf b/type_checking_rules.pdf new file mode 100644 index 0000000000000000000000000000000000000000..a4f9f9d699b9156c1e4175b9aa74beaa27ab6ae4 Binary files /dev/null and b/type_checking_rules.pdf differ diff --git a/walkthrough.md b/walkthrough.md new file mode 100644 index 0000000000000000000000000000000000000000..e4b63a3186c1aa264bd3285ed64e9f3352abb56a --- /dev/null +++ b/walkthrough.md @@ -0,0 +1,252 @@ + fn main<>() -> () { + // State: + // [] + + let mut n: &mut i64 = &mut 2; + // SOS eqs. 44, 1 + // State: + // [ + // { State for main(): + // !anonymous: [2], + // n: [Ref(0, !anonymous, 0)], + // }, + // ] + + let max_int: i64 = (1 + 1) ** 7; + // SOS eqs. 12, 38, 33, 1 + // State: + // [ + // { State for main() + // !anonymous: [2], + // n: [Ref(0, !anonymous, 0)], + // max_int: [128], + // }, + // ] + + while *n < max_int { + // SOS eqs. 13, 45, 15, 4 + // Evaluates to `2 < 128` or `true` + + print *n; + // SOS eqs. 13, 45, 10 + + next_prime(n); + // SOS eqs. 13, 14, 3 + fn next_prime<'a>(n: &'a mut i64) -> () { + // State: + // [ + // { State for main() + // !anonymous: [2], + // n: [Ref(0, !anonymous, 0)], + // max_int: [128], + // }, + // { State for next_prime() + // n: [Ref(0, !anonymous, 0)], + // }, + // ] + + while true { + // SOS eq. 4 + + *n = *n + 1; + // SOS eqs. 13, 45, 38, 2 + // State: + // [ + // { State for main() + // !anonymous: [3], + // n: [Ref(0, !anonymous, 0)], + // max_int: [128], + // }, + // { State for next_prime() + // n: [Ref(0, !anonymous, 0)], + // }, + // ] + + if is_prime(*n) { + // SOS eqs. 13, 45, 14, 6 + fn is_prime<>(n: i64) -> bool { + // State: + // [ + // { State for main() + // !anonymous: [3], + // n: [Ref(0, !anonymous, 0)], + // max_int: [128], + // }, + // { State for next_prime() + // n: [Ref(0, !anonymous, 0)], + // }, + // { State for is_prime() + // n: [Ref(0, !anonymous, 0)], + // }, + // ] + + let mut div: f64 = 2.0; + // SOS eq. 1 + // State: + // [ + // { State for main() + // !anonymous: [3], + // n: [Ref(0, !anonymous, 0)], + // max_int: [128], + // }, + // { State for next_prime() + // n: [Ref(0, !anonymous, 0)], + // }, + // { State for is_prime() + // n: [3], + // div: [2.0], + // }, + // ] + + while div ** 2. <= .n { + // SOS eqs. 13, 33, 20, 43, 5 + // Evaluates to `2.0 ** 2.0 <= 3.0` or `false` + } + + return true; + // SOS eq. 11 + // State: + // [ + // { State for main() + // !anonymous: [3], + // n: [Ref(0, !anonymous, 0)], + // max_int: [128], + // }, + // { State for next_prime() + // n: [Ref(0, !anonymous, 0)], + // }, + // ] + + // if condition that called is_prime() evaluates to `true` + return (); + // SOS eq. 11 + // State: + // [ + // { State for main() + // !anonymous: [3], + // n: [Ref(0, !anonymous, 0)], + // max_int: [128], + // }, + // ] + + while *n < max_int { + // SOS eqs. 13, 45, 15, 4 + // Evaluates to `3 < 128` or `true` + + print *n; + // SOS eqs. 13, 45, 10 + + next_prime(n); + // SOS eqs. 13, 14, 3 + fn next_prime<'a>(n: &'a mut i64) -> () { + // State: + // [ + // { State for main() + // !anonymous: [3], + // n: [Ref(0, !anonymous, 0)], + // max_int: [128], + // }, + // { State for next_prime() + // n: [Ref(0, !anonymous, 0)], + // }, + // ] + + while true { + // SOS eq. 4 + + *n = *n + 1; + // SOS eqs. 13, 45, 38, 2 + // State: + // [ + // { State for main() + // !anonymous: [4], + // n: [Ref(0, !anonymous, 0)], + // max_int: [128], + // }, + // { State for next_prime() + // n: [Ref(0, !anonymous, 0)], + // }, + // ] + + if is_prime(*n) { + // SOS eqs. 13, 45, 14, 7 + fn is_prime<>(n: i64) -> bool { + // State: + // [ + // { State for main() + // !anonymous: [4], + // n: [Ref(0, !anonymous, 0)], + // max_int: [128], + // }, + // { State for next_prime() + // n: [Ref(0, !anonymous, 0)], + // }, + // { State for is_prime() + // n: [Ref(0, !anonymous, 0)], + // }, + // ] + + let mut div: f64 = 2.0; + // SOS eq. 1 + // State: + // [ + // { State for main() + // !anonymous: [4], + // n: [Ref(0, !anonymous, 0)], + // max_int: [128], + // }, + // { State for next_prime() + // n: [Ref(0, !anonymous, 0)], + // }, + // { State for is_prime() + // n: [4], + // div: [2.0], + // }, + // ] + + while div ** 2. <= .n { + // SOS eqs. 13, 33, 20, 43, 4 + // Evaluates to `2.0 ** 2.0 <= 4.0` or `true` + + if .n % div == 0. { + // SOS eqs. 13, 43, 35, 23, 6 + // Evaluates to `4.0 % 2.0 == 0.0` or `true` + return false; + // SOS eq. 11 + // State: + // [ + // { State for main() + // !anonymous: [3], + // n: [Ref(0, !anonymous, 0)], + // max_int: [128], + // }, + // { State for next_prime() + // n: [Ref(0, !anonymous, 0)], + // }, + // ] + + // if condition that called is_prime() evaluates to `false` + } + } + + while true { + // SOS eq. 4 + + *n = *n + 1; + // SOS eqs. 13, 45, 38, 2 + // State: + // [ + // { State for main() + // !anonymous: [5], + // n: [Ref(0, !anonymous, 0)], + // max_int: [128], + // }, + // { State for next_prime() + // n: [Ref(0, !anonymous, 0)], + // }, + // ] + + if is_prime(*n) { + // SOS eqs. 13, 45, 14, 7 + + ... \ No newline at end of file