Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found
Select Git revision
Loading items

Target

Select target project
  • pln/d7050e_2020
  • soderda/d7050e_2020
  • cryslacks/d7050e_2020
  • wilkru-7/d7050e_2020
4 results
Select Git revision
Loading items
Show changes
Commits on Source (19)
......@@ -2,9 +2,16 @@
- Fork this repo and put your answers (or links to answers) in THIS file.
_Answers can be found at the end of this file_
- [Syntax](#syntax)
- [Semantics](#semantics)
- [Type Checker](#type-checker)
- [Borrow Checker](#borrow-checker)
- [Conclusion](#conclusion)
## Your repo
- Link to your repo here:
- Link to your repo here: https://github.com/wilkru-7/D7050E
## Your syntax
......@@ -69,3 +76,648 @@ Comment on the alignment of the concrete course goals (taken from the course des
- Code optimization and register allocation. Machine code generation for common architectures. [Both LLVM/Crane-Lift does the "dirty work" of backend optimization/register allocation leveraging the SSA form of the LLVM-IR]
Comment on additional things that you have experienced and learned throughout the course.
# Home Exam - Answers by Wilma Krutrök
In the following sections will the bullet points above be covered. The answers will be based on the code attached.
## Syntax
The implemented langugage is desciribed usign EBNF grammar below. The language is implemented using lalrpop.
#### EBNF
```ebnf
Program
:Function*
;
```
```ebnf
Function
:"fn" Id "(" (Arg ",")* Arg? ")" "->" Type BlockExpr
|"fn" Id "()" "->" "()" BlockExpr
;
```
```ebnf
BlockExpr
:"{" (Stmt ";")* Stmt? "}"
;
```
```ebnf
Stmt
:"if" Expr BlockExpr ("else" BlockExpr)?
|"while" Expr BlockExpr
|Decl
|Expr
;
```
```ebnf
Decl
:"let" Id ":" Type? "=" Expr
|"let mut" Id ":" Type? "=" Expr
|Id "=" Expr
|"*" Id "=" Expr
;
```
```ebnf
Arg
:Id ":" Type
;
```
```ebnf
Expr
:Expr ExprOp Factor
|Expr LogicOp Factor
|Id "(" (Expr ",")* Expr? ")"
|Factor
;
```
```ebnf
ExprOp
:"+"
|"-"
;
```
```ebnf
LogicOp
:"&&"
|"||"
|"<"
|">"
|"<="
|">="
|"!="
|"=="
;
```
```ebnf
Factor
:Factor FactorOp Term
|Term
;
```
```ebnf
FactorOp
:"*"
|"/"
;
```
```ebnf
Term
:PrefixOp NumOrId
:NumOrId
:"(" Expr ")"
;
```
```ebnf
PrefixOp
:"-"
:"!"
;
```
```ebnf
Type
:"&" Types
|"&" "mut" Types
|Types
;
```
```ebnf
Types
:"i32"
|"bool"
|"()"
;
```
```ebnf
NumOrId
:Num
|Id
|Bool
|"&" Id
|"&" "mut" Id
|"*" Id
;
```
```ebnf
Num
:[0-9]+
;
```
```ebnf
Id
:([a-z]|[A-Z]|_)([a-z]|[A-Z]|[0-9]|_)*
;
```
```ebnf
Bool
:"true"
|"false"
;
```
#### Showcase
The following program will be accepted by the parser.
```rust
fn test1(x: i32, y: i32) -> i32 {
if x < y {
if x > 0 {
- x + (2 * y)
};
x + y
} else {
- x - (2 * y)
}
};
fn test2(x: bool) -> bool {
let mut a: i32 = 3;
let mut b: bool = false;
let c: i32 = test1(a, 10/2);
while a >= 0 {
b = !b;
a = a - 1;
};
b
};
fn main() -> () {
let start: bool = true;
test2(start || false);
}
```
```rust
fn b(x: &mut i32, y: &mut i32) -> () {
*x = *x + 1;
*y = *y + 1;
};
fn main() -> () {
let mut x: i32 = 5;
let mut y: i32 = 5;
b(&mut x, &mut y);
}
```
Example of references accepted by my compiler
#### Illeagal examples
Three examples that will not go through the parser and be rejected by the compiler:
```rust
fn a(x: bool, y: bool) {
x || y
}
```
No return type.
```rust
fn a(x) -> bool {
x
};
fn main() -> () {
a();
}
```
No type given for argument in function a.
```rust
fn a(x: i32, y: i32) -> bool {
if x > y {
true
}
else x < y {
false
}
}
```
Expressions not allowed in else statment.
#### Coverage
The following bullet points are covered by the parser.
- Two different function definitions (with arguments and return type or without arguments and return type)
- Let and mutable let
- Assignments
- Functioncalls
- If with or without else
- While
- Expressions (parenthesized expressions, prefix, several operands, precedence)
- Types: bool, i32 and unit
- References, mutable references, dereferenced assignments
All statements have explicit types.
#### Future implementation
In the future could the following things be implemented to allow the parser to accept more programs.
- Function definition with arguments but no return type or no arguments but return type
- Else if
- Other loops than while
- More types
- Nested functions
- Pretty printing
Currently it is needed to seperate functions with ";" (except for the last one) for the parser to interpret it as a vector. This would be nice to rewrite in the future.
## Semantics
Below is the Structural Operational Semantics (SOS) described using small-step semantics for the implemented language.
- σ, Store
- σ' and σ'', Changed store
- n, Integer
- b, Boolean
- e, Expression
- c, Commands
- x, Variable
##### Constant
```math
\frac{}{<n,σ> → n}
```
##### Variable
```math
\frac{}{<x,σ> → σ(x)}
```
##### Arithmetic, boolean & comparison operations
```math
\frac{<e0,σ> → n0 <e1,σ> → n1}{<e0 + e1,σ> → n}, \text{ where } n = n0 + n1
```
SOS for addition between two expressions.
Similar SOS goes for the following operands:
- "-", subtraction
- "*", multiplication
- "/", division
- ">", greater
- "<", less
- ">=", greater or equal
- "<=", less or equal
- "==", equals
- "!=", not equals
Can also be combined in a derivation tree.
```rust
1 - 5 + 3 * 4 / 2;
3 <= 5;
```
##### Unary operations
```math
\frac{<e0,σ> → n0}{<-e0,σ> → n}, \text{ where } n = -n0
```
SOS for unary - for one expression.
Similar SOS goes for the following operands:
- "!", not
```rust
-(5+1);
!false;
```
##### Assignment
```math
\frac{<e,σ> → n}{<x:=e,σ> → σ[x:=n]}
```
Let statements works in a similiar way.
```rust
let mut x: bool = true;
x = false;
```
##### Command sequence
```math
\frac{<c0,σ> → σ'' <c1,σ''> → σ'}{<c0;c1,σ> → σ'}
```
```rust
let x: i32 = 5;
x > 2;
```
##### Command conditional
```math
\frac{<b,σ> → \text{ true } <c0,σ> → σ'}{<\text{ if } b \text{ then } c0 \text{ else } c1,σ> → σ'};
```
```math
\frac{<b,σ> → \text{ false } <c1,σ> → σ'}{<\text{ if } b \text{ then } c0 \text{ else } c1,σ> → σ'}
```
```rust
if x != y {
x = x + 1;
} else {
x = x - 1;
}
```
##### Command while
```math
\frac{<b,σ> → \text{ false }}{<\text{ while } b \text{ do } c,σ> → σ};
```
```math
\frac{<b,σ> → \text{ true } <c,σ> → σ'' <\text{ while } b \text{ do } c,σ''> → σ'}{<\text{ while } b \text{ do } c,σ> → σ'}
```
```rust
while x != y {
x = x - 1;
}
```
##### Functioncall
```math
\frac{(<e_1,σ> → n_1 ... <e_i,σ> → n_i) → σ(arg1 := n_1, ..., argi := n_i))}{<c,σ> → (x,σ')}; \text{where x is the return value}
```
```rust
fn a(x: i32) -> () {
x = x + 1;
}
fn main() -> () {
a(5+2);
}
```
#### Showcase step by step
When looking at the showcase and the SOS for the langugage can it be seen that the program will be executed in sequence. First will the value true be assigned to the variable start followed by a functioncall to test2(). Before executing test2() need the argument expression be evaluated and get the value true as seen in the first SOS and assign it to the variable x. In the scope for test2() will the variables a and b also be assigned followed by a functionfall to test1(). In test1() is a if then else introduced and following the SOS above does this return in true and the next if is evalueted which also is true. From here will a return statement be evalueted and the value 7 will be returned back to test2() and assign to variable c. Continuing in the method test2() will a while loop be started and because the expression is true will the block be evalueted as seen in the SOS for while. This will be repeted two more times before the expression becomes false and the state does not change. Finally is the value for variable b returned back to the main method.
#### Coverage
The following bullet points are covered by the interpreter.
- Expressions (Arithmetic, boolean, comparison)
- Let statements (Let and mutable let)
- Assignments (For mutable variables in the right scope)
- If and else
- While
- Functioncalls (With return value)
- References, mutable references, dereferenced assignments
For the interpreter to accept a program is a main() method needed.
#### Future improvements
- Shadowing
- Global variables
- Better handeling of scopes
Currently the interpreter does not handle the right scopes compleatly but this is not a problem due to the type checker taking care of it.
## Type checker
Supported types:
- i32
- bool
- unit
##### Arithmetic, boolean & comparison operations
```math
\frac{<e0,σ> → i32 <e1,σ> → i32}{<e0 + e1,σ> → i32}
```
SOS for addition between two types, if an expression does not follow this rule the expression will be rejected.
Similar SOS goes for the following operands:
- "-", subtraction
- "*", multiplication
- "/", division
```math
\frac{<e0,σ> → i32 <e1,σ> → i32}{<e0 > e1,σ> → bool}
```
SOS for greater than comparison between two i32 types returning a boolean.
Simliar SOS goes for the following operands:
- "<", less
- ">=", greater of equal
- "<=", less or equal
- "==", equals
- "!=", not equals
```math
\frac{<e0,σ> → bool <e1,σ> → bool}{<e0 || e1,σ> → bool}
```
SOS for the or operand between two boolean types returning a boolean.
Similar SOS goes for the following operand:
- "&&", and
- "==", equals
- "!=", not equals
```rust
5 * 3;
3 > 2;
(true || false) == true;
1 + false; //Invalid type bool for operand +
true && 5; //Invalid type i32 for operand &&
```
##### Unary operations
```math
\frac{<e0,σ> → i32}{<-e0,σ> → i32}
```
SOS for unary - for one expression.
Similar SOS goes for the following operands:
- "!", not (bool instead of i32)
```rust
-(5+1);
!false;
```
##### Assignment
```math
\frac{<x,σ> → bool <e,σ> → bool}{<x:=e,σ> -> ()}
```
Let statements works in a similiar way. For an assignment to be valid needs the variable to be initialized as mutable together with a type. An assignment will only be accepted by the type checker if the new type matches the current.
The initialization of a function does also follow a similiar SOS rule.
```rust
fn a() -> () {
let mut x: bool = true;
x = false;
x = 5; // Invalid, x has type bool
}
fn b() -> bool {
1 + 2
} // Invalid, block returns i32 but return type is bool
```
##### Conditional command
```math
\frac{<e,σ> → \text{ bool } <c0,σ> → t}{<\text{ if e then } c0 \text{ else } c1,σ> → t}, \text{ where t = type of last statement }
```
For the if statement to go through the type checker need the expression, e, be a boolean.
```rust
if 5 > 3 {
x = x + 1;
} // Valid
if 5 + 3 {
x = x - 1;
} // Invalid, expression 5 + 3 does not return a boolean
```
##### While command
```math
\frac{<e,σ> → \text{ bool } <c,σ> → () <\text{ while e do } c,σ''> → ()}{<\text{ while e do } c,σ> → ()}
```
```rust
while 3 > 5 {
x = x - 1;
} // Valid for type checker
while 3 * 5 {
x = x + 1;
} // Invalid, expression is not a boolean
```
The example above is valid for the type checker due to it only checking that the expression is a boolean. The interpreter will not execute the block statement due to the expression being false.
##### Functioncall
```math
\frac{<e_1,σ> → t_1 ... <e_i,σ> → t_i}{<c,σ> → t-res}, \text{ where } t-res = \text{return type of function }
```
Each expression evaluating to the same type as the arguments connected to the function.
```rust
fn a(x: i32) -> bool {
x < 5;
}
fn b(x: bool, y: i32) -> () {
x = x + 1;
}
fn main() -> () {
a(5+2); // Valid
b(true, 5 < 7); //Invalid for type checker
}
```
#### Coverage
The type checker rejects programs where:
- Expressions contains mismatching types
- Assignment are done with conflicting types (both for expression and functioncalls)
- Assignment are trying to be done to non mutable variables
- If and while statements does not have a boolean as an expression
- Function does not return the specified type
- Variables not found in scope
- References, mutable references, dereferenced assignments
Information is being spawn about the error when occuring.
#### Future improvements
It would be nice do add the following in the future:
- Multiple error reporting
- Type inference
## Borrow checker
#### Invalid examples
```rust
fn a(x: &mut i32) -> () {
*x += 1;
}
fn main() -> () {
let mut x = 0;
let y = &x;
a(&mut x);
println!("{}", y);
}
```
Variable x is borrowed both as mutable and immutable which is not accepted.
```rust
fn b(x: &mut i32, y: &mut i32) -> () {
*x += 1;
*y += 1;
}
fn main() -> () {
let mut a: i32 = 5;
b(&mut a, &mut a);
}
```
Not a valid program due to variable a being borrowed twice and therefor not being treated as unique.
```rust
fn main() -> () {
let mut x = 0;
let y = &mut x;
x += 1;
*y += 1;
}
```
Borrow is used at multiple locations, therefor it is mutable from different references and that is not accepted.
#### Valid examples
Below follows the rewritten examples that are accepted by the borrow checker.
```rust
fn a(x: &mut i32) -> () {
*x = *x + 1;
};
fn main() -> () {
let mut y: i32 = 5;
a(&mut y);
}
```
```rust
fn b(x: &mut i32, y: &mut i32) -> () {
*x = *x + 1;
*y = *y + 1;
};
fn main() -> () {
let mut x: i32 = 5;
let mut y: i32 = 5;
b(&mut x, &mut y);
}
```
```rust
fn main() -> () {
let mut x: i32 = 0;
let y: i32 = &mut x;
*y = *y + 1;
x = x + 1;
}
```
(Functions should not be separeated by ";", but this was the only solution that worked so I decided to move on)
The borrowchecker should make sure that:
- One reference can only be mutable at a time.
- Multiple immutable references can be used.
- One mutable reference and multiple immutable references can not exist at the same time
This ensures that a variable can not be changed at the same time that another reference tries to read it.
The implementation of the borrow checking is not compleated. It is possible to make references and change the value via the reference. The interpreter does currently not reject illeagal borrows.
## Conclusion
It has been very interesting to learn how a programming language is built. From the lexical analysing where sentences are parsed into tokens to the syntax analysing where the order of the tokens are ckecked. I found it very interesting and instructive to be able to build the AST and parser with the use of lalrpop. It was tricky to understand what should be done in the beginning when both learing lalrpop and rust at the same time. This fell inte place with time and after rewriting the AST compleatly it made alot more sense.
When building the parser we got to use both regular expressions and context-free grammar where every production rule contains of a nonterminal symbol that is equalent to either a nonterminal and/or a terminal. In this way can a language be built recursively. For example can a nonterminal <Expr> be equal to x <Operand> y which is an abstraction of all operands that can be interpreted as a expression. The parser and lexer generator Lalrpop has helped alot with simplifying this process.
The process of implementing the type checker and the interpreter was similar. These gave me more insight in what order a program is executed and how the compiler makes sure that the types are correct in all the different cases. To be able to execute and type check earlier decleared variables and functions was environments used where the variable/function are stored in a vecdeque containing hashmaps of strings and matching type/value. For each scope is a new hashmap pushed to the front of the vecdeque. It was good to get to do the type checker and interpreter both in code and as a logical inference system to get a deeper understanding.
The next step would be to generate machine code from the parser output that has been accepted by the type checker and the interpreter. To do this could the code generator Crane-lift be used. The code could be optimized in several way, for example due to the rules for borrow checking is it possible to put together all changes to a mutable reference into one expression. This because we know that the interpreter only allows one mutable refernce at a time. One way to implement the intermediate representation could be by using the SSA form where each variable only can be assigned once.
I have not implemented this part, instead I focused on understanding the parser, type checker, intepreter and the borrow checking.
It has been difficult to write good structured code because this is the first time I worked with Rust. Therefor the rules and structure was new to me which ended up in very messy code. Neither of the parts work as well as I want to and there are some examples that will not be excepted. I decided to let is be and move on to learn more instead. So I focused on getting the different pars to work as much as possible to get a broad understanding instead of putting too much time into one part. It would be really fun to rewrite everything from the beginning now with the knowledge and insight the course have given!
\ No newline at end of file