Initial view of the language
Variables
- Variables define scope,length and name of a symbol in code
- Variables only live inside the bounds of code block they are defined in
- Compiler enforced naming conventions, warns when not following. Snake case maybe?
- Primitive types u8...u64, f32, f64, bool
- Bitfields!
- usize, isize, architecture specific
- char (these are unicode scalar values, 32 bits), you can use u8 as old school C-char
- Strings. I would really want to have UTF-8 supported from start.
- If you have explicitly specified address to a variable, then it's handled as volatile, its likely memory mapped IO. These will always be loaded and stored during read/write.
- Global variables are visible only inside the source file, unless exported, and imported in another source file.
- I guess we want generic types?
Structs
- Structs have always known size
- You can put any type in to it!
- used to construct complex types
Unions
- Unions are structs, thats size is size of it's biggest element. All members point to same address.
- You can also explicitly give size for the union! This way you can allocate some block from global memory or stack.
- in memory, its like struct that has lenght and data
- Value of the members is just pointed by the data pointer
Enums
- Special user defined types with range known in compile time
- useful for pattern matching
- two different kinds of enums, typed or typeless
- in typeless enums, its just appropriately sized unsigned integer, where name matches a integer value (called discriminant).
- in typed enums, in memory it's an array of elements with given enum type. Name matches to some appropriately sized integer value.
- so typeless enums are just the same but the array lenght is zero
- can be used as type, not just as value.
Pointers
- Pointers are special, they are more than the address.
- Pointer is a struct that has following members
- type (VariableType enum)
- address, which has members
- valid (boolean)
- address (usize)
- Pointers have type!
- This tells compiler to warn if you are using pointer to assing reference to another type that it was previously assigned. No implicit casting
- Makes the code easier to follow, no void* business like in c
- It tells the type, and thus data lenght at runtime, so you can do ptr.size
- Arrays are they are just struct with pointer to start of data and length in bytes
- This is why we cannot index arrays. No array indexing, since they are just structs.
- But it's using pointer, so we get the same result with something like array.get_element(nth=5). We can do that, since we know the variable type, so we can get the lenght.
- Address of any variable can be extracted with & (reads "address of ...")
- As usual addresses are always usize (so 64 bit in 64 bit machines etc)
Constants
- constants are read-only variables that can be only set to global scope, meaning scope of the source file, unless exported and imported in another source file.
Registers
- user needs to be able to clear registers on cpu startup.
- registers are special kind of variables that allow things like setting a stack pointer.
- User can use registers as they wish, but for compiler to know not use the same registers, they need to be reserved and abandoned.
- during reserve, initial value will be stored somewhere.
- during abandon, stored value will be read to the register.
- these also provide way to do constant time operations.
- these can be only used with assembly instructions, so arithmetics, load and store, branching/jumping and enviroment calls.
- first thought about syntax :
let i1:i32 = 1;
let i2:i32 = 2;
let output:i32;
let x1 = reserve(r10);
let x2 = reserve(r11);
let x3 = reserve(r11);
x1.load(i1);
x1.load(i2);
x1 = x2 + x3; // basic instructions would map to basic operators.
x3.store(output);
x1.abandon()
x2.abandon()
x3.abandon()
Sections
- Any symbols, meaning global variables, constants or functions, can be explicitly put to any section.
- These implicitly go to .bss, .data or .text sections.
- I guess we follow Executable and Linkable Format
Arithmetics
- debug builds should have run-time assertion checking of integer overflows and divided by zeros. These would just be warnings anyway.
Code blocks
- it is nessesary to be able to split code in to blocks to define scope of variables.
- proposed syntax: { let x; { let y; }//y no longer lives here }// x no longer lives here
Functions
- if you have not called a function, its optimized out in the build.
- if you have explicitly given address to a function it will not be optimized. It's likely an interrupt function.
- functions with &self as first argument can be chained like variable.function1().function2;. Any varible with same type as the type of self argument can use these functions.
- functions should use as many registers for their inputs as possible with the hardware. this should all be implicit
- nameless functions. These are just regular functions that compiler has to give name to. They are nice to pass as function arguments.
- closures. These are a bit weird but people use them quite a lot.
- functions always go to global memory, no matter if they are named, nameless or closures.
- function calling with argument name should be possible, like this array.get_element(nth=5, size=sizeof(u32)).
- Not checking return value should result to warning
if's and else's
- these are just conditional jumps in desquise.
Loops
- no strong preferences here.
Iterators and ranges
- Maybe
Pattern matching
- Any primitive or enum type can be matched. Structs, therefore also arrays and unions are not possible to match.
- Matching values need to defined compile time
- Whole range of the variable needs to be covered
- default can be set if preferred.
- just creates a jump table (array of function pointers) size of the type, initializes everything to default function pointer, and sets specified cases to right indexes.
- i would not suggest trying to match u64's, since that results you running out of memory, you end up with usize*2^64 array of function pointers.
- enums preferred for these. Makes the code also more readable.
- enums can only be matched with variant, not with value.
Error handling
Should be as errors-as-value. No try catches here.
I like rust-like option / error types
Memory management
- This is not exactly a language problem. We can do whatever we want.
- If someone want's garbage collector, then implement allocator with carbage collection.
- We mostly want stack allocators for small thing and arena allocators for big things. We should implement what most of the people would be using.
- C++ Smart pointers are great! We should have some mechanism for calling destructors when leaving scope.