# pan lang ## hello world c-like ``` var# std = @import( "std" ); // int main( int argc, char **argv ); pub var# c_main = fn | argc :isize, argv :[*:nil]?[*:0]u8 | :isize { std.print( "hello world\n" ) iferr |err| { return -1; }; // error // note: `std.print` will likely change return 0; // success }; ``` idiomatic pan ``` var# std = @import( "std" ); pub var# main = fn | args :[][]u8 | :!void { std.print( "hello world\n" ) tried; // note: `std.print` will likely change }; ``` ## toc - pan lang - hello world - toc - origins - objectives - ecosystem - formatter - compiler - qbe - wasm - features - errors - tried - allocators - compile-time - generics - interfaces - non-features - types - integers - bits - floats - records - strings - pointers TODO finish /update.. - hello-world-example - foundations-and-priorities - reference ( types ( strings ), operators, control-flow ) - idioms ( dos-and-donts ) - advanced ( compile-time, generics, interfaces, etc ) note: *literals* and *syntax* are appropriate subsections ## origins major: c, zig minor: rust, java ## objectives systems programming language consistent, elegant no hidden control flow prioritize: 1. correctness 2. speed, optimality 3. ease of use 4. terseness correct > fast/optimal > easy > concise prefer identifying bugs at: 1. compile-time 2. runtime 3. never identifying bugs compile-time > runtime > unrecognized explicit > implicit ambitions: **generics** compile-time type allocators compile-time interfaces; readers writers ~async~ .. TODO worry about mvp first resources heap pointers as something to be freed new heap pointers as something to be assigned/written before being read ### ecosystem ( *vaporware*, nothing exists yet; these are ordered priorities. ) 1. compiler 2. language server 3. formatter 4. std lib 5. package manager.. compiler prioritizing simplicity, language server addressing feedback loop time compiler includes testing ambition: c-compatible #### formatter options / levels ( under consideration ) - insert and remove newlines ( consistent output ) - align indentation ( tabs, maintain spaces ) - align rows ( spaces ) ### compiler targets - x86 ( 64 ) - arm ( 64 ) - wasm ( 32 & 64 ) - riscv ( 64 ) modes + debug + safe - fast - small #### qbe https://c9x.me/compile/ https://c9x.me/compile/doc/il.html https://c9x.me/git/qbe.git #### wasm https://github.com/WebAssembly/spec instructions https://webassembly.github.io/spec/core/syntax/instructions.html https://webassembly.github.io/spec/core/exec/instructions.html https://webassembly.github.io/spec/core/binary/instructions.html https://webassembly.github.io/spec/core/text/instructions.html https://webassembly.github.io/spec/core/appendix/index-instructions.html ## features manual memory management, allocators compile-time execution sentinel types control flow expressions consistency TODO generics TODO linear or affine types, resource tracking objective: prevent resource leaks, prevent double free non-objective: prevent use-after-free https://en.wikipedia.org/wiki/Substructural_type_system ### errors minimal effectively an enum, but tracked and grouped by the compiler. similar to sqlite3 errors, but recognized as enums with language conveniences. TODO consider how /if /how-much to associate data with errors.. #### tried TODO find a better word/phrase/char-sequence.. containing function *must* be fallible, return `!T` does not apply to optionals ``` var x = fallible() tried; ``` is equivalent to ``` var x = fallible() iferr |err| { return err; }; ``` and ``` var x = if( fallible() ) |val| { val } else |err| { return err; }; ``` ### allocators single: `create( type )` & `destroy( ptr )` multi: `alloc( type, len )` & `free( slice )` `realloc( slice, len )` - allocate more, either in-place, or copying to the new allocation and freeing the old ? `calloc( type, len )` - clear memory possibly with a `self` param for each method.. ### compile-time ### generics `<||>` is used in definitions, when the generics are introduced and *used*. `<()>` is used in invocations, when the generics are assigned and *provided*, monomorphized. note: generics may be inferred. ``` var# swap = fn <| T :type |> | a :*mut T, b :*mut T | :void { var tmp = a.*; a.* = b.*; b.* = tmp; }; var mut x :usize = 24; var mut y :usize = 42; swap<( .T = usize )>( x.&, y.& ); ``` see also `index_of` ``` var# Node = record <| T :type |> |Self| { .data :T, // must be set when initialized .next :?*Self = nil, // methods .#push = fn | self :*mut ?*Self, all :Allocator, val :T | :!void { var node = all.create( Self ) tried; node.* = .( .data = val, .next = self.*, ); self.* = node; }, .#pop = fn | self :*mut ?*Self, all :Allocator | :?T { var head :*Self = self.* ifnil { return nil; }; self.* = head.next.*; var data = head.data.*; all.destroy( head ); return data; }, }; @test( "node push-pop", fn | test :std.Test | :!void { var mut root :?*Node<( .T = usize )> = nil; for( 0..5 ) |i| { root.&.push( test.allocator, i ) tried; }; for( 0..5 ) |j| { test.expect_equal( root.&.pop( test.allocator ) ifnil { return error.nil; }, 5 - j - 1, ) tried; }; }; ``` integers will be defined generically `@integer< .sign :bool, .bits :u16 >` `#{ assert( u8 == @integer<( .signed = false, .bits = 8 )> ); }` ### interfaces compile-time + generics ``` var# Collection = record <| .Col :type, .Ele :type, .fn_table :record { .size :fn( self :*Col ) :usize, .add :fn( self :*mut Col, allocator :Allocator, ele :Ele ) :!void, .iterator :fn< It :type >( self :*Col ) :!It, .next :fn( it :It ) :?*Ele, }, |> { // runtime state .self :Col, // methods .#size = fn| col :*Col | :usize { fn_table.size( col ) }, .#add = fn| self :*mut Col, allocator :Allocator, ele :Ele ) :!void { fn_table.add( self, allocator, ele ) tried }, .#iterator = fn| self :*Col | :!It { fn_table.iterator( self ) tried }, // TODO should `next` be used here ? }; ``` TODO Allocator, Reader, Writer ### non-features shadowing identifiers ; never macros ; never inheritance; at least for mvp .. inheritance is not implicitly bad, but oop shoehorns it into polymorphism, which it is *bad* at. globals; i hate globals. could be convinced they're necessary, but need a compelling case, need to make me want them.. overloading functions ; never overloading operators ; never overloading ; never ## types integers, floats, bool, void, noreturn/never optionals, sentinels, enums, records, classes/opaque, variant /tagged-union /sum-type /rust-enum, flags(?) pointers: single, multi, any, ( mutability, readability ) functions records basic user-defined composed type either named (struct) or indexed (tuple) fields packed variant for bit-fields transparent, structural equivalence; memory layout: fields are sorted first by alignment, then by name or index; so a record instance can be assigned to a variable of a structually equivalent record without overhead. TODO example memory layout of record with fields sorted. variants /tagged-unions /sum-types packed-records /bit-fields fields of these types cannot have their address taken ### integers two's complement, modern hardware.. usize, isize, u0, u1, s0, s1, u8, s8, u16, u32, u64, u128 ``` var ascii :u7 = 'a'; var unicode :u21 = 'λ'; // TODO decide if unicode glyphs are valid character literals.. // var utf8 :u36; // if header_byte is allowed 0b1111_1110 // var utf8 :u42; // if header_byte is allowed 0b1111_1111 ``` #### literals basic ``` var zero :u0 = 0 ; var one :u1 = 1 ; var two :u8 = 2 ; var three :usize = 3 ; var ten = 10 ; ``` signed ``` var negative :i1 = -1; var positive :u1 = +1; ``` radixes ``` // default radix is decimal assert( 255 == 0xff ); // hexadecimal assert( 255 == 0o377 ); // octal assert( 255 == 0b11111111 ); // binary ``` note: non-zero decimals *may* begin with zero '0'. underscore ``` var one_million = 1_000_000; var dead_beef = 0xdead_beef; var zig_undefined = 0b1010_1010; ``` ##### bits "shrink-wrapped": 5 is 0b101 so the literal `05` is `u3` #### ranges | min | max | ---|-----|-----| u0 | 0 | 0 | u1 | 0 | 1 | u2 | 0 | 3 | u3 | 0 | 7 | u4 | 0 | 15 | u5 | 0 | 31 | u6 | 0 | 63 | u7 | 0 | 127 | u8 | 0 | 255 | ---|-----|-----| i0 | 0 | 0 | i1 | -1 | 0 | i2 | -2 | 1 | i3 | -4 | 3 | i4 | -8 | 7 | i5 | -16 | 15 | i6 | -32 | 31 | i7 | -64 | 63 | i8 |-128 | 127 | ---|-----|-----| ### floats f32 "float", f64 "double" fractional ``` var three_halves :f32 = 1.5; assert( three_halves == 0x1.8 ); assert( three_halves == 0o1.4 ); assert( three_halves == 0b1.1 ); ``` signed ``` var negative_half = -0.5; ``` exponents ``` // TODO floats with exponents.. 'e' -vs- 'p' ``` ### records *named* == struct *indexed* == tuple `packed` `extern` or `clang` - c memory layout, fields in declaration order.. #### literals TODO examples of named and indexed ``` var# Point = record { .x :u16, .y :u16 }; var point = Point( .x = 3, .y = 4 ); var# String_Key__Integer_Value = record { .:[]u8, .:u32 }; var tuple = String_Key__Integer_Value( "hello", 1 ); var tuple :String_Key__Integer_Value = .( "world", 2 ); // this would normally be preferred named-style as //var# Dictionary_Entry = record { .key :[]u8, .val :u32 }; //var entry = Dictionary_Entry( .key = "name", .val = 42 ); //var entry :Dictionary_Entry = .( .key = "name", .val = 42 ); ``` tuples can be convenient, but are usually poor choices for apis; exception that proves the rule: `print()`.. `print`'s second parameter is a tuple. ### strings `[]u8` no utf-8 affordances, use a library zig uses `[:0]const u8` .. pan defaults to const/immutable .. pan still considering maintaining slice suffix values like `[:0]`.. string literals will include a suffix zero-byte; the question remains whether the type system should track such slice-suffixes.. ### pointers no referencing or dereferencing occur automatically `.&` is required for referencing, taking the address of a value. `.*` is required for de-referencing, following a pointer. note: `for` loops over slices provide single-pointers to each element note: field accesses only work on "local" types ( non-pointers ) and single-dereference pointers ( `*T` ), not on multiply dereferenced types ( `**T` ). ``` var# AType = record { .field :usize }; var a :AType = .( .field = 42 ); var b :*AType = a.&; var c :**AType = b.&; assert( a.field == b.field.* ); assert( a.field == b.*.field ); assert( b == c.* ); // XXX assert( a.field == c.field.*.* ); // invalid // `c.field` does not compute, a memory dereference is needed assert( a.field == c.*.field.* ); assert( a.field == c.*.*.field ); ``` ### slices slices are conceptually `record <| T :type |> { .ptr :*T, .len :usize }` however, there are also details of mutability, and possibly readability. some pointers can be used to read, but not write - those are "mutable". some pointers refer to undefined memory - those a not "readable". one idea is to treat such un-readable pointers as resources that are "consumed"/"freed" when the pointer is assigned to. the assignment/write operation would need to return a value so that a new variable could be defined with the new type "writable, readable, pointer-to-T".. for heap pointers, this "readable resource attribute" would be a second, after the first "allocation resource attribute" which requires freeing via allocator. ## literals ### keyword literals ``` var b0 :bool = false; var b1 :bool = true; var opt :?u8 = nil; ``` ### void ``` var x :void = void(); ``` zero-size-type .. not useful itself. instantiating `void` may be useful with generics. ### optionals `nil` value only allowed in optional types ``` var optional :?usize = nil; if( optional ) |val| { fx( val ); } else { handle_nil_value(); }; var default_value = 42; var non_optional :usize = optional ifnil { default_value }; ``` note: `??usize` is valid multiple layers of optional *is* valid, but it does make some control flow strange ``` var nxx :??usize = nil; // which "layer" ? ..the first outermost i think if( nxx ) |nx| { // even though `nx` was just "unwrapped", it can be `nil`.. if( nx ) |n| { print( "the number is (d)\n", .( n ) ); }; }; var oz :?usize = nil; var myy :??usize = oz; if( myy ) |my| { if( my ) |m| { print( "the number is (d)\n", .( m ) ); } else { print( "also, nil\n", .() ); print( "what does this mean semantically?\n", .() ); }; } else { print( "nil\n", .() ); }; var q :?usize = 13; var p :?usize = myy ifnil { q }; // `ifnil` "unwraps" only one "layer" ``` #### sentinel types implicit to pointers applicable to integers and enums ``` var# Id = @sentinel( u32, 0 ); // the `Id` type is 32 bits, but cannot hold the semantic value zero ( `0` ). // for the `Id` type, the bit-pattern of zero represents `nil`. var my_id :Id = @cast( ?Id, 1 ) ifnil { @panic( "id-one-nil" ); }; var# ceo_id :Id = 42; // ok; compile-time verifies integer value is not reserved // XXX var invalid_id :Id = 0; // XXX invalid // XXX assert( @cast( ?Id, 1 ) == 1 ); assert( if( @cast( ?Id, 1 ) ) |val| { val == 1 } else { false } ); assert( @cast( ?Id, 0 ) == nil ); // XXX assert( @cast( ?Id, 0 ) <> 0 ); assert( if( @cast( ?Id, 0 ) ) |val| { false } else { true } ); ``` example ``` var# max_usize = #{ std.max_val( usize ) }; var# Index = @sentinel( usize, max_usize ); var# index_of = fn <| T :type |> | haystack :[]T, needle :T | :?Index { assert( haystack.len < max_usize ); return for( haystack, 0.. ) |hay, i| { if( hay == needle ) { break @cast( ?Index, i ) ifnil { @panic( "unreachable" ); }; }; } end { nil }; }; var idx = index_of( "hello world", 'w' ) ifnil { @panic( "unreachable" ); }; assert( idx == 6 ); ``` useful to avoid padding due to boolean optional bit. ( integers and enums with a reserved value for nil ) ## variables var x; // immutable, type inferred var y :usize; // immutable, usize var mut z :bool = false; // mutable, boolean, initialized var# Point :type = record { .x :u16, .y :u16 }; // compile-time (`#`), type ## builtins @import @test // TODO specific syntax @sentinel @panic @cast TODO should these include parenthesis or no ?.. ## control flow if for while fn and or tried ( `try` as a postfix/suffix operator ) switch iferr ifnil defer errdefer okdefer TODO examples ### defer, errdefer, okdefer `defer` good for cleaning up resources particularly when multiple block exits exist. `errdefer` specifically useful for releasing resources when a function errors before it can return those resources. database transaction - `errdefer { transaction.rollback(); };` - `okdefer { commit(); };` while-`update` block is very similar to `okdefer` except for on `break`.. - on `break`, `okdefer` does run; - on `break`, `update` does not run. `okdefer` useful for `flush`ing buffers.. ### and, or binary operators, boolean only short-circuiting, hence *control flow* ### catch ``` var x = fallible() iferr |err| { switch( err ) case( .error_a ) { def_val_a } default { return err; } }; ``` is equivalent to ``` var x = fallible() catch case( .error_a ) { def_val_a } default { return err; } ; ``` ``` var socket = while( true ) { break connect( addr ) catch case( .Timed_Out ) { continue; } // retry default |err| { return err; } ; } end { @panic( "unreachable" ); }; // XXX would this need an `end` block ? // `while( true )` will never normally end, // but that requires evaluating logic.. hmm ``` ### symbols *general* associations `.` is for fields or inference `()` are for "consuming" expressions `||` is for control-flow-introduced or "unwrapped" values `{}` are for logic and control flow `''` are for characters `""` are for strings `!` is for errors `?` is for optionals `@` is for builtins `#` is for compile-time `$` will be for resources // TODO `[]` are for slices, arrays, indexing, etc `*` is for multiplication and de-referencing /pointers `&` is for addressing `=` is for equality or assignment `+`, `-`, `%` are addition, subtraction, modulo respectively ### whitespace "tabs for scope, spaces for alignment" ### comments ``` // this is a comment // there is only one syntax for comments ``` ### blocks blocks are expressions ``` var two = { 1 + 1 }; var fourteen = 2 * { 3 + 4 }; // var x = 2 * ( 3 + 4 ); // XXX invalid ``` parenthesis are not allowed around arbitrary expressions; use braces. blocks can contain statements before the block-expression ``` var fourteen = { var a = 3 + 4; print( "a: ()\n", .( a ) ); 2 * a }; ``` ### conditionals braces are required for all control flow bodies. most control flow mechanisms are expressions; exceptions include `defer`. note: blocks without a tail-expression (normally) evaluate to void, a valid expression. ( exceptions include when the block is type `never`. ) ``` if( true ) { print( "hello world\n", .() ); }; // `else` not required if( true ) { print( "hello world\n", .() ); } else { @panic( "unreachable" ); }; // if true { // XXX invalid, no parenthesis // std.io.print( "hello world\n" ); // } else { // @panic( "unreachable" ); // }; // if( true ) // XXX invalid, no braces // std.io.print( "hello world\n" ); // else // @panic( "unreachable" ); var a = 11; var b = 13; var min = if( a < b ) { a } else { b }; ``` unwrapping optionals ``` var r = rand( 0.0, 1.0 ); var my_optional :?f32 = if( r < 0.5 ) { r } else { nil }; if( my_optional ) |val| { print( "r is ()\n", .( val ) ); } else { print( "r is nil\n", .() ); }; ``` unwrapping errors ``` if( fopen( "myfile", .rw ) ) |handle| { defer { fclose( handle ); }; read_and_write_file( handle ) tried; } else |e| { print( "opening file failed: (s)\n", .( @errorName( e ) ) ); }; ``` ### loops `use` blocks for declaring loop-control variables without indenting the loop body again. use-declarations are available within the loop body and end block, but not after. `break`, `continue` note: `break` can take a value, and combos with an `end` block to provide a loop expression. inifinite loops, `never` block type #### `for` simple and safe for-loops; iterate over slices and ranges. cannot infinite loop; the iteration count of for loops is known before the loop is entered. examples ``` for( 0..10 ) |i| { print( "i: (d)\n", .( i ) ); }; for( 0..10 ) |j| { if( j % 2 == 0 ) { continue; }; print( "odd: (d)\n", .( j ) ); }; var mut numbers = [.]u8( 1, 2, 3 ); for( numbers ) |n_ptr :*mut u8| { n_ptr.* += 1; }; var# rot13 = fn | char :u8 | :u8 { var alpha = "abcdefghijklmnopqrstuvwxyz"; // return if( 'a' <= char <= 'z' ) { // alpha[ { char - 'a' + 13 } % 26 ] // } elseif( 'A' <= char <= 'Z' ) { // alpha[ { char - 'A' + 13 } % 26 ] - 'a' + 'A' // } else { // char // }; return switch( char ) // ranges are [inclusive,exclusive) case( 'a'..'z'+1 ) { alpha[ { char - 'a' + 13 } % 26 ] } case( 'A'..'Z'+1 ) { alpha[ { char - 'A' + 13 } % 26 ] - 'a' + 'A' } default { char } ; }; var text = #{ var str = "one fun clerk; what she ebbs, she errs."; var mut array :[ str.len ]u8 = undefined; // XXX for( str, array[..] ) |cptr, el| { el.* = rot13( cptr.* ); }; array }; ``` #### `while` venerable while-loops; loop with arbitrary boolean conditions, or unwrapping optionals like `iterator.next()`. ``` { var mut i :usize = 0; while({ i < 10 }) { update { i += 1; } print( "i: (d)\n", .( i ) ); }; }; use { var mut j :usize = 10; } while({ 0 < j }) update { j -= 1; } { print( "(d).. ", .( j ) ); }; print( "boom!\n", .() ); use { var it = collection.iterator(); } while({ it.next() }) |val| { print( "val: ()\n", .( val ) ); }; use { var file = fopen( "some.csv", .readonly ) tried; defer { file.close(); }; } while({ read_line( file ) }) |line| { use_line( line ) tried; }; ``` ### `switch` switch is an expression `goto` can jump to another switch-case; can make a `switch` loop.. `default` is required iff cases are not exhaustive. ``` var# Color = enum { .red, .blue, .green, .nil }; var favorite_color = Color.blue; var number = switch( favorite_color ) case( .red ) { print( "bright and sharp\n" , .() ); 1 } case( .blue ) { print( "calm and cool\n" , .() ); 2 } case( .green ) { print( "quiet and serene\n" , .() ); 3 } ; assert( number == 2 ); var# SomeType = variant |Self| { .boolean :void, .integer :record { .sign :bool, .bits :u16 }, .optional :*Self, // XXX this would be terrible for ownership tracking // optionals should also have sentinel info .pointer :record { .child :*Self, .mutable :bool, .readable :bool }, // TODO more types; slices, enums, variants, etc. }; var some_bool = SomeType.boolean(); var some_pointer :SomeType = .( .pointer = some_bool.&, .mutable = false, .readable = true ); switch( some_pointer ) case( .boolean ) { print( "bool" ); } case( .integer ) |int_data| { print( "(c)(d)", .( if( int_data.sign ) { 's' } else { 'u' }, int_data.bits, ) ); } case( .optional ) |opt_data| { print( "?" ); goto opt_data.*; // `never` case } case( .pointer ) |ptr_data| { print( "*" ); if( ptr_data.mutable ) { print( "mut " ); }; if( !ptr_data.readable ) { print( "und" ); }; } default { return error.unimplemented; } ; ``` ## idioms ### syntax trailing commas allowed, encouraged on multiline lists leading periods imply type inference control flow expressions are *generally* of the form: ``` ( ,* ) | ,* |? { } ``` ## std lib ### `print` `print`'s first parameter is a compile-time "string", `print`'s second parameter's type is computed at compile-time based on the *value* of the first parameter. normally, comptime parameters are passed as generics; TODO consider if/how-to pass compile-time format-strings to `print`.. ``` // hypothetical syntax.. print( "my_var is ()\n", .( my_var ) ); print<( "my_var is ()\n" )>( .( my_var ) ); print<( .fmt = "my_var is ()\n" )>( .( my_var ) ); ``` TODO possibly reference `print` alongside or before *interfaces* self-note: stdout is a normal file descriptor, writing to it may fail; however, *debugging* can allow the debug mode to cover such errors by internally `stdout.write(...) iferr { @panic( "debug" ); };` and relying on the compiler mechanisms around `@panic` in *debug* mode.