#pldev — Public Fediverse posts
Live and recent posts from across the Fediverse tagged #pldev, aggregated by home.social.
-
Rearranged the register assignments in my softrisc32 ISA to match that of RV32I because there's no point in maintaining a variant register map just because I find the RV32I map "untidy" (due to them arranging stuff to make sense when the top half are missing in RV32E).
This has the side-effect of making (textual) sr32 assembly even closer to rv32i assembly.
About to shift from passing parameters on the stack to passing parameters in registers.
-
Starting to allocate some registers. I need more test cases with greater register pressure. Most of them fit within 4 working registers just fine.
I did update my live range graph in the IR dump to use dashed lines for spilled registers.
Here's one that spills at 4 and spills a bit more at 3.
-
Housekeeping to allow the -out path (for final compilation) and the -xir path (for eXecutable IR useful for validation) to coexist in a single compiler invocation. Also some tidying up of argument wrangling, improving the XIR format so writing it is non-destructive (to allow generating pre/post optimization variants), tidying up output file argument handling in main, and separate flags for dumping ir0 (initial IR generated from the AST) and -ir1 (final IR).
-
Next: Finish up register allocation and selection and final code generation from the IR. At which point it should be self-hosting through the full stage3 compiler. Guessing it'll wind up around 8000 lines of code total once that's done, but we shall see!
Not tiny, but not enormous either.
-
Cleaning up the validated AST form involved having the AST be more consistent about types, in particular pointers (which are generally not explicit in the language syntax, except when indicating if a struct field that is an array or struct is inline or not).
This resulted in an explosion of Type objects (2438 total, 1978 of them pointer-to-x types) when building the compiler.
Adding a pointer-to cache field in Type dropped that to 580 total. ~76% savings.
-
That took a few days to get sorted, and it's not entirely done until I fix up the stage3 compiler's IR generation to work with the revised AST that the validation phase now generates, but stage1 and stage2 pass all tests and the a bunch of weird quirks from the early days of the project have been sorted out.
Responsibility for validating types, handling lhs vs rhs differences, managing other bookkeeping now lives entirely in the validator.
https://github.com/swetland/spl/commit/0f2f73521ea9ef55e9a6b57ae204b763bdf519a3
-
Needing to better formalize the rules for pointers (which only exist explicitly in structure or array type definitions to indicate if fields or elements are in-line or not) to make sure the implicit dereferencing (or not) happens correctly. Getting closer. Doing it post-parsing but pre-codegen is definitely feeling better than the original side-effect-of-codegen approach that got really messy.
Only a couple tests not passing with all these changes.
-
Putting the sr32 code generator back together around the new AST nodes the post-parser validation step generates/transforms, hoping that the result will indeed be simpler, cleaner, easier to follow, and worth the big mess I made tearing everything apart.
-
You might remember that I was working on my own programming language a few months ago. I've decided to write a blog post about what this language is, how it is implemented, and what are my future plans!
#programming #pldev #compiler #arm64
https://lisyarus.github.io/blog/posts/making-your-own-programming-language.html
-
Today's Compiler Project Adventures... sorting out transformation from the post-parser AST that is nearly 1:1 with the source text to an adjusted form reflecting operations that differ on the LHS/RHS of assignments, resolving various Type related issues, and so on...
I've been staring at FIELD (all caps) in AST_FIELD_... defines and AST debug dumps for so long now it has ceased being a word and is now an incomprehensible jumble of letters.
-
I was going to wash the dishes but inspiration struck and I had to sketch out a 4 bit reference counting scheme.
The refcount values are 0, 1, any (2), unused (3).
4 refcounts can fit in a byte.
If we pre-generate inc and dec (with conditional jump if refcount is 0) for each offset in the byte, each version of them takes only about 4 x86 instructions:
add imm to al,
test al againt imm,
branch on zero (in dec) or parity (in inc)
and al, imm (only in inc, to correct for infinity-plus-one)Objects with 0 refs are not reachable during GC (by definition), so its value (along with 3) can be reused as a colors in the graph traversal. (classic depth-first)
Objects with only one reference can then be absorbed.
Consider an array with a slice. If the slice is the only reference to the array, then the slice can become (in the #Smalltalk sense) the array, and relinquish its unreferenced portions to the allocator.On the more sane side: if a reference is unique, we can use it to optimize a pure function into an in-place update.
Idk, #PLdev people, are these just mad ramblings or could this be used for something?
I should mention that the context is wanting to run a high-ish level language on old 8-16 bit (mostly Intel) PDAs. (which probably pushes the whole idea into mad rambling territory, if it wasn't there already :neofox_laugh_sweat: ) -
Today I expanded on the direct to backend compiler directives:
#emit puts the given string directly on the source code as a LinearOp
#funattr adds function attributes to the current function
#global puts the string into the top of the generated file
#local puts the string into the top of the current functionWith these in place you can do things like adding linear assembly (useful to insert optimization fences or other shenanigans), hookup instrumentation, and to configure your functions as you would with a C compiler (add always_inline, force loop unrolls, put a function into a given section, etc.). I think these form a base that could work for most or all backends I can think of, so they are not limited to the current C one.
-
Last #DecemberAdventure day, but work won't stop after today, have a really long trip ahead still, and will celebrate new year's on the plane.
In the meantime, I used the new `#emit` directives to move all the stdlib specific code from the backend into std and added the option to compile without main to create standalone lib code that could run on any target with a crt. I also added raw string literals that extend until (and including) the newline.
I would like to fully get rid of the need for libc, but I'm not familiar enough with macos syscalls to start writing assembly for those, would probably do that on my linux machine after I'm back home.
-
Some more #DecemberAdventure work from the bus:
- Improved codegen to avoid generating code for unused functions.
- Added the `#if` compiler directive for conditional compilation based on a given --flag.
- Added the `#error` compiler directive to ensure we have a way to signal a compilation error in some path (for example, unimplemented library functions for a given OS and such).
- Added the `#emit` compiler directive to be able to generate code directly on the backend verbatim. Now #badlang is a C macro assembler lol.Here is everything together.
-
I did a short survey of #compiler backend targets: https://abhinavsarkar.net/notes/2025-compiler-backend-survey/
-
If someone were to write a new #compiler book today, what would you prefer the backend to emit? Learning about which backend would help the readers most these days?
#poll #compilers #PLdev #LangDev -
I’m officially working part time so that I can have more time for myself to work on #ArkScript (and also rest, go on small adventures…)
And this month I’ll finally publish the next (hopefully last) major of ArkScript, on which I’ve been working for about 3 years now!
https://arkscript-lang.devI’m looking for sponsors so that I can keep working on the project and deliver high quality code (https://github.com/sponsors/SuperFola)
-
Apparently this needs to be said? Bismuth VM is written in C, not C#. The reason there's .net stuff in the path of the hex editor I took a screenshot of is because the *compiler* is written in C#, because I don't hate myself nearly enough to be doing a whole bunch of string manipulation in C
-
I’ve started writing an article on #ArkScript blog, to show how it differs from other Lisp (since ArkScript is just Lisp inspired, not aiming to be a complete lisp replacement/variant)
So far I’ve talked about scoping, namespacing, declaring variables and functions, touched quoting and data types.
What is an important point for you when looking at lisp like languages?
(Btw here is the blog https://arkscript-lang.dev/blog/)
-
New Bismuth VM blog post, detailing the life cycle of a hello world program for Bismuth from high level Bronze code to text-based IR, transpiled C, binary IR, and finally bytecode: https://enikofox.com/posts/hello-world-in-bismuth/
(Quiet) public/unlisted replies to this post will be shown on the blog as comments on the post
-
#ArkScript April 2025 update is up!
-
On another note, I’ve added instruction source location tracking to #ArkScript!
Meaning, we can (finally) have runtime errors that point to the line which threw the error. As well as go up the call tree and display it with the line of each call as well!
However I’m still dueling with #msvc that loves generating weird errors at runtime (and my favorite OS, Windows, using back slashes in path instead of forward slashes…)
-
Finally adding struct literals. Initially I was going to use a prefix to denote compoun literals, like `#Vec(x = 1, y = 2)` but I found them aesthetically unappealing.
So alas, we overload the parenthesis syntax a bit (though I was already doing this for sum type literals anyway).
Also going with parens instead of curly braces. Technically parens mean grouping in the language, so it's somewhat consistent with other use cases.
-
Finally adding struct literals. Initially I was going to use a prefix to denote compoun literals, like `#Vec(x = 1, y = 2)` but I found them aesthetically unappealing.
So alas, we overload the parenthesis syntax a bit (though I was already doing this for sum type literals anyway).
Also going with parens instead of curly braces. Technically parens mean grouping in the language, so it's somewhat consistent with other use cases.
-
Finally adding struct literals. Initially I was going to use a prefix to denote compoun literals, like `#Vec(x = 1, y = 2)` but I found them aesthetically unappealing.
So alas, we overload the parenthesis syntax a bit (though I was already doing this for sum type literals anyway).
Also going with parens instead of curly braces. Technically parens mean grouping in the language, so it's somewhat consistent with other use cases.
-
Finally adding struct literals. Initially I was going to use a prefix to denote compoun literals, like `#Vec(x = 1, y = 2)` but I found them aesthetically unappealing.
So alas, we overload the parenthesis syntax a bit (though I was already doing this for sum type literals anyway).
Also going with parens instead of curly braces. Technically parens mean grouping in the language, so it's somewhat consistent with other use cases.
-
Finally adding struct literals. Initially I was going to use a prefix to denote compoun literals, like `#Vec(x = 1, y = 2)` but I found them aesthetically unappealing.
So alas, we overload the parenthesis syntax a bit (though I was already doing this for sum type literals anyway).
Also going with parens instead of curly braces. Technically parens mean grouping in the language, so it's somewhat consistent with other use cases.
-
-
wrote a tiny script to output an indexed aseprite image to hex (thanks to https://github.com/boombuler/aseprite-gbexport for letting me cheat off their homework) and now i have a lil yuzu test sprite blitted to the framebuffer :3
-
got the 8-bit palette going in my VM. had to add specifying data from a string of hex characters to my IR to get it there but it works :D
-
-
ArkScript December 2024 update is here!
https://lexp.lt/posts/arkscript_update_december_2024/
Quite a long read because there was a lot of changes. I’m now thinking I should maybe do a 2024 article wrap?
Or people can just read https://lexp.lt/categories/arkscript/ if they want to see everything 🤔 -
deciding to just push system call args in order (so last arg at top of stack) made things so much easier and now i have MemCopy, MemClear, and MemSet system calls :D
-
here's the code for the framebuffer shenanigans:
(mov fb (sys 0x30 0)) ; set graphics mode 0
(mov i 0)
(while (ltu i (mul 320 200)) {
(stob fb i (band i 0xFF))
(mov i (add i 1))
})(sys 0x31) ; present back buffer
-
Programming Languages: Application and Interpretation
Shriram Krishnamurthi
Brown University3rd Edition
@shriramk
#programminglanguages #PLdev #plt #lop #Racket #LanguageOrientedProgramming