Programmable Whitelist-based Configs: Embedding Rye in Go
15 points by refaktor
15 points by refaktor
Just a suggestion: I believe allow- & deny list are much better terms than white- & black-list. Many non-native speakers don‘t get the meaning of them initially and need to actively learn them.
Thanks, I am a non-native speaker also. I didn't know for words denylist / allowlist so far but I see why they would be better. I will see, maybe I will just change it in the blogpost. I was a little confused if these terms are problematic in any way, but later forgot about this.
Your entire response, @refaktor, is wonderful.
As a native English speaker, "denylist" and "allowlist" are completely intelligible and would work great!
Thank you for letting me crash your thread: today was rough but everything about your comment was just a delightfully refreshing breath of much-needed humanity :)
Cheers!
This is an interesting approach
Today I am using CUE as the config in one of my projects. Even did a module system in there.
I like the expressiveness, reminds me of Nix but without wanting to solve everything and having to install globally.
CUE was surprisingly nice for encoding data validation schemas.
I've started modelling a large data set with it and it's surprisingly nice
Thanks. Basically I wanted to build a tutorial on how to embed Rye into Go, and decided on a config theme ... then I fell into the config files rabbit hole :)
I will google CUE.
Every time i read about Rye, i want to play around with it, but i haven't found a cool usecase yet.
maybe i should just implement a raytracer in it
I'm afraid raytracer will be quite slow. While flat code is quite fast for an interpreted language, implementing complex hot code is not Rye's forte.
I've used rye in many different niches, what are the ones that you work in, besides raytracing :)
(edit: the article mentions Starlark at the end)
Starlark is a similar embedded configuration language; its Go implementation go.starlark.net/starlark is quite capable. By default it's fully hermetic and deterministic, and its syntax is a subset of Python (with one little asterisk around the load() function replacing import).
The main selling point of Rye here seems to be that it's Forth-ish, so the syntax is determined by the host program. It reminds me a lot of how Lisp-based programs often evolve into special-purpose DSLs (c.f. https://wiki.c2.com/?LispIsTooPowerful).
I'm also in the middle of designing a new configuration language named Ficus, which is similar to Starlark but leans more toward Go in terms of syntax and has static typing. It's an interesting area, there's a surprising lack of little languages that fit in the gap between plain data structures (YAML, TOML) and full-powered embedded scripting languages (Lua, JavaScript).
I'm also in the middle of designing a new configuration language named Ficus, which is similar to Starlark but leans more toward Go in terms of syntax and has static typing.
I dabbled with Starlark a long time ago, and one of the things I was most annoyed about was the lack of static typing. I think since then there have been quite a few configuration-esque languages that have introduced static typing, but a fair number of those use Haskell syntax and, well, Haskell and derivatives convinced me that syntax matters a lot more than I previously thought. Looking forward to hearing more about Ficus!
I was trying to make a tutorial about how Rye can be embedded in Go programs and decided to theme it around configuration, although it basically could be used for general embedding tutorial (since it shows how Rye can call Go's code, and vice-versa).
So I am not trying to particularly sell the idea that this is some great config solution. I suspect languages that do focus on that can be more specific and focused about the problems that arise with config scenario that go beyond capabilities.
Do you have a link to Ficus?
Do you have a link to Ficus?
Not yet, though I hope to have enough for a v0.1 within a month or two. It's being developed as part of a larger project (a Bazel-like build system) so huge chunks of the Ficus parser and interpreter are todo!() while I build out the higher-level layers.
It's not very interesting from a PL theory perspective, basically just a mishmash of Python + Go + Rust syntax on top of a Go-like type system.
# can also group as `import { .. }` like in Go
import "some/module.fig" as m
import "other/module.fig" { imported_fn, ImportedType }
@balsa.rule(
name = "cc_library",
attrs = {
"srcs": balsa.attr.label_list(),
"cc_flags": balsa.attr.string_list(),
},
)
func _cc_library(ctx: &balsa.RuleContext) {
let out = ctx.actions.declare_file(
filename = ctx.attr.name + ".o",
)
# implementation of cc_library build rule here ...
return balsa.FilesProvider.new(files = [out])
}
The only maybe-interesting part is its number type, which has the range +/- 999,999,999,999,999,999,999 and 9 digits of fractional precision. JavaScript's unified number type (without floating-point semantics) + Python's "big enough for anyone" int type (without a hard dependency on arbitrary-precision arithmetic) + enough precision for nanoseconds in a timestamp.
Not to be confused with the phonetically-similar Rhai language, which is also an embeddable scripting language but for Rust.
The granularity of this approach arises nicely from Rye's forthy principle of "no special primitives", but I don't think this essay clearly articulates why it's practically valuable to so severely subset the language in a config file.
When I think about the expressiveness of configuration files, I generally want to ensure that they have no side effects (apart from, perhaps, debugging facilities) and (ideally) that the configuration language is sub-Turing-complete, to facilitate reasoning about configurations. Turing-completeness can arise alarmingly easily with many simple sets of primitives, but can be blocked tidily with execution limits, as you provide extrinsically. Without side-effects the only other serious misbehavior that's possible is consuming excessive quantities of memory, which could likewise be avoided with an appropriate heap/stack limit.
Just as a point of comparison, in Lil I draw hard lines between syntactic constructs (if/while/each/queries,etc.) primitive operators, and everything else. The eval[] function provided in the standard library offers all the syntactic constructs and primitive operators of the language, but only the variable bindings one explicitly passes in. Since the syntactic constructs and primitive operators are all pure, evaluated code is inherently sandboxed unless you give it access to some of the stdlib functions, and you have the option to wrap those functions individually in guard-rails before exposing them to eval[]-ed code fragments. Much less fine-grained than Rye's approach, but I get the practical benefits- as I see them- of whitelist-based capabilities.
You raise good points. I'm not saying the granularity is the only or the best way. Choosing specific words / functions to expose, or delivering config specific ones can have their merits I think. You could also just register full Rye with one call, or specific Rye subcontexts / function groups, or make your own selection of functions. I imagine I wouldn't want all functions in my config language, but it all depends on the situation.
Rye also distinguishes between pure and regular builtins - and functions, so you could make a helper that only registers pure functions for example.
If you make distinction between syntactic constructs, you just haven't been bitten by the REBOL bug yet :). Otherwise, each approach has some benefits, here I am exploiting some our Rye/REBOL/Red. Like the nginx case shows, sometimes it could make sense to have "if" and not all other control flow constructs.
There's a fair bit of embedded language implementations in Go (see comparison table and tengo itself). It makes for a very good user experience when there's just a single static binary and no runtime dependency IMO.
When I implemented my custom TRMNL server I ended up on declaring screens with a custom JSX by leveraging esbuild and goja which I think ended up quite nice.