When if is just a function
41 points by veqq
41 points by veqq
Tcl is (infamously?) like this as well: {} are just a kind of string quote, so if is a perfectly normal procedure, and you can write your own control structures. It also allows such delights as If we had no if (archive since the wiki is down for me at the moment).
In REBOL (Red and Rye) these are not strings, but blocks of REBOL values. REBOL was "famous" for having more than 30 datatypes many of them literal datatypes (you can enter directly in code or REPL, not convert to) and this was very usefull for dialecting as it provides additional information. For example url, email address and file path were each it's own datatype, so you could just do read %file.txt and read https://www....
Rye also has a lot of datatypes and also adds idea of kinds and generic methods on top of it (as seen in Factor). Generic methods are used extensively, especially agains external libraries where you work with certain resources, but there are still some open design questions around it.
In rye you can go Get https:/www..... where Get is a generic method that dispatched on https-schema kind. There can be other generic get methods. For example keyvalue db could have unrelated Get method.
I think that's the better design. Tcl's string-based approach results in some odd behavior, like comments with unbalanced braces in them behaving in unexpected ways. Also, I think the Tcl approach marries you to dynamic scoping.
I don't have direct experience with Tcl, but having specific types and more type sensitive code seems a positive to me, even if the language is dynamic.
One interesting thing that somehow evolved in Rye is, Rye is dynamically typed, but now it's constant by default, so if almost everything is of constant value, this seems much more constrained than almost everything being of "constant" type (Go also has Any for example).
Combinatory programing (and array programming) offers such functional control flow. Here is a straight forward explanation which inspired me to write better-cond in Janet:
(defn better-cond
  [& pairs]
  (fn [& arg]
    (label result
           (defn argy [f] (if (> (length arg) 0) (f ;arg) (f arg))) # naming is hard
           (each [pred body] (partition 2 pairs)
             (when (argy pred)
               (return result (if (function? body)
                                (argy body) # calls body on args
                                body)))))))
Most Lisps have cond like this:
(def x 5)
(cond
  ((odd? x) "odd") ; note wrapping around each test-result pair
  ((even? x) "even"))
Clojure (and children Fennel and Janet) don't require wrapping the pairs:
(def x 5)
(cond
  (odd? x) "odd"
  (even? x) "even")
My combinatoresque better-cond doesn't require a variable at all and is simply a function call which you can map over etc.:
((better-cond
   odd?   "odd"
   even?  "even") 5)
Of course, it can work over multiple variables too and have cool function output:
(defn recombine # 3 train in APL or ϕ combinator
  [f g h]
  (fn (& x) (f (g ;x) (h ;x))))
(((better-cond
  |(function? (constant ;$&))
  |($ array + -)) recombine) 1 2) # |( ) is Janet's short function syntax with $ as vars
Surprised not to see more of a discussion of if as expression, common in functional languages, and the issue of mandatory “else” branch.
I think that's because the two topics are orthogonal to each other. You can have if be an expression and a special form at the same time, but this post has been mainly about what if it's not a special form, regardless of where it stands.
That being said, yeah, I love if expressions myself, it's not impossible to code without it, but it makes a lot of forms nicer if it's present. Especially conditional assignment.
Yes, the focus of the blog post was on special forms or not. Focus also wasn't on If specifically, although in title I used if as the most common thing to pass the message.
In Rye everything returns something, hence everything is an expression. In REBOL I think print didn't return anything which I found odd given that all (or at least most) else did return. Rye doesn't have a null value, either.
I think that's because the two topics are orthogonal to each other.
Then why did TFA bring it up as if they were related?
In Forth, IF is just another word (Forth for "function").  It however, does a bit more than just conditionally evaluates what follows, it typically generates code for a conditional branch.  Postscript (another language based on Reverse Polish Notation) is a bit more pure in that it actually takes two parameters, a bool and a block.
Yes, Forth, Factor are one of the languages that also inspired Rebol (Forth) and Rye (Factor).
It gets even cooler when if is just a method:
a > 3 ifTrue: { 'greater' } ifFalse: { 'less' }.
In most other languages, even when if is just a function, it's not a function that can be implemented in the language itself, it needs some primitive definition of conditional branching that's not in the language. In Smalltalk (and Objective-S) it is.
Because ifTrue:ifFalse: is a method that is implemented in both the False and True classes:
class True : Boolean { 
    -ifTrue: trueBlock ifFalse: falseBlock {
            trueBlock value.
    }
}
class False : Boolean { 
    -ifTrue: trueBlock ifFalse: falseBlock {
            falseBlock value.
    }
}
The reason of course is that conditional branching is included in dynamic message dispatch primitive.
True and False are their own types and you dispatch on Type. Cool :)
Is there a concept of True and False belonging to Boolean? Can you make a function that dispatches just on both?
You might wonder: “Won’t the block execute immediately when passed as an argument?” Here’s the key insight: in Rye, code blocks { ... } are values.
This is really cool, I don't know any other language where this is the case.
Tcl, Smalltalk, Postscript. Arguably Ruby.
This kind of block is basically a lambda expression with minimal syntactic noise, with the caveat that languages like these often don’t do lexical scoping.
In PowerShell, {} is a lambda function. The downside of the syntax is that defining parameters is slightly convoluted, but you can write very keyword-like commands with it.
e.g.
1,2,3 | foreach {$_ + 1}
foreach is just an ordinary command that receives a lambda function as a parameter and invokes it for each value from the pipeline.
Cases like this is where the "oddball" idea of injected blocks came from ... the left value is injected before the block is being evaluated in Rye, and this is used in a lot of functions in Rye.
{ 1 2 3 } |for { + 1 }    ; this would return result of the last loop so 4
{ 1 2 3 } |map { + 1 } ; would return { 2 3 4 }
I hate to be that person, but:
What does [ mean and how is it different from {?
I couldn't find a place in the docs where this is explained. This is very annoying.
Your question is perfectly valid. There was no difference between { } and [ ] for several years, but this year [ ] became the vals (reduce in REBOL) block, and ( ) is the do block. This is so recent that I haven't yet properly documented it in "Meet Rye". Also constant by default, mod-words and var function are new. Otherwise the language was pretty stable design wise for few years.
( 1 + 1 "hel" ++ "lo" )      ; is the same as
do { 1 + 1 "hel" ++ "lo" } ; both return last result of the expression "hello"
[ 1 + 1 "hel" ++ "lo" ]        ; is the same as
vals { 1 + 1 "hel" ++ "lo" } ; in rebol vals=reduce and retuns { 2 "hello" } 
{ 1 + 1 "hel" ++ "lo" }         ; just returns itself { 1 + 1 "hel" ++ "lo" }
Thanks. To check my understanding: in this particular blog post, all square brackets can be replaced with curly brackets.
Rye is pretty cool, it has everything I could whish for in a scripting language, except for the two most important things: arg parsing and subprocess management.
Hi, thank you for your feedback.
Do you think any specific arg parsing, or what would that arg parsing look like? Scripts have access to script arguments in general but it doesn't have any higher level functionality related to script arguments.
Can you also give me some more information on subprocess mangement you would need / expect? Maybe an example and a use case. (Rye has goroutines inherited from Go)
You have no idea, how happy I am that you ask :D If you actually implement this, I'll start testing out Rye seriously as my replacement for bash and python scripts.
So regarding argparsing: It's a spectrum and everything has tradeoffs, additionally it's a matter of taste. Take python for example. You can construct quite complex parsers, but it's verbose. Still way better than what you need to do in bash. On the other end of the spectrum, we have something like Raku, which does everything for you automagically, but robs you of certain freedoms, it also doesn't really follow the default unix-conventions, if I recall correctly, I'm not sure though. I think ideal would be some sort of DSL that allows for absolute minimal syntax for the common cases, and still allows you to expand for more complex examples. E.g.: to just have boolean flags, argument flags, i.e. flags with additional value, and positional arguments you should have to write nearly nothing. To have modes (cargo build, cargo run), you should not need much more. To restrict the valid values, to add help-strings or to have flags with multiple arguments, you could then use more elaborate syntax.
As for subprocess management: The simple cases are: start a process, synchronously or asynchronously, but sync might suffice, if you can start a process in a go-routine. Be able to catch it's output or redirect it. Be able to merge the output streams (and redirect the merged stream), be able to provide input. So far for the basics.
Then it gets harder: start a process asynchronously and be able to read out the streams while it runs, and to provide input base on the outputs. Most subporcessing libraries only offer the possibility to set the data that should be sent to stdin before the process is started, but not after starting it. Additionally, if you could do simple call chains like ls -la | wc -l and then catch the last output, that would be very nice too.
About argument parsing. That is exactly why I didn't yet make anything more specific about it. I see Go has libraries for this, but I'm not sure they cover all the cases or if they are just too strict / static and I should write something more specialized for Rye. Basically I don't know what are the better solutions ... for myself, so far I always sort of did my own parsing, but I didn't have complex or standard CLI needs so far. Go has flag package, but it's similar to what you describe in Raku (not fully standard, still limited).
Do you have any preferences or examples what is the right level of features or structure it should provide? I will look at what Raku does.
About processes, I need to think and experiment about it. We have cmd and cmd\capture functions, but it's a rather basic as I didn't need them that much so far. Can you give me a practical example where you would need this so I have realistic goal / motivation and also ability to test, and maybe it will be useful for me too :)
Like I said, I'd prefer some sort of DSL for argparsing, where the most common scenarios are the briefest ones. Even though python is somewhat verbose, I'm still fine with it. Argparsing can get complicated, for example when you want to separate between arguments to the executable and arguments to the mode, designing something that's elegant and complete is a non-trivial task. Rusts clap-crate does a pretty good job, maybe you could take inspirations from it, and turn it into a DSL? It works by letting you define a structure which holds all possible argument values, which is used to automatically generate a parser. For the details, you can use annotations on the struct.
For subprocesses: My needs mainly originate from writing test scripts for interactive programs, and utility scripts. One example: I have a script that start multiple processes: one http-server, one vue-frontend and emulators for embedded devices that need to interact with each other and the http server, the script filters the outputs of the server and the emulators for certain things and reacts to those outputs.
In lisp these are called FEXPRs. They are equivalent to special cases in eval itself. Powerful stuff. In my lisp, I used them to implement if as a library function that gets imported like all the others. It simplified eval into its true essence: self-evaluating values, and function application.
Lisp sometimes has a "scary words" problem. I've noticed FEXPRs many times, but since I can't even pronounce them I haven't really looked what exactly you are. From your description and first scentence of wikipedia it seems yes, these are how Rye (REBOL) evaluates all code, where in lisp if I understand this is activated on a function type (FEXPR), interesting.
So all FEXPR Lisp would be like Rebol?
I find that it's easier to explain lisp stuff with Python syntax.
Lisp works like this:
def eval(list):
    # magical stuff
def lisp_if(condition, true_case, false_case):
    if eval(condition):
        return eval(true_case)
    else:
        return eval(false_case)
def lisp_gt(x, y):
    return x > y
# etc.
code = ["if", [">", ["+", 5, 5], 20],
              ["print", "true??"],
              ["print", "false, as expected."]]
eval(code) # prints "false, as expected."
The lisp_gt function is a normal function whose arguments are evaluated. So eval itself does the work of reducing the arguments to values.
The condition would be evaluated like this:
eval([">", eval(["+", 5, 5]), 20])
eval([">", 10, 20])
False
The numbers evaluate to themselves and are passed to +, which returns 10. This gets passed along with 20 to > which returns False.
When implementing the > function in the form of lisp_gt, there's no need to write eval(x) > eval(y) because eval itself does it implicitly.
The lisp_if is an FEXPR which just skips all that. So when it executes it's like this:
def lisp_if(condition, true_case, false_case):
    # condition  = [">", ["+", 5, 5], 20]
    # true_case  = ["print", "true??"]
    # false_case = ["print", "false, as expected."]
    if eval(condition):
        # ...
It gets the unevaluated code itself as its arguments. It can evaluate it however it wants or even not at all. The if function only evaluates two out of three parameters.
I'm not sure if Rebol internally works like this. It might not even have an eval function. It does seem to have similar mechanisms and ideas though.
you “just” have to make function calls [...] as fast as possible
And, unless I'm missing something, you also have to deal with the GC pressure created by allocating a closure every time you use a conditional.
No, Rye follows REBOL in this case. Plain block invocation doesn't create it's own scope / context. That holds for do, if, either, loop, for, map, etc.
Yes, it would be costly to have this ON by default. If you want separation, there are many ways to achieve it. Rye has many functions related to contexts / scopes.
A lot of builtins directly accept anonymous functions in place of blocks of code if you want that. For example for function:
for { 1 2 3 } fn { x } { print x }
; which can also be written with fn1 where first arg in anonymous and injected into block
for { 1 2 3 } fn1 { .print }
In Rye (not REBOL) the default set-word ( word: 123 ) creates a constant, so few cases where you need to modify a variable visibly stick out. You can define a variable via var function ( var 'i 0 ) or via a mod-word ( i:: inc i ) which is also needed to modify a variable, set-word would produce and error, and modifying word assigned via set-word first also produces an error.
There is no point in creating a closure for the argument to if because the block cannot go out of its original scope.
yeah, it's evaluated right there and then so closure specifically isn't needed. You could create a subcontext, and you can explicitly via fn or context or any other related method, but it has a cost.