The Wrong Question About Type Systems
16 points by voutilad
16 points by voutilad
You use maps, vectors, sets. Not custom classes. Every function you already know works on every piece of data. No "how do I get the name out of this Person object." It is just (:name person).
I've worked with a Typescript codebase at work that does this: no classes, just plain objects. It's utter hell to work with because you can't hide anything behind abstraction boundaries. You can't enforce invariants like "this list must be nonempty" because someone can just write { key: [] }.
The author’s argument is to use a schema library like malli to validate data at the boundaries of an application and keep things loose internally. I do share your frustration with loose, dynamic type idioms, even inside a boundary protected by a schema. I just think there is a more generalized counterargument to be made for strict typing throughout an application.
You can't enforce invariants like "this list must be nonempty"
malli does in fact allow you to model data types with non-empty vectors. Also, for what it’s worth, it is possible to specify non-empty tuples in TypeScript, which may or may not meet your needs since tuples and arrays are related but not quite the same.
no classes, just plain objects. It's utter hell to work with because you can't hide anything behind abstraction boundaries.
In JavaScript (or TypeScript), when objects are created by factories instead of classes, the closure created by that factory is the abstraction boundary. One can also use getters, setters, or proxies with POJOs.
If you use runtime validation, that incurs runtime overhead, especially when working with objects the size of the ones at my work codebase. And there's no way to statically know "has this object been validated" so you have to be careful to get it right.
And we"re not using factories, we're just directly writing const o = {k: v};. But even if we were using factories, that wouldn't stop you from directly initializing objects anyway unless you have the factory include some field named [Symbol.wasProperlyInitialized] or something, at which point it's not clear what you get from POJOs.
(I'm aware of the fancy tricks you can pull in the TS type system, having done quite a few myself.)
Funnily enough, in the Rust codebase I'm working on there's a case where I'm trying to change a particular ID from "ID of a Foo" to "ID of a Bar". We have runtime checks for this, but it would still be a lot easier to make the change if those were separate types.
I really don’t want to quibble because I share your sentiments about type safety, but playing devil’s advocate a bit more, if the author’s argument is to do runtime type checking at the boundaries of the application (inputs from users and external data sources), one still needs to do this kind of runtime validation in a statically typed application, right? It’s only when one uses runtime type checking to validate internal state in a dynamically typed language where static type checking performs better, and the author is arguing not to do that.
If you're loading in existing input from the outside world then yeah, you need to validate it even in a statically typed application. No disagreement there. The problem arises when one part of the application creates a new object that another one consumes.
Been thinking about this after going form 600k+ lines of Elm to 1M+ lines of Elixir. In Elm it was easy to coordinate, as the author puts it, a change across the entire code base (a large refactor). In Elixir I find it much, much more challenging. Like Clojure I do have access to a repl but I'm not writing my code there, I'm writing in static files that I then have to compile. I don't know how schemas work in Clojure, so Elixir's might be slightly different or even very similar. Either way I do have access to schemas but I have to know when and where to use them, whereas static types are there always (sometimes a language requires you to type them out and sometimes they are implicit regardless of being typed out).
I've also thought about this in comparison to my years writing JS, both pre-Node and during the early days of Node, pre-bundlers and build steps. In those early days there was a fairly fast feedback loop between writing and using the code. When bundlers and build steps got involved it started to slow the process. The description of working with Clojure sounds similar to those early days where changes can be tested nearly instantly. Elixir feels so much slower, where I first have to wait for a build step and only then can I test my changes before iterating.
I guess, regardless of the type system it's nice when you can have tight iteration loops and tooling that coordinates well to gives you feedback about your code. Also, I don't have lot of Elixir experience so maybe there's something I'm missing that would get me closer to what the author describes.
Like Clojure I do have access to a repl but I'm not writing my code there, I'm writing in static files that I then have to compile.
I haven't used Clojure, but it seems from context that it's being used the same as Common Lisp: you are writing code in static files (usually), but evaluating and compiling them into an image in memory from the editor. You then have them to work with, test, and debug in the repl. Like the article says, you're poking at a live system, validating your mental model of it. I don't think anyone writes non-trivial code directly to the repl.
I've definitely not worked with a 1M line Elixir project, but the speed of Elixir/Erlang's incremental compile in dev has never been too bad IME. But I do agree that the spec/type situation makes large-scale refactors painful. I've told friends before that Phoenix's motto - Peace of mind from prototype to production - hasn't really panned out for me because that transition of prototype to production has always involved very painful refactors.
1M lines of Elixir has been far better than 1M lines of TypeScript or Java, I would not want to go back to those. But yeah, definitely has some gaps with the lack of required static types (I know there's work on the gradual types).
this is comparing clojure to a couple of the weakest static type systems (typescript, and go. java would be similar)
typescript is especially bad, because the only purpose of the types is to cause compiler errors and maybe generate docs. the compiler checks the types, and then erases them
go's types are a little better, since they at least mean the compiler knows the offset of fields. but go doesn't have overloading or anything. and if you're passing by interface, it's basically the same as using deftype/defrecord in clojure, minus the static check
i don't mean that the conclusion is bad. for some ways of programming, clojure is a fine language and you won't miss them. but, i don't think the comparison tells you much, unless you look at type systems like C#, rust, haskell, or C++, where the types are doing more work
typescript is especially bad, because the only purpose of the types is to cause compiler errors and maybe generate docs. the compiler checks the types, and then erases them
A language's lack of typing at runtime or in later compiler stages is compltely irrelevant to how "strong" its type system is. OCaml has what's generally considered a strong type system, but past the type checking stage it's (almost?) completely untyped. The difference with TypeScript (which I would personally consider a very strong type system with too many escape hatches) is that OCaml is built around its type system, so you can't write code that will work at runtime but is untypable or too complicated to type as you often do with TS.
i know what you mean, about calling it strong/weak, but i have no idea what to call the property i'm talking about
it's about erasure and overloading, mostly. like, you also don't get runtime types if you do monomophization in some languages, but monomorphization is very much "doing something" in a way that erasure isn't
(i can't comment on ocaml's presence or lack of these features, because i don't know ocaml)
but i have no idea what to call the property i'm talking about
Are you perhaps looking for Reflection?
Which of the 5 or more meanings of “strongly/weakly typed” are you using to judge these languages? I ask because your complaints about Typescript seem weird to me.
i'm using the words strong/weak loosely, and i've described what i mean. no need to go to a dictionary but if you have a suggestion on what to call that, i'm interested
basically, do the types matter to the meaning of the program (besides a filter on whether or not it compiles)
Typescript has one of the most sophisticated static type systems, so the types can do a lot of work to prevent bugs happening later on. You complained that it’s bad that types are erased, then you went on to praise a bunch of languages that erase types. And it’s unusual to say Golang’s type system is better than Typescript’s. Which is why I was hoping you would unpack your comment.
typescript has sophisticated types, but the only thing it does with them is reject programs. they're not really used by the compiler for anything other than validation. if you take a valid typescript program and remove all the types, you have the same program
go's types aren't fancy at all. but if you have a function with a pointer-to-struct arg, the compiler will use the struct's definition to emit "load at offset" instruction when you access a field. the types are doing work, beyond just rejecting programs
the other languages i mentioned all use the types for other purposes. the types actually matter for what code gets emitted, and what the program does
rejecting programs can be useful. but it's also not the only thing a type system can do
I think those are two different properties that a type system can have: describing physical layout and enforcing invariants.
I would consider a type system "strong" when it has the capability of strongly enforcing invariants and rejecting programs that don't uphold them: in this case Go would be a weak type system because of it's policy towards zero values, error handling, pointer dereferencing, etc. while TypeScript is much stronger because it has lots of expressive power.
Describing the physical layout of data is a secondary function of a type system and does not factor into considering a type system strong or weak. I think in theory it could be a wholly separated system from the type system (meaning that you could tell the compiler that a field is a pointer to a struct but whether the field is a pointer or embedded into the containing struct would be transparent to the type system and the language), but in most cases it makes sense to bundle them together.
basically, do the types matter to the meaning of the program (besides a filter on whether or not it compiles)
One possibly relevant distinction is between Church types and Curry types. But a separate issue is that TypeScript's gradual type system is explicitly unsound as most (all?) practical gradual type systems tend to be.
There was a strand of research into sound gradual type systems, exemplified by Typed Racket. A few months ago I read the paper Is sound gradual typing dead? which discusses its performance problems due to the way Typed Racket ensures soundness when crossing the boundary between dynamically and statically typed code, and its desire to assign blame for type errors as accurately as possible to the code that’s at fault. There were a few informative comments on the orange site. (My link log posted it there automatically; it was previously discussed here too.)
The biggest bugs I have seen were never type errors. They were misunderstandings about the domain. They were wrong assumptions about what the system should do. They were coordination failures between teams or between code and reality. Type systems do not catch these. You need testing anyway. And the coupling that types introduce makes the system harder to change when you discover you were wrong.
I think few proponents of (static) type systems would claim that types would catch these bugs, especially not for the exemplar typed languages Typescript and Go. Though I do believe there is merit in having stronger type systems that allow for tighter coupling between code and expectation, I think that is a discussion for a blog post comparing Haskell to Lean.
It is still possible for type systems to have utility if they only catch small bugs.
I mostly program in statically typed languages, but sometimes I write a quick Python utility program. It is only after rerunning your code 6 times to fix all of the typos and silly mistakes that you can truly appreciate the splendor of your code Just Working on its first successful compilation. And --- of course --- I can and should be using a linter or what-have-you that would detect all of these problems, but a sufficiently powerful linter should converge to some kind of a type system anyway.
There are enough things to keep track of when programming (like the aforementioned disconnects between code and reality). I prefer to offload as much menial validation to my computer as is reasonably possible. And I am optimistic about what can be done to alleviate the burdens this brings, such as what the author alludes to in the last quoted sentence.
I do think that having a good REPL is a (possibly huge) boon to a language (where is the Rust REPL?), but I am not sure that a static type system precludes one. And even compiled languages can have serviceable REPLs (such as GHCi).
Look, I like Clojure, but...
On easier code review: In practice, dependency injection-heavy typed OOP codebases have a different problem. You click on a method and land on an interface. You click on the interface and find four implementations. Which one is actually called? You check the dependency injection configuration, maybe in a separate XML file or scattered across annotations. [...]
The author should review the Polylith architecture, and maybe compare it with their description of dependency injection. For those not familiar, it allows you to break up your code into reusable components, including, possibly, multiple implementations of the same interface, and then combine those components in different ways at build-time using a config file in order to have builds that serve different clients and use-cases.
And that's before you even start using the (absolutely delightful, imo) integrant library! (Or, for that matter, any of the other DI libraries for Clojure.)
In Clojure: it is a function. Click on it and you land on the code, not an interface.
... unless that function is a multimethod, because it turns out that runtime polymorphism is useful sometimes.
You catch this with tests, but you have to write them.
A usual argument. I really like type systems doing this job for me.
This is so detached from what type systems are to a logician or computer scientist that is hard not go get lost in semantics and definitions here.
One thing is refactor the name of a variable, other is to completely swap implementations and have the compiler guide you until it's done via type systems..