A poor man’s types
57 points by chreke
57 points by chreke
A lot of types/tests wordslinging essentially tries to compare them at how well they do the same job (finding bugs), but I get the most value from types at the point of designing a program, when I can find out if what I’m trying to make can even logically exist. After that, I can test whether it does what I want it to. They’re different things!
One of the biggest advantages of types and tests in my mind is that they serve as documentation.
The type signature on a function tells me precisely what I need to pass it, and what I’m guaranteed to get back in return. This should all be theoretically present in the docs, but in practice they’re often incomplete or ambiguous, whereas the type signature is checked by the compiler so you know it’s complete and accurate.
A test case for a function shows me a single full example of how to call it.
These are nicely complementary; neither subsumes the other.
I agree; in most scenarios you want both. What I find interesting is a kind of “duality” between tests and types:
You have to put a lot of effort into writing tests to assert things that a type system would give you for free. On the other hand, there are many kinds of tests (eg does this string match this regular expression?) where the assertion essentially comes for free, and which can only be replicated in a type system with a great deal of effort (eg dependent types).
Both tests and types constrain the valid states of your program. An invariant can be upheld by a type or a test, depending on what is most appropriate for the language/paradigm/style/aesthetic.
I’ve used types and tests the opposite way!
As thought experiment: write down the ML-style types for the AST of Unix shell right now. I claim that you can’t do it – you need to “measure the world” first, by writing tests.
So that is what we did for https://oils.pub/ , starting with black box reverse engineering of other shells:
https://oils.pub/release/0.29.0/test/spec.wwz/osh-py/index.html
And then later adding types. It was a dynamically typed program, and now it’s 100% statically typed. I think of it “schema discovery” or “type discovery”: https://github.com/oils-for-unix/oils/blob/master/frontend/syntax.asdl [1]
The static typing has really helped keep the program clean in the last few years – there is less fear of refactoring. But in the early days, dynamic typing makes the code SHORT, and avoids committing to a structure that has to be unwound later. i.e. the structure for 10% of the program is different than the one for 60% of the program
So yes I think this only underscores the point that types and tests are different. They are complementary techniques for correctness, not competing ones. And you can even do things in EITHER order!
[1] On schema discovery: You will see a similar thing in data science/ML. Because you need to write code to discover what’s in the data. The data is messy, and from different sources. In this domain, data comes before types.
Right! I guess we can congrue despite the difference in order: tests for the interface with the outside world (here the shell AST, or “what I want it to do”) and types for internal consistency.
And as you say, when you grow the program from the interface, rather than from the middle, dynamic types are more useful for the exploration required, like a slime mould discovering a maze and being able to prune the extraneous bits (the ill-typed code, I guess) once it’s complete.
As thought experiment: write down the ML-style types for the AST of Unix shell right now. I claim that you can’t do it – you need to “measure the world” first, by writing tests.
You also can’t write tests that pass from the beginning right? Then your comparison does not make sense.
I believe you are misunderstanding how type-driven-development can work. Basically, it’s similar as what you are doing with tests. In the case where you are interacting with an already typed API, you can just use those types. But if that is not the case and you are interacting with an untyped or partially typed (or “stringly typed”) API such as in your case, you can still use types.
The way you do this is by writing your assumptions as types and then parsing the APIs output into those types. Of course you have to run the code (just like you have to run your tests) to confirm if it works or not.
Concrete example: you expect some part of the AST of Unix shell to always return a string of a certain shape. So you encode that in your type, maybe simply with a String
in the beginning. Basically, you expect Result of String or Error
(often modeled as Result[String, Exception]
or whatever, depending on your language). Then you run your code and it will return an exception. Why? Because the output of the API was not a string, it was null, because you forgot to handle the case where there is no string returned at all (which maybe was unknown to you). So you refine your types to Result[String | null, Exception]
and continue from there.
Rinse and repeat and you have a fully fledged type-modeling of that API. It can still fail in some places, just like as if you forgot to test some important part.
There is not a lot of difference between those approaches, other than the classical differences between tests and types, e.g. that it is sometimes difficult or impossible to specify a constraint as a type, depending on your language.
What I’d say is that while Oils is now very strongly typed, in a fine-grained way, I still always write the tests BEFORE the types.
Histoically, that may be because we have “oracles” in the form of other shells, i.e. a hard constraint from the outside world
But even when there is no oracle, I still find it more useful to write the tests first, which is quite different from what the OP said:
I get the most value from types at the point of designing a program, when I can find out if what I’m trying to make can even logically exist
“can’t” might be a little strong – you can certainly write down some types (and you have to if you are not writing in a dynamically typed language), but it’s a question of whether that helps or not.
What I’d say is that while Oils is now very strongly typed, in a fine-grained way, I still always write the tests BEFORE the types.
I don’t want to argue that you can replace all tests (or even the majority of them) with a type-driven approach in any case. But you are not really making a point here, other than saying “I do it how I do it”.
I still find it more useful to write the tests first
Why is that? And which way/style in which programming language are you comparing it against?
“can’t” might be a little strong
No, actually not. OP is spot on here.
I’ll give you an example. Let’s say you use Idris (to pick an example of a very very strict language with a very powerful statical type-system) and you define a function foo. As input type, foo receives a list of strings. As output type foo returns a string of its input. (which might already come as a surprise but that is a constrain possible to express in Idris’ type-system)
You will be absolutely unable to write an implementation of that function that compiles. Impossible. Unless you use the dedicated escape hatch of unsafePerformIO
which basically means “Screw you type-system, I explicitly choose to ignore you”.
The reason why you cannot implement such a function is that kind of obvious: the input list might be empty and in that case you cannot return a string of it because there is none.
So you are forced to refine the type signature of that function in some way.
Now, you might say “it’s obvious that such a function cannot work”. True, but now imagine a big piece of software with a lot of interconnected parts. And in some part of your application, you expect some value to be there. But in some other part you allow empty inputs. You might not realize until you connect those parts that this cannot work. The types will then lead you the way. And they can capture much more nuance cases as a simple “value might not be present”.
Whether such an approach is productive or not is another question. I personally find a mix to be best. But what I can also experienced is that most developers have no clue about both approaches - they only know and have experience with the testing approach, which leads to them making suboptimal choices.
If you find it interesting, I recommand you to read https://www.manning.com/books/type-driven-development-with-idris - it’s a mind blowing book that will most likely change how you view programming with a statically typed language.
I have a side project of a DataFrame display widget, and it’s a very challenging area to work in. DataFrames can contain so many different types of columns in the wild, and designing a generic display tool bites you hard when you make assumptions. I’m starting to work with mypy to add typing in, but it is very helpful because you know so little about incoming data that a lot of types become Any.
What approaches have you seen to data science typing that work well? Any projects come to mind?
I created a package for statically typed dataframes in Julia called TypedTables.jl. Mostly for speed in very particular edge cases we had (these days I like to think of it as a cross between a pandas dataframe and a Zig multi array list).
Julia itself is in an odd place where it has dynamic semantics but strong typing, however more recently you can use tools like JET to make sure everything is both statically inferred and well typed. That can include that the columns’ types implement the interfaces for displaying their value in your table widget.
In any case it worked well for the geospatial data I was working on at the time, and last I heard it is still being used. I’m not 100% convinced python’s nascent type system is powerful enough to do everything you’d want and need for “static dataframes” and for your widget, but I certainly am not an expert (Rust, Zig and even TypeScript would work fine).
I use and like MyPy for certain things, and many years before that I did data science work … but I actually did it in R. Even though Python is my favorite/main language, I think R has nicer APIs for data frames
So I’m not sure I have that much useful to add. I’m not sure static typing really goes with data frames. As others have said, it can be useful to think of typed Python and dynamic Python as different languages, with different idioms
But maybe there have been developments I don’t know about – MyPy is pretty flexible like TypeScript
Yeah… I like both types and tests and find they support each other. If I have a static type system I don’t need to write as many tests for mundane things, and the types help me alter code (including the tests) over time, and remember how things fit together when I come back to it months later. But I can’t enforce every property in the type system, and trying to be to fancy with types often leads to code that is brittle and difficult to change. And even if I was using formal verification I wouldn’t be confident without a decent number of integration tests checking my assumptions. TL;DR; there are no silver bullets.
Knuth vs Curry-Howard, “Beware of bugs in the above code; I have only proved it correct, not tried it.”
IMO, Benjamin Pierce’s “Types And Programming Languages” book said it best:
On one end of the spectrum [of formal methods] are powerful frameworks such as Hoare logic, algebraic specification languages, modal logics, and denotational semantics. These can be used to express very general correctness properties but are often cumbersome to use and demand a good deal of sophistication on the part of programmers. At the other end are techniques of much more modest power – modest enough that automatic checkers can be built into compilers, linkers, or program analyzers and thus be applied even by programmers unfamiliar with the underlying theories.
…
A type system can be regarded as calculating a kind of static approximation to the run-time behaviors of […] a program.
Exactly, I came here to say the same thing. Conventional type systems are lightweight formal methods. When building large pieces of software, types are very useful because they rule out a large class of errors.
For example, the main complaint about Clojure, one of my favorite languages, is that large systems require a lot of discipline. Components built by different people, and large data structures being passed back and forth, make it quite probable that type errors arise.
I’d argue @tentacloids upthread said it better -:)
Correctness checking just isn’t the most important aspect of types. I’d say types help, in this order:
Specifically, types help correctness for cases where the code isn’t run even once (but it is surprisingly hard to run code in some cases, like error handing or concurrency).
I think this talk has some good intuition around that (I may have posted it before)
Ron Pressler - Why Writing Correct Software Is Hard and Why Math (Alone) Won’t Help Us
https://pron.github.io/posts/correctness-and-complexity
https://youtu.be/dWdy_AngDp0?t=2367
Specifically that part claims that you can take any Haskell program, remove all the function bodies, and replace them with finite state machines that are completely WRONG, yet the whole program type checks anyway
i.e. there are an infinite number of wrong programs for any set of consistent function signatures (at least I think this is obvious, he didn’t say that)
The whole talk is pretty interesting, although it goes fast, and I don’t have the background for all of it
A test asserts that for some mock input an invariance holds. A type asserts that for all input of that type an invariance holds.
Conversely, a failing test gives you a specific counterexample to your property, while a failing typecheck does not. Usually you can figure out a counterexample from the typechecking error but this becomes more of a problem with refinement and dependent types (where a failed typecheck may not indicate an actual problem).
I think there’s an interesting pattern in how broad forms of verification gives you more information than narrow forms when they both succeed, while narrow forms give you more information when they both fail.
› Conversely, a failing test gives you a specific counterexample to your property, while a failing typecheck does not.
It doesn’t have to be this way as https://arxiv.org/pdf/1606.07557 shows, even though it’s the most commo situationn as of now.
Conversely, a failing test gives you a specific counterexample to your property, while a failing typecheck does not. Usually you can figure out a counterexample from the typechecking error but this becomes more of a problem with refinement and dependent types (where a failed typecheck may not indicate an actual problem).
(emphasis my own)
See this recent paper for counterexamples from failed refinement type checking. Additionally, while a failed typecheck may indeed (and dare I say often does) not indicate an issue with the code-as-executed, I would say an improperly specified type is still a problem. At least so much as it prevents the type checker from verifying the code.
Types can prove that their asserted invariants hold for all inputs, modulo weaknesses of the type system (e.g. infinite loops, null, unsafe coercions, etc.).
Tests can also assert universally-quantified invariants; but they can only prove such assertions for a handful of inputs (either given explicitly, or generated randomly, etc.). I find such tests work better as documentation than if the input data were written inline, e.g. result.shouldContain(givenUser.id)
tells me more than result.shouldContain(1)
.
Is there already a Venn diagram sorting the different and the similar use cases for types and tests?