Parse, don’t validate

ephemera@lemmy.blahaj.zone · 8 months ago

Parse, don’t validate

livingcoder@programming.dev · 8 months ago

This was a good blog post. I particularly appreciated the statement about the validate and parse function comparison: “Both of these functions check the same thing, but parseNonEmpty gives the caller access to the information it learned, while validateNonEmpty just throws it away.”

QT1@lemm.ee · 8 months ago

I’ve first read this post back in 2019 when it was released and I have to say that it really has left quite an impact on the way I write programs these days. The „make illegal states unrepresentable“ and „push proofs up“ guidelines are so simple yet so effective. Sure, there is some initial cost to create new datatypes, but it really pays off in the long run. Not having to worry about null or wrongly shaped data structures down the line is really nice, especially if you’re working on older code or develop in a team. Even though the post uses Haskell to explain the concepts, I found it to also work well in other languages, even Java or Python.

Deckweiss@lemmy.world · 8 months ago

confused java dev: what do you mean a function can’t return void???

QT1@lemm.ee · 8 months ago

void in Java and Void in Haskell are quite different. As the post explains, in Haskell it’s a type with no possible values. In Java, the equivalent would be a class without a constructor (not sure if that’s even possible). It defines a type, but you cannot construct a value or object with that type. The equivalent of Java‘s void in Haskell is the unit type () which has exactly one possible value, also called (). It can be returned by a function, but it does not give you any information, just like void. By the way, Rust also uses the unit type instead of void.

Deckweiss@lemmy.world · 8 months ago

yeah that was the joke, thanks for explaining it

onlinepersona@programming.dev · 8 months ago

data NonEmpty a = a :| [a]

Note that NonEmpty a is really just a tuple of an a and an ordinary, possibly-empty [a]. This conveniently models a non-empty list by storing the first element of the list separately from the list’s tail: even if the [a] component is [], the a component must always be present.

Wat? How can I “store the first element of the list separated from the lists tail” when the list is empty? Whether a list is empty or not is a runtime possibility, not a compile-time possibility.

Someone care to explain this part? It does not compute at all for me.

Anti Commercial-AI license

Corbin@programming.dev · 8 months ago

A list can store zero or more elements. A NonEmpty can store one or more element. That’s all.

This overall strategy – representing the top of a list as a dedicated value – shows up elsewhere, notably in Forths, where it is called “top of stack” and often stored in a dedicated CPU register.

Kache@lemm.ee · 8 months ago

You cannot, and that’s why that type declaration models a NonEmpty that a type checker can enforce

onlinepersona@programming.dev · 8 months ago

So it’s the implementation that has to ensure a NonEmpty is returned, but that’s up to the developer, correct? The developer still holds the gun to shoot themselves in the foot by returning an empty list, IINM.

Anti Commercial-AI license

Ephera@lemmy.ml · 8 months ago

During the parsing step, you check that the list has at least one element. If it does not, you report an error to the user and exit. If it does, you take the first element in the least and store it in the left side of your tuple, and then the remaining elements of the input list go into the right side of your tuple.

So, for example: [1, 2, 3] → (1, [2, 3])
Or also: [1] → (1, [])
If the user gives you [], then you cannot represent that with your tuple, you necessarily have to error.