NULL
!= ‘NULL’How do devs make this mistake
It’s baffling to me. Maybe I’m just used to using “modern” frameworks, but the only way this could be an issue is if you literally check if the string value equals “null” and then replace it with a null value.
lastName = lastName.ToUpper() == "NULL" ? null : lastName;
Either that or the database has some bug where it’s converting a string value of “null” into a
null
.How do devs make this mistake
it can happen many different ways if you’re not explicitly watching out for these types of things
example let’s say you have a csv file with a bunch of names
id, last_name 1, schaffer 2, thornton 3, NULL 4, smith 5, "NULL"
if you use the following to import into postgres
COPY user_data (id, last_name) FROM '/path/to/data.csv' WITH (FORMAT csv, HEADER true);
number 5 will be imported as a string “NULL” but number 3 will be imported as a NULL value. of course, this is why you sanitize the data (GIGO) but I can imagine this happening countless times at companies all over the country
there are easy fixes if you’re paying attention
COPY user_data (id, last_name) FROM '/path/to/data.csv' WITH (FORMAT csv, HEADER true, NULL '');
sets the empty string to NULL value.
example with js
fetch('/api/user/1') .then(response => response.json()) .then(data => { if (data.lastName == "null") { console.log("No last name found"); } else { console.log("Last name is:", data.lastName); } });
if
data
isdata = { id: 5, lastName: "null" };
then the if statement will trigger- as if there was no last name. that’s why you gotta know the language you’re using and the potential pitfalls
now you may ask – why not just do
if (data.lastName === null)
instead? But what if the system you’re working on uses
JSON.parse(data)
and that auto-converts everything to a string? it’s a very natural move to check for the string"null"
obviously if you’re paying attention and understand the pitfalls of certain languages (like javascript’s type coercion and the particularities of
JSON.parse()
) it becomes easy but it’s something that is honestly very easy to overlookLike you said, GIGO, but I can’t say I’m familiar with any csv looking like that. Maybe I’m living a lucky life, but true null would generally be an empty string, which of course would still be less than ideal. From a general csv perspective, NULL without quotes is still a string.
If “NULL” string, then lord help us, but I would be inclined to handle it as defined unless instructed otherwise. I guess it’s up to the dev to point it out and not everyone cares enough to do so. My point is these things should be caught early.
I’ll admit I’m much more versed in mysql than postgres.
really it’s a cautionary tale about the intersections of different technologies. for example, csv going into a sql database and then querying that database from another language (whether it’s JS or C# or whatever)
when i was 16 and in driver’s ed, I remember the day where the instructor told us that we were going to go drive on the highway. I told him I was worried because the highway sounds scary- everybody is going so fast. he told me something that for some weird reason stuck with me: the highway is one of the safest places to be because everybody is going straight in the same direction.
the most dangerous places to be, and the data backs this up, are actually intersections. the points where different roads converge. why? well, it’s pretty intuitive. it’s where you have a lot of cars in close proximity. the more cars in a specific square footage the higher probability of a car hitting another car.
that logic follows with software too. in a lot of ways devs are traffic engineers controlling the flow of data. that’s why, like you said, it’s up to the devs to catch these things early. intersections are the points where different technologies meet and all data flows through these technologies. it’s important to be extra careful at these points. like in the example i gave above…
the difference between
WITH (FORMAT csv, HEADER true);
and
WITH (FORMAT csv, HEADER true, NULL '');
could be the difference between one guy living a normal life and another guy receiving thousands of speeding tickets https://www.wired.com/story/null-license-plate-landed-one-hacker-ticket-hell/
Code is easy in a vacuum. 50 moving parts all with their own quirks and insufficient testing is how you get stuff like this to happen.
“True”
How do devs make off by one mistakes.
The most common source of security vulnerabilities is memory corruption and off by one errors.
(to make the joke more obvious)
The two most common sources of security vulnerabilities are buffer overflows, use-after-free, and off-by-one errors.
I can’t even think of a language that does that. I don’t think even JS does it, and if anything was going to it’s fucking that.
I was NaN years old when I learned this.
/me changes name to
'); DROP TABLE STUDENTS; --
.Oh. Yes. Little Bobby Tables, we call him.
Are there character escapes for SQL, to protect against stuff like that?
Yes but it’s a dangerous process. You should use paramatrized queries instead.
Yup, then it becomes a front-end problem to deal with wonky input. As a backend dev, this is ideal, just give me data and I’ll store it for ya.
Use parameters, that way data and queries are separate.
Input sanitization typically handles this as a string that only allows characters supported by the data type specified by the table field in question. A permissive strategy might scrub the string of unexpected characters. A strict one might throw an error. The point, however, is to prevent the evaluation of inputs as anything other than their intended type, whether or not reserved characters are present.
Only noobs get hit by this (called SQL injection). That’s why we have leads review code…
Knew a guy who had the license plate ‘NULL’ and he was telling me how he never got a toll bill or red light ticket.
How about XÆa-12? Asking for a friend.
Ah yes, little Nell=%00\u0000’\0’“”‘0’0x000x30’';
Nellie Null we call her.
She and her cousin Bobby Tables love to scamper around, but they are good kids. They would never break anything intentionally
Mandatory xkcd:
There is an infosec guy in California who had NULL as his car license plate. If a license-plate reader detects a ticketable event but the license plate is unreadable, guess how the system handles those events?
Infosec guy was not a happy bunny.