Sick kid, broken promises and bad design decisions.
So, I’m on sick leave caring for my daughter. But she’s sleeping, so I have some time to play with my very small service: one query to mysql database, format results to CSV, that’s it. And it works on my machine. But it fails in cloud, we’re getting 500 on endpoint with CSV and, most interesting, we don’t have logs at all.
That service is a test bed to some middlewares and formatters that I’m going to promote for use across the company. So, now I’m on a journey, on an adventure: how to find out what’s going on in Fargate container, without any logs and abilities to attach my remote debugger. To make things more interesting, I built that application as self-contained, ready-to-run .net core 3.1 application.
Fortunately, problem was easy to replicate in local docker container. No logs, 500, same as in cloud. We’re using rider to develop .net applications. Unfortunately, Rider (at the moment of writing) can’t debug ready-to-run applications. It can debug apps in docker, though. Ok, let’s fallback to VS2019: fortunately it can debug ready-to-run application. One of devs from Jetbrains team said that they will try to ship debug of ready-to-run applications in release after 2020.1. But no promises were made and that’s understandable. Still, we’re waiting for this feature.
VS2019 has an ability to attach debugger to a process inside of container. Probably, even remote ones, I’ve never tested it yet. I’m using FROM mcr.microsoft.com/dotnet/core/runtime-deps:3.1.3-buster-slim as base image. It doesn’t have curl or wget in there, so you need to install one of those by yourself to make that remote debugging work. And yes, VS2019 can debug ready-to-run applications. And we’re getting this exception in VS2019:
So, problem, obviously lies somewhere in connection to mysql. But why doesn’t exception appear in logs? Why? This is actually important, because we could miss logs from production and I hate that. At this time I’ve already spent 4 hours dissecting my 200 lines service, trying to understand - who mutes my exception, who deserves most painful execution in history of mankind?
And finally, I have removed JsonFormatter (we have custom json formatter for Serilog, tailored to our company needs) from logger configuration and, voila, we see that beatiful exception as above. “Hmmmm”, I thought, - “What happened? JsonFormatter is dumb and simple”. So I injected the test code into controller and I’ve got another exception:
Gotcha! Line 56 is where we’re passing stacktrace to serilog method that should format a string. That method assumes that passed string isn’t null. And this is public API. So, let’s talk about broken promises. We’re getting to broken promises part of the
story.
In my project file I have this code:
As you can see, nullable feature of C#8 is enabled. All warnings are propagated to errors. Nuget package with logger was published, that means that compilation was successful, that means that we do not pass anywhere something that is nullable to non-nullable APIs, right? Let’s look to signature of the method: public static void WriteQuotedJsonString(string str, TextWriter output)
str isn’t nullable! There is no question mark there! So, if stacktrace property is nullable, compiler should emit error in our case. But it doesn’t. Promise about “null safety” is broken. But why? Answer to that is third part of my story. It’s about design decisions. You see, C# is a language, one of many in .net world. And feature of nullable reference types was introduced on language level, not on runtime level. That means that nullable string and non-nullable string to runtime are the same types. There is no real difference. And compiler emits some attributes on types to mark them either nullable or not. So, Serilog package was compiled either before that feature was introduced to C#, or with disabled feature, so no attributes on types.
And that was a story about bad design decisions. From my point of view, being able to check if type is nullable or not on runtime level is very important and worth of breaking backward compatibility. Current implementation, where string and string? are essentially same types, with no real ability of differintiate between them - is just bad decision, which led me, and will lead
other people, into false sense of safety: “see, zero errors with nullable enabled! We can’t get NRE!” “Yes, sweet summer child, you can”.