Jim's Depository

this code is not yet written

While testing SMTP message reception and DKIM validation I ran into an evil action from Postfix.

In my basically default Debian Postfix/Dovecot system, if you send a message to Postfix, it DKIM signs it, then sends it off to the destination. But if your message needed 8BITMIME and your destination doesn't support that, then Postfix quietly re-encodes your message and body to be quoted-printable.

Now your DKIM signature is invalid!

It sounds like you can configure Postfix to bounce that instead, but that's not what I got for a simple installation.

That was about a day and a half of me debugging my DKIM verification code because my body hash calculation kept not matching the one in the DKIM header.

Morals:

  • Just go ahead and support 8BITMIME and SMTPUTF8 if you can. No need to poke the bear.
  • If your body header checksum matches for some messages, but not others, its probably something upstream corrupting the message bodies after the signature.

Lost half a day today working to a strange Cloudflare DNS resolver anomaly.

When you make a UDP request, it may answer correctly, or it may indicate truncation by setting the TC (TrunCation) bit and not give you an answer. The answer easily fits, but it just decides, "nope, not this time". If you reissue the query over TCP you will get the result every time (in my testing). But UDP gives an erroneous TC non-result about 25% of the time on the name I was testing.

So, I guess be alert for that. If your DNS code can flip over to TCP you may never realize this is happening other than some of your queries are oddly much delayed compared to the others.

Just the simple host command can show the behavior, though you won't realize it happened because it falls back to TCP. If you are watching packets with tcpdump you will see it.

I don't see the behavior using Google's 8.8.8.8.

Anyway, here's a little tcpdump capture if you want to look at it. That's a bunch of requests for the same TXT record (the SPF for lunarware.com, which is ultimately DNS hosted at Cloudflare.)

I don't have a resolution unfortunately, I just banned 1.1.1.1 until I add TCP lookup to the affected code.

Attachments

I needed to annotate regions of a bunch of images to let Create ML train up an object recognition model. I found a program for that, but it decided to nag me every few images and I didn't like the UI much.

So… I decided to see how writing a new app by just exhorting Codex to apply itself works, the answer is: beautifully! I spent about a day feeling out how I want the app to work, with a couple of false starts for trying to make it too simple.

But now the world can enjoy Image ML Annotator for all your image labeling needs, as long as you need exactly what I needed.

screenshot

I built it to train a sudoku puzzle locator for images, but the example data for playing with the app is a bunch of motivational pictures of puppies, kittens, and ducklings that I had an LLM and an image generator cough out. So, lots of AI in this project. The web site was AI generated, the help is AI generated (and is surprisingly good), all of the code is AI generated. I may have edited a small line here and there, but it is essentially 100% written by Codex with me standing over its shoulder and annoying it constantly.

macos store download

Attachments

screenshot1.png 1260379 bytes

Femtoblogger gained support for Passkeys and Atom.

You can create an account with passkey authentication. If you have an existing account, you can add another passkey. The username/password authentication is still supported, but not encourage. Notably missing is the ability to set or change a password once you have an account, I'll get to that.

Atom support is back for all you feed readers. Another Codex win. I asked it to put atom support back, told it which endpoint I wanted to use. It looked around at my models and how I do things and wrote it all in one whack. It really is a bit like wishing features into existence.

Less visible, some sorts of pages are now cached internally so they don't have to be regenerated all the time. For instance, the Atom feed page caches so if your reader uses If-None-Match or If-Modified-Since you might get a 304 Not Modified and save a download. But even if you don't use those headers you will probably get a pregenerated cached copy, so that saves time.

I had Codex help me with the parts of femtoblogger which annoyed me.

As I said earlier, I was ambivalent about Lighter. I decided to ditch Lighter and just go straight to llibsqlite3. Codex and I stomped through that table by table in my model layer with me asking it to convert a table and it doing all the work. 100% success and much faster than I would have done it by hand.

The I decided to address the HTML generation. I'd been using Stencil and template files. I asked Codex to make me a HTML DSL so I could write my HTML in Swift and use Swift for the flow of control and computed values instead of trying to map that on to the Stencil capabilities. 100% success and much faster than I would have done by hand. Codex was reading the Stencil templates and generating the HTML DSL with interspersed Swift control flow and values which also extending the DSL every time it found an unimplemented HTML element. I had to do a little steering on the shape of the DSL API to make it come out the way I wanted, so not an unattended change, but easily done 10 times faster than by hand and without all the tiny errors I would have introduced.

I brought up email on a domain which hasn't had it for 15 years or so. I am instantly being barraged with spammers who actually remember the names of the people on that domain!

But I have to wonder, are these real messages and just a very persistent delivery agent? 15 years of trying to deliver the message? I'm tempted to add the users just to see.

hello, im trying to find drivers for my lacie ethernet disk mini and found you hacked it. do you have the drivers some where? thanks

If you want to look up countries for an IP address, because say all your comment spam in your blog comes from the same country, then the nice folks at MaxMind will hook you up with their GeoLite2 databases for no cost, though there is a license to pay attention to.

These files are in MMDB format which excels at encoding partitions of natural numbers. The format is documented and there are a number of libraries to access the files, but I wanted native Swift to keep my cross platform building nightmares in check.

You can check out swift-mmdb which I keep on github. I haven't made a change in 4 years, but that's just because it keeps working.

The performance cost is about 9MB for the data and 150µS per lookup.

I've finished rewriting femtoblogger again. Now it is in Swift 6 concurrency using the Hummingbird 2 web framework.

Thoughts:

  • Swift 6 is surprisingly good at identifying concurrency foot guns. Sometimes it is obtuse about corrective action, but it caught a number of cases where I would have risked uncontrolled concurrent access.
  • Hummingbird is ok, but I find it missing some features which should really be included. There are places to hook them in, and they generally write what you need in the documentation, but it would be nicer if they were included. (e.g.: access to the remote IP in the request context, or parsing multipart/form-data.
  • For building on Debian, the binaries are tied to the Swift Stdlib which might not be synced on your deploy machine, so I've resorted to swift build --static-swift-stdlib in my Makefile and keeping the Swift Stdlib embedded.
  • For simple packages, you can organize all the parts in a directoy at build time and use dpkg-deb --build --root-owner-group to make a Debian package with minimal fuss. If you need to screw with file owners other than root then you'll have to get into a fakeroot or something. The dpkg-deb man page alludes to them adding a owner/mode manifest, so I'd hold out for that.
  • brew can install dpkg on your Mac so you can test building the package there. You'll get your macos binary, so it isn't usable. If you used the Swift Static SDK you could build your Debian packages on the mac, but I didn't want to mess with that.
  • I used Lighter to access the database, but I'm not sure I like it. The API has like three ways of doing everything and doesn't really talk about them, so it's kind of random exploration to get something to work. It makes a set of struct to map to your SQLite tables which is kind of its thing, but I kept finding myself working with partial data so bypassing that layer on insert, update, or delete. For reading I ended up making my own structs anyway for reasons on about half the tables. Still, I like the strongly typed access.

So, maybe I'll write more here now that the software works again.

If you have had to do something heinous in your code to avoid triggering a Swift compiler bug, you might consider…

#if swift(>=7)
#warning("Check if Swift bug is fixed and we can remove the '@unchecked Sendable'.")
#endif

In my case, certain presentations of indirect enums which reference each other (and that's totally legal!) trip a circular reference error in the analysis. The work around is to jiggle the order of your declarations, or dodge the analysis by marking it as an @unchecked Sendable.

The bug has been there since sometime in version 5, we are up in 6 now, so safe to say it isn't a priority. Setting a reminder that someday when you hit Swift 7 it is worth a review let's you forget about it now.

I know my daily readers have been wondering what's up these past 700 or so days…

  • There will be no more OS work. I rolled a critical fumble on COVID and can no longer work with complex abstractions in my head. I was kind of hoping for a recovery, but it appears there will not be one.

  • An OS which makes limited use of system calls and primarily operates by streaming requests from user space to the OS and responses make to the user space asynchronously was looking extremely promising and anyone looking to build one sadly only gets that anecdote from my work.

  • Most of the entities which the user space communicates with are ordinairy, though trusted, processes in their own right, the kernel is quite small and really about connecting things. Breaking traditional OS subsystems off into their own processes has a nice synergy with high core count processors by putting more L1/L2 cache to work.

  • Popular programming languages can be made to work in this environment, but there is a lot of friction. A language with builtin light weight concurrency is more ideal. I was in the middle of a sort of Swift subset which fulfilled that and looked very promising.

Not so OS related…

  • I'm ok. I've been doing a lot of physical manufacturing skill work and exploring what I can create. I think I can do small software bits, but I have to finish them in one push, if I have to pick them up later it makes no sense to me.

  • So I guess I'll take a little pivot here and start documenting small things and maybe just showing stuff I make.

more articles