More

russellthehippo · 2026-04-24T11:11:36 1777029096

writes and claim/ack flow. really depends on your journal mode and synchrnous mode as well.

notifs are extremely cheap, either in the old stat(2) mode or the new PRAGMA page_version (see my update on feeback comment). Some other comments mentioned that stat(2) is about 1µs.

russellthehippo · 2026-04-24T10:03:36 1777025016

[Response to feedback]

Thanks all for your feedback, responses, and discussion. I've done a PR here taking your suggestions into account:

https://github.com/russellromney/honker/pulls/1

The PR implements a three-layer polling architecture: - PRAGMA data_version every 1ms - stat every 100ms - retry connection to handle blips

1. PRAGMA data_version every 1ms replaces stat-based (size, mtime) change detection. This is SQLite's own commit counter: monotonic, immune to clock skew, correctly handles WAL truncation and rolled-back transactions. ~3µs nonblocking query. Credit to ncruces for pointing to this. This is not done for performance but for correctness as it is slightly slower. tuo-lei also pointed out truncation risk, which turned out to be more real than i thought.

Interesting note: I found in testing that the C API's SQLITE_FCNTL_DATA_VERSION does not work cross-connection. So for now honker continues paying the cost of going through the VFS layer which vlovich123 pointed out and now we tradeoff explicitly.

2. Reconnect-on-error: if the data_version query fails (disk blip, NFS hiccup, corrupted connection), honker tries to reconnect and wakes subscribers as a precaution. zbentley pointed me in this direction.

3. stat identity check every 100ms: compares (dev, ino) against startup values to detect file replacement (atomic rename, litestream restore, volume remount). data_version can't catch this because it polls through the open fd, which follows the original inode even after replacement. Credit to zbentley for the file-replacement scenarios.

Again - thanks for the discussion, honker got better because of it and I learned some stuff. See you round

russellthehippo · 2026-04-24T03:51:33 1777002693

Wow, thanks for the great feedback.

I actually looked at fstat, but the "check for deletions" piece, given I'm polling at 1kHZ, was the reason I decided not to use it. Older hardware actually made this a big issue but it's fast enough now I decided it wasn't a problem.

I'll ignore the malicious ones bc [out of scope declaration]. Object paranoia is an artifact of build trama and I respect that lmao.

I've just looked into the device number and system clock issues. I think what i'll end up doing is actually a combo of ncruces's above comment and your feedback: a 1kHZ data_version and a 10HZ stat() with version check. This gets around syscall load, avoid clock issues, avoids the WAL truncation issues that others have mentioned, and is both lighter weight and less bugabooable than my previous design.

Thanks again.

zbentley · 2026-04-24T20:05:14 1777061114

Hope it helps!

One clarification: by "check for deletions" I didn't mean that you need to read back through the filesystem; you can check for deletions for free using fstat(2)'s result. The number of hard links to a file descriptor's underlying description returned by fstat includes the "existential" hard link of the file itself, and drops to zero when the file's deleted and the open handle is an orphan:

    import os
    import time
    from threading import Thread, Event

    f = '/tmp/foo.test'
    ev = Event()
    Thread(target=lambda: ev.wait() and os.unlink(f), daemon=True).start()

    with open(f, 'w+') as fh:
        print("before delete:", os.fstat(fh.fileno()).st_nlink)
        ev.set()
        time.sleep(1)
        print("after delete:", os.fstat(fh.fileno()).st_nlink)

russellthehippo · 2026-04-24T20:27:24 1777062444

Ha. Great callout. Will inspect further

russellthehippo · 2026-04-23T23:15:46 1776986146

Damn it was real the whole time. I found Opus 4.7 to holistically underperform 4.6, and especially in how much wordiness there is. It's harder to work with so I just switched back to 4.6 + Kimi K2.6. Now GPT 5.5 is here and it's been excellent so far.

russellthehippo · 2026-04-23T23:11:58 1776985918

"a small proliferation" is a nice way to describe the cluster that is my side project habit. if you bump into any issues pls pull a PR or drop an issue on the repo!

russellthehippo · 2026-04-23T20:39:12 1776976752

Actually need to test this. Will report back

russellthehippo · 2026-04-23T20:17:39 1776975459

10k listeners is a lot. Thundering herd issue at stat(). SQLite may not be your best choice at this scale.

russellthehippo · 2026-04-23T22:38:59 1776983939

also this is designed for a single machine. 10k listeners on one machine seems like a lot!

russellthehippo · 2026-04-23T20:08:00 1776974880

Nope! The extension just functions as a shortcut for raw SQL. Litestream edits the wal file but only like a normal checkpoint. So not too bad. Although I haven’t tested it directly. Probably need to

russellthehippo · 2026-04-23T20:05:34 1776974734

The WAL file sticks around but gets truncated so that counts as an update. Though I don’t have tests for this. Good input, thanks I’ll make sure

russellthehippo · 2026-04-23T20:03:51 1776974631

Breaks cross-platform, specifically Macs swallow silently. stat just works

Retr0id · 2026-04-23T20:29:15 1776976155

I don't believe this to be true.

russellthehippo · 2026-04-23T20:38:34 1776976714

See comment below - Darwin silently drops same-process notifs. I could change the behavior depending on same vs cross process and platform but I wanted to”just one thing to worry about”. Potentially a good optimization later. Would help reduce syscalls.

Retr0id · 2026-04-23T20:58:52 1776977932

I believe you are mistaken. If you are referring to the comment from ArielTM, that's an LLM bot regurgitating your readme.

russellthehippo · 2026-04-23T22:30:36 1776983436

Apologies for not being specific.

The specific thing I'm talking about is this: write events don't fire until the file handle is closed. [1] I didn't validate this myself btw, but my original design was certainly trying to use notify events rather than stat polling. My research (heavily AI assisted of course) led me away from that path as platforms differ in behavior and I wanted to avoid that.

[1] https://github.com/notify-rs/notify/issues/240

russellthehippo · 2026-04-24T03:55:05 1777002905

If this has been fixed somewhere or there is a better alternative I'd love to use that over polling. Current plan is to move to polling data version for speed + occasional stat for safety. Getting rid of polling was my original goal but i compromised with syscalls.

xenadu02 · 2026-04-24T01:52:31 1776995551

I have no idea why they aren't using kqueue but that works on macOS and FreeBSD. It has for years.

You want EVFILT_VNODE with NOTE_WRITE. That's hooked up to VNOP_WRITE in the kernel, the call made to the relevant filesystem to actually perform the write.