More

danvk · 2024-11-16T14:35:46 1731767746

I've had really good luck recently running OCR over a corpus of images using gpt-4o. The most important thing I realized was that non-fancy data prep is still important, even with fancy LLMs. Cropping my images to just the text (excluding any borders) and increasing the contrast of the image helped enormously. (I wrote about this in 2015 and this post still holds up well with GPT: https://www.danvk.org/2015/01/07/finding-blocks-of-text-in-a...).

I also found that giving GPT at most a few paragraphs at a time worked better than giving it whole pages. Shorter text = less chance to hallucinate.

pbhjpbhj · 2024-11-16T14:52:10 1731768730

Have you tried doing a verification pass: so giving gpt-4o the output of the first pass, and the image, and asking if they can correct the text (or if they match, or...)?

Just curious whether repetition increases accuracy or of it hurt increases the opportunities for hallucinations?

danvk · 2024-11-17T13:00:12 1731848412

I have not, but that's a great idea!

danvk · 2024-11-09T12:15:58 1731154558

Try webdiff (I’m the author) https://pypi.org/project/webdiff/

danvk · 2024-07-20T15:07:47 1721488067

Since function parameters are immutable in Zig, is this a difference that you need to care about?

renox · 2024-07-24T21:33:11 1721856791

The parameter is immutable so if you pass a pointer, the pointer is immutable, but the pointed value isn't.. If the pointer alias another 'immutable' parameter, this parameter may not be immutable anymore..

danvk · 2024-07-18T14:46:25 1721313985

Thanks for the pointer, I hadn't run across ts-runtime-checks before. It does do something similar to what I propose in the post. The difference is that ts-runtime-checks is opt-in. If you want a type validated at runtime, you have to write `Assert<T>`. What I'm proposing is that _all_ types be validated at runtime.

opt-in makes sense if you want these checks in production. The appeal of a "check everything" debug mode is that you wouldn't have to modify your TS at all to use it.

danvk · 2024-07-18T14:40:06 1721313606

Destructuring assignment was definitely something I missed in Zig (it's even called out in the post). Here's the relevant issue/comment, which makes me think this won't happen in Zig: https://github.com/ziglang/zig/issues/3897#issuecomment-7389...

Reki · 2024-07-18T14:44:25 1721313865

Its in zig 0.12

danvk · 2024-07-18T16:21:08 1721319668

For tuples, not for named fields.

danvk · 2024-07-18T14:36:32 1721313392

Yes, the error is much clearer when you realize where the mistake is :)

There's an interesting point here, though: while comptime lets Zig unify function calls and generic type instantiation, this also creates the possibility of confusing the two and getting confusing error messages.

danvk · 2024-07-18T14:31:58 1721313118

Extensive. Try reading the full post, you'll see that those points are all mentioned.

danvk · 2024-06-20T19:05:19 1718910319

You can work around this by typing your import statement inside-out:

import {} from 'fs';

Then move your cursor back inside the {} and you'll have nice autocomplete. Works with object destructuring, too.

danvk · 2024-01-24T11:59:00 1706097540

Or 2037.

danvk · 2023-12-19T22:15:34 1703024134

What about Object.fromEntries?

See https://github.com/tc39/proposal-array-grouping for why this isn’t a method on Array.prototype.

flqn · 2023-12-19T22:18:15 1703024295

Constructs an object from kv pairs, definitely assoctiated with the "Object" prototype. Groupby is more dubious