More

visarga · 2026-03-02T07:35:00 1772436900

A nice way to use traditional ML models today is to do feature extraction with a LLM and classification on top with trad ML model. Why? because this way you can tune your own decision boundary, and piggy back on features from a generic LLM to power the classifier.

For example CV triage, you use a LLM with a rubric to extract features, choosing the features you are going to rely on does a lot of work here. Then collect a few hundred examples, label them (accept/reject) and train your trad ML model on top, it will not have the LLM biases.

You can probably use any LLM for feature preparation, and retrain the small model in seconds as new data is added. A coding agent can write its own small-model-as-a-tool on the fly and use it in the same session.

benrutter · 2026-03-02T09:21:55 1772443315

What do you mean by "feature extraction with an LLM?". I can get this for text based data, but would you do that on numeric data? Seems like there are better tools you could use for auto-ML in that sphere?

Unless by LLM feature extraction you mean something like "have claude code write some preprocessing pipeline"?

mirsadm · 2026-03-02T10:26:46 1772447206

Isn't the whole point for it to learn what features to extract?

visarga · 2026-03-02T05:19:22 1772428762

Yes, it should remain part of the commit, and the work plan too, including judgements/reviews done with other agents. The chat log encodes user intent in raw form, which justifies tasks which in turn justify the code and its tests. Bottom up we say the tests satisfy the code, which satisfies the plan and finally the user intent. You can do the "satisfied/justified" game across the stack.

I only log my own user messages not AI responses in a chat_log.md file, which is created by user message hook in the repo.

visarga · 2026-03-01T20:19:04 1772396344

I'm switching from Claude Web to Claude Code. Local files give me memory I actually control, unlike Anthropic's implementation. CC doesn't carry state between sessions — you just put whatever project context it needs in a file.

visarga · 2026-03-01T19:46:56 1772394416

> they call it fun, i call it death.

Are you just sitting there as if dead when using AI? I find AI work exciting, always something new to discover.

jnpnj · 2026-03-01T20:34:30 1772397270

It's been a few months since gemini 3 and opus 4.5 were released and I still regularly have feelings of dread in me because I'm deprived of something (which I assume is the thrill and pride of being able to explore solution spaces in non stupid ways to find plausible answers on my own)

Maybe it's the usual webdev corp job that is too focused on mainstream code and where AI is used to sell more, not find new ideas that could be exciting..

visarga · 2026-02-28T05:55:06 1772258106

Reimplementing function is not the same with wholesale regurgitation.

visarga · 2026-02-27T06:30:37 1772173837

> Not understanding how consciousness is created doesn't make it divine.

It's not divine, just expensive, and has to pay its costs. That little thing - cost - powers evolution. Cost defines what can exist and shaped us into our current form, it is the recursive runway of life.

visarga · 2026-02-26T15:57:27 1772121447

John Good's quote is pretty myopic, it assumes machines make better machines based on being "ultraintelligent" instead of learning from environment-action-outcome loop.

It's the difference between "compute is all you need" and "compute+explorative feedback" is all you need. As if science and engineering comes from genius brains not from careful experiments.

observationist · 2026-02-26T17:54:08 1772128448

There's an implicit assumption there, anything a computer as intelligent as a human does will be exactly what a human would do, only faster. Or more intelligent. If the process is part of the intelligent way of doing things, like the scientific method and careful experimentation, then that's what the ultraintelligent machine will do.

There's no implication that it's going to do it all magically in its head from first principles; it's become very clear in AI that embodiment and interaction with the real world is necessary. It might be practical for a world model at sufficient levels of compute to simulate engineering processes at a sufficient level of resolution that they can do all sorts of first principles simulated physical development and problem solving "in their head", but for the most part, real ultraintelligent development will happen with real world iterations, robots, and research labs doing physical things. They'll just be far more efficient and fast than us meatsacks.

ACCount37 · 2026-02-26T16:31:51 1772123511

At sufficient levels of intelligence, one can increasingly substitute it for the other things.

Intelligence can be the difference between having to build 20 prototypes and building one that works first try, or having to run a series of 50 experiments and nailing it down with 5.

The upper limit of human intelligence doesn't go high enough for something like "a man has designed an entire 5th gen fighter jet in his mind and then made it first try" to be possible. The limits of AI might go higher than that.

kilpikaarna · 2026-02-26T17:02:30 1772125350

Exceedingly elaborate, internally-consistent mind constructs, untested against the real world, sounds like a good definition of schizophrenia. May or may not correlate with high intelligence.

ACCount37 · 2026-02-26T19:02:34 1772132554

We only call it "schizophrenia" when those constructs are utterly useless.

They don't have to be. When they aren't, sometimes we call it "mathematics".

You only have to "test against the real world" if you don't already know the outcome in advance. And you often don't. But you could have. You could have, with the right knowledge and methods, tested the entire thing internally and learned the real world outcome in advance, to an acceptable degree of precision.

We have the knowledge to build CFD models already. The same knowledge could be used to construct a CFD model in your own mind. We have a lot of scattered knowledge that could be used to make extremely elaborate and accurate internal world models to develop things in - if only, you know, your mind was capable of supporting such a thing. And it isn't! Skill issue?

econ · 2026-02-26T23:03:33 1772147013

I like the substitution concept. What humans can do depends on the abstractions and the tools. One could picture just the shape of the jet and have a few ideas how to improve it further. If that is enough info for the tool it could be worthy of the label "designed by Jim".

circlefavshape · 2026-02-26T16:12:15 1772122335

> As if science and engineering comes from genius brains not from careful experiments

100% this. How long were humans around before the industrial revolution? Quite a while

snikeris · 2026-02-26T16:29:47 1772123387

Science and engineering didn't begin with the Industrial Revolution. See: https://en.wikipedia.org/wiki/Great_Pyramid_of_Giza

tjoff · 2026-02-26T17:21:37 1772126497

Have you gotten any indication that machines won't have sensors?!

gopher_space · 2026-02-26T20:54:38 1772139278

From what I can see we're working as hard as we can to build them. You can watch the "let's put this on a Raspberry Pi and see what happens" seeds of Skynet develop in real time.

There's something compelling about helping assemble the machine. Science fiction was completely wrong about motivation. It's fun.

Eldt · 2026-02-26T16:08:46 1772122126

Maybe ultraintelligence is having an improved environment-action-outcome loop. Maybe that's all intelligence really is

goodmythical · 2026-02-26T16:46:28 1772124388

I've noticed this core philosophical difference in certain geographically associated peoples.

There is a group of people who think AI is going to ruin the world because they think they themselves (or their superiors) would ruin the world.

There is a group of people who think AI is going to save the world because they think they themselves (or their superiors) would save the world.

Kind of funny to me that the former is typically democratic (those who are supposed to decide their own futures are afraid of the future they've chosen) while the other is often "less free" and are unafraid of the future that's been chosen for them.

mitthrowaway2 · 2026-02-26T17:30:54 1772127054

There is also a group of people who think AI is going to ruin the world because they don't think the AI will end up doing what its creators (or their superiors) would want it to do.

tines · 2026-02-26T17:18:09 1772126289

You’re just describing authoritarian vs non-authoritarian mindsets.

inigyou · 2026-02-26T17:18:31 1772126311

In that case, it can't be improved with bigger computers.

visarga · 2026-02-26T08:12:45 1772093565

It's our job after all to keep the agent aligned, we should not expect it to self recover when it goes astray or mind its own alignment. Even with humans we hire managers to align the activity of subordinates, keeping intent and work in sync.

That said, I find that running judge agents on plans before working and on completed work helps a lot, the judge should start with fresh context to avoid biasing. And here is where having good docs comes in handy, because the judge must know intent not just study the code itself. If your docs encode both work and intent, and you judge work by it, then misalignment is much reduced.

My ideal setup has - a planning agent, followed by judge agent, then worker, then code review - and me nudging and directing the whole process on top. Multiple perspectives intersect, each agent has its own context, and I have my own, that helps cover each other's blind spots.

josephg · 2026-02-26T08:19:30 1772093970

> Even with humans we hire managers to align the activity of subordinates, keeping intent and work in sync.

We do this socially too. From a very young age, children teach each other what they like and don't like, and in that way mutually align their behaviour toward pro social play.

> I find that running judge agents on plans before working and on completed work helps a lot

How do you set this up? Do you do this on top of the claude code CLI somehow, or do you have your own custom agent environment with these sort of interactions set up?

visarga · 2026-02-26T08:28:43 1772094523

I use a task.md file for each task, it has a list of gates just like ordinary todo lists in markdown. The planner agent has an instruction to install a judge gate at the top and one at the bottom. The judge runs in headless mode and updates the same task.md file. The file is like an information bus between agents, and like code, it runs gates in order reliably.

I am actively thinking about task.md like a new programming language, a markdown Turing machine we can program as we see fit, including enforcement of review at various stages and self-reflection (am I even implementing the right thing?) kind of activity.

I tested it to reliably execute 300+ gates in a single run. That is why I am sending judges on it, to refine it. For difficult cases I judge 3-4 times before working, each judge iteration surfaces new issues. We manually decide judge convergence on a task, I am in the loop.

The judge might propose bad ideas about 20% of the time, sometimes the planner agent catches them, other times I do. Efficient triage hierarchy: judge surfaces -> planner filters -> I adjudicate the hard cases.

eucyclos · 2026-02-26T08:31:33 1772094693

>we do this socially too

There's a school of thought that the reason so many autistic founders succeed is that they're unable to interpret this kind of programming. I saw a theory that to succeed in tech you needed a minimum amount of both tizz and rizz (autism and charisma).

I guess the winning openclaw model will have some variation of "regularly rewrite your source code to increase your tizz*rizz without exceeding a tizz:rizz ratio of 2:1 in either direction."

josephg · 2026-02-26T08:39:18 1772095158

> increase your tizz*rizz without exceeding a tizz:rizz ratio of 2:1 in either direction.

Amazing. Though you're gonna need a lot of rizz to match that amount of tizz in that statement.

eucyclos · 2026-02-26T08:41:45 1772095305

By Jove you're right. To the avatar store!

visarga · 2026-02-26T08:08:11 1772093291

> This article is far off the mark. The improvement is not in the user-side. You can write docs or have the robot write docs; it will improve performance on your repo, but not “improve” the agent.

No, the idea is to create these improved docs in all your projects, so all your agents get improved as a consequence, but each of them with its own project specific documentation.

selridge · 2026-02-26T08:14:25 1772093665

But they're not your agents.

visarga · 2026-02-26T08:23:22 1772094202

You can't improve the agents but you can improve their work environment. Agents gain a few advantages from up to date docs:

1. faster bootstrap and less token usage than trashing around the code base to reconstitute what it does

2. carry context across sessions, if the docs act like a summary of current state, you can just read it at the start and update it at the end of a session

3. hold information you can't derive from studying the code, such as intents, goals, criteria and constraints you faced, an "institutional memory" of the project

normalocity · 2026-02-26T17:57:09 1772128629

Agree, this is the point the article makes. I don't think the article claims that it's the agent that is directly improved or altered, but that through the process of the agent self-maintaining its environment, then using that improvement to bootstrap its future self or sub-agents, that the agent _performance_ is holistically better.

> ... if the docs act like a summary of current state, you can just read it at the start and update it at the end of a session

Yeah, exactly. The documentation is effectively a compressed version of the code, saving agent context for a good cross-section of (a) the big picture, and (b) the details needed to implement a given change to the system.

Think we're all on the same page here, but maybe framing it differently.

visarga · 2026-02-26T06:27:08 1772087228

> Nothing you write will matter if it is not quickly adopted to the training dataset.

That is my take too, I was surprised to see how many people object to their works being trained on. It's how you can leave your mark, opening access for AI, and in the last 25 years opening to people (no restrictions on access, being indexed in Google).

Morromist · 2026-02-26T08:10:00 1772093400

"On reflection I have started to worry again. In 10 to 20 years nobody will read anything any more, they just will read LLM digests. So, the single most important task of a writer starting right now is to get your efforts wired in to the LLMs"

You're words will be like a drop in the ocean, an ocean where the water volume keeps increasing every year. Also if nobody reads anything anymore what's the point?

heavyset_go · 2026-02-26T06:45:46 1772088346

Most people value their time and work and don't want to give it away for free to some billionaire so they can reproduce it as slop for their own private profit.

That's to say, most people recognize when they're getting fucked over and are correct to object to it.

mbgerring · 2026-02-26T06:33:48 1772087628

People who produced the works LLMs are trained on are not compensated for the value they are now producing, and their skills are increasingly less valued in a world with LLMs. The value the LLMs are producing is being captured by employees of AI companies who are driving up rent in the Bay Area, and driving up the cost of electricity and water everywhere else.

Your surprise to people’s objections makes sense if you can’t count.

chii · 2026-02-26T06:49:13 1772088553

> People who produced the works LLMs are trained on are not compensated for the value they are now producing

the value being extracted via LLM techniques is new value, which did not previously exist. The producer(s) of the old data had an asking price, which was taken by the LLM trainers. They cannot make the argument that since the LLM is producing new value, they should retroactively update their old asking price for their works.

They could update their asking price for any new works they produce. They also have the right to ask their works not be used for training, etc. But they cannot ask their old works to be paid for by the new uses in LLM in a retroactive way.

GolfPopper · 2026-02-26T07:05:07 1772089507

>The producer(s) of the old data had an asking price, which was taken by the LLM trainers.

This is... blatantly untrue?

https://arstechnica.com/tech-policy/2026/02/microsoft-remove...

https://www.theatlantic.com/technology/archive/2025/03/libge...

mbgerring · 2026-02-26T12:07:10 1772107630

> They could update their asking price for any new works they produce. They also have the right to ask their works not be used for training, etc.

Someone else already pointed out that many works used to train LLMs were stolen, but also, it’s unclear whether this is true, either. Can you opt out? Because copyright should have been enough to prevent a company from stealing and profiting from your work, but it wasn’t in the case of every existing LLM.

joquarky · 2026-02-26T23:47:53 1772149673

> not compensated for the value they are now producing

"To promote the Progress of Science and useful Arts" is the basis for this right.

Does the old way still promote the Progress of Science and useful Arts sufficiently to stifle the new way?

mbgerring · 2026-02-27T04:03:49 1772165029

So stealing is fine if you convince investors it fits their definition of progress (firing human workers)?