Hacker News new | past | comments | ask | show | jobs | submit | jasonjmcghee's comments login

Interactivity and liveness in programming deserves to be discussed far more often than it is on front-page of hacker news, but excited there are multiple ongoing threads!

I'm a very strong supporter of interactive blogposts as well. Obviously https://ciechanow.ski/ is leader here - being able to mess with something to build intuition is huge.


Live development is still so under-explored, and is just so exciting to work with.

One of my favorite talks is "Stop Writing Dead Programs" (https://www.youtube.com/watch?v=8Ab3ArE8W3s) and touches on a lot of what could be in terms of live development.

Lisp is very well-suited to live development due to code being data, but live development doesn't need to be lispy.

I built a live development extension for Love2D which lets you do graphics livecoding (both lua and glsl) in real-time - every keystroke updating the output, (if a valid program).

https://github.com/jasonjmcghee/livelove

Here are some (early) demos of things you can do with it:

https://gist.github.com/jasonjmcghee/9701aacce85799e0f1c7304...

So many cool things once you break down the barrier between editor and running program.

I've also asked myself the question of, what would a language look like that was natively built for live development, and built out a prototype - though it's definitely a sandbox so haven't posted it anywhere yet, other than demos on mastadon.


Jack Rusher's recent interview is well worth reading too (the "stop writing dead programs" guy).

> On the need to sustain your creative drive in the face of technological change

> https://thecreativeindependent.com/people/multi-disciplinary...

nb. I recently submitted it here: https://news.ycombinator.com/item?id=43759204


That's a great submission! I put it in the second-chance pool (https://news.ycombinator.com/pool, explained at https://news.ycombinator.com/item?id=26998308), so it will get a random placement on HN's front page.

Oh wow, just had to log in and give you a high-five for livelove because this is the first I've heard of it and it sounds like the sort of thing I absolutely need to try out.

I remember giving Love2D a go a couple of years ago with Fennel and the lack of such a thing sent me grumbling back to Common Lisp. I'd never even have thought of building that functionality in Love/Lua myself - assuming it's something that the runtime just didn't support - and it absolutely would never have occurred to me to use LSP to set it up. I've not even used it yet and it's already doing things to my brain, so thanks!


Excited to spread the brain worm. Don't hesitate to join in the fun / log issues / contribute / share how you use it!

I guess the prevailing worldview is that "recompile everything and re-run" is good enough if it takes 2 seconds. But agreed that it just "feels" different when you're doing live in lisp... I miss Emacs!

Recompile and hot reload, maybe. 2 seconds if you're very lucky. Many setups are much slower. I've seen some really cool projects over the last few years- things like tcc + hot reload that have really good turn around time.

But "live" is a whole different thing. And most languages aren't built with the expectation that you'll be patching it while it's running - at least as standard operating procedure and without nuking state.

And that's a key part.

I think you should be able to build an app and develop a game without ever having to restart them.


Well, for me it’s not enough because I need to get back to where I was, repeating the same actions so it gets to the same state. With live dev I don’t need this, or a history replay method. I only update the running code. Heck I could also update the in memory var too if so I want.

It’s good that it’s fast. Still no good enough!


There are similar trends in music and sound art, which can be experienced with Glicol (https://glicol.org/) as well as many other languages here:

https://github.com/toplap/awesome-livecoding

also this live coding book is free to read!

https://livecodingbook.toplap.org/


Parent comment isn't asking how data is requested from the back-end.

GP comment is (seemingly) describing keeping an entirely client side instance (data stored locally / in memory) snapshot of the back-end database.

Parent comment is asking how the two are kept in sync.

It's hard to believe it would be the method you're describing and take 25ms.

If you're doing http range requests, that suggests you're reading from a file which means object storage or disk.

I have to assume there is something getting triggered when back end is updating to tell the client to update their instance. (Which very well could just be telling it to execute some sql to get the new / updated information it needs)

Or the data is entirely in memory on the back end in an in memory duckdb instance with the latest data and just needs to retrieve it / return it from memory.


agreed - thought the qwen2.5-coder was kind of standard non-reasoning small line of coding models right now

I saw pretty good reasoning quality with phi-4-mini. But alright - I’ll still run some tests with qwen2.5-coder and plan to add support for it next. Would be great to compare them side by side in practical shell tasks. Thanks so much for the pointer!

they made it so easy to do specdec, that alone sold it for me

Some models have even a 0.5B draft model. The speed increase is incredible.


Is this the widely used term? Do you know of any open source models fine-tuned as an "inner loop" / native agentic llm? Or what the training process looks like?

I don't see why any model couldn't be fine-tuned to work this way - i.e. tool use doesn't need to be followed by an EOS token or something - it could just wait for an output (or even continue with the knowledge there's an open request, and to take action when it comes back)


Looks like someone used a tool which generates a landing page for you (the product they used left an image that advertises that company). There's no product here / violates Show HN.


If you look at the Privacy Policy or Terms pages, they call the site ActionFigureGenerator, and they lack the styling of the homepage. There's even a leftover image from the other site on the landing page. Both sites are similar in design.

So either both sites are run by the same person and they copied their own template, or they stole the website design outright. Given the broken styling on the privacy and terms pages, I'm leaning towards stolen.


They've worked to improve this with "memories" (hash symbol to "permanently" record something - you can edit later if you want).

And there's CLAUDE.md. it's like cursorrules. You can also have it modify it's own CLAUDE.md.


(even though this was posted first) - larger discussion of same post here https://news.ycombinator.com/item?id=43735550

Surprised that "controlling cost" isn't a section in this post. Here's my attempt.

---

If you get a hang of controlling costs, it's much cheaper. If you're exhausting the context window, I would not be surprised if you're seeing high cost.

Be aware of the "cache".

Tell it to read specific files (and only those!), if you don't, it'll read unnecessary files, or repeatedly read sections of files or even search through files.

Avoid letting it search - even halt it. Find / rg can have a thousands of tokens of output depending on the search.

Never edit files manually during a session (that'll bust cache). THIS INCLUDES LINT.

The cache also goes away after 5-15 minutes or so (not sure) - so avoid leaving sessions open and coming back later.

Never use /compact (that'll bust cache, if you need to, you're going back and forth too much or using too many files at once).

Don't let files get too big (it's good hygiene too) to keep the context window sizes smaller.

Have a clear goal in mind and keep sessions to as few messages as possible.

Write / generate markdown files with needed documentation using claude.ai, and save those as files in the repo and tell it to read that file as part of a question. I'm at about ~$0.5-0.75 for most "tasks" I give it. I'm not a super heavy user, but it definitely helps me (it's like having a super focused smart intern that makes dumb mistakes).

If i need to feed it a ton of docs etc. for some task, it'll be more in the few $, rather than < $1. But I really only do this to try some prototype with a library claude doesn't know about (or is outdated). For hobby stuff, it adds up - totally.

For a company, massively worth it. Insanely cheap productivity boost (if developers are responsible / don't get lazy / don't misuse it).


If I have to be so cautious while using a tool might as well write the code myself lol. I’ve used Claude Code extensively and it is one of the best AI IDE. It just gets things done. The only downside is the cost. I was averaging $35-$40/day. At this cost, I’d rather just use Cursor/Windsurf.

Oh wow. Reading your comment guarantees I'll never use Claude Code.

I use Aider. It's awesome. You explicitly specify the files. You don't have to do work to limit context.


Not having to specify files is a humongous feature for me. Having to remember which file code is in is half the work once you pass a certain codebase size.

Use /context <prompt> to have aider automatically add the files based on the prompt. It's been working well for me.

That sometimes work sometimes doesn’t and takes 10x time. Same with codex. I would have both and switch between them depending on what you feel will get it right better

Yeah, I tried CC out and quickly noticed it was spending $5+ for simple LLM capable tasks. I rarely break $1-2 a session using aider. Aider feels like more of a precision tool. I like having the ability to manually specify.

I do find Claude Code to be really good at exploration though - like checking out a repository I'm unfamiliar with and then asking questions about it.


Aider is a great tool. I do love it. But I find I have to do more with it to get the same output as Claude Code (no matter what LLM I used with Aider). Sure it may end up being cheaper per run, but not when my time is factored in. The flip side is I find Aider much easier to limit.

What are those extra things you have to do more of? I only have experience with Aider so I am curious what I am missing here.

With Claude Code you can at least type "/code" at any point to see how much it's spent, and it will show you when you end a session (with Ctrl+C) too.

The output of /cost looks like this:

  > /cost 
    ⎿  Total cost: $0.1331
       Total duration (API): 1m 13.1s
       Total duration (wall): 1m 21.3s

Aider shows how much you've spent after each command :-). It shows the cost of the command as well as the session.

After switching to Aider, I realized the other tools have been playing elaborate games to choose cheaper models and to limit files and messages in context, both of which increase their bills.

>I use Aider. It's awesome.

What do you use for the model? Claude? Gemini? o3?


Currently using Sonnet 3.7, but mostly because I've been too lazy to set up an account with Google.

Get an Openrouter account and you can play with almost all providers, I was burning money on Claude, tried V3 (blocked Deepseek provider for being flaky, let the laypeople mock them) and experimental and GA Gemini models.

Gemini 2.5 pro is my choice

The productivity boost can be so massive that this amount of fiddling to control costs is counterproductive.

Developers tend to seriously underestimate the opportunity cost of their own time.

Hint - it’s many multiples of your total compensation broken down to 40 hour work weeks.


The cost of the task scales with how long it takes, plus or minus.

Substitute “cost” with “time” in the above post and all of the same tips are still valuable.

I don’t do much agentic LLM coding but the speed (or lack thereof) was one of my least favorite parts. Using any tricks that narrow scope, prevent reprocessing files over and over again, or searching through the codebase are all helpful even if you don’t care about the dollar amount.


Hard agree. Whether it's 50 cents or 10 dollars per session, I'm using it to get work done for the sake of quickly completing work that aims to unblock many orders of magnitude more value. But in so far as cheaper correct sessions correlate with sessions where the problem solving was more efficient anyhow, they're fairly solid tips.

I agree but optimisation often reveals implementation details helping to understand limits of current tech more. It might not be worth the time but part of engineering is optimisation and another part is deep understanding of tech. It is sometimes worth optimising anyway if you want to take the engineering discipline to the next level within yourself.

I myself didn’t think about not running linters however it makes obvious sense now and gives me the insight about how Claude Code works allowing me to use this insight in related engineering work.


Exactly. I've been using the chat gpt desktop app not because of the model quality but because of the UX. It basically seamlessly integrates with my IDEs (intellij and vs code). Mostly I just do stuff like select a few lines, hit option+shift+1, and say something like "fix this". Nice short prompt and I get the answer relatively quickly. Option+shift+1 opens chat gpt with the open file already added to the context. It sees what lines are selected. And it also sees the output of any test runs on the consoles. So just me saying "fix this" now has a rich context that I don't need to micromanage.

Mostly I just use the 4o model instead of the newer better models because it is faster. It's good enough mostly and I prefer getting a good enough answer quickly than the perfect answer after a few minutes. Mostly what I ask is not rocket science so perfect is the enemy of good here. I rarely have to escalate to better models. The reasoning models are annoyingly slow. Especially when they go down the wrong track, which happens a lot.

And my cost is a predictable 20$/month. The downside is that the scope of what I can ask is more limited. I'd like it to be able to "see" my whole code base instead of just 1 file and for me to not have to micro manage what the model looks at. Claude can do that if you don't care about money. But if you do, you are basically micro managing context. That sounds like monkey work that somebody should automate. And it shouldn't require an Einstein sized artificial brain to do that.

There must be people that are experimenting with using locally running more limited AI models to do all the micromanaging that then escalate to remote models as needed. That's more or less what Apple pitched for Apple AI at some point. Sounds like a good path forward. I'd be curious to learn about coding tools that do something like that.

In terms of cost, I don't actually think it's unreasonable to spend a few hundred dollars per month on this stuff. But I question the added value over the 20$ I'm spending. I don't think the improvement is 20x better. more like 1.5x. And I don't like the unpredictability of this and having to think about how expensive a question is going to be.

I think a lot of the short term improvement is going to be a mix of UX and predictable cost. Currently the tools are still very clunky and a bit dumb. The competition is going to be about predictable speed, cost and quality. There's a lot of room for improvement here.


If this is true, why isn't our compensation scaling with the increases in productivity?

It usually does, just with a time delay and a strict condition that the firm you work at can actually commercialize your productivity. Apply your systems thinking skills to compensation and it will all make sense.

It's interesting that this is a problem for people because I have never spent more than about $0.50 on a task with Claude Code. I have pretty good code hygiene and I tell Claude what to do with clear instructions and guidelines, and Claude does it. I will usually go through a few revisions and then just change anything myself if I find it not quite working. It's exactly like having an eager intern.

I don't think about controlling cost because I price my time at US$40/h and virtually all models are cheaper than that (with the exception of o1 or Gemini 2.5 pro).

If I spend $2 instead of $0.50 on a session but I had to spend 6 minutes thinking about context, I haven't gained any money.


Important to remind people this is only true if you have a profitable product, otherwise you’re spending money you haven’t earned.

If your expectation is to produce the same amount of output, you could argue when paying for AI tools, you're choosing to spend money to gain free time.

4 hours coding project X or 3 hours and a short hike with your partner / friends etc


If what I'm doing doesn't have a positive expected value, the correct move isn't to use inferior dev tooling to save money, it's to stop working on it entirely.

There might be value but you might not receive any of it. Most salaried employees won't see returns.

Come on, every hobby has negative expected value. You're not doing it for the money but it still makes sense to save money.

If you do it a bit, it just becomes habit / no extra time or cognitive load.

Correlation or causation aside, the same people I see complain about cost, complain about quality.

It might indicate more tightly controlled sessions may also produce better results.

Or maybe it's just people that tend to complain about one thing, complain about another.


I assume they use a conversation, so if you compress the prompt immediately you should only break cache once, and still hit cache on subsequent prompts?

So instead of Write Hit Hit Hit

It's Write Write Hit Hit Hit


My attempt is - Do not use Claude Code at all, it is terrible tool. It is bad at almost everything starting with making simple edits to files.

And most of all Claude Code is overeager to start messing with your code and run unnecessary $$ instead of making sensible plan.

This isn't problem with Claude Sonnet - it is fundamnetal problem with Claude Code.


I pretty much one shot a scraper from an old Joomla site with 200+ articles to a new WP site, including all users and assets, and converting all the PDFs to articles. It cost me like $3 in tokens.

I guess the question the is: can't VScode Copilot do the same for a fixed $20/month? It even has access to all SOTA models like Claude 3.7, Gemini 2.5 Pro and GPT o3

Vscode’s agent mode in copilot (even in the insider’s nightly) is a bit rough in my experience: lots of 500 errors, stalls, and outright failures to follow tasks (as if there’s a mismatch between what the ui says it will include in context vs what gets fed to the LLM).

I would have thought so, but somehow no. I have a cursor subscription with access to all of those models, and I still consistently get better results from claude code.

I haven't tried copilot. Mostly because I don't use VSCode, I use jetbrains ides. How do they provide Claude 3.7 for $20/mo with unlimited usage?

Copilot has a pretty good plugin for JetBrains IDEs!

Though their own AI Assistant and Junie might be equally good choices there too.


By providing bad UI that you don't use it so much.

was it a wget call feeding into html2pdf?

no it's a few hundred lines of python to parse weird and inconsistent HTML into json files and CSV files, and then a sync script that can call the WP API to create all the authors as needed, update the articles, and migrate the images

Plumbing to pipe shit from one sewer to another.

Yep, don't wanna spend more of my life doing that than I have to!

If I have to spend this much time thinking about any of this, congratulations, you’ve designed a product with a terrible UI.

Some tools take more effort to hold properly than others. I'm not saying there's not a lot of room for improvement - or that the ux couldn't hold the users hand more to force things like this in some "assisted mode" but at the end of the day, it's a thin, useful wrapper around an llm, and llms require effort to use effectively.

I definitely get value out of it- more than any other tool like it that I've tried.


Think about what you would do in an unfamiliar project with no context and the ticket

"please fix the authorization bug in /api/users/:id".

You'd start by grepping the code base and trying to understand it.

Compare that to, "fix the permission in src/controllers/users.ts in the function `getById`. We need to check the user in the JWT is the same user that is being requested"


So, AIs are overeager junior developers at best, and not the magical programmer replacements they are advertised as.

Let's split the difference and call them "magical overeager junior developer replacements".

On a shorter timeline than you'd think none of working with these tools will look like this.

You'll be prompting and evaluating and iterating entirely finished pieces of software and be able to see multiple attempts at each solve at once, none of this deep in the weeds fixing a bug stuff.

We're rapidly approaching a world where a lot of software will be being made without an engineer hire at all, maybe not the hardest most complex or novel software but a lot of software that previously required a team of 3-15 wont have a single dev.

My current estimate is mid 2026


my current estimate is 2030. because we can barely get a JS/TS application to compile after a year of dependency updates.

our current popular stack is quicksand.

unless we're talking about .net core, java, Django and more of these stable platforms.


> So, AIs are overeager junior developers at best, and not the magical programmer replacements they are advertised as.

This may be a quick quip or a rant. But the things we say have a way of reinforcing how we think. So I suggest refining until what we say cuts to the core of the matter. The claim above is a false dichotomy. Let's put aside advertisements and hype. Trying to map between AI capabilities and human ones is complicated. There is high quality writing on this to be found. I recommend reading literature reviews on evals.


[flagged]


Don’t be a dismissive dick; that’s not appropriate for this forum. The above post is clearly trying to engage thoughtfully and offers genuinely good advice.

The above post produces some vague philosophical statements, and equally vague "juts google it" claims.

I’m thinking you might be a kind of person that requires very direct feedback. Your flagged comment was unkind and unhelpful. Your follow-up response seems to suggest that you were justified in being rude?

You also mischaracterize my comment two levels up. It didn’t wave you away by saying “just google it”. It said — perhaps not directly enough — that your comment was off track and gave you some ideas to consider and directions to explore.


> There is high quality writing on this to be found. I recommend reading literature reviews on evals.

This is, quite literally, "just google it".

And yes, I prefer direct feedback, not vague philosophical and pseudo-philosophical statements and vague references. I'm sure there's high quality writing to be found on this, too.


We have very different ideas of what "literal" means. You _interpreted_ what I wrote as "just Google it". I didn't say those words verbatim _nor_ do I mean that. Use a search engine if you want to find some high-quality papers. Or use Google Scholar. Or go straight to Arxiv. Or ask people on a forum.

> not vague philosophical and pseudo-philosophical statements and vague references

If you stop being so uncharitable, more people might be inclined to engage you. Try to interpret what I wrote as constructive criticism.

Shall we get back to the object level? You wrote:

> AIs are overeager junior developers at best

Again, I'm saying this isn't a good framing. I'm asking you to consider you might be wrong. You don't need to hunker down. You don't need to counter-attack. Instead, you could do more reading and research.


> We have very different ideas of what "literal" means. You _interpreted_ what I wrote as "just Google it". I didn't say those words verbatim _nor_ do I mean that. Use a search engine if you want to find some high-quality papers. Or use Google Scholar. Or go straight to Arxiv. Or ask people on a forum.

Aka "I will make some vague references to some literature, go Google it"

> Instead, you could do more reading and research.

Instead of vague "just google it", and vague ad hominems you could actually provide constructive feedback.


The grandparent is talking about how to control cost by focusing the tool. My response was to a comment about how that takes too much thinking.

If you give a junior an overly broad prompt, they are going to have to do a ton of searching and reading to find out what they need to do. If you give them specific instructions, including files, they are more likely to get it right.

I never said they were replacements. At best, they're tools that are incredibly effective when used on the correct type of problem with the right type of prompt.


> If you give a junior an overly broad prompt, they are going to have to do a ton of

> they're tools that are incredibly effective when used on the correct type of problem with the right type of prompt.

So, a junior developer who has to be told exactly what to do.

As for the "correct type of problem with the right type of prompt", what exactly are those?


As of April 2025. The pace is so fast that it will overtake seniors within years maybe months.

That's been said since at least 2021 (the release date for GitHub Copilot). I think you're overestimating the pace.

overtake ceo by 2026

I have been quite skeptical of using AI tools and my experiences using them have been frustrating for developing software but power tools usually come with a learning curve while "good product" with clean simplified interface often results in reduced capability.

VIM, Emacs and Excel are obvious power tools which may require you to think but often produce unrivalled productivity for power users

So I don't think the verdict that the product has a bad UI is fair. Natural language interfaces is such a step up from old school APIs with countless flags and parameters


Mh. Like, I'm deeply impressed what these AI assistants can do by now. But, the list in the parent comment there is very similar to my mental check-list of pair-programming / pair-admin'ing with less experienced people.

I guess "context length" in AIs is what I intuitively tracked with people already. It can be a struggle to connect the Zabbix alert, the ticket and the situation on the system already, even if you don't track down all the zabbix code and scripts. And then we throw in Ansible configuring the thing, and then the business requriements by more, or less controlled dev-teams. And then you realize dev is controlled by impossible sales-terms.

These are scope -- or I guess context -- expansions that cause people to struggle.


It's fundamentally hard. If you have an easy solution, you can go make a easy few billion dollars.

Never edit files manually during a session (that'll bust cache). THIS INCLUDES LINT

Yesterday I gave up and disabled my format-on-save config within VSCode. It was burning way too many tokens with unnecessary file reads after failed diffs. The LLMs still have a decent number of failed diffs, but it helps a lot.


GitHub copilot follows your context perfectly. I don't have to tell it anything about files. I tried this initially and it just screwed up the results.

> GitHub copilot follows your context perfectly. I don't have to tell it anything about files. I tried this initially and it just screwed up the results.

Just to make sure we're on the same page. There are two things in play. First, a language model's ability to know what file you are referring to. Second, an assistant's ability to make sure the right file is in the context window. In your experience, how does Claude Code compare to Copilot w.r.t (1) and (2)?


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: