More

GoatInGrey · 2026-02-16T18:07:21 1771265241

Me:

I want to get my car's transmission fixed, the mechanic shop is 50 meters away. Would it be better to walk or to drive to a location 50 meters away from my front door?

-

Opus 4.6:

<thinking> The user is asking whether to walk or drive 50 meters to a mechanic shop, but there's a key nuance here - their transmission needs fixing, so driving might not be wise.

Done </thinking>

Walk, absolutely. Driving a car with a compromised transmission even 50 meters risks making the problem worse and turning a repair into a more expensive one. Plus, 50 meters is barely a minute on foot.

GoatInGrey · 2026-02-16T17:25:12 1771262712

No, what the other commenter described is narrowly scoped delegation to LLMs paired with manual review (which sounds dreadfully soul-sucking to me), not wholesale "write feature X, write the unit tests, and review the implementation for me". The latter is vibe-coding.

krastanov · 2026-02-16T17:53:12 1771264392

Reviewing a quick translation of a test to a benchmark (or another menial coding tasks) is way less soul-sucking than doing the menial coding by yourself. Boring soul-sucking tasks are an important thankless part of OSS maintenance.

I concur it is different from what you call vibecoding.

unshavedyak · 2026-02-16T17:50:24 1771264224

Sidenote, i do that frequently. I also do varying levels of review, ie more/less vibe[1]. It is soul sucking to me.

Despite being soul sucking, I do it because A: It lets me achieve goals despite lacking energy/time for projects that don't require the level of commitment or care that i provide professionally. B: it reduces how much RSI i experience. Typing is a serious concern for me these days.

To mitigate the soul sucking i've been side projecting better review tools. Which frankly i could use for work anyway, as reviewing PRs from humans could be better too. Also inline with review tools, i think a lot of soul sucking is having to provide specificity, so i hope to be able to integrate LLMs into the review tool and speak more naturally to it. Eg i belive some IDEs (vscode? no idea) can let Claude/etc see the cursor, so you can say "this code looks incorrect" without needing to be extremely specific. A suite of tooling that improves this code sharing to Claude/etc would also reduce the inane specificity that seems to be required to make LLMs even remotely reliable for me.

[1]: though we don't seem to have a term for varying amounts of vibe. Some people consider vibe to be 100% complete ignorance of the architecture/code being built. In which case imo nothing i do is vibe, which is absurd to me but i digress.

GoatInGrey · 2026-02-16T17:20:10 1771262410

Even on codebases within the half-year age group, these LLMs often do perform nasty (read: ungodly verbose) implementations that become a maintainability nightmare. Even for the LLMs that wrote it all in the first place. I know this because we've had a steady trickle of clients and prospects expressing "challenges around maintainability and scalability" as they move toward "production readiness". Of course, asking if we can implement "better performing coding agents". As if improved harnessing or similar guardrails can solve what is in my view, a deeper problem.

The practical and opportunistic response is too tell them "Tough cookies" and watch the problems steadily compound into more lucrative revenue opportunities for us. I really have no remorse for these people. Because half of them were explicitly warned against this approach upfront but were psychologically incapable of adjusting expectations or delaying LLM deployment until the technology proved itself. If you've ever had your professional opinion dismissed by the same people regarding you as the SME, you understand my pain.

I suppose I'm just venting now. While we are now extracting money from the dumbassery, the client entitlement and management of their emotions that often comes with putting out these fires never makes for a good time.

buschleague · 2026-02-16T20:12:37 1771272757

This is exactly why enforcement needs to be architectural. The "challenges around maintainability and scalability" your clients hit exist because their AI workflows had zero structural constraints. The output quality problem isn't the model, it's the lack of workflow infrastructure around it.

datsci_est_2015 · 2026-02-16T20:44:46 1771274686

Is this not just “build a better prompt” in more words?

At what point do we realize that the best way to prompt is with formal language? I.e. a programming language?

semiquaver · 2026-02-16T21:01:23 1771275683

No, the suite of linters, test suite and documentation in your codebase cannot be equated to “a better prompt” except in the sense that all feedback of any kind is part of what the model uses to make decisions about how to act.

datsci_est_2015 · 2026-02-16T21:25:34 1771277134

A properly set up and maintained codebase is the core duty of a software engineer. Sounds like the great-grandparent comment’s client needed a software engineer.

oblio · 2026-02-16T23:34:18 1771284858

What if LLMs, at the end of the day are machines, so for now generally dumber than humans and the best they can provide are at most statistically median implementantions (and if 80% of code out there is crap, the median will be low)?

Now that's a scary thought that basically goes against "1 trillion dollars can't be wrong".

Now, LLMs are probably great range extenders, but they're not wonder weapons.

lossyalgo · 2026-02-17T00:56:47 1771289807

Also who is to say what is actually crap? Writing great code is completely dependent on context. An AI could exclusively be trained on the most beautiful and clean code in the world, yet if it chooses the wrong paradigm in the wrong context, it doesn't matter how beautiful that code is - it's still gonna be totally broken code.

GoatInGrey · 2026-02-15T21:33:35 1771191215

Functioning antitrust is really all it takes. The last significant US antitrust action was on AT&T in 1982.

GoatInGrey · 2026-02-15T21:11:06 1771189866

For vibe-writing, the vibes aren't even that good!

GoatInGrey · 2026-02-13T23:43:49 1771026229

It's the evergreen tradeoff between the short and long terms. Do I get the nugget of information I need right now but lose in a month, or do I spend the time and energy that leads to deeper understanding and years-long retention of the knowledge?

There is something about our biology that makes us learn better when we struggle. There are many concepts on this dynamic: generation effect, testing effect, spacing effect, desirable difficulties, productive failure...it all converges on the same phenomenon where the easier it is to learn, the worse we learn.

Take K-12 for instance. As computing technology is further and further integrated into education, cognitive performance decreases in a near-linear relationship. Gen Z is famously the first generation to perform worse in every cognitive measure than previous generations, for as long as we've been recording since the 19th century. An uncomfortable truth emerging from studies on electronics usage in schools is that it isn't just the phones driving this. It's more so the Duolingo effect of software overall emulating the sensation of learning without actually changing the brain state. Because the software that actually challenges you is not as engaging or enjoyable.

How you learn, and your ability to parse, infer, and derive meaning from large bodies of information, is increasingly a differentiator in both the personal and professional worlds. It's even more so the case when many of your peers are now learning through LLM-generated summaries averaging just 300 words, perhaps skimming outputs around 1,000 words in length for "important information". The immediate benefits are obvious, but the cost of outsourcing that cognitive work gets lost in the convenience.

Because remember, this isn't just about your ability to recall specific regex, follow a syntax convention, or how much code you ship in an hour. Your brain needs exercise, and deep learning is one of the most reliable ways to get it. Doubly true if you're not even writing your own class names.

What I am speaking to is not far away or hypothetical, either. Because as of 2023, one in four young adults in the United States is functionally illiterate.

https://www.the74million.org/article/many-young-adults-barel...

zozbot234 · 2026-02-14T00:25:40 1771028740

Effective learning and memorizing is actually at the narrow edge of struggling: it's neither "too easy" nor "too hard and painful". SRS systems do a very good job of tuning this: by the time a question comes back to you it will feel difficult, but you'll be able to recall the information and answer it with some effort. It's a matter of recognizing this feeling and acknowledging as "the right kind of effort" as opposed to a hopeless task.

If you ask the AI "please quiz me about the proper understanding of issues x y z and tell me if I got it all right. iterate for anything I get seriously wrong, then provide a summary at the end and generate SRS cards for me to train on" it will generally do a remarkably good job at that.

r0b05 · 2026-02-14T16:01:45 1771084905

I agree with all of this. The brain needs exercise, just like the body.

GoatInGrey · 2026-02-10T17:44:46 1770745486

In my opinion, it's this, though I think it's a second-order effect. I believe that the issue isn't so much that women are working, but rather that there is a shortage of household labor. This labor pool is what was traditionally used for childcare needs. When you pair that labor shortage with (terrible) modern parenting standards, there just isn't enough time to raise kids without becoming a zombie.

https://pbs.twimg.com/media/G-p2hvebQAEkEBg?format=jpg&name=...

Edit: To be clear, I think there are multiple contributing factors. It's just that, in my view, the time/labor shortage is the core of the issue. Everything else feeds into it in some way. The factors eventually start stacking and problems that contribute to the time issue get exacerbated by their own contributing factors.

Economics pressures, for instance. Bad housing economics means couples work maximum hours to afford daily expenses, decreasing available household labor. It also fractures extended family systems when people have to relocate for cheaper housing or better jobs, eliminating the traditional labor-pooling arrangements for childrearing. Generally poor median household economics keep parents in constant anxiety too, which then requires time to be spent on coping routines.

Social atomization has further taken away the kind of pooled childcare labor that used to absorb overflow. Media has displaced churches, bars, parks, and bowling alleys with private screen time, shrinking social circles with scarce opportunities to rebuild them. Car-based infrastructure further reduces local community interaction and subtly dehumanizes neighbors into obstacles who steal parking and slow you down. Remote work and online shopping accelerate this deterioration. The result of all of this? Parents who already don't have extended family, also don't have friends, neighbors. or community to cover childcare needs. The sort of "Hang out at the neighbor's house while I go to my book club meeting." scenario has largely gone extinct because of this.

Even if a couple does better than the average bear in these areas, and they have options, ambient paranoia bottlenecks their outsourcing of childcare anyway. Our media environment has normalized constant fear. Fear that every blade of grass conceals a potential predator, so every adult is regarded as a serious risk to your kid(s). This compounds further because it's gotten to the point where children (and teenagers) can't play outside or otherwise exist independently without supervision. This increases the time parents must spend on daily childcare needs. So not only can they not decrease the time spent, but they now have to spend even more because of it.

On top of all of this, the fraying social fabric creates an effect similar to cellular breakdown. Where those who become disconnected from the larger biological system stop acting for the collective benefit and further prioritize the self, becoming cancerous. This leads to growing numbers of extremist, anti-social individuals with poor mental health. Individuals who both compound the scarcity and isolation of parents, and justify their media-sourced fear of other adults. This is an example of the contributing factors to the contributing factors.

You get the idea.

projektfu · 2026-02-10T18:26:17 1770747977

While I think I see myself in the chart, I am not exactly sure what it says, especially the "Controlling for children under 5" and the time.

This seems like a good place for a study using matched subjects. Do 23 year olds of a certain generation spend more time with 7 year old children than another generation? Etc., etc., then you can calculate the baseline and excess for each generation

GoatInGrey · 2026-02-07T16:48:55 1770482935

To be blunt, I think it's a form of mania that drives someone to reject human-written code in favor of LLM-generated code. Every time I read writing from this perspective that exceeds a paragraph, I quickly realize the article itself was written by an LLM. When they automate this much writing, it makes me wonder how much of their own reading they automate away too.

The below captures this perfectly. The author is trying to explain that vibe-coding their own frameworks lets them actually "understand" the code, while not noticing that the LLM-generated text they used to make this point is talking about cutting and sewing bricks.

> But I can do all of this with the experience on my back of having laid the bricks, spread the mortar, cut and sewn for twenty years. If I don’t like something, I can go in, understand it and fix it as I please, instructing once and for all my setup to do what I want next time.

Dfiesl · 2026-02-07T17:33:51 1770485631

I think the bit you quoted is a tie in with an earlier bit:

“ I can be the architect without the wearing act of laying every single brick and spreading the mortar. I can design the dress without the act of cutting and sewing each individual piece of fabric”

To me, this text doesn’t read as being entirely written by an LLM, there is definitely an air of LLM about it though, so maybe the first draft was.

daveguy · 2026-02-08T01:20:25 1770513625

Those additional jumbled incoherent mixing of metaphors don't make it any better.

potsandpans · 2026-02-07T18:44:45 1770489885

> Every time I read writing from this perspective that exceeds a paragraph, I quickly realize the article itself was written by an LLM.

Likewise, I think that this mentality is a modern day conversion disorder.

"Everything I don't agree with is an LLM, I have no proof it just feels good!"

cortesoft · 2026-02-07T20:34:19 1770496459

It’s the ‘woke’ of the computer world.

GeoAtreides · 2026-02-07T18:33:24 1770489204

> it's a form of mania

Correct. The history is rife with examples of manias taking hold of societies, I recommend "Memoirs of Extraordinary Popular Delusions and the Madness of Crowds" by Charles Mackay[1], it's an absolutely fascinating book.

[1]https://www.gutenberg.org/ebooks/24518

lstodd · 2026-02-08T00:36:28 1770510988

> absolutely fascinating book

Indeed. Thank you for posting this link.

GoatInGrey · 2026-02-06T23:49:07 1770421747

> most people don't want to use

I'm not sure if this was intended or not, but this is a common NIMBY refrain. The argument of "This thing being advocated for that I'm fighting against isn't something people want anyway". And like walkable neighborhood architecture, extremely few Americans have access to light rail. Let alone light rail that doesn't have to yield to car traffic.

Regardless, the cost arguments fall apart once you take the total cost society pays for each system instead of only what the government pays. Because when you get the sum of road construction & maintenance, car acquisition, car maintenance, insurance, and parking, it dwarfs the cost of the local transit system. Break it down on a per-consumer basis and it gets even uglier. New York City is a good example to dive into, especially since it's the typical punching bag for "out-of-control" budgets.

Quick napkin math pins the annual MTA cost at $32-$33 billion and the total cost of the car system between $25 and $44 billion per year. Since the former serves somewhere around 5.5 million riders, and the latter only about 2 million, the MTA costs $5,300-6,600 per user annually where the car system costs $12,000–$22,500 per user annually.

scoofy · 2026-02-07T00:21:48 1770423708

You seem to be misunderstanding my point. I am a transit alternatives advocate, and have been my entire adult life.

I'm NOT saying "people don't want to ride trains."

I AM saying "people don't want ride trains that allow 5% of the riders to smoke cigarettes on enclosed train platforms and in enclosed train cars."

You might says "what? but that's not happening."

In Chicago, yes it is: https://resphealth.org/snuff-out-smoking-on-cta/

People want transit as long as that transit reasonably meets their quality of life standards. The reason why automobiles have been so popular -- even while being wildly more expensive -- is exactly that they allow the user to adjust their travel to their optimal quality of life expectations.

Public transit advocates need to be honest with themselves that anti-social behavioral issues really matter to people. People are willing to pay more to have a more pleasant experience. When a transit system fails to meet that standard, then you'll suddenly find yourself with a transit system that people don't want to use.

xnx · 2026-02-08T16:51:39 1770569499

Cosigning all of this as a Chicago resident. Service is somehow both much worse and more expensive after COVID.

lonjil · 2026-02-07T15:35:18 1770478518

> I AM saying "people don't want ride trains that allow 5% of the riders to smoke cigarettes on enclosed train platforms and in enclosed train cars."

Just don't allow that then?

> Public transit advocates need to be honest with themselves that anti-social behavioral issues really matter to people. People are willing to pay more to have a more pleasant experience. When a transit system fails to meet that standard, then you'll suddenly find yourself with a transit system that people don't want to use.

"we can't have good transit because a few people who call themselves transit advocates have bad opinions" is very defeatist. Weak-spined politicians find it much easier to just set money on fire than actually solving problems, so even though most transit advocacy groups in the US emphasize quality and being less wasteful with budgets, your politicians usually prefer the worse options.

scoofy · 2026-02-07T19:38:49 1770493129

>> I AM saying "people don't want ride trains that allow 5% of the riders to smoke cigarettes on enclosed train platforms and in enclosed train cars."

>Just don't allow that then?

>"we can't have good transit because a few people who call themselves transit advocates have bad opinions" is very defeatist.

My point here is only that this is a hard problem, not a trivial one. When the transit advocates in my area just say "transit should be free" in response to "transit pricing is a complex problem that affects system fragility" and they say "stop hating homeless people" in response to "quality of life concerns matter to keeping the system functional long term" then we're in bad place, because the non-transit advocates literally want to get rid of the system. The last TWO Muni funding bills in SF failed.

We've built a system that can fail catastrophically, in large part, because transit advocates don't want to deal with the realities of running a functional transit system. This is why I get grumpy when people say "all this work is impressive, but I'd rather have better trains" when it's very clear why Waymo is succeeding as Muni is failing, but it is exactly because Muni is mostly disconnected from market forces that we've got to this place, and the "solution" being proposed by most transit advocates is to just completely remove all market forces which will very obviously be worse is the long run.

GoatInGrey · 2026-02-06T16:50:25 1770396625

I can concur that previous models would say "No, that isn't possible" or "No, that doesn't exist". There was one time where I asked it to update a Go module from version X.XX to version X.YY and it would refuse to do so because version X.YY "didn't exist". This back with 3.7 if I recall, and to be clear, that version was released before its knowledge cut off.

I wish I remembered the exact versions involved. I mostly just recall how pissed I was that it was fighting me on changing a single line in my go.mod.

setgree · 2026-02-06T16:53:23 1770396803

alas, 4.5 often hallucinates academic papers or creates false quotes. I think it's better at knowing that coding answers have deterministic output and being firm there.