You got my curiosity going so I went for the experiment again. It turns out my e...

FeepingCreature · on March 10, 2024

Again, nobody is saying that GPT-4 is scary or that GPT-4 can kill all humans. I'm saying it contains the skills necessary to write programs in novel domains.

Just today, I asked it to translate a demo to OpenGL4. It did this using a matrix lib it had probably not seen before, since I just wrote it a few years ago and there's like one file using it on Github, but its code mostly worked fine after I gave it a copy of the documentation to read. (I had to do some debugging, but its issues were in hindsight very understandable. Stuff like not knowing that a matrix is row-major, which is an admittedly unusual layout.)

And I had to ask for some specific corrections after, sure, but its responses were perfectly correct. ("This shader doesn't seem to be handling alpha blending with discard." "Oh yeah sure here's how you do that.")

Yes, if you keep asking it for corrections its performance will continue to degrade. This is a known problem. Yes, it isn't good at incremental problem-solving. Again, known issue. And yes - it's not yet human level. Nobody's saying it's a replacement for a human programmer, as it stands.

It seems like you're saying "it can't think as good as a human, therefore it can't think." There are skill levels below human but above parrot! And I'd argue this thing as it stands is much closer to human than parrot already.

That said, looking at your challenge:

> create a graphical application in Go using the Gio package. It should have two buttons down the left column and a main view area where graphics, text and input can be received from the user.

I couldn't do that. What the hell does "a main view area where graphics, text and input can be received from the user" mean?? Do you want a text field? With mouse paint support? What input is there beyond graphics and text? Like, GPT-4 is trained to just do what you tell it to, so it can't incrementally gather additional requirements, so you do actually have to be really specific in what you tell it to do. If you offered me that as a spec, I wouldn't sign it. If my company had already signed it, I'd be desperately trying to phone your designer to ask what they were smoking. This is just not a good requirements document. Garbage in...

Look, prompting a LLM is a skill. In this year 2024, we are not yet at the point where you can just throw vague requirements at a LLM and get constructive dialogue out eventually culminating in working code. But I just think it's silly to say "this model doesn't have the complete skillset of a fullstack developer and designer, therefore it has no skill at all."

trimethylpurine · on March 10, 2024

You're right. I had a typo.

"a main view area with graphics, text and where input can be received from the user."

But notice how you can look at that and try to reason through what I meant, realize it's not clear, and respond to that.

The above is reasoning.

"The capacity for logical, rational, and analytic thought; intelligence."

ChatGPT can't do that and I don't mean that it can't do it good yet. I mean it's literally not doing that. There is nothing to advance or to expect advancement in. The "AI" doesn't exist. LLMs are simply not AIs and they are not in the path to AI, although they likely will play a major role if a theory towards AI were to be conceived.

FeepingCreature · on March 10, 2024

It can do it, it just isn't trained for it. This one is actually kind of depressing. It's genuinely that ... as far as I'm able to tell, this sort of reasoning just doesn't appear in the finetuning set that they feed it to make instruct models. The way we train these things is honestly terrible in so many ways. Anyway, that's why I think the current mode of "well, I'll give it one attempt by charging blindly ahead, ignoring any mistakes I made and then blindly claiming success" is purely temporary.

In my experience, if you feed it a metric ton of explicit guidance you can occasionally get it into a mode where it reasons incrementally and actually notices when things are unclear. It is in there, it's just not foregrounded. They're trained not to ask questions, you see.

trimethylpurine · on March 10, 2024

It's an interesting take. I don't see that it's in the technology. If you have a way to make this tech do that I'd say you're in a class of your own and you've got something no one else has. You should press it forward.