Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I wonder how much of that is because the way such questions are asked is so formulaic that it's possible to "guess" the response from that information.

If you were to ask a truly nonsense question, like "Which is more powerful, a GTX 1070 or an F-16?", what would the response be?



"It's difficult to compare a graphics card like the GTX 1070 and an aircraft like the F-16 as they are designed for very different purposes and operate on completely different scales.

The GTX 1070 is a powerful graphics card that is designed to render high-quality graphics for gaming, video editing, and other visual computing applications. It has impressive processing power and can handle complex graphics and visual effects with ease.

On the other hand, the F-16 is a highly advanced fighter aircraft designed for air-to-air and air-to-ground combat missions. It is a complex machine that requires a highly skilled pilot to operate and is equipped with state-of-the-art avionics, weapons systems, and engines.

In terms of raw power, the F-16 is undoubtedly more powerful than the GTX 1070, with its advanced capabilities in speed, agility, and firepower. However, it's important to note that these are two very different types of machines, and a direct comparison of their power is not really possible."


LLMs keep demonstrating that they have an internal "world model" representation of concepts and the relationships between them.

Meanwhile ostensible "expert" humans continue to regurgitate obvious falsehoods like "it's just a fancy autocomplete" or "you're being fooled by a system that just predicts the most probable next word".


I don't see why "internal model" and "predicts the most probable next word" are mutually exclusive.


Who said they're mutually exclusive?


The internal "world model" are things like "GPU's are faster than CPU's", "Military hardware beats nonmilitary things" etc. Then when you ask the comparison question it categorises the items, and for the plane vs card you get military vs nonmilitary response. Or computer vs gpu you get the cpu vs gpu response or similar. I showed the example of how to reason about this in a post below.

Anyway, while you can call that a "world model", that world model is very brittle since its just a bunch of heuristics, if you reverse engineer the heuristic it uses it is really easy to find cases where it fails completely. Like if you ask it to compare a modern integrated GPU to an old dedicated GPU, it will say that the old dedicated GPU is faster even though it is ancient and slow, since when people talk about dedicated vs integrated GPU they almost always say that the dedicated GPU is faster.


That world model is an emergent phenomenon arisen out of training, they are not explicitly trained a world model.


I’d phrase it the other way around, learning a world model is necessary in order to make such good token predictions.


It doesn't have a model it is a model and it's a model of language. It's never interacted with the world, only with text. There are patterns and relationships among words that can describe the real world, that's why we can use it to communicate complex ideas. The problem is that the map is not the territory.


You've never interacted with the world, only with the sense impressions that were triggered in your nervous system. All world models are inferred.


Bad argument. My sense impressions consist of far more than just text input. We don't know if consciousness consists of one model, multiple models or is something different from a statistical model but makes use of one or more of them. I also don't consist of only a nervous system, I also have an endocrine system that has very pronounced effects on my behaviour and my interpretation of input.

There's a giant gap here that people keep glossing over when they treat LLMs as almost-human.


Sense impressions are mediated by fairly simple neuronal firing mechanisms, somewhat different, but similar in concept to the wires we use to encode language tokens.

I'm not really trying to make a point here about consciousness, although I was interested by the article - it seems to imply that gpt-3 performs better at some tasks when it has a train of thought / stream of consciousness.

People keep saying that there's a giant gap, but I still don't really know what they think it is, and there seem to be a lot of points of similarity. It's very clear that these large language models are, despite their problems, much more capable than was expected by nearly everyone not very many years ago.


To me, it comes down to whether or not you think the sum total of a human being is language. If your life consists almost entirely of communication based activities then it's easy to see why one would think that is the case but if you spend any appreciable amount of time doing work that deals with physical objects and their interactions with the environment you find that language is only one tool in the box and isn't capable of replacing the others. That's not even bothering with the metaphysics of consciousness or subjective experience.


> Bad argument. My sense impressions consist of far more than just text input.

Multimodal models are being trained right now using text images, video, and audio. Eventually you can add data from pressure, heat, and acceleration sensors and motors (sense of touch). We can further add additional "senses" - data from RADAR/LIDAR, magnetometer, multispectrum vision, radiation sensors if desired.

AI will come to know our world very well.


Why wonder?

"It's not appropriate to compare a graphics card (GTX 1070) to a military fighter jet (F-16) as they are completely different things designed for entirely different purposes.

A GTX 1070 is a high-end graphics card designed for gaming and other graphics-intensive tasks. It has a lot of processing power and can handle complex visual computations quickly and efficiently.

On the other hand, an F-16 is a military fighter jet designed for air-to-air and air-to-ground combat. It is equipped with advanced weapons systems, avionics, and other technologies that allow it to perform a wide range of military operations, including surveillance, reconnaissance, and combat missions.

In short, while a GTX 1070 is a powerful graphics card, it is not designed for the same purposes as an F-16, which is a highly advanced and specialized military aircraft."




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: