Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> When is a language untyped

There's the formal definition of type and there's the colloquial definition of type. A type in the context of the article means a formal type, it is an syntactic classifier that is part of the static semantics of the language. The properties you describe ( string/number exceptions ) are part of the dynamics of the language, which is a separate concept. Dynamically typed languages quite often use boxed values with an attached runtime tags which are colloquially referred to as "types" even though they have nothing to do with static types.

If you read the article in the context of "type" meaning "static type" it will make more sense.



Does that mean I can search and replace every occurrence of "type" in the article with "static type"?

Are you sure dynamic types have nothing to do with static types? What about compilers for dynamic languages which reason about type, like that if the program calls (symbol-name x), it is inferred that x must hold a string, so that a subsequent (cos x) is inconsistent? Is such a compiler is reasoning about something which has nothing to do with static type?


> What about compilers for dynamic languages which reason about type, like that if the program calls (symbol-name x), it is inferred that x must hold a string, so that a subsequent (cos x) is inconsistent?

That's irrelevant; if the program containing (symbol-name x) and (cos x) works in an interpreter but not in the compiler then the compiler is broken. Note that throwing-exceptions/triggering-errors/etc. is "working", since it's a valid form of control flow which the program could intercept and carry on. Aborting immediately (like a segfault) isn't "working". Triggering undefined behaviour also isn't "working", but since most of these languages lack any real defined behaviour, it's assumed that "whatever the official interpreter does" is correct.

Usually, these compilers completely acknowledge that their behaviour is incorrect WRT the interpreter. They embrace this and define a new, static language which just-so-happens to share a lot of syntax and semantics with that dynamic language. For example, HipHop for PHP could not compile PHP; it could only compile a static language which looks like PHP. Likewise for ShedSkin and Python. PyPy even gives its static language a name: RPython.

Hence, these compilers don't show that runtime tags and static types are the same/similar. Instead, they show that sometimes programmers without static types will use tags to do some things that static types are often used for.


> if the program containing (symbol-name x) and (cos x) works in an interpreter but not in the compiler then the compiler is broken.

The compiler isn't broken.

The interpreter throws an exception, which meets your own definition of "works".

The compiler gives an opinion "x has an inconsistent type". In this manner, the code also "works" in the compiler.

You can still run the program if you like; then that opinion becomes an exception if an input case exercises the code, proving the compiler's opinion right.

Compilers for dynamic languages do not preserve all interpreted behaviors. This is usually explicitly rejected as a goal: users must accept certain conventions if they want code to behave the same in all situations.

For instance, some Common Lisp implementations re-expand the same macros during evaluation, which is friendly toward developers when they are modifying macros. But in compilation, macros are expanded once.

Usually these compilation-interpretation discrepancies are obscure; we are not talking about obscure features here. It's a basic tenet of type checking in a compiler that it will flag things that will throw type-related exceptions at run time. You cannot say that type checking compilers for dynamic languages are simply not allowed because type mismatches are well-defined behavior; that's simply outlandish.


> The compiler gives an opinion "x has an inconsistent type".

> You can still run the program if you like; then that opinion becomes an exception if an input case exercises the code, proving the compiler's opinion right.

In this example, the compiler cannot give "x" the static type of "string" and also allow the program to handle errors when "x" is used in places where strings aren't allowed. If the static types don't match, the result is undefined; that's what static typing means.

If the compiler causes an exception to be thrown, then either:

- That's just an implementation-specific quirk, which just-so-happens to be the way this compiler behaves when it hits this undefined case; it's not part of any spec/documentation and useful only to hackers trying to exploit the system.

- We accept that the compiler didn't assign "x" the static type of "string" after all; it actually assigned it "string + exception + ..." (which may be the "any" type of the dynamic language, or some sub-set of it)


The situation isn't one of static typing, but of static type checking. Check (parent (parent (parent yourpost))) where it says

> What about compilers for dynamic languages ...

A compiler for a dynamic language that checks types does not bring about static typing. The language in fact remains dynamic.

What the checking means is that the compiler can give an opinion based on a static view of the program, and we can run that program regardless.

The compiler can say: yes, all expressions in the program can be assigned a type; no, some expressions have a conflicting type, or couldn't be assigned a type.

Static typing indeed means that we use the result of the static analysis to remove all traces of type from the program, and only run it when its type information is complete and free of conflicts.

The dynamic language optimizer can in fact take advantage of its findings to eliminate run-time type checks where it is safe to do so.


> What the checking means is that the compiler can give an opinion based on a static view of the program, and we can run that program regardless.

> The compiler can say: yes, all expressions in the program can be assigned a type; no, some expressions have a conflicting type, or couldn't be assigned a type.

What is a "conflicting type"? In the dynamic language, there is only one static type (any = int + string + float + (any -> any) + array(any) + ...), so there can't be any conflicts.

It's fine to have a compiler try to specialise the types of variables beyond that, eg. narrowing-down the type of "x" in the example to "string + float", based on how it's used (or just leave it as "any").

A "conflict" would imply that a variable is used in ways that don't match the inferred type; but the inference is based on how it's used.


PS: Static type inference, like your example of (symbol-name x) and (cos x), has been going on for at least 40 years, since ML. The article mentions, Ocaml, which is a descendent of ML and hence has full type inference (you can leave out all annotations and the type-checker will figure them out). Depending on the type system, this may or may not be possible. For example types in Agda, Coq and Idris are full programs, so inferring them is undecidable.

Since dynamic languages only have one static type, static type inference is trivial. So is type-safety: there's no way to cast a value of one type to an incompatible type, since there are no other types.

Inferring tags doesn't work in general: when I say `print x` it doesn't mean that `x` must be a string; I might want Python to raise a "TypeError" exception, so that I can jump to the handler where all the real work takes place. Of course that's bad practice, but exceptions are only "errors" by convention; in the same way that False is only a failure by convention.


Are you sure dynamic types have nothing to do with static types?

"Static type" does not really mean "class of values." That is a common use of static types, but they can also be used for things like tracking what resources are available/released or describing what exceptions a function might raise.


Types are usually understood as proofs: in this case, a proof about whether the tag of x is suitable for passing to cos without causing an error. (If you like, read "dynamic type" for "tag".)

So the proof and the tag are related in that one is about the other, but that does not mean they are the same thing.


Indeed, just like a basket with three oranges in it, the numeral 3 written on a piece of paper, and a die with the three-dot face turned up, are not the same thing. They just potentially serve as representations of the same thing. Of course something at compile time isn't the same thing as at run time. They exist at different times in different spaces. Compile time might be on an x86 build machine, and run-time on an embedded ARM device.


I didn't say that types are tags at a different time, I said they are different things.

And only concrete types are about representation. Abstract, universal, existential and uninhabited types are all useful and meaningful concepts without any particular relation to representation.


But a concrete value, like a symbol or string, has a concrete type, and that better agree with what its tag information says (if it has it).

Sure, expressions like "for all x: function from x to x" may not have concrete instantiations: there is no thing that is type tagged that way. It could be, of course; there is no limitation on what constitutes a type tag. Any compile time information, whether it is a universally quantified expression or whatever, can be represented at run-time also. In practice, that isn't done.


Quite sure, in a dynamically typed language there is only a single static type inhabited by all values. If the compiler is reasoning about classes of values at runtime then it does indeed have nothing to do with static typing.


In the run time image for a dynamic language, there may be a "cell" or "value" representation which holds anything. That doesn't mean this is the "type" of that language.

It may certainly be a static type in another language; that language which is used to implement an interpreter or virtual machine for the dynamic language. E.g. a Lisp interpreter in C has some "value" typedef which can hold a number, string, symbol, ...

That type is a C type, though, not a Lisp type.

A language can't be confused with the one it compiles into, or which is used to write an interpreter for it.

Utimately, everything is interpreted by a processor, through some machine language representation which uses typeless words.


I can't even begin to parse what you're trying to say or how it relates to my last comment.


If the compiler is applying reasoning before the program is run then it is by definition performing static type checking.


But what it's reasoning about coincides with the dynamic type which will exist at run time, so that means we cannot say the two have nothing to do with each other. It's the same knowledge, just known at different times, right?

Just because we have stuffed some knowledge into tags that exist at run time (and chosen suitable machine representations to make that possible) doesn't mean it's a different class of knowledge.


Types are propositions for which the text of your program is the proof. Type theory arose as an alternative to set theory in order to deal with Russell's paradox[0]. Types have been around for a lot longer than computers have. They have nothing at all to do with a running program.

[0] https://en.wikipedia.org/wiki/Russell%27s_paradox


What is missing from the above articulation of a view is that the run-time state of the program is also text. (It's made up of the symbols "0" and "1", at one level.) There can be propositions about that text.


There can be propositions about that text.

Yes, however, those propositions are not types.


Even those propositions, which are like, for example Ɐx: x -> x? (For all x, function from x to x)?

Who decides these things? Some "world type authority"?


Such a proposition pertains to a mathematical function. A running computer program does not contain mathematical functions, merely representations of them.

It is akin the difference between the theory of gravity and a falling rock. You can describe the rock with all manner of equations but you will not find any equations inside the rock.


I think I almost understand your position now.

A running computer program does not contain mathematical functions, merely represenations of them.

A printed computer program text does contain mathematical functions, and not merely representations of them.

Because of this difference, logical propositions about structures in the program text are types, whereas those about structures in the running program are not.

Furthermore, this has something to do with gravity and rocks: the running program is analogous to a falling rock, but the non-running program is like ... something else.

Somehow, compilers are exempt: they are running programs, which work with types. The source of this exemption is that they operate on other programs; a proposition inside a compiler is a representation about the program being compiled, not about anything in the compiler. Running programs may not entertain propositions about their own typing, because typing is purely prior to run-time. Type does not exist at run-time, period. The propositions of run-time typing are not type, but some kind system of data representations that mimics type.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: