Type systems

Equation systems

Row polymorphism, row operations and intersection types

When the language has an effects system or linear type system, it is necessary to ensure that row operations are safe. For example, if the fields in the row are duplicated via let x = {z with f = 1} and y = {z with f = 2} in …, it is necessary to ensure that none of these fields were constrained by the linear type system.

I think it would be best to implement the type system of ROSE, and try to add row maps ("transformation" above) and first-class labels.

functions with explicit environment

The environment is: * a stack of lexical scopes with local variables (let or function arguments). * lexical scopes and references introduced by hygienic macros, which exist on a sort of parallel stack for each macro invocation. Macro-defining macros are more complex to understand (see http://www.cs.utah.edu/plt/scope-sets/ for more) * identifiers introduced by opening modules (globally via open X, locally via let open X in expr, or X.expr)

Non-macro lexical scopes and module interfaces can be thought of as row operations (usually in a model where shadowing is allowed).

Uses of macros: * Inlining. This is better done by an "inline" directive. * Abstraction of repetitive boilerplate that cannot be expressed using function or other basic language abstractions. Better solution would be to not have second-class language constructs that cannot be abstracted over. * Optimizations on the generated code. * Changing the order of evaluation (lazy vs. eager) * The creation of new binding forms

macro expansion = one level of partial expansion

Gensym

Metaprogramming facilities (macros or environment-manipulating functions) may need to introduce fresh identifiers for temporary variables or for behind-the-scenes communication between two uses of a macro to track information across these invocations.

Traditionally, LISP programs use gensym to create a fresh (preferably uninterned) symbol. We need a way to give a type to a gensym-like function.

Nominal typing provides a way to create a unique identifier that can be used for that purpose. However, for macros that produce macros, one needs a way to generate fresh identifiers on the fly, instead of having a fixed set of pre-declared unique identifiers.

It is therefore desirable to be able to express in row constraints the fact that a certain label will e.g. be absent from arguments and present in a returned row, indicating that this label is fresh (and therefore that it is trivially absent from arguments). This is very similar to how existential types allow hiding information to unprivileged parts of the code.

Uninhabited types

Detecting uninhabited types can be difficult depending on the type system.

Uninhabited types can cause spurious "pattern-matching does not cover all cases" errors (since some cases may be impossible but not detected as such)

Ex Falso Quodlibet might be a problem in some type systems. If the type system assumes that types are always inhabited, but is unable to always detect uninhabited types, then there is a problem.

Object-oriented and subtyping

typeclasses vs. modules

Modules: allow ensuring that a specific implementation is used for an operation (e.g. the comparison function for comparable pairs of integers, for which there are several implementations). On the other hand, with typeclasses, one cannot easily mix pairs of integers sorted with different sorting functions within the same program.

Typeclasses: more natural way to describe overloading than using functors.

Recursion

overloading + renaming + split & merge identifiers

rank-N polymorphism

lift foralls to typeclass constraints

A workaround which supports some use cases of rank-N polymorphism consists in moving all forall type variables to the beginning of the type (prenex form), and specify via type constraints what concrete types must be unifiable with these. For example, f g = (g 1) + (g "hi") can be assigned the type (forall a . a → int) → int. A constrained prenex form with subtyping would be forall t . (t ≤ int → int, t ≤ string → int) ⇒ t → int. Other constraints than can be used to allow only a limited form of subtyping that compares e.g. if a function type can be unified with another (a simple check that does not unify type variables within both sides).

intersection types

Another workaround that supports some uses cases of rank-N polymorphism is the use of intersection types.

duplication of variables

In some cases (probably low-rank polymorphism and / or value restriction) an expression is assigned a type which is a unification variable (e.g. empty_list : 'a list as opposed to empty_list : forall 'a . 'a list). This means that all occurrences of that variable must unify with the same type.

There is a problem when the expression is used in contexts where it must unify with distinct types, e.g. let x = empty_list in ((1 :: x), ("hi" :: x)). Either the overall expression will not typecheck, or it may cause spurious equality constraints to appear.

In that case, a (common?) workaround is to duplicate (immutable) identifiers, so that each use is typechecked as if it came from an independent declaration. This means that occurrences are not forced to have the same type. This strategy can cause a blow-up in the size of the program to effectively typecheck, but there are mitigations for that.

propagation of type errors: inconsistant constraint cycles, where to put the error?

modules

TODO: write about modules

variance

Properties of type systems that people often prove

Other