Type systems

Equation systems

In W, the traditional HM inference algorithm, constraint generation and solving are interleaved
A better approach is to gather constraints in one pass, and then solve the resulting system of equations.
This allows for a more modular description of the type system, as some new language features can simply describe the constraints without changing the inference algorithm https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/jfp-outsidein.pdf
constraints must be bidirectional as much as possible (knowing one side allows determining the other and vice versa)

Row polymorphism, row operations and intersection types

Abstracting extensible data types (ROSE language), a general framework in which most row systems fall (excellent paper):
- youtube video: https://www.youtube.com/watch?v=5rDfyB2udKA
- operations which are not supported by their framework:
- application of a record of functions to a record of arguments ("transformation" below)
  - TODO: send the authors an e-mail asking if they see major issues / pitfalls when adding this as an extension on top of ROSE.
- first-class labels and sets of labels (can be emulated to some point by lambdas which do the selection)
- type-indexed records (records whose labels are type literals, it allows to build things like static dispatch / typeclasses as record lookup operations)
Variations on Variants an implementation of extensible variants on top of Haskell's type sytem
https://wiki.haskell.org/CTRex another implementation of extensible variants on top of Haskell's type sytem
fclabels first-class labels
λTIR for type-indexed records. Labels can (only) be types (but not type schemes, i.e. no \forall or constraints). Custom labels that are disjoint from all others can be created via newtype.
tinybang a language kernel with extensible records. They present a technique to close functions stored in fields of the record so that these functions may refer to the record instance self. Their technique allows later adding extra fields and methods to the object, and re-sealing it so that new methods have access to the new fields. They also provide a complete type inference algorithm (there are no type annotations in the language).
castagna uses union and intersection types, with quantified (forall) type variables but without constraints, to encode row-polymorphic variants.
Ur by Chlipala. Has row maps ("transformation" below). Needs quite a lot of annotations, for functions manipulating rows, but apparently user code requires few annotations.
principal typings with rank-2 intersection types
non-exhaustive list of value-level features
- records, variants
- records and variants are dual, so we only need to consider rows (which appear in sum and product types) which are mappings from labels to types
- records = case-functions (collection of pattern-matching clauses with independent result types)
- the dual: constructors = field accessors
- empty record, empty case-function (its type is 'a . bottom -> 'a)
- singleton record, constructor
- concatenation of two records, unification of the results of the branches of an if
- projection (.l), update (with existing_l = v), addition (with new_l = v), deletion (without l)
- renaming (addition + deletion)
- some systems allow or impose shadowing (updating a field merely shadows it, deleting a field unshadows)
- transformation: given a dependent function associating labels as follows f -> g | h -> i, and a record with labels {f,h}, uniformly mapping the function on the record gives a new record {g,i}
- first-class labels: λ r l -> r.l takes a record and a label, and accesses the given label in the record
- introspection (label to string, ad-hoc iteration over all the fields of a record).
- Some introspection operations are ad-hoc (they allow a function to not handle all fields uniformly, even if it does not mention these fields in its type), and this ad-hoc-ness should appear in the type (i.e. this function could do anything, and its correctness cannot be assessed by looking at its type and writing black-box unit tests).
- Some introspection-like operations can be uniform (not ad-hoc)
- …
non-exhaustive list of type-level features
- polymorphism over rows (a Row kind, often with constraints on the row)
- polymorphism over labes (a Label kind)
- contains constraint or type with some fixed labels and a row for the rest
- lacks constraint (ensure that some labels are absent from the row)
- disjoint constraint (to ensure that concatenation does not cause duplicate labels)
- included-in constraint
- …
- rows can be interpreted as substitutions (from their labels to the associated type), and some operations (like the "transformation" above) require to apply such a substitution on another row

When the language has an effects system or linear type system, it is necessary to ensure that row operations are safe. For example, if the fields in the row are duplicated via let x = {z with f = 1} and y = {z with f = 2} in …, it is necessary to ensure that none of these fields were constrained by the linear type system.

I think it would be best to implement the type system of ROSE, and try to add row maps ("transformation" above) and first-class labels.

functions with explicit environment

The environment is: * a stack of lexical scopes with local variables (let or function arguments). * lexical scopes and references introduced by hygienic macros, which exist on a sort of parallel stack for each macro invocation. Macro-defining macros are more complex to understand (see http://www.cs.utah.edu/plt/scope-sets/ for more) * identifiers introduced by opening modules (globally via open X, locally via let open X in expr, or X.expr)

Non-macro lexical scopes and module interfaces can be thought of as row operations (usually in a model where shadowing is allowed).

Uses of macros: * Inlining. This is better done by an "inline" directive. * Abstraction of repetitive boilerplate that cannot be expressed using function or other basic language abstractions. Better solution would be to not have second-class language constructs that cannot be abstracted over. * Optimizations on the generated code. * Changing the order of evaluation (lazy vs. eager) * The creation of new binding forms

macro expansion = one level of partial expansion

Gensym

Metaprogramming facilities (macros or environment-manipulating functions) may need to introduce fresh identifiers for temporary variables or for behind-the-scenes communication between two uses of a macro to track information across these invocations.

Traditionally, LISP programs use gensym to create a fresh (preferably uninterned) symbol. We need a way to give a type to a gensym-like function.

Nominal typing provides a way to create a unique identifier that can be used for that purpose. However, for macros that produce macros, one needs a way to generate fresh identifiers on the fly, instead of having a fixed set of pre-declared unique identifiers.

It is therefore desirable to be able to express in row constraints the fact that a certain label will e.g. be absent from arguments and present in a returned row, indicating that this label is fresh (and therefore that it is trivially absent from arguments). This is very similar to how existential types allow hiding information to unprivileged parts of the code.

Uninhabited types

Detecting uninhabited types can be difficult depending on the type system.

Uninhabited types can cause spurious "pattern-matching does not cover all cases" errors (since some cases may be impossible but not detected as such)

Ex Falso Quodlibet might be a problem in some type systems. If the type system assumes that types are always inhabited, but is unable to always detect uninhabited types, then there is a problem.

Object-oriented and subtyping

Object-oriented uses inheritance to implement code-sharing, subtyping and message-passing (implicit self argument to methods). These aspects should be orthogonal.
mixins do only code-sharing
subtyping is only concerned on whether two types offer a common API, in which case that API is the subtype (nominal subtyping can restrict that by enforcing e.g. that supertypes must be declared as such, and then using a transitive closure of that subtype relationship).
message-passing: when calling the method on the receiver, the receiver itself should be passed, usually as the first argument.
static dispatch decides which case of an overloaded function is called at compile-time, based on the static type information available.
dynamic dispatch decides which implementation of a method is called at run-time, based on the dynamic type of the object.
Single dispatch: at run-time, the instance of the receiver is used to determine the implementation of the method. E.g. in x.m, if x has dynamic type Foo then the implementation of m in Foo is used, even if the static type of x is a supertype of Foo (e.g. via a type annotation)
Multiple dispatch: Some languages allow the implementation of m to be determined by the dynamic type of the receiver and one or more arguments, e.g. 1.add(2.0) will select the implementation add(self: Int, x: Float), but 1.add(2) would select add(self: Int, x: Int).
Typeclasses implement a form of single dispatch based on the static type of the arguments (and a form of static multiple-dispatch via multi-parameter typeclasses)
When a module declares a new case for a single- or multiple-dispatch method, simply importing that module may affect which function is called by x.f(y …). Without the import, this call can be handled by a general case, and the import can make a more specific case available. This can be solved by enforcing that new cases are declared in the same module as one of the types involved in their signature (or in the same module that declares the multimethod itself, for nominal declarations). See explanation in the context of Racket's multimethods.

typeclasses vs. modules

Modules: allow ensuring that a specific implementation is used for an operation (e.g. the comparison function for comparable pairs of integers, for which there are several implementations). On the other hand, with typeclasses, one cannot easily mix pairs of integers sorted with different sorting functions within the same program.

Typeclasses: more natural way to describe overloading than using functors.

Recursion

polymorphic recursion: when a type 'a t contains in its definition 'b t, where 'b depends on 'a but is not 'a, e.g. type 'a t = Tree of 'a | Depth of ('a * 'a) t
recursion, via let rec or something similar to the y combinator, causes the unification of a type variable corresponding to the parameter of the function, and the type of an expression in the body (the argument passed to the recursive call). The type of this expression will likely depend on the parameter.
- If the expression has the same type as the parameter, nothing special to do.
- If the types can be unified to a third, more specific type, then it can be used but will reject some correct programs which call the function using a more general type.
- Otherwise, a recursive type can be constructed using e.g. the μ type combinator.

overloading + renaming + split & merge identifiers

A useful feature of an IDE is the ability to rename identifiers.
Renaming a single static or dynamic dispatch case will cause all other cases for that method to be renamed. Via subtyping (interfaces), this can cause a lot of identifiers to be renamed in loosely related classes or modules.
With static dispatch, it is possible to rename the identifier for one case and leave the others unchanged.
- This requries that function calls never supply an argument that overlaps multiple cases (relying on dynamic dispatch to route the call to the appropriate case).
- This can also benefit from the possibility to extract a specific meaning (case) from an overloaded identifier, or merge the meanings (cases) of two distinct identifiers. See https://docs.racket-lang.org/polysemy/index.html (from https://github.com/jsmaniac/polysemy).
With structural subtyping, all occurrences of a name have to be renamed.
- Suggestion: use structural subtyping with pre-declared names. {a:int} is a subtype of {a:int,b:string} as long as the two occurrences of a refer to the same field declaration. The IDE can automate the declaration of these fields.

rank-N polymorphism

Naturally arises from higher-order functions, that take a argument f and apply it to a variety of values.
Might naturally arise from functions that construct module-like things (e.g. mk f g h = { f = f; g = g; h = h } returns a struct, if only rank-1 polymorphism is allowed then the types of f, g and h become weakly polymorphic).
System F (?) with rank-1 polymorphism has decidable inference
System F (?) with rank-2 polymorphism has decidable inference (but the algorithm is apparently complex)
System F (?) with rank-3 polymorphism or higher does not have decidable inference

lift foralls to typeclass constraints

A workaround which supports some use cases of rank-N polymorphism consists in moving all forall type variables to the beginning of the type (prenex form), and specify via type constraints what concrete types must be unifiable with these. For example, f g = (g 1) + (g "hi") can be assigned the type (forall a . a → int) → int. A constrained prenex form with subtyping would be forall t . (t ≤ int → int, t ≤ string → int) ⇒ t → int. Other constraints than ≤ can be used to allow only a limited form of subtyping that compares e.g. if a function type can be unified with another (a simple check that does not unify type variables within both sides).

intersection types

Another workaround that supports some uses cases of rank-N polymorphism is the use of intersection types.

duplication of variables

In some cases (probably low-rank polymorphism and / or value restriction) an expression is assigned a type which is a unification variable (e.g. empty_list : 'a list as opposed to empty_list : forall 'a . 'a list). This means that all occurrences of that variable must unify with the same type.

There is a problem when the expression is used in contexts where it must unify with distinct types, e.g. let x = empty_list in ((1 :: x), ("hi" :: x)). Either the overall expression will not typecheck, or it may cause spurious equality constraints to appear.

In that case, a (common?) workaround is to duplicate (immutable) identifiers, so that each use is typechecked as if it came from an independent declaration. This means that occurrences are not forced to have the same type. This strategy can cause a blow-up in the size of the program to effectively typecheck, but there are mitigations for that.

propagation of type errors: inconsistant constraint cycles, where to put the error?

When a type error arises because of a cycle / strongly-connected component of inconsistent constraints, the error can point at any of the constraints. That choice should not depend on the constraint gathering or resolution algorithms, as these are implementation details, and are designed to maximize efficiency or completeness, not to maximize the understandability of type errors.
Instead, pick the best candidate for the error according to some metric. For example we could select the constraint that seems the furthest in the data flow graph, therefore trying to get the error where a run-time error would occur (via an abstraction, the data flow graph).
type-level debugger: see where this value (type / constraint) comes from, from the point of view of the type system, see where it flows (i.e. interactively navigate the constraints graph).
People who work on contracts have designed a theory of blame, which tries to assign the blame to the right party of a contract (e.g. if the argument to a function is bad then blame the caller, if the callee function returns a bad value then blame the callee) https://jackw.io/papers/oopsla-root-cause-of-blame.pdf

modules

TODO: write about modules

variance

Covariant: τ ≤ π implies τ' ≤ π', where τ' is determined by τ in some way (and π' is determined by π). Example: α → β ≤ δ → γ implies β ≤ γ (function types are covariant in their return type).
Contravariant: τ ≤ π implies τ' ≥ π', e.g. α → β ≤ δ → γ implies α ≥ δ (function types are contravariant in their argument type).
Invariant: τ ≤ π implies τ' = π'.

Often occurs because a type variable occurs both in a covariant and a contravariant position.

Richer constraint, polymorphism and subtyping systems are able to avoid placing an equality constraint in some of these cases.

Properties of type systems that people often prove

TODO: check that these definitions are accurate
subject reduction: single-step evaluation does not change the type of the term: (Γ |- e : τ) ⇒ (Γ |- eval₁(e) : τ)
progress: well-typed programs can always be evaluated step by step until they finish (or produce a run-time error as specified by the semantics)
soundness: the result of the evaluation (full reduction) has the expected type: (Γ |- e : τ) ⇒ (Γ |- eval(e) : τ)
completeness of an inference algorithm: if there exists an annotation of the program that makes it pass the typechecker, the type inference will find it
decidable inference: there is a complete inference algorithm for the given type system and expression language
principal type property: given Γ and e, there is a type scheme τ such that Γ |- e : τ and any other possible type τ' (such that Γ |- e : τ') can be obtained by instantiating the type scheme, i.e. τ' = σ(τ), for some reasonable definition of what σ can be.
principal typing property: like principal type property, but e alone is used to infer both Γ and τ. Allows for bottom-up inference

Other

Capabilities: ensuring (at run-time or statically) that some code can only perform a restricted set of operations (e.g. cannot access the filesystem, can inspect and alter its own behaviour as long as it does not grant itself extra capabilities, etc.).
When serializing and deserializing data, how to ensure that no properties are broken in the serialized version. E.g. ensure that one cannot modify the serialized code to grant extra capabilities (e.g. save a run-time process to disk, tweak the saved image to pretend the application has a writable file descriptor for /etc/password, and restart the application from the fake "saved" image) or break invariants that are not expressed directly in the type (e.g. dangling logical references in a graph-like structure where references are represented using strings).
Contracts
Effects
Linear types
Units of measure