Logo

dev-resources.site

for different kinds of informations.

Create Your Own Programming Language 7: More Types

Published at
6/29/2023
Categories
programminglanguage
compilers
interpreters
javascript
Author
jasonsbarr
Author
10 person written this
jasonsbarr
open
Create Your Own Programming Language 7: More Types

Hello, again! It's time for another installment in the Create Your Own Programming Language series.

This time we're not going to add any significant new compiler features. Instead, we're going to add some interesting features to the type checker, including TypeScript-like union and intersection types.

As always, if you haven't read the previous article on adding functions to Wanda you should do that before continuing.

Ready? Ok, let's go.

What All We're Doing Today

Most of the work today is going to be concentrated on the type checker. We're going to add singleton types (which require us to implement constant declarations), tuple types, union types, intersection types, and type ascriptions using the :as keyword as an operator.

Singleton Types

Singleton types are types that have one and only one value. TypeScript calls them "literal types."

For example, you don't have a value of type string; you have a value of type "hello".

A singleton type is a subtype of its base, so "hello" is a subtype of string.

Singleton types require the values that have them not to change, so in order to have singleton type we're going to have to implement constant declarations.

Constants will only be constant in the type checker, though; ConstantDeclaration nodes will desugar to VariableDeclaration nodes so we won't need to add anything to the emitter.

Singleton types aren't terribly interesting by themselves, but you can do interesting things with them in conjunction with union types, below.

Tuple Types

Tuples are indexed collections that can hold values of different types, unlike a list or vector which is supposed to only hold 1 type (or its subtypes).

Tuples are going to be vectors under the hood (like TypeScript tuples, which are JavaScript arrays).

A tuple type could be something like [number, number, string].

Union Types

Union types are types where a value can have one of a number of specified types.

A union type is like logical OR for types. You can have simple unions like string || number or unions of objects.

A common pattern in TypeScript is to have unions of object types with a common field on all object types that indicates which type is being used based on its value. This is called a discriminated union. Here's an example:

type Coordinate =
  | { type: "cartesian"; x: number; y: number }
  | { type: "polar"; angle: number; magnitude: number };
Enter fullscreen mode Exit fullscreen mode

Note that the 2 object types have type properties that are singleton types: "cartesian" for the first, and "polar" for the second. Not string types.

We're going to use || to set apart members of a union instead of | for symmetry with intersection types, for which we'll use && to avoid confusion with the & rest parameter operator.

In a later article we'll implement variant types, which are a kind of union type that also works like an enum in some ways.

To understand union types and how they work better, check out the article on union types in Jake Donham's Reconstructing TypeScript series.

Intersection Types

If union types are logical OR for types, intersection types are logical AND. A value of an intersection type must have all the properties of all members of the intersection.

A common use of intersection types is to add fields to an existing type. Here's an example in TypeScript:

type Coord2D = { x: number; y: number };

type Coord3D = Coord2D & { z: number };
Enter fullscreen mode Exit fullscreen mode

An object of type Coord3D must have x, y, and z properties, all numbers.

To learn more about intersection types, once again you can check out the relevant article in Jake Donham's Reconstructing TypeScript series.

Type Ascriptions

Type ascriptions tell the type checker to treat an expression like a particular type. You use them when you have more information than the type checker about what type an expression is so you can get your program to correctly type check when it otherwise might not.

It's kind of an escape hatch when you need to give extra information to your type checker.

Here's what that looks like in TypeScript: let's say you have a union type AST that includes types for VariableDeclaration, NumberLiteral, and LambdaExpression AST nodes. You have a visitor class with a dispatch method based on what type of node it receives.

The dispatch method receives an argument of type AST, which will be the node's type inside that method. Your node processing methods expect arguments of their particular node type. A type ascription tells the type checker "This node, which is part of the AST union, is actually of type LambdaExpression. So you pass the node to the dispatched method like this:

enum ASTTypes {
  NumberLiteral,
  VariableDeclaration,
  LambdaExpression,
}

type NumberLiteral = { type: ASTTypes.NumberLiteral, value: number };
type VariableDeclaration = {
  type: ASTTypes.VariableDeclaration;
  name: string;
  value: NumberLiteral | LambdaExpression;
};
type LambdaExpression = {
  type: ASTTypes.LambdaExpression;
  args: string[];
  body: AST;
};

type AST = NumberLiteral | VariableDeclaration | LambdaExpression;

class Visitor {
  visit(node: AST): AST {
    switch (node.kind) {
      // other cases
      case ASTypes.LambdaExpression:
        return this.visitLambdaExpression(node as LambdaExpression);
    }
  }

  visitLambdaExpression(node: LambdaExpression): LambdaExpression {
    // method body
  }
}
Enter fullscreen mode Exit fullscreen mode

Our type ascriptions will do the same thing, although obviously we don't have classes in the language yet.

Now, in TypeScript, you can't use a type ascription to say something like 5 as string. The type checker will still complain about that, as it should. Ours will too.

Changes to Shared Types

First, let's clean up something that's been bugging me about our in-language Exception types.

A couple of articles ago I added a :dict property to the base Exception class (as well as RuntimeException, though I later realized that's redundant because RuntimeException inherits from Exception), but I made the mistake of not making it a metafield. This caused it to show up in dumps when exceptions were thrown in debugging, which is annoying. Metafields are non-enumerable and so shouldn't show up in dumps by default.

Let's import addMetaField into src/shared/exceptions.js and fix that:

import { addMetaField } from "../runtime/object.js";

export class Exception extends Error {
  constructor(msg) {
    super(msg);

    addMetaField(this, "dict", { message: msg });
  }
}
Enter fullscreen mode Exit fullscreen mode

You can also remove the :dict property defined in the constructor of RuntimeException if you have it there.

With that out of the way, now we can move on to the changes needed to implement the new features.

Changes to The Lexer

We need to make a very minor change to the lexer for intersection type annotations.

We're going to add support for an && token that will be used in intersection annotations.

First, in src/lexer/TokenTypes.js, add the new token to the TokenTypes enum:

export const TokenTypes = {
  // other types...
  AmpAmp: "AmpAmp",
};
Enter fullscreen mode Exit fullscreen mode

Now in src/lexer/Lexer.js we need to tokenize the new token type, which requires modifying the else if case for isAmp(ch):

      // other cases...
      } else if (isAmp(ch)) {
        const { pos, line, col, file } = this.input;
        if (isAmp(this.input.lookahead(1))) {
          // skip over both ampersands
          this.input.next();
          this.input.next();
          tokens.push(
            Token.new(TokenTypes.AmpAmp, "&&", SrcLoc.new(pos, line, col, file))
          );
        } else {
          this.input.next(); // skip over punc
          tokens.push(
            Token.new(TokenTypes.Amp, ch, SrcLoc.new(pos, line, col, file))
          );
        }
      } // else...
Enter fullscreen mode Exit fullscreen mode

That's it for changes to the lexer. Now we need to make changes to the reader so we can read type ascriptions with :as.

Changes to The Reader

Remember a couple of articles ago when I said we weren't adding binary expressions to a Lisp-like language? Turns out I lied. Type ascriptions are a binary expression and I sort of forgot about them.

We're going to make use of the Pratt algorithm once more and add the :as operator. Since the dot was a token type and :as is the value on a keyword token, we're going to need to change our PREC object and getPrec function to use the token's value property.

Here's the new PREC object in src/reader/read.js:

const PREC = {
  ".": 90,
  ":as": 50,
};
Enter fullscreen mode Exit fullscreen mode

And getPrec:

const getPrec = (token) => PREC[token?.value] ?? 0;
Enter fullscreen mode Exit fullscreen mode

As you can see, the dot and :as operators have different precedences so our Pratt algorithm might actually have some work to do besides checking if the precedence value isn't 0.

Now let's create a new readBinary function and replace the call to readMemberExpression in readExpr with it:

const readBinary = (reader, left) => {
  const tok = reader.peek();

  switch (tok.value) {
    case ".":
      return readMemberExpression(reader, left);
    case ":as":
      return readAsExpression(reader, left);
    default:
      throw new SyntaxException(tok.value, tok.srcloc);
  }
};

// readForm...

const readExpr = (reader, bp = 0) => {
  let left = readForm(reader);
  let tok = reader.peek();
  let prec = getPrec(tok);

  while (bp < prec) {
    left = readBinary(reader, left);
    tok = reader.peek();
    prec = getPrec(tok);
  }

  return left;
};
Enter fullscreen mode Exit fullscreen mode

Finally, here's readAsExpression:

const readAsExpression = (reader, left) => {
  const tok = reader.peek();
  reader.expect(":as", tok.value);
  const prec = getPrec(tok);

  // skip :as keyword
  reader.skip();
  const typeAnnotation = readExpr(reader, prec);

  return {
    type: "AsExpression",
    expression: left,
    typeAnnotation,
    srcloc: left.srcloc,
  };
};
Enter fullscreen mode Exit fullscreen mode

That should be the last bit of actual parsing we need to do in the reader for quite some time. I can picture my Twitter friend Shriram Krishnamurthi (a well-known figure in programming language research circles who has gracefully been very helpful in answering my probably asinine questions about languages and language development) rolling his eyes at me and pretending this article doesn't exist. 😂😂😂

Changes to The Parser

We need to add new nodes for AsExpression and ConstantDeclaration, plus we need to parse union and intersection type annotations. We're also going to add a utility function to check if a node is a primitive that will come in handy when type checking with singleton types.

First thing's first: we need to add a bunch of members to the TATypes enum at the top of parseTypeAnnotation:

export const TATypes = {
  // other members...
  Tuple: "Tuple",
  Singleton: "Singleton",
  Intersection: "Intersection",
  Union: "Union",
  Never: "Never",
  Unknown: "Unknown",
};
Enter fullscreen mode Exit fullscreen mode

Never and Unknown are types that come along with union and intersection types. I'll explain those when we get to them in the type checker.

The Type Annotation Parser

The most substantial change is to the type annotation parser, so let's start with that.

Parsing singleton and tuple annotations is relatively straightforward.

Singleton annotations are just primitive literals, like "hello", true, or 7.

Tuple annotations are a comma-separated list of types inside square brackets, e.g. [number, number, string].

The reader will give singleton annotations to the parser as literal forms and tuple annotations as VectorLiteral forms.

For the former, let's create a function parseSingletonAnnotation in src/parser/parseTypeAnnotation.js:

const parseSingletonAnnotation = (annot) => ({
  kind: TATypes.Singleton,
  token: annot,
});
Enter fullscreen mode Exit fullscreen mode

Then, above the section beginning with if (annot.typpe === TokenTypes.Symbol), check on annot.type and call parseSingletonAnnotation if the form is a Token that represents a literal:

  // previous cases
  if (
    annot.type === TokenTypes.Number ||
    annot.type === TokenTypes.String ||
    annot.type === TokenTypes.Boolean ||
    annot.type === TokenTypes.String
  ) {
    return parseSingletonAnnotation(annot);
  }
  // if containing switch case for symbol tokens
Enter fullscreen mode Exit fullscreen mode

While we're doing things with parseTypeAnnotation, let's also refactor the guts for parsing a function annotation into its own function, called parseFunctionAnnotation:

const parseFunctionAnnotation = (annotation) => {
  // filter out arrow
  annotation = annotation.filter((item) => item.value !== "->");

  // get return type
  const retType = parseTypeAnnotation(annotation.pop());
  // get param types and if it's variadic
  let params = [];
  let variadic = false;

  for (let item of annotation) {
    if (item.type === TokenTypes.Amp) {
      variadic = true;
      continue;
    } else {
      params.push(parseTypeAnnotation(item));
    }
  }

  return { kind: TATypes.Function, params, retType, variadic };
};
Enter fullscreen mode Exit fullscreen mode

Now replace everything in the block statement for if (hasArrow) with:

      return parseFunctionAnnotation(annotation);
Enter fullscreen mode Exit fullscreen mode

For parseTypeAnnotation, you should be left with this (note that we're now also handling never and unknown in the function):

export const parseTypePrimitive = (annotation) => {
  if (annotation instanceof Cons) {
    // is function or generic annotation
    // flatten Cons to array
    annotation = [...annotation];

    // if it has an arrow, it's a function annotation
    const hasArrow = annotation.reduce((hasArrow, item) => {
      if (item.type === TokenTypes.Symbol && item.value === "->") {
        return true;
      }
      return hasArrow;
    }, false);

    if (hasArrow) {
      // is function annotation
      return parseFunctionAnnotation(annotation);
    }
  }

  let annot;
  if (Array.isArray(annotation)) {
    // is generic annotation
    // get container type
    annot = annotation[0];
  } else {
    // is simple annotation
    annot = annotation;
  }

  if (annot.type === "RecordLiteral") {
    return parseObjectAnnotation(annot);
  }

  if (annot.type === "VectorLiteral") {
    return parseTupleAnnotation(annot);
  }

  if (annot.type === TokenTypes.Nil) {
    return { kind: TATypes.NilLiteral };
  }

  if (
    annot.type === TokenTypes.Number ||
    annot.type === TokenTypes.String ||
    annot.type === TokenTypes.Boolean ||
    annot.type === TokenTypes.String
  ) {
    return parseSingletonAnnotation(annot);
  }

  if (annot.type === TokenTypes.Symbol) {
    switch (annot.value) {
      case "any":
        return { kind: TATypes.AnyLiteral };
      case "number":
        return { kind: TATypes.NumberLiteral };
      case "string":
        return { kind: TATypes.StringLiteral };
      case "boolean":
        return { kind: TATypes.BooleanLiteral };
      case "keyword":
        return { kind: TATypes.KeywordLiteral };
      case "never":
        return { kind: TATypes.Never };
      case "unknown":
        return { kind: TATypes.Unknown };
      case "list":
        // annotation is array with listType as 2nd member
        return parseListAnnotation(annotation[1]);
      case "vector":
        return parseVectorAnnotation(annotation[1]);
      default:
        // must be a named type
        return { kind: TATypes.Symbol, name: annot.value };
    }
  }

  throw new Exception(
    `Unknown type annotation kind ${JSON.stringify(annot, null, 2)}`
  );
};
Enter fullscreen mode Exit fullscreen mode

Now for the most drastic change to the type annotation parser: rename the function parseTypeAnnotation to parseTypePrimitive and get rid of the export keyword before the const keyword for the function expression assignment.

Parsing Compound Type Annotations

Now stub out a new function named parseTypeAnnotation and for now just keep it simple while we figure out how to parse compound type annotations (i.e. union and intersection annotations):

export const parseTypeAnnotation = (annotation) => {
  return parseTypePrimitive(annotation);
};
Enter fullscreen mode Exit fullscreen mode

We need to be able to figure out if an annotation is simple, in which case it's handled by parseTypePrimitive, or compound. A compound type annotation is contained within a cons list and contains multiple simple annotations separated by either a || or && token.

So the first thing we need to do is check for one of those tokens in the annotation.

We'll do something similar for checking if a simple type annotation has an arrow: flatten the annotation to an array and use reduce to see if it has at least one || or && token (the same type annotation will never have both – a type is either intersection or union, but not both).

Then if it has one of those tokens we get the first annotation in the list and parse it. Then we choose the type of the annotation based on which token it is and continue parsing annotations in a loop until we reach the end of the list. Then we return the compound annotation.

Here's the new parseTypeAnnotation:

export const parseTypeAnnotation = (annotation) => {
  const annot = annotation instanceof Cons ? [...annotation] : null;

  const isCompound =
    Array.isArray(annot) &&
    annot.reduce((isCompound, item) => {
      if (
        (item.type === TokenTypes.Symbol && item.value === "||") ||
        item.type === TokenTypes.AmpAmp
      ) {
        return true;
      }
      return isCompound;
    }, false);

  if (isCompound) {
    const first = parseTypeAnnotation(annot[0]);
    let sep = annot[1];
    const kind =
      sep.type === TokenTypes.AmpAmp ? TATypes.Intersection : TATypes.Union;
    const types = [first];

    let i = 2;
    while (sep) {
      types.push(parseTypeAnnotation(annot[i]));
      i++;
      sep = annot[i];
    }

    return { kind, types };
  }

  return parseTypePrimitive(annotation);
};
Enter fullscreen mode Exit fullscreen mode

Adding New Nodes to The Parser

Now that we have parsing the new type annotations out of the way, let's move on to the regular parser. We need to add new nodes for and handle parsing :as expressions and constant declarations.

First, in src/parser/ast.js, we add new members to the ASTTypes enum:

export const ASTTypes = {
  // other members...
  AsExpression: "AsExpression",
  ConstantDeclaration: "ConstantDeclaration",
};
Enter fullscreen mode Exit fullscreen mode

Now add new node constructors:

export const AST = {
  // other constructors...
  AsExpression(expression, type, srcloc) {
    return {
      kind: ASTTypes.AsExpression,
      expression,
      type,
      srcloc,
    };
  },

  ConstantDeclaration(lhv, expression, srcloc, typeAnnotation = null) {
    return {
      kind: ASTTypes.ConstantDeclaration,
      lhv,
      expression,
      srcloc,
      typeAnnotation,
    };
  },
};
Enter fullscreen mode Exit fullscreen mode

In src/parser/parse.js, add a new case to the switch statement in parseComplexForm to handle :as expressions:

    // other cases...
    case "AsExpression": {
      const expression = parseExpr(form.expression);
      const type = parseTypeAnnotation(form.typeAnnotation);
      return AST.AsExpression(expression, type, form.srcloc);
    }
    // default case
Enter fullscreen mode Exit fullscreen mode

Constant declarations, like function declarations, will begin with the def symbol. Since we have 2 forms that use the same symbol, we'll need a new function to disambiguate called parseMaybeFunctionDeclaration.

We call it from the "def" case in the switch statement in parseList:

    // other cases...
    case "def":
      return parseMaybeFunctionDeclaration(form);
    // case "fn"...
Enter fullscreen mode Exit fullscreen mode

Here's parseMaybeFunctionDeclaration:

const parseMaybeFunctionDeclaration = (form) => {
  const length = [...form].length;

  if (length === 3) {
    // (def <name> <expression>) or
    // (def (<name>: <type>) <expression>)
    return parseConstantDeclaration(form);
  }

  return parseFunctionDeclaration(form);
};
Enter fullscreen mode Exit fullscreen mode

Function declarations have length 4 because there's the def keyword, the function name, the function arguments, and the function body. Constant declarations just have the def keyword, the constant name, and the expression to initialize the constant's value.

parseConstantDeclaration delegates most of the work to parseVariableDeclaration since the 2 are exactly the same apart from the value produced for node.kind by the node constructor:

const parseConstantDeclaration = (form) => {
  const varDecl = parseVariableDeclaration(form);
  return AST.ConstantDeclaration(
    varDecl.lhv,
    varDecl.expression,
    varDecl.srcloc,
    varDecl.typeAnnotation
  );
};
Enter fullscreen mode Exit fullscreen mode

Adding a Helper Function for Primitive Nodes

The last thing we need to do with the parser is add a utility function to tell us if an AST node is for a primitive literal. This will come in handy in the type checker when we check singleton types.

Add the isPrimitive function in src/parser/utils.js:

import { ASTTypes } from "./ast.js";

export const isPrimitive = (node) =>
  node.kind === ASTTypes.NumberLiteral ||
  node.kind === ASTTypes.StringLiteral ||
  node.kind === ASTTypes.BooleanLiteral ||
  node.kind === ASTTypes.KeywordLiteral ||
  node.kind === ASTTypes.NilLiteral;
Enter fullscreen mode Exit fullscreen mode

That's it for changes to the parser! Now for the bulk of the work, which takes place in the type checker.

Changes to The Type Checker

First, add the new cases to the TypeTypes enum in src/typechecker/types.js:

export const TypeTypes = {
  // other values...
  Tuple: "Tuple",
  Singleton: "Singleton",
  Never: "Never",
  Union: "Union",
  Unknown: "Unknown",
  Intersection: "Intersection",
};
Enter fullscreen mode Exit fullscreen mode

Now add type constructors for the simpler types in src/typechecker/constructors.js. Union and intersection types will take a bit more work, as I'll explain below.

// other constructors...
export const tuple = (types, constant = false) => ({
  kind: TypeTypes.Tuple,
  types,
  constant,
});

export const singleton = (base, value) => ({
  kind: TypeTypes.Singleton,
  base,
  value,
  constant: true,
});

export const never = { kind: TypeTypes.Never };

export const unknown = { kind: TypeTypes.Unknown };
Enter fullscreen mode Exit fullscreen mode

Validators are simple in src/typechecker/validators.js:

export const isTuple = (type) => {
  return type.kind === TypeTypes.Tuple;
};

export const isSingleton = (type) => {
  return type.kind === TypeTypes.Singleton;
};

export const isNever = (type) => {
  return type.kind === TypeTypes.Never;
};

export const isUnion = (type) => {
  return type.kind === TypeTypes.Union;
};

export const isUnknown = (type) => {
  return type.kind === TypeTypes.Unknown;
};

export const isIntersection = (type) => {
  return type.kind === TypeTypes.Intersection;
};
Enter fullscreen mode Exit fullscreen mode

Constructing Union Types

With unions, you can have different types that are equivalent. For example (in TypeScript):

'red' | 'blue' | 'green'
'blue' | 'red' | 'green'
'green' | 'red' | 'blue'
Enter fullscreen mode Exit fullscreen mode

all contain the same values. If a union arm is a nested union, the arms of the inner union can be lifted up to the outer union (if a value satisfies the nested union, then it satisfies one of its arms); so for example:

'red' | ('green' | 'blue')
('red' | 'green') | 'blue'
'red' | 'green' | 'blue'
Enter fullscreen mode Exit fullscreen mode

also contain the same values.

If one arm of a union is a subtype of another, it can be removed (all values that satisfy it also satisfy the other arm); so for example:

{ type: 'cartesian', x: number, y: number } | { x: number, y: number }
{ x: number, y: number } | never
{ x: number, y: number }
Enter fullscreen mode Exit fullscreen mode

all contain the same values.

Finally, an object type with properties of union type supports the same operations as a union of object types; so for example:

{ foo: 1 | 2, bar: 3 | 4 }
{ foo: 1 | 2, bar: 3 } | { foo: 1 | 2, bar: 4 }
{ foo: 1, bar: 3 | 4 } | { foo: 2, bar: 3 | 4 }
Enter fullscreen mode Exit fullscreen mode

all contain the same types.

To simplify the type checker and make its output more readable, we normalize union types in the constructor. To do that, we flatten nested unions and remove redundant arms. If there are no types left in the union, we return the never type.

Here's src/typechecker/union.js:

const collapseSubtypes = (ts) => {
  /**
   * An arm is kept if for every arm (except itself) it's not a subtype of the
   * other arm or it's equivalent to the other arm and this is the first
   * equivalent arm.
   */
  return ts.filter((t1, i1) => {
    return ts.every(
      (t2, i2) =>
        i1 === i2 || !isSubtype(t1, t2) || (isSubtype(t2, t1) && i1 < i2)
    );
  });
};

const flatten = (ts) => {
  return [].concat(...ts.map((t) => (isUnion(t) ? t.types : t)));
};

export const union = (...types) => {
  types = flatten(types);
  types = collapseSubtypes(types);

  if (types.length === 0) return never;
  if (types.length === 1) return types[0];
  return { kind: TypeTypes.Union, types };
};
Enter fullscreen mode Exit fullscreen mode

For more information about what's going on here, check out Jake Donham's post on union types.

Constructing Intersection Types

Like union types, intersection types give us lots of ways to describe the same collection of values. The order of types in an intersection doesn't matter:

{ x: number } & { y: number } & { z: number }
{ y: number } & { x: number } & { z: number }
{ z: number } & { x: number } & { y: number }
Enter fullscreen mode Exit fullscreen mode

all contain the same values.

If a part of an intersection is a nested intersection, the parts of the inner intersection can be lifted up to the outer intersection, so for example:

{ x: number } & ({ y: number } & { z: number })
({ x: number } & { y: number }) & { z: number }
{ x: number } & { y: number } & { z: number }
Enter fullscreen mode Exit fullscreen mode

all contain the same values.

If one part of an intersection is a supertype of another, it can be removed; so for example:

{ type: 'cartesian', x: number, y: number } & { x: number, y: number }
{ type: 'cartesian', x: number, y: number } & unknown
{ type: 'cartesian', x: number, y: number }
Enter fullscreen mode Exit fullscreen mode

all contain the same values.

As we do for union types, we normalize intersection types in the type checker. Normalizing intersections is more complicated than normalizing unions. We want to detect and eliminate empty intersections (this will be important for narrowing), but unions make it more difficult.

In addition to flattening nested intersections and removing redundant parts (as we do for unions), we also distribute intersections over unions and eliminate empty intersections. As with unions, we don't distribute intersections over object or function types (or objects / functions over intersections).

Here's a helper function for distributing over unions (in src/typechecker/union.js):

export const distributeUnion = (ts) => {
  let accum = [];

  const dist = (ts, i) => {
    if (i === ts.length) {
      accum.push(ts);
    } else {
      const ti = ts[i];

      if (isUnion(ti)) {
        for (let t of ti.types) {
          const ts2 = ts.slice(0, i).concat(t, ts.slice(i + 1));
          dist(ts2, i + 1);
        }
      } else {
        dist(ts, i + 1);
      }
    }
  };

  dist(ts, 0);
  return accum;
};
Enter fullscreen mode Exit fullscreen mode

This function takes a list of types (which may contain unions) and returns roughly the Cartesian product over the unions.

Next we need a function to check if 2 types overlap, that is if their intersection is not empty. In src/typechecker/intersection.js:

const overlaps = (x, y) => {
  if (isNever(x) || isNever(y)) return false;
  if (isUnknown(x) || isUnknown(y)) return true;

  if (isUnion(x)) {
    return x.types.some((x) => overlaps(x, y));
  } else if (isUnion(y)) {
    return y.types.some((y) => overlaps(x, y));
  }

  if (isIntersection(x)) {
    return x.types.every((x) => overlaps(x, y));
  } else if (isIntersection(y)) {
    return y.types.every((y) => overlaps(x, y));
  }

  if (isSingleton(x) && isSingleton(y)) return x.value === y.value;
  if (isSingleton(x)) return x.base === y.kind;
  if (isSingleton(y)) return y.base === x.kind;

  if (isObject(x) && isObject(y)) {
    return x.properties.every(({ name, type: xType }) => {
      const yType = propType(y, name);
      if (!yType) return true;
      else return overlaps(xType, yType);
    });
  }

  return x.type === y.type;
};
Enter fullscreen mode Exit fullscreen mode

Now, to normalize intersections we flatten nested intersections, distribute the intersection over unions, and check for emptiness/remove redundant parts. Continuing in intersection.js:

const collapseSupertypes = (ts) => {
  return ts.filter((t1, i1) =>
    ts.every(
      (t2, i2) =>
        i1 === i2 || !isSubtype(t2, t1) || (isSubtype(t1, t2) && i1 < i2)
    )
  );
};

const flatten = (ts) =>
  [].concat(...ts.map((t) => (isIntersection(t) ? t.types : t)));

const intersectionNoUnion = (ts) => {
  if (ts.some((t1, i1) => ts.some((t2, i2) => i1 < i2 && !overlaps(t1, t2)))) {
    return never;
  }

  ts = collapseSupertypes(ts);

  if (ts.length === 0) return unknown;
  if (ts.length === 1) return ts[0];

  return { type: TypeTypes.Intersection, types: ts };
};

export const intersection = (...ts) => {
  ts = flatten(ts);
  ts = distributeUnion(ts).map((ts) => intersectionNoUnion(ts));

  return union(...ts);
};
Enter fullscreen mode Exit fullscreen mode

Again, for a better understanding see Jake Donham's post on intersection types.

Mapping Types

Whereas before we could just infer using simple types, now we need a map function to account for unions and intersections in our inference. In src/typechecker/map.js:

export const map = (t, fn) => {
  if (isUnion(t)) {
    return union(...t.types.map((t) => map(t, fn)));
  } else if (isIntersection(t)) {
    /** @type {import("./types.js").Type[]} */
    let ts = [];
    let error = null;

    for (let tt of t.types) {
      try {
        ts.push(fn(tt));
      } catch (e) {
        if (!error) error = e;
      }
    }

    if (ts.length === 0) {
      throw error;
    } else {
      return intersection(...ts);
    }
  } else {
    return fn(t);
  }
};
Enter fullscreen mode Exit fullscreen mode

Now let's add union, intersection, and map to our Type module in src/typechecker/Type.js:

import { TypeTypes } from "./types.js";
import * as Validators from "./validators.js";
import * as Constructors from "./constructors.js";
import { typeToString } from "./typeToString.js";
import { union } from "./union.js";
import { map } from "./map.js";
import { intersection } from "./intersection.js";

export const Type = {
  Type: TypeTypes,
  ...Constructors,
  ...Validators,
  toString: typeToString,
  union,
  map,
  intersection,
};
Enter fullscreen mode Exit fullscreen mode

Updated fromTypeAnnotation

Here's the updated fromTypeAnnotation function in src/typechecker/fromTypeAnnotation.js with cases for singleton, tuple, union, intersection, unknown, and never types:

export const fromTypeAnnotation = (
  typeAnnotation,
  typeEnv = TypeEnvironment.new()
) => {
  switch (typeAnnotation.kind) {
    case TATypes.AnyLiteral:
      return Type.any;
    case TATypes.NumberLiteral:
      return Type.number;
    case TATypes.StringLiteral:
      return Type.string;
    case TATypes.BooleanLiteral:
      return Type.boolean;
    case TATypes.KeywordLiteral:
      return Type.keyword;
    case TATypes.NilLiteral:
      return Type.nil;
    case TATypes.Never:
      return Type.never;
    case TATypes.Unknown:
      return Type.unknown;
    case TATypes.Symbol: {
      const name = typeAnnotation.name;
      const type = typeEnv.getType(name);

      if (!type) {
        throw new Exception(
          `Type ${name} not found in current type environment`
        );
      }

      return type;
    }
    case TATypes.List: {
      const listType = fromTypeAnnotation(typeAnnotation.listType, typeEnv);
      return Type.list(listType);
    }
    case TATypes.Vector: {
      const vectorType = fromTypeAnnotation(typeAnnotation.vectorType, typeEnv);
      return Type.vector(vectorType);
    }
    case TATypes.Object: {
      const propTypes = typeAnnotation.properties.map((prop) => ({
        name: prop.name,
        type: fromTypeAnnotation(prop.propType, typeEnv),
      }));
      return Type.object(propTypes);
    }
    case TATypes.Function: {
      const paramTypes = typeAnnotation.params.map((p) =>
        fromTypeAnnotation(p, typeEnv)
      );
      const retType = fromTypeAnnotation(typeAnnotation.retType, typeEnv);
      return Type.functionType(paramTypes, retType, typeAnnotation.variadic);
    }
    case TATypes.Tuple: {
      /** @type {import("./types.js").Type[]} */
      let types = [];

      for (let mem of typeAnnotation.types) {
        types.push(fromTypeAnnotation(mem, typeEnv));
      }

      return Type.tuple(types);
    }
    case TATypes.Singleton: {
      const tType = typeAnnotation.token.type;
      const base =
        tType === TokenTypes.Number
          ? "Number"
          : tType === TokenTypes.String
          ? "String"
          : tType === TokenTypes.Boolean
          ? "Boolean"
          : TokenTypes.Keyword
          ? "Keyword"
          : fail(
              `Invalid token type ${tType} when parsing type annotation ${JSON.stringify(
                typeAnnotation,
                null,
                2
              )}`
            );
      const value = typeAnnotation.token.value;

      return Type.singleton(base, value);
    }
    case TATypes.Union:
      return Type.union(
        ...typeAnnotation.types.map((t) => fromTypeAnnotation(t, typeEnv))
      );
    case TATypes.Intersection: {
      return Type.intersection(
        ...typeAnnotation.types.map((t) => fromTypeAnnotation(t, typeEnv))
      );
    }
    default:
      throw new Exception(
        `Type not found for type annotation ${JSON.parse(
          typeAnnotation,
          null,
          2
        )}`
      );
  }
};
Enter fullscreen mode Exit fullscreen mode

Converting Types to String

Now we need cases for the new types in the switch statement in typeToString in src/typechecker/typeToString.js:

    // other cases...
    case TypeTypes.Tuple:
      return `[${type.types.map(typeToString).join(", ")}]`;
    case TypeTypes.Singleton:
      return type.value;
    case TypeTypes.Never:
      return "never";
    case TypeTypes.Union:
      return `(${type.types.map(typeToString).join(" | ")})`;
    case TypeTypes.Unknown:
      return "unknown";
    case TypeTypes.Intersection:
      return `(${type.types.map(typeToString).join(" && ")})`;
    // default case
Enter fullscreen mode Exit fullscreen mode

Unifying Types

Now that we have unknown as a universal supertype, we need to modify the unification functions in src/typechecker/unify.js:

import { Type } from "./Type.js";
import { isSubtype } from "./isSubtype.js";

export const unify = (type1, type2) => {
  if (isSubtype(type1, type2)) {
    return type2;
  } else if (isSubtype(type2, type1)) {
    return type1;
  }

  return Type.unknown;
};

export const unifyAll = (...types) => {
  return types.reduce((unified, type) => {
    if (Type.isUnknown(unified) || Type.isAny(unified)) return unified;

    return unify(unified, type);
  });
};
Enter fullscreen mode Exit fullscreen mode

Type Inference

Due to adding constants and singleton types, we actually need to make a change to nearly every single infer* function to accommodate constants.

We add a constant parameter to the infer function that defaults to false and pass it into the inference subfunctions in src/typechecker/infer.js:

export const infer = (ast, env, constant = false) => {
  switch (ast.kind) {
    case ASTTypes.NumberLiteral:
      return inferNumber(ast, constant);
    case ASTTypes.StringLiteral:
      return inferString(ast, constant);
    case ASTTypes.BooleanLiteral:
      return inferBoolean(ast, constant);
    case ASTTypes.KeywordLiteral:
      return inferKeyword(ast, constant);
    case ASTTypes.NilLiteral:
      return inferNil();
    case ASTTypes.Symbol:
      return inferSymbol(ast, env, constant);
    case ASTTypes.CallExpression:
      return inferCallExpression(ast, env, constant);
    case ASTTypes.VariableDeclaration:
      return inferVariableDeclaration(ast, env, constant);
    case ASTTypes.SetExpression:
      return inferSetExpression(ast, env, constant);
    case ASTTypes.DoExpression:
      return inferDoExpression(ast, env, constant);
    case ASTTypes.TypeAlias:
      return inferTypeAlias(ast, env, constant);
    case ASTTypes.VectorLiteral:
      return inferVectorLiteral(ast, env, constant);
    case ASTTypes.RecordLiteral:
      return inferRecordLiteral(ast, env, constant);
    case ASTTypes.MemberExpression:
      return inferMemberExpression(ast, env, constant);
    case ASTTypes.LambdaExpression:
      return inferFunction(ast, env, constant);
    case ASTTypes.FunctionDeclaration:
      return inferFunction(ast, env, constant);
    case ASTTypes.ConstantDeclaration:
      return inferVariableDeclaration(ast, env, true);
    case ASTTypes.AsExpression:
      return inferAsExpression(ast, env, constant);
    default:
      throw new Exception(`No type inferred for AST node type ${ast.kind}`);
  }
};
Enter fullscreen mode Exit fullscreen mode

Note the new cases at the bottom of the switch statement for constant declarations (which just call inferVariableDeclaration with constant set to true) and the call to inferAsExpression. Here's that function:

const inferAsExpression = (node, env, constant) => {
  const type = fromTypeAnnotation(node.type);
  const exprType = infer(node.expression, env, constant);

  if (env.checkingOn) {
    if (!isSubtype(exprType, type)) {
      throw new TypeException(
        `${Type.toString(
          exprType
        )} is not a valid subtype of the given type ${Type.toString(
          type
        )} in :as expression`,
        node.srcloc
      );
    }
  }

  return type;
};
Enter fullscreen mode Exit fullscreen mode

We'll need to modify the header for every subfunction except inferNil (which does not change).

We also need to modify the inference functions for primitives to handle singleton types. If constant is true, they now return singleton types:

const inferNumber = (ast, constant) =>
  constant ? Type.singleton("Number", ast.value) : Type.number;

const inferString = (ast, constant) =>
  constant ? Type.singleton("String", ast.value) : Type.string;

const inferBoolean = (ast, constant) =>
  constant ? Type.singleton("Boolean", ast.value) : Type.boolean;

const inferKeyword = (ast, constant) =>
  constant ? Type.singleton("Keyword", ast.value) : Type.keyword;
Enter fullscreen mode Exit fullscreen mode

We also need to modify every recursive call to the infer function to make sure we pass in the constant parameter value.

Now let's modify inferCallExpression and inferMemberExpression to use the new Type.map function since those types could be unions or intersections:

const inferCallExpression = (node, env, constant) => {
  let func = infer(node.func, env, constant);

  if (Type.isAny(func)) {
    return Type.any;
  } else if (Type.isUndefined(func) || Type.isUndefined(func.ret)) {
    // this should only happen during first typechecker pass
    return Type.undefinedType;
  } else if (Type.isTypeAlias(func)) {
    func = getAliasBase(func.name, env);
  }

  return Type.map(func, (func) => {
    if (!func.variadic && node.args.length > func.params.length) {
      throw new TypeException(
        `Too many arguments for function: ${node.args.length} given; ${func.params.length} expected`,
        node.srcloc
      );
    }

    // handle partially applied functions
    if (
      node.args.length <
      (func.variadic ? func.params.length - 1 : func.params.length)
    ) {
      // is partially applied
      const params = func.params.slice(0, node.args.length);

      if (env.checkingOn) {
        checkArgTypes(node, params, env, func, constant);
      }

      const newParams = func.params.slice(node.args.length);
      return Type.functionType(newParams, func.ret, func.variadic);
    }

    if (env.checkingOn) {
      checkArgTypes(node, func.params, env, func, constant);
    }

    return func.ret;
  });
};

// ...

const inferMemberExpression = (node, env, constant) => {
  const prop = node.property;
  let object = infer(node.object, env, constant);

  if (Type.isTypeAlias(object)) {
    object = getAliasBase(object.name, env);
  }

  return Type.map(object, (object) => {
    if (!Type.isObject(object)) {
      if (env.checkingOn) {
        throw new TypeException(
          `Member expression expects object type; ${Type.toString(
            object
          )} given`,
          node.srcloc
        );
      } else {
        return Type.any;
      }
    }

    const type = propType(object, prop.name);

    if (!type && env.checkingOn) {
      throw new TypeException(
        `Property ${prop.name} not found on object of type ${Type.toString(
          object
        )}`,
        node.srcloc
      );
    }

    return type ?? Type.any;
  });
};
Enter fullscreen mode Exit fullscreen mode

Let's also change inferVectorLiteral so that an empty vector has the vectorType of never instead of any:

const inferVectorLiteral = (node, env, constant) => {
  if (node.members.length === 0) {
    return Type.vector(Type.never);
  }

  const types = node.members.map((m) => infer(m, env, constant));
  const unified = unifyAll(...types);

  return Type.vector(unified, constant);
};
Enter fullscreen mode Exit fullscreen mode

We're also going to add a check to inferFunction to make sure that if a function is annotated with the never type it doesn't return something else as its inferred type:

const inferFunction = (node, env, constant) => {
  const params = node.params.map((p) => {
    if (p.typeAnnotation) {
      env.checkingOn = true;
    }
    const type = p.typeAnnotation
      ? fromTypeAnnotation(p.typeAnnotation, env)
      : Type.any;
    env.set(p.name, type);
    return type;
  });

  if (node.retType) {
    env.checkingOn = true;
  }

  const retType = node.retType
    ? fromTypeAnnotation(node.retType, env)
    : Type.any;
  let inferredRetType;

  for (let expr of node.body) {
    // type of last expression "wins"
    inferredRetType = infer(expr, env, constant);
  }

  if (env.checkingOn && !isSubtype(inferredRetType, retType)) {
    throw new TypeException(
      `Inferred return type ${Type.toString(
        inferredRetType
      )} is not a subtype of annotated return type ${Type.toString(retType)}`,
      node.srcloc
    );
  } else if (env.checkingOn && Type.isNever(retType)) {
    if (!Type.isNever(inferredRetType)) {
      throw new TypeException(
        `Function with return type never cannot return inferred type of ${Type.toString(
          inferredRetType
        )}`,
        node.srcloc
      );
    }
  }

  return Type.functionType(
    params,
    Type.isAny(retType) || Type.isUndefined(retType)
      ? inferredRetType
      : retType,
    node.variadic
  );
};
Enter fullscreen mode Exit fullscreen mode

Now go back and double check to make sure you didn't miss any function type signatures with the constant parameter and recursive calls to infer with the new constant parameter.

Subtyping Singleton, Tuple, Union, and Intersection Types

A singleton type is a subtype of its base type, whereas a type is only a subtype of a singleton type if the values match.

For a tuple type, make sure the types are the same length and that every type of type2 is a subtype of the corresponding type from type1.

A type is a subtype of a union type if it matches one arm of the union, and it's a subtype of an intersection type if it matches every intersection.

Here are the new cases in src/typechecker/isSubtype.js:

  // after checking function types
  if (Type.isTuple(type1) && Type.isTuple(type2)) {
    return type1.types.every((a, i) => isSubtype(a, type2.types[i]));
  }

  if (Type.isSingleton(type1)) {
    if (Type.isSingleton(type2)) return type1.value === type2.value;
    else return isSubtype(type1.base, type2);
  }

  if (Type.isUnion(type1)) {
    return type1.types.every((t1) => isSubtype(t1, type2));
  }

  if (Type.isUnion(type2)) {
    return type2.types.some((t2) => isSubtype(type1, t2));
  }

  if (Type.isIntersection(type1)) {
    return type1.types.some((a) => isSubtype(a, type2));
  }

  if (Type.isIntersection(type2)) {
    return type2.types.every((b) => isSubtype(type1, b));
  }
  // default return false
Enter fullscreen mode Exit fullscreen mode

Checking Singleton, Tuple, and Union Types

We don't need to add anything to our check function for intersection types, but we do for singletons, tuples, and unions.

With singleton types, we need to check if the literal value matches the singleton's value. We also need to make this check for properties of an object type.

In check in src/typechecker/check.js:

  if (Type.isSingleton(type) && isPrimitive(ast) && ast.value === type.value) {
    // primitive matches singleton type
    return;
  }
Enter fullscreen mode Exit fullscreen mode

And the new checkObject:

const checkObject = (ast, type, env) => {
  const astProps = ast.properties.map((prop) => ({
    name: prop.key.name,
    expr: prop.value,
  }));

  type.properties.forEach(({ name }) => {
    const astProp = astProps.find(({ name: astName }) => astName === name);

    if (!astProp) {
      throw new TypeException(
        `Property ${name} not found on record literal`,
        ast.srcloc
      );
    }
  });

  astProps.forEach(({ name, expr }) => {
    const pType = propType(type, name);

    if (!pType) {
      throw new TypeException(
        `Property ${name} not found on object of type ${Type.toString(type)}`,
        ast.srcloc
      );
    }

    if (
      Type.isSingleton(pType) &&
      isPrimitive(expr) &&
      pType.value === expr.value
    ) {
      // continue
    } else {
      check(expr, pType, env);
    }
  });
};
Enter fullscreen mode Exit fullscreen mode

We need to check each member of a tuple against the corresponding node of the VectorLiteral it's matched against.

In check:

  if (ast.kind === ASTTypes.VectorLiteral && Type.isTuple(type)) {
    return checkTuple(ast, type, env);
  }
Enter fullscreen mode Exit fullscreen mode

And checkTuple:

const checkTuple = (ast, type, env) => {
  let i = 0;
  for (let t of type.types) {
    check(ast.members[i], t, env);
    i++;
  }
};
Enter fullscreen mode Exit fullscreen mode

Finally, for unions, we need to check against each arm of the union. If nothing matches, we need to construct an error message based on the whole union type and throw that as a TypeException. See Jake Donham's post on union types for why it's good to do this.

In check:

  if (Type.isUnion(type)) {
    return checkUnion(ast, type, env);
  }
Enter fullscreen mode Exit fullscreen mode

And in checkUnion:

const checkUnion = (ast, type, env) => {
  for (let t of type.types) {
    try {
      check(ast, t, env);
      return;
    } catch (_) {
      // do nothing
    }
  }
  // Nothing matched, so construct error message based on the whole union
  const inferredType = infer(ast, env);

  if (!isSubtype(inferredType, type)) {
    throw new TypeException(
      `Type ${Type.toString(
        inferredType
      )} is not a valid subtype of ${Type.toString(type)}`,
      ast.srcloc
    );
  }
};
Enter fullscreen mode Exit fullscreen mode

A Slight Change to The Type Environment

I forgot when creating the type environment class to make sure checking propagates to child environments, so in the extend method we want to make sure that if checkingOn is set for the parent it's also set for the child in src/typechecker/TypeEnvironment.js:

  extend(name) {
    let env = new TypeEnvironment(this, { name });
    env.checkingOn = this.checkingOn;
    return env;
  }
Enter fullscreen mode Exit fullscreen mode

The TypeChecker

First, I found a bug in the TypeChecker class in src/typechecker/TypeChecker.js between articles. In the checkVariableDeclaration method, change the return statement to this:

    return {
      ...node,
      expression: this.checkNode(node.expression, env),
      type,
    };
Enter fullscreen mode Exit fullscreen mode

This fixes it so any type annotation given is still present during the second pass of the type checker. Previously it was being discarded prior to the second check. Whoops!

I also found a second bug that was causing an error when processing type aliases on the second pass because I had named the type annotation property on the AST node type instead of something else. Since the AST node is already defined, I decided just to check if it's the 2nd pass and, if so, return the value of the type property which is the type defined for the alias:

  checkTypeAlias(node, env) {
    const type = isSecondPass ? node.type : fromTypeAnnotation(node.type, env);
    env.setType(node.name, type);
    return { ...node, type };
  }
Enter fullscreen mode Exit fullscreen mode

Now for the current changes to the TypeChecker: in the checkNode method, we need to add cases for the new ConstantDeclaration and AsExpression nodes to the switch statement:

      // other cases...
      case ASTTypes.ConstantDeclaration:
        return this.checkConstantDeclaration(node, env);
      case ASTTypes.AsExpression:
        return this.checkAsExpression(node, env);
      // default case
Enter fullscreen mode Exit fullscreen mode

Now let's make a quick edit to checkVariableDeclaration so that it takes a constant parameter that defaults to false and passes that into the infer function:

  checkVariableDeclaration(node, env, constant = false) {
    let type;

    if (node.typeAnnotation) {
      type = fromTypeAnnotation(node.typeAnnotation, env);
      check(node.expression, type, env);
      env.checkingOn = true;
    } else {
      type = infer(node.expression, env, constant);
    }
  // rest of method body
  }
Enter fullscreen mode Exit fullscreen mode

Tuples can also be destructured in variable assignment, so we have to account for that too. We're also going to throw an error if someone tries to assign a value to a never type, though hopefully no one would ever try to do that in a variable declaration because... why would you do that?:

    // continuing from above
    if (env.checkingOn && Type.isNever(type)) {
      throw new TypeException(
        `Type never cannot be assigned a value`,
        node.srcloc
      );
    }

    if (node.lhv.kind === ASTTypes.Symbol) {
      env.set(node.lhv.name, type);
    } else if (node.lhv.kind === ASTTypes.VectorPattern) {
      if (!Type.isVector(type) && !Type.isList(type) && !Type.isTuple(type)) {
        throw new TypeException(
          `Vector pattern destructuring must take a vector, list, or tuple type`,
          node.srcloc
        );
      } else {
        let i = 0;
        for (let mem of node.lhv.members) {
          if (Type.isVector(type) || Type.isList(type)) {
            if (node.lhv.rest && i === node.lhv.members.length - 1) {
              env.set(mem.name, type);
            } else {
              env.set(
                mem.name,
                type.vectorType ? type.vectorType : type.listType
              );
            }
          } else {
            // is tuple type
            if (node.lhv.rest && i === node.lhv.members.length - 1) {
              // is rest variable
              env.set(mem.name, type.types.slice(i));
            } else {
              env.set(mem.name, type.types[i]);
            }
          }
          i++;
        }
      }
    } // continue with RecordPattern case
Enter fullscreen mode Exit fullscreen mode

Now let's add the checkConstantDeclaration method:

  checkConstantDeclaration(node, env) {
    return this.checkVariableDeclaration(node, env, true);
  }
Enter fullscreen mode Exit fullscreen mode

And a method for checkAsExpression:

  checkAsExpression(node, env) {
    env.checkingOn = true;
    const type = infer(node, env);
    return { ...node.expression, type };
  }
Enter fullscreen mode Exit fullscreen mode

We also need to modify checkSetExpression to make sure we're not trying to set a value for a constant, which is illegal:

  checkSetExpression(node, env) {
    if (node.lhv.kind !== ASTTypes.Symbol) {
      throw new TypeException(
        `Cannot use destructuring with set! assignment`,
        node.srcloc
      );
    }

    const nameType = env.get(node.lhv.name);

    if (Type.isSingleton(nameType)) {
      const exprType =
        node.expression.kind === ASTTypes.Symbol
          ? env.get(node.expression.name)
          : infer(node.expression, env);
      if (
        (Type.isSingleton(exprType) && exprType.value !== nameType.value) ||
        (isPrimitive(node.expression) &&
          node.expression.value !== nameType.value)
      ) {
        throw new TypeException(
          `Cannot assign different value to variable of singleton type ${Type.toString(
            nameType
          )}`,
          node.expression.srcloc
        );
      }
    }

    if (nameType.constant) {
      throw new TypeException(
        `Cannot assign to constant value ${node.lhv.name}`,
        node.srcloc
      );
    }

    if (env.checkingOn) {
      check(node.expression, nameType, env);
      return {
        kind: node.kind,
        lhv: node.lhv,
        expression: this.checkNode(node.expression, env),
        srcloc: node.srcloc,
        type: nameType,
      };
    }

    return {
      kind: node.kind,
      lhv: node.lhv,
      expression: this.checkNode(node.expression, env),
      srcloc: node.srcloc,
      type: infer(node, env),
    };
  }
Enter fullscreen mode Exit fullscreen mode

Changes to The Visitor

Now we need to add methods to the Visitor class for constant declarations and as expressions in src/visitor/Visitor.js. In the switch expression in the visit method:

      // other cases...
      case ASTTypes.ConstantDeclaration:
        return this.visitConstantDeclaration(node);
      case ASTTypes.AsExpression:
        return this.visitAsExpression(node);
      // default case
Enter fullscreen mode Exit fullscreen mode

Here's the visitConstantDeclaration method:

  visitConstantDeclaration(node) {
    const lhv = this.visit(node.lhv);
    const expression = this.visit(node.expression);
    return { ...node, lhv, expression };
  }
Enter fullscreen mode Exit fullscreen mode

And here's visitAsExpression:

  visitAsExpression(node) {
    const expression = this.visit(node.expression);
    return { ...node, expression };
  }
Enter fullscreen mode Exit fullscreen mode

Changes to The Desugarer

Since constant declarations desugar to variable declarations for the emitter, we need a method for that in the Desugarer class. It's pretty simple in src/desugarer/Desugarer.js:

  visitConstantDeclaration(node) {
    return { ...node, kind: ASTTypes.VariableDeclaration };
  }
Enter fullscreen mode Exit fullscreen mode

In visitFunctionDeclaration I've also added a name property to the AST node output, since a function declaration will always have a name. This lets us recover the names for our functions since passing the function into rt.makeFunction in the emitter was knocking out the function name.

Just add this line somewhere below the definition of lambda in the visitFunctionDeclaration method:

lambda.name = node.name.name;
Enter fullscreen mode Exit fullscreen mode

Changes to The Runtime

There's a new version of makeFunction in src/runtime/makeFunction.js to handle adding names to functions:

import objectHash from "object-hash";
import { curryN } from "ramda";
import { makeWandaValue } from "./conversion.js";
import { addMetaField } from "./object.js";
import { parseContract } from "./parseContract.js";

export const makeFunction = (func, { contract = "", name = "" } = {}) => {
  let fn = curryN(func.length, (...args) => {
    const val = makeWandaValue(func(...args));

    if (typeof val === "function") {
      return makeFunction(val);
    }

    return val;
  });
  const hash = objectHash(fn);

  addMetaField(fn, "wanda", true);
  addMetaField(fn, "arity", func.length);
  addMetaField(fn, "name", name || hash);

  if (contract !== "") {
    Object.defineProperty(fn, "contract", {
      enumerable: false,
      configurable: false,
      writable: false,
      value: parseContract(contract),
    });
  }

  Object.defineProperty(fn, "name", {
    enumerable: false,
    configurable: false,
    writable: false,
    value: name || hash,
  });

  return fn;
};
Enter fullscreen mode Exit fullscreen mode

I've added some metadata in makeObject in src/runtime/object.js:

export const makeObject = (obj) => {
  let newObj = {};
  addMetaField(newObj, "dict", obj);
  addMetaField(newObj, "constructor", function (...args) {
    return new obj.constructor(...args);
  });
  addMetaField(newObj, "type", "object");
  addMetaField(
    newObj[makeKeyword("constructor")],
    "name",
    obj.constructor?.name ?? "WandaObject"
  );

  // to allow destructuring
  for (let [k, v] of Object.entries(obj)) {
    newObj[makeSymbol(k)] = v;
  }

  return newObj;
};
Enter fullscreen mode Exit fullscreen mode

Finally, make sure you have imported the isNil function into src/runtime/makeRuntime.js and have it added to the runtime object:

import { isNullish as isNil } from "../shared/utils.js";

export const makeRuntime = () => {
  return {
    // other values...
    isNil,
  };
};
Enter fullscreen mode Exit fullscreen mode

Changes to The Emitter

There's only one change to the emitter: handling re-adding the function name when it exists on the lambda node we're receiving from the desugarer in src/emitter/Emitter.js by passing it as a parameter to rt.makeFunction:

  emitLambdaExpression(node, ns) {
    const funcNs = ns.extend("Lambda");
    let params = [];
    let i = 0;

    for (let p of node.params) {
      funcNs.set(p.name.name, makeSymbol(p.name));

      if (node.variadic && i === node.params.length - 1) {
        params.push(`...${this.emit(p.name, funcNs)}`);
      } else {
        params.push(this.emit(p.name, funcNs));
      }
      i++;
    }

    let code = `rt.makeFunction((${params.join(", ")}) => {\n`;

    let j = 0;
    for (let expr of node.body) {
      if (j === node.body.length - 1) {
        code += `return ${this.emit(expr, funcNs)};`;
      } else {
        code += this.emit(expr, funcNs) + ";\n";
      }
      j++;
    }

    code += "\n}";
    code += `${node.name ? `, { name: "${node.name}" }` : ""})`;

    return code;
  }
Enter fullscreen mode Exit fullscreen mode

Changes to Shared Types

I've added an isList function to src/shared/cons.js to check for if a Cons cell is a valid list (that is, it ends in null):

export const isList = (obj) => {
  if (obj != null && obj.constructor?.name !== "Cons") {
    return false;
  } else if (obj == null) {
    return true;
  }

  // only option left is cons
  return isList(obj.cdr);
};
Enter fullscreen mode Exit fullscreen mode

I've used it in the printer, which you'll see below.

Changes to The Printer

In src/printer/printString.js I've rewritten printList using the isList function and some recursive magic to detect if the list is a proper list or degenerate. Add isList to your import from shared/cons.js and here's the new printList function:

const printList = (list) => {
  const printListElems = (next, str = "") => {
    if (next == null) {
      return str;
    } else if (next.constructor?.name === "Cons") {
      return printListElems(next.cdr, str + printString(next.car) + " ");
    } else {
      return str + ". " + printString(next);
    }
  };

  let prStr = printListElems(list);
  // if it's a list, get rid of the extra space at the end
  prStr = isList(list) ? prStr.slice(0, -1) : prStr;
  return "(" + prStr + ")";
};
Enter fullscreen mode Exit fullscreen mode

We also need to print AST nodes for ConstantDeclaration and AsExpression nodes. First, add the cases to the switch statement in the print method:

      // other cases...
      case ASTTypes.ConstantDeclaration:
        return this.printConstantDeclaration(node, indent);
      case ASTTypes.AsExpression:
        return this.printAsExpression(node, indent);
      // default case
Enter fullscreen mode Exit fullscreen mode

Here's printConstantDeclaration:

  printConstantDeclaration(node, indent) {
    let prStr = `${prIndent(indent)}ConstantDeclaration:\n`;
    prStr += `${this.print(node.lhv, indent + 2)}\n`;
    prStr += `${this.print(node.expression, indent + 2)}`;
    return prStr;
  }
Enter fullscreen mode Exit fullscreen mode

And here's printAsExpression:

  printAsExpression(node, indent) {
    return this.print(node.expression, indent);
  }
Enter fullscreen mode Exit fullscreen mode

Updating The Core Library

Finally, I've added names to all the functions in our core library. I've also fixed the append function for vectors, since there was a bug in it earlier.

Here's the entire updated version of lib/js/core.js:

import equal from "fast-deep-equal/es6/index.js";
import { makeModule } from "../../src/runtime/Module.js";
import { Cons, cons } from "../../src/shared/cons.js";
import { print } from "../../src/printer/print.js";
import { println } from "../../src/printer/println.js";
import { printString } from "../../src/printer/printString.js";
import { isTruthy, makeKeyword } from "../../src/runtime/utils.js";
import { Exception } from "../../src/shared/exceptions.js";
import { hasMetaField } from "../../src/runtime/object.js";

export const theModule = makeModule("Core", (rt, ns) => {
  const isList = (obj) => {
    if (!rt.isNil(obj) && !(obj instanceof Cons)) {
      return false;
    } else if (rt.isNil(obj)) {
      return true;
    }

    // only option left is cons
    return isList(obj.cdr);
  };

  return {
    print: rt.makeFunction(print, {
      contract: "(any -> string)",
      name: "print",
    }),
    println: rt.makeFunction(println, {
      contract: "(any -> string)",
      name: "println",
    }),
    cons: rt.makeFunction(cons, {
      contract: "(any, any -> (list any))",
      name: "cons",
    }),
    car: rt.makeFunction((pair) => pair.car, {
      contract: "((list any) -> any)",
      name: "car",
    }),
    cdr: rt.makeFunction((pair) => pair.cdr, {
      contract: "((list any) -> any)",
      name: "cdr",
    }),
    string: rt.makeFunction(printString, {
      contract: "(any -> string)",
      name: "string",
    }),
    number: rt.makeFunction(Number, {
      contract: "(string -> number)",
      name: "number",
    }),
    boolean: rt.makeFunction((val) => isTruthy(val), {
      contract: "(any -> boolean)",
      name: "boolean",
    }),
    keyword: rt.makeFunction((str) => Symbol.for(":" + str), {
      contract: "(string -> keyword)",
      name: "keyword",
    }),
    "+": rt.makeFunction(
      (a, b, ...nums) => nums.reduce((sum, n) => sum + n, a + b),
      { contract: "(number, number, &(vector number) -> number)", name: "+" }
    ),
    "-": rt.makeFunction(
      (a, b, ...nums) => nums.reduce((diff, n) => diff - n, a - b),
      { contract: "(number, number, &(vector number) -> number)", name: "-" }
    ),
    "*": rt.makeFunction(
      (a, b, ...nums) => nums.reduce((prod, n) => prod * n, a * b),
      { contract: "(number, number, &(vector number) -> number)", name: "*" }
    ),
    "/": rt.makeFunction(
      (a, b, ...nums) => nums.reduce((quot, n) => quot / n, a / b),
      { contract: "(number, number, &(vector number) -> number)", name: "/" }
    ),
    "%": rt.makeFunction(
      (a, b, ...nums) => nums.reduce((quot, n) => quot % n, a % b),
      { contract: "(number, number, &(vector number) -> number)", name: "%" }
    ),
    "=": rt.makeFunction((a, b) => a === b, {
      contract: "(number, number -> boolean)",
      name: "=",
    }),
    ">": rt.makeFunction((a, b) => a > b, {
      contract: "(number, number -> boolean)",
      name: ">",
    }),
    ">=": rt.makeFunction((a, b) => a >= b, {
      contract: "(number, number -> boolean)",
      name: ">=",
    }),
    "<": rt.makeFunction((a, b) => a < b, {
      contract: "(number, number -> boolean)",
      name: "<",
    }),
    "<=": rt.makeFunction((a, b) => a <= b, {
      contract: "(number, number -> boolean)",
      name: "<=",
    }),
    not: rt.makeFunction((x) => !x, {
      contract: "(any -> boolean)",
      name: "not",
    }),
    list: rt.makeFunction((...args) => Cons.from(args), {
      contract: "(&(vector any) -> (list any))",
      name: "list",
    }),
    length: rt.makeFunction(
      (obj) => {
        if (obj instanceof Cons) {
          let i = 0;
          for (let _ of obj) {
            i++;
          }
          return i;
        }

        return obj.length;
      },
      { contract: "((list any) -> number)", name: "length" }
    ),
    get: rt.makeFunction(
      (n, obj) => {
        const value = obj.get(n);

        if (value === undefined) {
          throw new Exception(`Value for index ${n} not found on object`);
        }

        return value;
      },
      { contract: "((list any) -> any)", name: "get" }
    ),
    "list?": rt.makeFunction(isList, {
      contract: "((list any) -> boolean)",
      name: "list?",
    }),
    "pair?": rt.makeFunction((obj) => obj instanceof Cons, {
      contract: "((list any) -> boolean)",
      name: "pair?",
    }),
    "number?": rt.makeFunction((obj) => typeof obj === "number", {
      contract: "(any -> boolean)",
      name: "number?",
    }),
    "string?": rt.makeFunction((obj) => typeof obj === "string", {
      contract: "(any -> boolean)",
      name: "string?",
    }),
    "boolean?": rt.makeFunction((obj) => typeof obj === "boolean", {
      contract: "(any -> boolean)",
      name: "boolean?",
    }),
    "nil?": rt.makeFunction((obj) => obj == null, {
      contract: "(any -> boolean)",
      name: "nil?",
    }),
    "keyword?": rt.makeFunction(
      (obj) => typeof obj === "symbol" && obj.description.startsWith(":"),
      { contract: "(any -> boolean)", name: "keyword?" }
    ),
    "equal?": rt.makeFunction(
      (a, b) => {
        if (rt.hasDict(a) && rt.hasDict(b)) {
          return equal(rt.getMetaField(a, "dict"), rt.getMetaField(b, "dict"));
        }
        return equal(a, b);
      },
      { contract: "(any, any) -> boolean", name: "equal?" }
    ),
    "is?": rt.makeFunction((a, b) => Object.is(a, b), {
      contract: "(any, any -> boolean)",
      name: "is?",
    }),
    append: rt.makeFunction(
      (obj1, obj2) => {
        if (typeof obj1 === "string" && typeof obj2 === "string") {
          return obj1 + obj2;
        } else if (obj1 instanceof Cons) {
          return Cons.from(obj1).append(obj2);
        } else if (Array.isArray(obj1)) {
          let v = [...obj1];
          v.push(obj2);
          return v;
        } else {
          rt.failRuntime(`Value of type ${typeof obj1} cannot be appended`);
        }
      },
      { contract: "(any, any -> any)", name: "append" }
    ),
    with: rt.makeFunction(
      (rec1, rec2) => {
        const newDict = Object.assign(
          {},
          rt.getMetaField(rec1, "dict"),
          rt.getMetaField(rec2, "dict")
        );
        return rt.makeObject(newDict);
      },
      { contract: "(any, any -> any)", name: "with" }
    ),
    prop: rt.makeFunction((prop, obj) => rt.getField(obj, prop), {
      contract: "(string, any -> any)",
      name: "prop",
    }),
    each: rt.makeFunction(
      (fn, lst) => {
        for (let item of lst) {
          fn(item);
        }
      },
      {
        contract: "((any -> nil), (list any) -> nil)",
        name: "each",
      }
    ),
    map: rt.makeFunction(
      (fn, lst) => {
        let mapped = [];
        for (let item of lst) {
          mapped.push(fn(item));
        }
        return Cons.from(mapped);
      },
      {
        contract: "((any -> any), (list any) -> (list any))",
        name: "map",
      }
    ),
    filter: rt.makeFunction(
      (fn, lst) => {
        let filtered = [];
        for (let item of lst) {
          if (rt.isTruthy(fn(item))) {
            filtered.push(item);
          }
        }
        return Cons.from(filtered);
      },
      {
        contract: "((any -> boolean), (list any) -> (list any))",
        name: "filter",
      }
    ),
    fold: rt.makeFunction(
      (fn, init, lst) => {
        let acc = init;
        for (let item of lst) {
          acc = fn(acc, item);
        }
        return acc;
      },
      {
        contract: "((any, any -> any), any, (list any) -> any)",
        name: "fold",
      }
    ),
    "fold-r": rt.makeFunction(
      (fn, init, lst) => {
        let acc = init;
        const reversed = [...lst].reverse();
        for (let item of reversed) {
          acc = fn(acc, item);
        }
        return acc;
      },
      {
        contract: "((any, any -> any), any, (list any) -> any)",
        name: "fold-r",
      }
    ),
    typeof: rt.makeFunction((obj) => {
      if (obj == null) {
        return "nil";
      }

      if (hasMetaField(obj, "type")) {
        return obj[makeKeyword("type")];
      }

      if (obj instanceof Cons) {
        return isList(obj) ? "list" : "pair";
      }

      if (Array.isArray(obj)) {
        return "vector";
      }

      return typeof obj;
    }),
  };
});
Enter fullscreen mode Exit fullscreen mode

Trying Out The New Functionality

Here's an example of how to use the new type. First, enter the wanda command, then at the prompt:

(type Coord ({ type: "cartesian", x: number, y: number } || { type: "polar", angle: number, magnitude: number }))
(var (c: Coord) { type: "cartesian", x: 42, y: 17.5 })
Enter fullscreen mode Exit fullscreen mode

Here's an example of an intersection type:

(type Point2D { x: number, y: number })
(type Point3D (Point2D && { z: number }))
Enter fullscreen mode Exit fullscreen mode

Have fun with it!

Conclusion

I hope that was as much fun for you as it was for me! Now we've greatly increased the expressivity of our type system.

As always, you can check out the code for this post at the relevant tag at the GitHub repo.

Next time we're going to add branching with if expressions, which will also allow us to perform type narrowing in the checker so we can do better things with union types. For instance, if you have a function that takes the type number || string you can define branches of the function that work with it in each case in a type safe manner.

I'll see you there!

compilers Article's
30 articles in total
Favicon
How to create simple tool for compile the Linux Kernel
Favicon
Unraveling Undefined Behavior: Performance Optimizations in Modern Compilers
Favicon
Video — Deep dive: Compiling deep learning models, from XLA to PyTorch 2
Favicon
The current state of Lithia after 2 years
Favicon
Verificando e Gerando Expressões
Favicon
Expressões encadeadas e agrupamento
Favicon
Improving Compiler Performance with Profile Guided Optimization
Favicon
Understanding Interpreters and Compilers in Programming
Favicon
Create Your Own Programming Language 9: Iteration
Favicon
Crafting Interpreters
Favicon
Create Your Own Programming Language 8: Conditionals
Favicon
Create Your Own Programming Language 7: More Types
Favicon
Create Your Own Programming Language 6: Functions
Favicon
Create Your Own Programming Language 4: Variables and Types
Favicon
Create Your Own Programming Language 3: Call Expressions
Favicon
How To Create Your Own Programming Language
Favicon
Create Your Own Programming Language 1: Numbers
Favicon
An alternative to "Distinguishing an Interpreter from a Compiler"
Favicon
On What Lexers Do
Favicon
Compilers Could Be Way More Fun
Favicon
Incremental compilation for Crystal - Part 3
Favicon
Incremental compilation for Crystal - Part 1
Favicon
Incremental compilation for Crystal - Part 2
Favicon
Crafting a Compiler in Rust: Lexical Analysis
Favicon
Crafting a Compiler in Rust: Introduction
Favicon
What is an ELF file?
Favicon
Static vs dynamic linking
Favicon
Speeding up ReScript compilation using interface files
Favicon
🕶 What it takes to build a Static Analysis tool
Favicon
A Compiler optimization area

Featured ones: