Logo

dev-resources.site

for different kinds of informations.

Matanuska ADR 002 - Architecture

Published at
12/18/2024
Categories
basic
typescript
interpreters
architecture
Author
jfhbrook
Author
8 person written this
jfhbrook
open
Matanuska ADR 002 - Architecture

This article is a repost of an ADR from Matanuska BASIC, my attempt to write a BASIC interpreter in TypeScript.

Context

In Writing Interactive Compilers & Interpreters (WIC&I), PJ Brown outlines a general architecture for a BASIC-like interpreter. The architecture overall has aged well, and will be implemented in Matanuska BASIC with a few adjustments.

Decision

The following architecture will be implemented.

Host

WIC&I refers to this as the I/O module. I'm borrowing a page from PowerShell and calling it the Host.

This is the one component which varies based on environment or frontend - that is, Host is an interface, and ConsoleHost is the implementation for the console specifically.

Host has a lot of responsibilities:

  • Prompting/reading input
  • Writing simple output and/or logging - this it shares with a PowerShell host
  • File reading/writing and tracking file handles
  • Process spawning, stdio redirection and tracking child processes/PIDs
  • Ports - both serial and networking, as well as HTTP
  • If applicable, drawing procedures - ie, wrapping ink, ratatui crossterm, etc.

This is a larger surface area than most objects. However, I feel the division of responsibility is clear.

Translator

This component contains the main REPL loop and feeds parsed lines to other components. The basic loop is:

  1. Read source code input from the prompt
  2. Use the scanner and parser to generate the AST for a line
  3. If prefixed by a line number, feed to the Editor
  4. If not prefixed by a line number, feed to the command module as an immediate command

The Translator may also non-interactively read directly from a file.

Compiler

Lines in BASIC are initially parsed without knowing the context of the rest of the program. This means that a second pass is needed to:

  1. Check that blocks are closed properly
  2. Resolve GOTOs
  3. Resolve variables (refer to Crafting Interpreters for what this entails)

There may be other needs - this list is non-exhaustive.

In WIC&I, there is a corresponding component called the pre-run module. This module doesn't generate a bytecode from an AST - rather, it fills in context fields in the Program set to null on the first pass.

Editor

In BASIC, editing programs is accomplished through the shell. If a command is prefixed with a line number, it is inserted into a program loaded in an editor.

The editor's responsibility is to take Lines and insert, update or remove them from a Program, and return the full Program when it's time to either RUN or LIST the program. Its interface is similar to a dictionary.

Recreator

BASIC typically doesn't retain the original source code in the editor - rather, it contains parsed and compiled bytecode. This is largely to save space. In order to LIST the program, the source code has to be recreated from that bytecode. WIC&I calls this recreating.

Note that a recreator combined with a parser is effectively a formatter.

AST and Bytecode

In WIC&I, a Program is the core abstraction for storing what it calls the "internal language". In a traditional BASIC, this is stored in a "reverse Polish" format - similar to a modern bytecode - but without stripping non-operating information from the source, such as comments.

The rationale for using the "reverse Polish" format over a tree is that executing it is faster - it can be done with a linear scan and stack operations, rather than doing pointer lookups through a visitor pattern. However, if non-operating information is stripped from this format, then the source code can't be recreated.

In Matanuska BASIC, I will be implementing both an AST and a bytecode. The Program will be the top-level node in an AST, and the output of the first pass executed in the Translator. The bytecode will be generated from a Program by the compiler when RUN is executed.

This means that there are two intermediate representations, not just one. It also means that the compiler has to do more. However, it means that the bytecode can strip non-operating information from the AST, as well as implement optimizations and use a simplified instruction set.

Commander

The commander - called the "command module" in WIC&I - has a few cross-cutting responsibilities.

Sessions

The commander is in charge of initializing and closing sessions. This includes:

  • Initializing the Editor and Program
  • Initializing the Host
  • Managing readline functionality, including loading history
  • Running any autoexec.bas (a feature of MSX BASIC, analogous to ~/.bashrc)
  • Printing any startup messages
  • Gracefully closing resources on exit

Executing Commands

Most BASIC implementations have a number of commands which aren't implemented through the runtime. These include editor commands, as well as RUN. In these cases, the commander is in charge of taking the AST input and executing it directly.

But in the case of a runtime command, the commander is still in charge of delegating to the runtime. This makes the commander the common entry point for all command execution, such that the translator always passes parsed input to the commander.

Interrupts & Errors

In the case of interrupts, the commander is in charge of ensuring that the runtime is paused smoothly, and that execution is handed back to the translator.

In the case of interrupts caused by errors, the commander is in charge of reporting and recovery. Generally, the translator should not be doing error handling.

Runtime

This component is straightforwardly a bytecode VM.

basic Article's
30 articles in total
Favicon
Research DevOps metrics and KPIs
Favicon
Matanuska ADR 010 - Architecture, Revisited
Favicon
Matanuska ADR 009 - Type Awareness in The Compiler and Runtime
Favicon
Matanuska ADR 007 - Type Semantics for Primary Types
Favicon
Top 10 Programming Languages in 2025
Favicon
PHP OOP Part-2: Constructor and Destructor
Favicon
PHP OOP Part-4: Static property, method and this vs self
Favicon
PHP OOP Part-3: Access modifier, Encapsulation and Inheritance
Favicon
PHP OOP Part-5: Abstraction and Interface
Favicon
Matanuska ADR 006 - Runtime Exit
Favicon
Load balancer vs Gateway vs reverse proxy vs forward proxy
Favicon
Sponsoring Family in Dubai: Who Can Apply and How?
Favicon
Best practices to Implement RTL in React Js
Favicon
Matanuska ADR 002 - Architecture
Favicon
Matanuska ADR 003 - Recursive Descent Parser
Favicon
Getting Started with Python: Why and How to Learn This Amazing Language
Favicon
Beginner-Friendly Basic Computer Course Overview
Favicon
I'm Publishing Matanuska BASIC's ADRs
Favicon
Engineering of Small Things #2: Cookies
Favicon
PHP OOP Part-7: Composition vs Inheritance and Dependency Injection
Favicon
A Beginner’s Guide to React: Understanding Components
Favicon
Matanuska ADR 008 - Sigils
Favicon
Correct Way to Implement RTL in React Js
Favicon
Hackathon 101
Favicon
Learn javascript promises. Part 1 β€” What is a promise?
Favicon
PHP OOP Part-6: Polymorphism
Favicon
PHP OOP Part-1: Introduction, Object, and Class
Favicon
Matanuska ADR 005 - Editor Operations
Favicon
Create Class and Object
Favicon
Java basic program with expansion

Featured ones: