Logo

dev-resources.site

for different kinds of informations.

Incremental compilation for Crystal - Part 2

Published at
1/4/2023
Categories
crystal
incremental
compilers
compilation
Author
asterite
Author
8 person written this
asterite
open
Incremental compilation for Crystal - Part 2

In the last post we found out that we could build a file dependencies graph for a given project. We also saw that, depending on the code, a file could depend on one another even if it's not immediately clear that that's the case.

But... is that dependency issue really an issue?

Checking the tool against real-world projects

In this section I'll look at the dependencies graph of two medium/big projects:

  • Mint
  • The Crystal compiler

We can use this branch of the compiler to compile Mint and the compiler to get the "dependencies.dot" files.

Then we can use this program to analyze the above files. The program will read a "dependencies.dot" file, build the graph in memory, and then for each node in that graph count how many nodes are, recursively, affected by it. This answers this question: if we make a change to a file, what are all the files that we need to re-analyze?

Transitive dependencies

Before looking at that, though, note that I said "recursively". If a file A depends on B, and B depends on C, why if we make changes to C we need to re-analyze A?

Let's take a look at an example.

# foo.cr
require "bar"

def foo
  Bar.bar
end

foo

# bar.cr
require "baz"

module Bar
  def self.bar
    Baz.baz
  end
end

# baz.cr
module Baz
  def self.baz
    1
  end
end
Enter fullscreen mode Exit fullscreen mode

Here we have that foo.cr depends on bar.cr, and bar.cr depends on baz.cr.

What does foo return? It returns an integer, because foo calls Bar.bar, which in turn calls Baz.baz, which returns 1.

If we change Baz.baz to return a string, foo will start returning a string too. So, effectively, changing baz.cr affected foo.cr. And note that the program still compiles just fine.

Does this transitive dependency re-analysis happen in other languages too? I don't know. But let's imagine Crystal forces you to specify argument and return types for methods, like Java, C#, Go, Rust, and essentially every statically typed language out there.

Now we have these files:

# foo.cr
require "bar"

def foo : Int32
  Bar.bar
end

foo

# bar.cr
require "baz"

module Bar
  def self.bar : Int32
    Baz.baz
  end
end

# baz.cr
module Baz
  def self.baz : Int32
    1
  end
end
Enter fullscreen mode Exit fullscreen mode

Now if we change Baz.baz to return a String, Bar will stop compiling. We can change Bar.bar to return a String and then change foo to return a String and then it will compile again. But doing that we changed all those files, so they naturally have to be re-analyzed.

But files don't always have to be re-analyzed like this. Let's say we change Baz.baz to return a String. We can make Bar still compile and return an Int32 like this:

module Bar
  def self.bar : Int32
    Baz.baz.size # The size of the string
  end
end
Enter fullscreen mode Exit fullscreen mode

Here we ended up changing Bar because of changes made to Baz, but we didn't have to change foo. Not only that: we don't have to re-analyze foo because Bar.bar still returns an int. We need to re-compile bar.cr, but foo can still keep calling the same method.

With all this, another partial conclusion is that not having mandatory type signatures in methods means even more that a change in one file can affect files that are far away in the dependency graph.

Checking affected files counts

If we run the above mentioned program on the "dependencies.dot" file for Mint, we'll see something like this:

src/macros.cr: 1
src/mint.cr: 1
src/utils/index_html.ecr: 1
src/utils/service_worker.ecr: 1
src/constants.cr: 2
src/app_init/scaffold.cr: 1193
src/artifact_cleaner.cr: 1193
src/assets.cr: 1193
src/ast.cr: 1193
...
1059 lines more
...

Total files: 1405
Enter fullscreen mode Exit fullscreen mode

I forgot to mention that this only outputs files in "your project". That is, the most common scenario is that you'll change files in your project, not in shards or the standard library.

Also, the output is sorted by number of affected files for a given file.

In this project there are a total of 1405 files, including files from shards and the standard library. And for the vast majority of files, if we make a change to them it ends up affecting 1193 files. That means that we'll end up needing to re-analyze most of the files anyway.

Doing this with the compiler is similar:

src/compiler/crystal.cr: 1
src/compiler/crystal/tools/doc/html/_head.html: 1
src/compiler/crystal/tools/doc/html/_list_items.html: 1
src/compiler/crystal/tools/doc/html/_method_detail.html: 1
src/compiler/crystal/tools/doc/html/_method_summary.html: 1
src/compiler/crystal/tools/doc/html/_methods_inherited.html: 1
src/compiler/crystal/tools/doc/html/_other_types.html: 1
src/compiler/crystal/tools/doc/html/_sidebar.html: 1
src/compiler/crystal/tools/doc/html/js/doc.js: 1
src/compiler/crystal/tools/doc/html/main.html: 1
src/compiler/crystal/tools/doc/html/sitemap.xml: 1
src/compiler/crystal/tools/doc/html/type.html: 1
src/compiler/crystal/tools/init/template/example.cr.ecr: 1
src/compiler/crystal/tools/init/template/example_spec.cr.ecr: 1
src/compiler/crystal/tools/init/template/gitignore.ecr: 1
src/compiler/crystal/tools/init/template/license.ecr: 1
src/compiler/crystal/tools/init/template/readme.md.ecr: 1
src/compiler/crystal/tools/init/template/shard.yml.ecr: 1
src/compiler/crystal/tools/playground/views/_workbook.html.ecr: 1
src/compiler/crystal/tools/playground/views/layout.html.ecr: 1
src/crystal/compiler_rt/divmod128.cr: 1
src/crystal/compiler_rt/fixint.cr: 1
src/crystal/compiler_rt/float.cr: 1
src/crystal/compiler_rt/mul.cr: 1
src/crystal/once.cr: 1
src/crystal/system/process.cr: 1
src/http/server/handlers/static_file_handler.html: 1
src/log.cr: 1
src/log/setup.cr: 3
src/compiler/crystal/command.cr: 6
src/compiler/crystal/command/cursor.cr: 6
src/compiler/crystal/command/eval.cr: 6
src/compiler/crystal/command/repl.cr: 6
src/compiler/crystal/command/spec.cr: 6
src/compiler/crystal/command/docs.cr: 7
src/compiler/crystal/command/env.cr: 7
src/compiler/crystal/command/format.cr: 7
src/compiler/crystal/command/playground.cr: 7
src/compiler/crystal/tools/print_hierarchy.cr: 7
src/spec/cli.cr: 7
src/array.cr: 451
src/base64.cr: 451
...
444 lines more
...

Total files: 513
Enter fullscreen mode Exit fullscreen mode

Again, with a total of 513 files, making changes to most files ends up affecting 451 other files, which are almost all of them. So trying to reuse previous compilation data won't be a lot of benefit.

The end?

Is this the end of the incremental compilation exploration? Is there nothing we can do about this? No! This is just the beginning. We have to look deeper into how things depend on each other. Maybe files aren't the incremental compilation unit we are looking for? Maybe we could introduce some small changes to the language to prevent transitive dependencies? Maybe the analysis can still be reused for methods that have a specified return type?

Like before, I didn't explore any of these options yet. Maybe I will!

compilers Article's
30 articles in total
Favicon
How to create simple tool for compile the Linux Kernel
Favicon
Unraveling Undefined Behavior: Performance Optimizations in Modern Compilers
Favicon
Video — Deep dive: Compiling deep learning models, from XLA to PyTorch 2
Favicon
The current state of Lithia after 2 years
Favicon
Verificando e Gerando Expressões
Favicon
Expressões encadeadas e agrupamento
Favicon
Improving Compiler Performance with Profile Guided Optimization
Favicon
Understanding Interpreters and Compilers in Programming
Favicon
Create Your Own Programming Language 9: Iteration
Favicon
Crafting Interpreters
Favicon
Create Your Own Programming Language 8: Conditionals
Favicon
Create Your Own Programming Language 7: More Types
Favicon
Create Your Own Programming Language 6: Functions
Favicon
Create Your Own Programming Language 4: Variables and Types
Favicon
Create Your Own Programming Language 3: Call Expressions
Favicon
How To Create Your Own Programming Language
Favicon
Create Your Own Programming Language 1: Numbers
Favicon
An alternative to "Distinguishing an Interpreter from a Compiler"
Favicon
On What Lexers Do
Favicon
Compilers Could Be Way More Fun
Favicon
Incremental compilation for Crystal - Part 3
Favicon
Incremental compilation for Crystal - Part 1
Favicon
Incremental compilation for Crystal - Part 2
Favicon
Crafting a Compiler in Rust: Lexical Analysis
Favicon
Crafting a Compiler in Rust: Introduction
Favicon
What is an ELF file?
Favicon
Static vs dynamic linking
Favicon
Speeding up ReScript compilation using interface files
Favicon
🕶 What it takes to build a Static Analysis tool
Favicon
A Compiler optimization area

Featured ones: