dev-resources.site
for different kinds of informations.
The Wasm Component Model and idiomatic codegen
Arcjet bundles WebAssembly with our security as code SDK. This helps developers implement common security functionality like PII detection and bot detection directly in their code. Much of the logic is embedded in Wasm, which gives us a secure sandbox with near-native performance and is part of our philosophy around local-first security.
The ability to run the same code across platforms is also helpful as we build out support from JavaScript to other tech stacks, but it requires an important abstraction to translate between languages (our Wasm is compiled from Rust).
The WebAssembly Component Model is the powerful construct which enables this, but a construct can only be as good as the implementations and tooling surrounding it. For the Component Model, this is most evident in the code generation for Hosts (environments that execute WebAssembly Component Model) and Guests (WebAssembly modules written in any language and compiled to the Component Model; Rust in our case).
The Component Model defines a language for communication between Hosts and Guests which is primarily composed of types, functions, imports and exports. It tries to define a broad language, but some types, such as variants, tuples, and resources, might not exist in a given general purpose programming language.
When a tool tries to generate code for one of these languages, the authors often need to get creative to map Component Model types to that general purpose language. For example, we use jco for generating JS bindings and this implements variants using a JavaScript object in the shape of { tag: string, value: string }
. It even has a special case for the result<_, _>
type where the error variant is turned into an Error
and thrown.
This post explores how the Wasm Component Model enables cross-language integrations, the complexities of code generation for Hosts and Guests, and the trade-offs we make to achieve idiomatic code in languages like Go.
Host code generation for Go
At Arcjet, we have had to build a tool to generate code for Hosts written in the Go programming language. Although our SDK attempts to analyze everything locally, that is not always possible and so we have an API written in Go which augments local decisions with additional metadata.
Go has a very minimal syntax and type system by design. They didn’t even have generics until very recently and they still have significant limitations. This makes codegen from the Component Model to Go complex in various ways.
For example, we could generate a result<_, _>
as:
type Result[V any] struct {
value V
err error
}
However, this limits the type that can be provided in the error position. So we’d need to codegen it as:
type Result[V any, E any] struct {
value V
err E
}
This works but becomes cumbersome to use with other idiomatic Go, which often uses the val, err := doSomething()
convention to indicate the same semantics as the Result
type we’ve defined above.
Additionally, constructing this Result
is cumbersome: Result[int, string]{value: 1, err: ""}
. Instead of providing the Result
type, we probably want to match idiomatic patterns so Go users feel natural consuming our generated bindings.
Idiomatic vs Direct Mapping
Code can be generated to feel more natural to the language or it can be a more direct mapping to the Component Model types. Neither option fits 100% of use cases so it is up to the tool authors to decide which makes the most sense.
For the Arcjet tooling, we chose the idiomatic Go approach for option<_>
and result<_, _>
types, which map to val, ok := doSomething()
and val, err := doSomething()
respectively. For variants, we create an interface that each variant needs to implement, such as:
type BotConfig interface {
isBotConfig()
}
func (AllowedBotConfig) isBotConfig() {}
func (DeniedBotConfig) isBotConfig() {}
This strikes a good balance between type safety and unnecessary wrapping. Of course, there are situations where the wrapping is required, but those can be handled as edge cases.
Developers may struggle with non-idiomatic patterns, leading to verbose, less maintainable code. Using established conventions makes the code feel more familiar, but does require some additional effort to implement.
We decided to take the idiomatic path to minimize friction and make it easier for our team so we know what to expect when moving around the codebase.
Calling conventions
One of the biggest decisions tooling authors need to make is the calling convention of the bindings. This includes deciding how/when imports will be compiled, if the Wasm module will be compiled during setup or instantiation, and cleanup.
In the Arcjet codebase, we chose the factory/instance pattern to optimize performance. Compiling a WebAssembly module is expensive, so we do it once in the NewBotFactory()
constructor. Subsequent Instantiate()
calls are then fast and cheap, allowing for high throughput in production workloads.
func NewBotFactory(
ctx context.Context,
) (*BotFactory, error) {
runtime := wazero.NewRuntime(ctx)
// ... Imports are compiled here if there are any
// Compiling the module takes a LONG time, so we want to do it once and hold
// onto it with the Runtime
module, err := runtime.CompileModule(ctx, wasmFileBot)
if err != nil {
return nil, err
}
return &BotFactory{runtime, module}, nil
}
Consumers construct this BotFactory
once by calling NewBotFactory(ctx)
and use it to create multiple instances via the Instantiate
method.
func (f *BotFactory) Instantiate(ctx context.Context) (*BotInstance, error) {
if module, err := f.runtime.InstantiateModule(ctx, f.module, wazero.NewModuleConfig()); err != nil {
return nil, err
} else {
return &BotInstance{module}, nil
}
}
Instantiation is very fast if the module has already been compiled, like we do with runtime.CompileModule()
when constructing the factory.
The BotInstance
has functions which were exported from the Component Model definition.
func (i *BotInstance) Detect(
ctx context.Context,
request string,
options BotConfig,
) (BotResult, error) {
// ... Lots of generated code for binding to Wazero
}
Generally, after using a BotInstance
, we want to clean it up to ensure we’re not leaking memory. For this we provide the Close
function.
func (i *BotInstance) Close(ctx context.Context) error {
if err := i.module.Close(ctx); err != nil {
return err
}
return nil
}
If you want to clean up the entire BotFactory
, that can be closed too:
func (f *BotFactory) Close(ctx context.Context) {
f.runtime.Close(ctx)
}
We can put all these APIs together to call functions on this WebAssembly module:
ctx := context.Background()
factory, err := NewBotFactory(ctx)
if err != nil {
panic(err)
}
defer factory.Close(ctx)
instance, err := factory.Instantiate(ctx)
if err != nil {
panic(err)
}
defer instance.Close(ctx)
result, err := instance.Detect(
ctx,
request,
AllowedBotConfig{
Entities: []BotEntity{"GOOGLE_CRAWLER"},
SkipCustomDetect: true,
},
)
if err != nil {
panic(err)
}
fmt.Printf("%+v", result)
This pattern of factory and instance construction takes more code to use, but it was chosen to achieve as much performance as possible in the hot paths of the Arcjet service.
By front-loading the compilation cost, we ensure that in the hot paths of the Arcjet service - where latency matters most - request handling is as efficient as possible. This trade-off does add some complexity to initialization code, but it pays off with substantially lower overhead per request - see our discussion of the tradeoffs.
Trade-offs
Any time we need to integrate two or more languages, it is fraught with trade-offs that need to be made—whether using native FFI or the Component Model.
This post discussed a few of the challenges we’ve encountered at Arcjet and the reasoning behind our decisions. If we all build on the same set of primitives, such as the Component Model and WIT, we can all leverage the same set of high-quality primitives, such as wit-bindgen or wit-component, and build tooling to suit every use case. This is why working towards standards helps everyone.
The WebAssembly Component Model offers a powerful abstraction for cross-language integration, but translating its types into languages like Go introduces subtle design challenges. By choosing idiomatic patterns and selectively optimizing for performance - such as using a factory/instance pattern - we can provide a natural developer experience while maintaining efficiency.
As tooling around the Component Model evolves, we can look forward to more refined codegen approaches that further simplify these integrations.
Featured ones: