dev-resources.site
for different kinds of informations.
Rethinking our REST API: Building the Golden API
At some point, every company reaches a crossroads where they need to stop and reassess the tools they've been using. For us, that moment came when we realized that the API powering our web dashboard had become unmanageable, hard to test, and didn't meet the standards we set for our codebase.
Arcjet is primarily a security as code SDK to help developers implement security functionality like bot detection, email validation, and PII detection. This communicates with our high-performance, low-latency decision gRPC API.
Our web dashboard uses a separate REST API primarily to manage site connections and review processed requests analytics, however this also includes signing up new users and managing their account, which means it’s still an important part of the product.
Screenshot of the requests analysis page of the Arcjet dashboard.
So, we decided to take on the challenge of rebuilding our API from the ground up—this time with a focus on maintainability, performance, and scalability. However, we didn’t want to embark on a huge rewrite project - that never works out well - instead, we decided to build a new foundation and then start with a single API endpoint.
In this post, I’ll discuss how we approached this.
Previously on Arcjet
When speed was our top priority, Next.js provided the most convenient solution for building API endpoints that our frontend could consume. It gave us seamless full-stack development within a single codebase, and we didn’t have to worry too much about infrastructure because we deployed on Vercel.
Our focus was on our security as code SDK and low-latency decision API so for the frontend dashboard this stack allowed us to quickly prototype features with little friction.
Our Stack: Next.js, DrizzleORM, useSWR, NextAuth
However, as our product evolved, we found that combining all our API endpoints and frontend code in the same project led to a tangled mess.
Testing our API became cumbersome (and is very difficult to do with Next.js anyway), and we needed a system that could handle both internal and external consumption. As we integrated with more platforms (like Vercel, Fly.io, and Netlify), we realized that speed of development alone wasn't enough. We needed a more robust solution.
As part of this project we also wanted to address a lingering security concern about how Vercel requires you to expose your database publicly. Unless you pay for their enterprise “secure compute”, connecting to a remote database requires it to have a public endpoint. We prefer to lock down our database so it can only be accessed via a private network. Defense in depth is important and this would be another layer of protection.
This resulted in us deciding to uncouple the frontend UI from the backend API.
Introducing "The Golden API"
What is "The Golden API"? It’s not a specific technology or framework—it’s more of an ideal set of principles that define a well-built API. While developers may have their own preferences for languages and frameworks, there are certain concepts valid across tech stacks that most can agree on for building a high-quality API.
1. Performance and Scalability
We already have experience in delivering high-performance APIs .Our Decision API is deployed close to our customers, uses Kubernetes to scale dynamically, and is optimized for low-latency responses.
We considered serverless environments and other providers, but with our existing k8s clusters already operating it made most sense to reuse the infrastructure in place: deployments through Octopus Deploy, monitoring through Grafana + Jaeger, Loki, Prometheus, etc.
After a short internal Rust vs Go bake off, we chose Go for its simplicity, speed, and how well it achieves its original goals of excellent support for building scalable network services. We also already use it for the Decision API and understand how to operate Go APIs, which finalized the decision for us.
2. Comprehensive and Clear Documentation
Switching the backend API to Go was straightforward thanks to its simplicity and the availability of great tools. But there was one catch: we were keeping the Next.js frontend and didn’t want to manually write TypeScript types or maintain separate documentation for our new API.
Enter OpenAPI—a perfect fit for our needs. OpenAPI allows us to define a contract between the frontend and backend, while also serving as our documentation. This solves the problem of maintaining a schema for both sides of the app.
3. Security and Authentication
Integrating authentication in Go wasn’t too difficult, thanks to NextAuth being relatively straightforward to mimic on the backend. NextAuth (now Auth.js) has APIs available to verify a session.
This meant we could have a TypeScript client in the frontend generated from our OpenAPI spec making fetch calls to the backend API. The credentials are automatically included in the fetch call and the backend can verify the session with NextAuth.
4. Testability
Writing any kind of tests in Go is very easy and there are plenty of examples out there, covering the topic of testing HTTP handlers.
It’s also much easier to write tests for the new Go API endpoints compared to the Next.js API, particularly because we want to test authenticated state and real database calls. We were able to easily write tests for the Gin router and spin up real integration tests against our Postgres database using Testcontainers.
Putting It All Together
Writing the OpenAPI Spec
We started by writing the OpenAPI 3.0 specification for our API. An OpenAPI-first approach promotes designing the API contract before implementation, ensuring that all stakeholders (developers, product managers, and clients) agree on the API's behavior and structure before any code is written. It encourages careful planning and results in a well-thought-out API design that is consistent and adheres to established best practices. These are the reasons why we chose to write the spec first and generate the code from it, as opposed to the other way around.
My tool of choice for this wasAPI Fiddle, which helps you quickly draft and test OpenAPI specs. However, API Fiddle only supports OpenAPI 3.1 (which we couldn’t use because many libraries hadn’t adopted it), so we stuck with version 3.0 and wrote the spec by hand.
Here’s an example of what the spec for our API looked like:
openapi: 3.0.0
info:
title: Arcjet Sites API
description: A CRUD API to manage sites.
version: 1.0.0
servers:
- url: <https://api.arcjet.com/v1>
description: Base URL for all API operations
paths:
/teams/{teamId}/sites:
get:
operationId: GetTeamSites
summary: Get a list of sites for a team
description: Returns a list of all Sites associated with a given Team.
parameters:
- name: teamId
in: path
required: true
description: The ID of the team
schema:
type: string
responses:
"200":
description: A list of sites
content:
application/json:
schema:
type: array
items:
$ref: "#/components/schemas/Site"
default:
description: unexpected error
content:
application/json:
schema:
$ref: "#/components/schemas/Error"
components:
schemas:
Site:
type: object
properties:
id:
type: string
description: The ID of the site
name:
type: string
description: The name of the site
teamId:
type: string
description: The ID of the team this site belongs to
createdAt:
type: string
format: date-time
description: The timestamp when the site was created
updatedAt:
type: string
format: date-time
description: The timestamp when the site was last updated
deletedAt:
type: string
format: date-time
nullable: true
description: The timestamp when the site was deleted (if applicable)
Error:
required:
- code
- message
- details
properties:
code:
type: integer
format: int32
description: Error code
message:
type: string
description: Error message
details:
type: string
description: Details that can help resolve the issue
With the OpenAPI spec in place, we used OAPI-codegen, a tool that automatically generates Go code from the OpenAPI specification. It generates all the necessary types, handlers, and error handling structures, making the development process much smoother.
//go:generate go run github.com/oapi-codegen/oapi-codegen/v2/cmd/oapi-codegen --config=config.yaml ../../api.yaml
The output was a set of Go files, one containing the server skeleton and another with the handler implementations. Here's an example of a Go type generated for the Site
object:
// Site defines model for Site.
type Site struct {
// CreatedAt The timestamp when the site was created
CreatedAt *time.Time `json:"createdAt,omitempty"`
// DeletedAt The timestamp when the site was deleted (if applicable)
DeletedAt *time.Time `json:"deletedAt"`
// Id The ID of the site
Id *string `json:"id,omitempty"`
// Name The name of the site
Name *string `json:"name,omitempty"`
// TeamId The ID of the team this site belongs to
TeamId *string `json:"teamId,omitempty"`
// UpdatedAt The timestamp when the site was last updated
UpdatedAt *time.Time `json:"updatedAt,omitempty"`
}
With the generated code in place, we were able to implement the API handler logic, like this:
func (s Server) GetTeamSites(w http.ResponseWriter, r *http.Request, teamId string) {
ctx := r.Context()
// Check user has permission to access team resources
isAllowed, err := s.userIsAllowed(ctx, teamId)
if err != nil {
slog.ErrorContext(
ctx,
"failed to check permissions",
slogext.Err("err", err),
slog.String("teamId", teamId),
)
SendInternalServerError(ctx, w)
return
}
if !isAllowed {
SendForbidden(ctx, w)
return
}
// Retrieve sites from database
sites, err := s.repo.GetSitesForTeam(ctx, teamId)
if err != nil {
slog.ErrorContext(
ctx,
"list sites for team query returned an error",
slogext.Err("err", err),
slog.String("teamId", teamId),
)
SendInternalServerError(ctx, w)
return
}
SendOk(ctx, w, sites)
}
The Database: Moving Away from DrizzleORM
Drizzle is a great ORM for JS projects and we’d use it again, but moving the database code out of Next.js meant that we needed something similar for Go.
We chose GORM as our ORM and used the Repository Pattern to abstract database interactions. This allowed us to write clean, testable database queries.
type ApiRepo interface {
GetSitesForTeam(ctx context.Context, teamId string) ([]Site, error)
}
type apiRepo struct {
db *gorm.DB
}
func (r apiRepo) GetSitesForTeam(ctx context.Context, teamId string) ([]Site, error) {
var sites []Site
result := r.db.WithContext(ctx).Where("team_id = ?", teamId).Find(&sites)
if result.Error != nil {
return nil, ErrorNotFound
}
return sites, nil
}
Testing. Everything
Testing is critical for us. We wanted to ensure that all database calls were properly tested, so we used Testcontainers to spin up a real database for our tests, closely mirroring our production setup.
type ApiRepoTestSuite struct {
suite.Suite
container *postgres.PostgresContainer
}
func (suite *ApiRepoTestSuite) SetupSuite() {
container, err := postgres.Run(suite.Context(), "postgres:16", ...)
require.NoError(suite.T(), err)
suite.container = container
}
After setting up the test environment, we tested all CRUD operations as we would in production, ensuring that our code behaves correctly.
func (suite *ApiRepoTestSuite) Test_CRUD() {
// Connect to the container DB
db, err := gorm.Open(gormpg.Open(suite.ConnectionString()), &gorm.Config{})
// Create the subject we're testing
testRepo, err := NewApiRepo(db)
ctx := context.Background()
// Create site
siteName := "Test Site"
teamId := "teamid"
site, err := testRepo.CreateSite(ctx, siteName, teamId)
require.NoError(suite.T(), err)
//...
// Update site
updatedSiteName := "Updated Test Site"
updatedSite, err := testRepo.UpdateSite(ctx, site.ID, updatedSiteName)
require.NoError(suite.T(), err)
//...
// Get site
getSite, err := testRepo.GetSiteById(ctx, updatedSite.ID)
require.NoError(suite.T(), err)
//...
// Delete site
deleteSite, err := testRepo.DeleteSite(ctx, updatedSite.ID)
require.NoError(suite.T(), err)
//...
// Get deleted site - should not return records
getDeletedSite, err := testRepo.GetSiteById(ctx, updatedSite.ID)
require.Error(suite.T(), err)
//...
}
For testing our API handlers, we used Go's httptest
package and mocked the database interactions using Mockery. This allowed us to focus on testing the API logic without worrying about database issues.
func (suite *ApiTestSuite) SetupTest() {
testRepo := repo.NewMockApiRepo(suite.T())
suite.testRepo = testRepo
testServer := NewServer(testRepo)
suite.testServer = testServer
}
func (suite *ApiTestSuite) Test_CreateSite() {
siteName := "Test Site Name"
teamId := "teamid"
// Setup
suite.testRepo.On("CreateSite", mock.Anything, siteName, teamId).
Return(&repo.Site{
...
}, nil)
// Where the response from the request will be written to
w := httptest.NewRecorder()
requestBody, err := json.Marshal(&NewSite{
Name: siteName,
})
require.NoError(suite.T(), err)
request := httptest.NewRequest(
http.MethodPost,
fmt.Sprintf("/teams/%s/sites", teamId),
strings.NewReader(string(requestBody)))
// Run
suite.testServer.CreateSite(w, request, teamId)
// Assert
body := w.Body.String()
require.Equal()
}
Frontend Consumption: OpenAPI-Driven Frontend
Once our API was tested and deployed, we turned our attention to the frontend.
Our previous API calls were made using Next.js’ recommended fetch
API with caching built in. For more dynamic views, some components used SWR on top of fetch
so we could get type-safe, auto-reloading data fetching calls.
To consume the API on the frontend, we used the openapi-typescript library, which generates TypeScript types based on our OpenAPI schema. This made it easy to integrate the backend with our frontend without having to manually sync data models. This has Tanstack Query built in, which uses fetch
under the hood, but also syncs to our schema.
We're gradually migrating the API endpoints over to the new Go server and making small improvements along the way. If you open your browser inspector you'll see those new requests going to api.arcjet.com
Browser inspector screenshot showing an API call to the new Go backend.
The Golden Checklist
So, did we achieve the elusive Golden API? Let’s check the box:
- ✅ Performance and Scalability – The API is deployed on our existing k8s clusters which are already tuned for performance. We have detailed metrics and can scale them out as needed.
- ✅ Comprehensive and Clear Documentation – The OpenAPI spec provides a single source of truth because code is generated from it rather than the other way around. Using generated clients means it's easy for our team to work with the APIs.
- ✅ Security and Authentication – We already deploy Go in production so we can copy over our security practices. Authenticated is handled by NextAuth.
- ✅ Testability – We implemented unit testing of the handlers and integration testing using Testcontainers, a significant improvement upon our Next.js APIs.
We went further with:
- ✅ Monitoring – Deploying to the existing k8s clusters meant we inherited the tracing, logging, and metrics capabilities already set up. Adding OpenTelemetry instrumentation to Gin was trivial.
- ✅ Simplicity – Go was originally designed for APIs, and it shows. Our code is much cleaner and more maintainable.
In the end, we’re happy with the results. Our API is faster, more secure, and better tested. The transition to Go was worth it and we’re now better positioned to scale and maintain our API as our product grows.
Featured ones: