Logo

dev-resources.site

for different kinds of informations.

πŸš€ V8 Engine Secrets How We Slashed Memory Usage by 66% with TypedArrays

Published at
11/16/2024
Categories
node
jit
webperf
javascript
Author
asadk
Categories
4 categories in total
node
open
jit
open
webperf
open
javascript
open
Author
5 person written this
asadk
open
πŸš€ V8 Engine Secrets How We Slashed Memory Usage by 66% with TypedArrays

When optimizing our DTA (Stata file format) parser and writer, we discovered several key techniques that dramatically improved performance:

1. Avoiding DataView for High-Performance Binary Operations

original approach:

function writeValue(view: DataView, offset: number, value: number) {  
    view.setFloat64(offset, value, true);
    return offset + 8; 
}
Enter fullscreen mode Exit fullscreen mode

optimized approach using Uint8Array:

// Create shared buffer and views once, outside the function
const sharedBuffer = new ArrayBuffer(8);
const sharedFloat64 = new Float64Array(sharedBuffer);
const sharedUint8 = new Uint8Array(sharedBuffer);

function writeValueWithUint8Array(buffer: Uint8Array, offset: number, value: number): number {
    // Write to shared Float64Array
    sharedFloat64[0] = value;

    // Copy bytes to target buffer
    buffer.set(sharedUint8, offset);
    return offset + 8;
}

Enter fullscreen mode Exit fullscreen mode

DataView operations are significantly slower due to bounds checking and endianness handling.

Uint8Array provides faster read/write operations than DataView due to direct memory access.

Here's a diagram that illustrates it.

Image description

2. Pre-computing Common Patterns

Rather than computing missing value patterns on demand, we pre-compute them once:

original approach:

const MISSING_VALUES = {
  ...
  BYTE_MISSING: 101,
  ...
};

function writeMissingValue(
  view: DataView,
  offset: number,
  type: ColumnType,
  littleEndian: boolean
): number {
  switch (type) {
    case ColumnType.BYTE:
      view.setInt8(offset, MISSING_VALUES.BYTE_MISSING);
      return offset + 1;
    case ...
  }
}
Enter fullscreen mode Exit fullscreen mode

optimized approach:

const MISSING_PATTERNS = {
  BYTE: new Uint8Array([MISSING_VALUES.BYTE_MISSING]),
  FLOAT_NAN: (() => {
    const buf = new ArrayBuffer(4);
    new DataView(buf).setUint32(0, 0x7fc00000, true);
    return new Uint8Array(buf);
  })(),
  DOUBLE_NAN: (() => {
    const buf = new ArrayBuffer(8);
    const view = new DataView(buf);
    view.setUint32(0, 0, true);
    view.setUint32(4, 0x7ff80000, true);
    return new Uint8Array(buf);
  })(),
};
Enter fullscreen mode Exit fullscreen mode

Image description

This optimization:

  • Eliminates repeated buffer allocations and bit manipulations in hot paths
  • Provides immediate access to commonly used patterns
  • Reduces cognitive load by centralizing binary pattern definitions

3. Loop Optimization for V8's JIT Compiler

Understanding V8's optimization patterns led us to prefer simple for-loops over higher-order array methods:

// Before: Creates closure and temporary arrays
const formats = Array(nvar)
  .fill(null)
  .map(() => ({
    type: ColumnType.DOUBLE,
    maxDecimals: 0,
  }));

// After: Simple, predictable loop that V8 can optimize
const formats = new Array(nvar);
for (let i = 0; i < nvar; i++) {
  formats[i] = {
    type: ColumnType.DOUBLE,
    maxDecimals: 0,
  };
}
Enter fullscreen mode Exit fullscreen mode

Image description

V8's JIT compiler can better optimize simple counting loops because:

  • The iteration pattern is predictable
  • No closure creation or function call overhead
  • Memory allocation pattern is more straightforward
  • Better instruction caching due to linear code execution

4. Shared Buffer Strategy for Maximum Efficiency

One of our most impactful optimizations was implementing a shared buffer strategy.
Instead of allocating new buffers for each operation, we maintain a single pre-allocated buffer for temporary operations:


// Pre-allocate shared buffers at module level
const SHARED_BUFFER_SIZE = 1024 * 1024; // 1MB shared buffer
const sharedBuffer = new ArrayBuffer(SHARED_BUFFER_SIZE);
const sharedView = new DataView(sharedBuffer);
const sharedUint8 = new Uint8Array(sharedBuffer);

// Different views for different numeric types
const tempBuffers = {
  float32: new Float32Array(sharedBuffer),
  float64: new Float64Array(sharedBuffer),
  uint8: new Uint8Array(sharedBuffer),
  dataView: new DataView(sharedBuffer),
};
Enter fullscreen mode Exit fullscreen mode

This approach provides several key advantages:

  • Eliminates thousands of small buffer allocations that would trigger garbage collection
  • Improves CPU cache utilization by reusing the same memory locations
  • Reduce memory fragmentation in long-running processes
  • Provides specialized view for different numeric types without additional allocations

Key Improvements πŸ“ˆ

Large Files (Best Results)

  • ⚑ ~43% faster conversion time (from 3939ms to 2231ms)
  • πŸ’Ύ 34% reduction in peak heap usage (13189MB to 8682MB)
  • πŸ”„ 52% increase in rows/second processing (5935 to 9030 rows/sec)

Medium Files

  • ⚑ ~41% faster conversion time (260ms to 154ms)
  • πŸ’Ύ 33% reduction in peak memory usage (1000MB to 673MB)
  • πŸ”„ 46% boost in rows/second processing (7139 to 10453 rows/sec)

Small Files

  • ⚑ ~44% faster conversion time (16.47ms to 9.14ms)
  • πŸ’Ύ 28% reduction in peak heap usage (85MB to 60MB)
  • πŸ”„ 42% increase in rows/second processing (8351 to 11930 rows/sec)

Conclusion

The key learning was that understanding V8's optimization strategies and leveraging typed arrays with shared buffers can dramatically improve performance when processing binary data. While some optimizations made the code slightly more complex, the performance benefits justified the trade-offs for our high-throughput use case.

Remember: Reserve optimizations for data-heavy critical paths - everywhere else, favor clean, maintainable code.


πŸ™Œ Thank You, Contributors!

These optimizations were a team effort. Big shoutout to the awesome folks who helped level up our code:


πŸ§‘β€πŸ’» Maintainer

β€’ GitHub
β€’ Website
β€’ Twitter


πŸ“š Further Reading


webperf Article's
30 articles in total
Favicon
Redefining Web Performance Standards with INP
Favicon
How to avoid frontend tech making us resentful
Favicon
Understanding PHP-FPM: Key Differences from Traditional PHP Processes and Benefits
Favicon
How Mentimeter deliver reliable live experiences at scale
Favicon
The ArtΒ of Prefetching and Preloading: Enhancing Web Performance
Favicon
The curious case of the paragraph with the bad CLS
Favicon
JavaScript Frameworks - Heading into 2025
Favicon
Technical SEO for Developers: Mastering Site Structure and Performance
Favicon
Extending Lighthouse for custom image and video optimization analysis
Favicon
Assassin ⚑️ - An open source, free database for killing slow webpages
Favicon
A Comprehensive Guide to Web Vitals: Metrics That Matter for Performance
Favicon
Why should you care about website performance?
Favicon
Screener.in Search API: A Performance Checkup! πŸ”Ž
Favicon
How Sportsbet handles 4.5M daily chat messages on its 'Bet With Mates' platform
Favicon
Enhancing React Performance with Concurrent Rendering
Favicon
Optimizing React Performance: Avoiding Unnecessary Re-renders
Favicon
Master React Profiler: Optimize Your App's Performance
Favicon
Lightweight, Transparent, Animated: Get All of Them by WebP Format
Favicon
Prerender Pages in Browser For Faster Page Load
Favicon
How we optimized perceived performance to improve our KPIs: a Hotjar case study
Favicon
Sites speed optimisation is a destination, not a journey
Favicon
How to Build a High-Performance WordPress Website: A Developer’s Guide
Favicon
Performance Optimization in React
Favicon
Subsequent Page Load Optimization πŸš€
Favicon
How to Build Blazing Fast Websites with Any Framework
Favicon
Everything to know about Mobile App Performance Test Tools, Metrics, & Techniques
Favicon
Efficient State Management in Next.js: Best Practices for Scalable Applications
Favicon
Top 5 Tips to Supercharge Your Express.js App for Lightning-Fast Performance
Favicon
A useState performance tip you may not have known
Favicon
πŸš€ V8 Engine Secrets How We Slashed Memory Usage by 66% with TypedArrays

Featured ones: