dev-resources.site
for different kinds of informations.
How Pinterest uses Kafka for Long-Term Data Storage
Published at
1/15/2025
Categories
programming
devops
career
learning
Author
the_infinity
Author
12 person written this
the_infinity
open
I spent hours diving into this so you don’t have to!
Here is what I learned:
- Pinterest doesn't store all data on Kafka brokers forever.
- Older data is moved to a remote storage like Amazon S3.
- They built a tool called Segment Uploader to automate this process.
- The Segment Uploader periodically transfers older data from Kafka brokers to remote storage.
- Segment Uploader runs as a sidecar alongside the Kafka broker.
- They also developed a specialized Consumer Library to fetch data intelligently.
- The library fetches old data directly from remote storage and new data from Kafka brokers.
By combining Kafka’s real-time capabilities with cost-efficient remote storage, Pinterest ensures scalability, reliability, and efficient long-term data management.
PS - I recently published an article on my free Newsletter covering this case study in-depth with visuals: https://designsystemsweekly.substack.com/p/how-pinterest-leverages-kafka-for
devops Article's
30 articles in total
DevOps bridges the gap between development and operations, emphasizing collaboration, automation, and continuous delivery in software development.
Day 04: Docker Compose: Managing multi-container applications
read article
AWS Certification Syllabus [Updated 2025]
read article
Research DevOps metrics and KPIs
read article
Kafka server with SASL_OAUTHBEARER
read article
Introduction to Terraform: Revolutionizing Infrastructure as Code
read article
Amazon S3 vs. Glacier: Data Archival Explained
read article
Be sure to check out our new bug bounty platform!
read article
Làm thế nào để quản lý secrets hiệu quả trên nhiều nền tảng chỉ với một công cụ?
read article
Как создать свой VPN и получить доступ ко всему?
read article
Building a Weather Data Collection System with AWS S3 and OpenWeather API
read article
Terraform input validation
read article
NXP i.MX8MP Platform Porting Driver Tutorial
read article
Stop Worrying About EC2 Patching – Automate It Like a Pro!
read article
How Pinterest uses Kafka for Long-Term Data Storage
currently reading
Something You Didn't Know About AWS Availability Zones
read article
Advanced Load Balancing with Traefik: An Introduction to Progressive Delivery, Mirroring, Sticky Sessions, and Health Checks
read article
Psychotherapy Technology Advancements
read article
Any recommendations of open source asset inventory ?
read article
AIOps : Investigation par l’IA dans Kubernetes avec HolmesGPT, Ollama et RunPod …
read article
How to Solve Common Kubernetes Multi-Cluster Deployment Issues
read article
Power Up Your AWS Game: Create EC2 Instances, Install Apache, and Connect with PowerShell
read article
Effortless vCluster Management with Sveltos: An Event-Driven Approach
read article
Docker vs kubernetes
read article
🚀 Week 3 Recap: Learning in Public – Software Engineering with DevOps 🚀
read article
HashiCorp Vault Setup Guide for NEAR Protocol Accounts
read article
Mastering Kubernetes Storage: A Deep Dive into Persistent Volumes and Claims
read article
Configuring Public IP addresses in Azure
read article
SPL: a database language featuring easy writing and fast running
read article
Cloud computing can be confusing, but it doesn't have to be! ☁️🤔 In the latest episode of Cloud in List of Threes (CiLoTs), I’m serving up easy-to-digest (pun intended 🤭) explanations analogy to explain Regions, Availability Zones, and Edge Locations
read article
[Boost]
read article
Featured ones: