Logo

dev-resources.site

for different kinds of informations.

Google Pub/Sub

Published at
11/20/2024
Categories
pubsub
Author
samnash
Categories
1 categories in total
pubsub
open
Author
7 person written this
samnash
open
Google Pub/Sub

Introduction: The Power of Event-Driven Architectures

Have you ever wondered how large-scale systems like Google, Netflix, or Spotify manage to process millions of real-time events, such as sending notifications, updating dashboards, or recommending personalized content, all without missing a beat? The secret lies in event-driven architectures, and at the heart of such systems is a powerful tool: Google Cloud Pub/Sub.

In a world where businesses demand seamless integration and instant processing of data, traditional point-to-point messaging systems often fall short. Enter Google Pub/Sub, a fully managed messaging service designed to decouple applications, handle massive data streams, and enable real-time communication with unparalleled reliability and scalability.

Why Pub/Sub is Essential for Modern Systems
Modern applications require more than just static workflows; they thrive on dynamic, event-driven processes that can adapt and respond to real-time data. Here are some key reasons why Google Pub/Sub has become indispensable:

  • Real-Time Processing: From IoT devices generating millions of sensor readings to financial systems detecting fraudulent transactions, Pub/Sub ensures these events are captured and processed in real time.

  • Scalability: Pub/Sub seamlessly scales to handle millions of messages per second, supporting the ever-growing demands of modern businesses.

  • Decoupling of Services: It simplifies architectures by allowing independent components to communicate asynchronously, improving maintainability and flexibility.

  • Global Reach: Built on Google's global infrastructure, Pub/Sub offers low-latency message delivery across regions.

  • Integration-Friendly: It integrates seamlessly with other Google Cloud services like BigQuery, Dataflow, Cloud Functions, and Cloud Storage, making it a cornerstone for building robust, interconnected systems.

Google Cloud Pub/Sub is a fully managed messaging service designed for real-time communication, enabling independent applications to send and receive messages seamlessly. It's a powerful tool for building scalable and reliable systems.

In this tutorial, we'll delve into the core concepts of Pub/Sub, explore practical examples using the command-line interface (CLI) and Python SDK, and discuss various use cases.

Key Concepts

  • Topic: A named resource to which messages are published.
  • Subscription: A named resource that receives messages from a specific topic.
  • Publisher: An application that sends messages to a topic.
  • Subscriber: An application that receives messages from a subscription.
  • Message: The core unit of communication in Pub/Sub, consisting of data (payload) and attributes (key-value metadata).
  • Acknowledgment (Ack): Subscribers must confirm receipt of a message by acknowledging it. Unacknowledged messages are redelivered to ensure reliable processing.
  • Push vs. Pull Subscriptions
    • Pull: Subscribers explicitly fetch messages.
    • Push: Pub/Sub pushes messages to an HTTP(S) endpoint.
  • Dead Letter Queue (DLQ): A separate topic for routing undeliverable messages after a defined number of delivery attempts. Useful for troubleshooting.
  • Retention Policy: Controls how long messages are stored if they remain unacknowledged or acknowledged. Default: 7 days.
  • Filters: Enable subscribers to receive only messages that match specific criteria based on attributes.
  • Ordering Keys: Ensures messages with the same ordering key are delivered in the order they were published.
  • Exactly Once Delivery: Guarantees no duplicate messages are delivered when configured correctly.
  • Flow Control: Helps subscribers manage the rate of message delivery to prevent being overwhelmed.
  • IAM Permissions and Roles: Pub/Sub uses Google Cloud IAM to control access.
    • Key roles:
    • Publisher: roles/pubsub.publisher
    • Subscriber: roles/pubsub.subscriber
    • Viewer: roles/pubsub.viewer
  • Message Deduplication: Prevents duplicate messages caused by retries or network issues.
  • Message Expiration (TTL): Automatically drops messages from a subscription after a specified time-to-live, reducing storage costs.
  • Backlog: The collection of unacknowledged messages in a subscription, enabling message replay if required.
  • Schema Registry: Defines structured message formats (e.g., Avro, Protocol Buffers), ensuring consistent data validation.
  • Batching: Groups multiple messages together for efficient publishing and delivery, improving performance.
  • Regional Endpoints: Publish messages to region-specific endpoints for compliance and reduced latency.
  • Cross-Project Topics and Subscriptions: Share Pub/Sub resources across projects for multi-project architectures.
  • Monitoring and Metrics: Integrated with Cloud Monitoring, providing insights like throughput, acknowledgment latency, and backlog size.
  • Snapshot: Captures the state of a subscription at a specific time, enabling message replay from that point.
  • Message Encryption: Messages are encrypted in transit and at rest by default, with the option to use Customer-Managed Encryption Keys (CMEK).

Getting Started

Pre-requisites

  • Set Up Your Google Cloud Project:
    • Create a new Google Cloud project or select an existing one.
    • Enable the Pub/Sub API.
  • Install the Google Cloud SDK:

    • Download and install the SDK from the official website.
    • Authenticate your Google Cloud account using
     # using cli
     gcloud auth login
    

Creating a Topic and Subscription

# Create a topic using cli
gcloud pubsub topics create my-topic

# Create a subscription using cli
gcloud pubsub subscriptions create my-subscription --topic my-topic
Enter fullscreen mode Exit fullscreen mode
# Using the Python SDK
from google.cloud import pubsub_v1

# Create a Publisher client
publisher = pubsub_v1.PublisherClient()
topic_path = publisher.topic_path("your-project-id", "my-topic")
publisher.create_topic(topic_path)
Enter fullscreen mode Exit fullscreen mode
# Create a Subscriber client
subscriber = pubsub_v1.SubscriberClient()
subscription_path = subscriber.subscription_path("your-project-id", "my-subscription")
subscriber.create_subscription(subscription_path, topic_path)
Enter fullscreen mode Exit fullscreen mode

Publishing Messages

# using cli
gcloud pubsub topics publish my-topic \
--message "Hello, Pub/Sub!"
Enter fullscreen mode Exit fullscreen mode
# using the Python SDK
# Publish a message
message = "Hello, Pub/Sub!"
future = publisher.publish(topic_path, message.encode("utf-8"))
print(future.result())
Enter fullscreen mode Exit fullscreen mode

Subscribing to Messages

# Using the CLI
gcloud pubsub subscriptions pull my-subscription --auto-ack
Enter fullscreen mode Exit fullscreen mode
# Using the Python SDK
def callback(message):
    print("Received message: {}".format(message.data))
    message.ack()

subscriber.subscribe(subscription_path, callback=callback)

# The subscriber will stay alive until interrupted
while True:
    time.sleep(60)
Enter fullscreen mode Exit fullscreen mode

Use Cases with Practical Examples

Triggers with Cloud Functions

Use Case: Automatically trigger a process whenever a new message is published to a Pub/Sub topic. For example, sending an email notification when a new user registers.

Solution: Use Cloud Functions to listen to Pub/Sub topics and handle messages.

Steps:
Pre-Requisites
-- Enable the apis run.googleapis.com, cloudbuild.googleapis.com, eventarc.googleapis.com
-- Grant the role roles/logging.logWriter to the default compute servie account

gcloud projects add-iam-policy-binding $GCP_PROJECT \
  --member="serviceAccount:<ID>[email protected]" \
  --role="roles/logging.logWriter"
Enter fullscreen mode Exit fullscreen mode
  1. Create a Pub/Sub topic:

    gcloud pubsub topics create user-registration
    

pub-sub-topic

  1. Write a Cloud Function:

    def send_email_notification(event, context):
        import base64
        message = base64.b64decode(event['data']).decode('utf-8')
    print(f"Sending email notification for: {message}")
    
  2. Deploy the Cloud Function:

      gcloud functions deploy sendEmailNotification \
    --runtime python39 \
    --trigger-topic user-registration \
    --entry-point send_email_notification
    
  3. Publish a message to test:

      gcloud pubsub topics publish user-registration --message "New User: John Doe"
    

Result: The Cloud Function logs the email
notification process.

cloud-function-run

Streaming Data to BigQuery

Use Case: Process and analyze large volumes of event data, such as IoT sensor readings or transaction logs, by streaming messages from Pub/Sub to BigQuery.

Solution: Use a pre-built Dataflow template to ingest data into BigQuery.

Steps:

  1. Create a Pub/Sub topic and subscription:

    gcloud pubsub topics create sensor-data
    gcloud pubsub subscriptions create sensor-data-sub --topic=sensor-data
    

pub-sub-topic

pub-sub-subscription

  1. Enable BigQuery API and create a BigQuery dataset and table:

    # schema.json
    [
      {
        "name": "temperature",
        "type": "FLOAT",
        "mode": "NULLABLE"
      },
      {
        "name": "humidity",
        "type": "FLOAT",
        "mode": "NULLABLE"
      }
    ]
    
```
bq mk sensor_dataset
bq mk --table sensor_dataset.sensor_table \ 
schema.json
```
Enter fullscreen mode Exit fullscreen mode

Image description

  1. Enable the Dataflow API and run the Dataflow job:

    # enable dataflow api
    https://console.developers.google.com/apis/api/dataflow.googleapis.com/overview?project=your-project-id
    
```bash
# run the dataflow job
gcloud dataflow jobs run pubsub-to-bigquery \
--gcs-location gs://dataflow-templates-us-central1/latest/PubSub_to_BigQuery \
--region us-central1 \
--parameters inputTopic=projects/your-project-id/topics/sensor-data,outputTableSpec=your-project-id:sensor_dataset.sensor_table
```
Enter fullscreen mode Exit fullscreen mode

Image description
Image description

  1. Publish messages to the topic:

    gcloud pubsub topics publish sensor-data --message '{"temperature": 25, "humidity": 60}'
    

Image description

Result: The message data should appear in the BigQuery table for further analysis.

Image description

Logging Events with Sinks

Use Case: Centralize and process logs by routing Cloud Logging events to a Pub/Sub topic for downstream processing, like alerting or archiving.

Solution: Create a logging sink targeting a Pub/Sub topic.

Steps:

  1. Create a Pub/Sub topic:
   gcloud pubsub topics create log-events
Enter fullscreen mode Exit fullscreen mode
  1. Create a logging sink:
    gcloud logging sinks create log-sink \
    pubsub.googleapis.com/projects/your-project-id/topics/log-events
Enter fullscreen mode Exit fullscreen mode
  1. Set IAM permissions for the sink service account:
gcloud pubsub topics add-iam-policy-binding log-events \
    --member=serviceAccount:service-account-email \
    --role=roles/pubsub.publisher
Enter fullscreen mode Exit fullscreen mode
  1. Create a subscriber to process the logs:
from google.cloud import pubsub_v1

subscriber = pubsub_v1.SubscriberClient()
subscription_path = subscriber.subscription_path('your-project-id', 'log-events-sub')

def callback(message):
    print(f"Received log: {message.data.decode('utf-8')}")
    message.ack()

subscriber.subscribe(subscription_path, callback=callback)
print("Listening for log events...")
Enter fullscreen mode Exit fullscreen mode

Result: Log messages flow into Pub/Sub and can be processed by your custom subscriber.

Streaming Messages to Cloud Storage

Use Case: Archive real-time data, such as application usage metrics or user activity logs, to Cloud Storage for long-term storage.

Solution: Use Dataflow to write Pub/Sub messages to a Cloud Storage bucket.

Steps:

  1. Create a Pub/Sub topic and subscription:
gcloud pubsub topics create app-metrics
gcloud pubsub subscriptions create app-metrics-sub --topic=app-metrics
Enter fullscreen mode Exit fullscreen mode
  1. Create a Cloud Storage bucket:
gsutil mb gs://your-bucket-name
Enter fullscreen mode Exit fullscreen mode
  1. Run the Dataflow job:
gcloud dataflow jobs run pubsub-to-gcs \
    --gcs-location gs://dataflow-templates-us-central1/latest/PubSub_to_Cloud_Storage_Text \
    --region us-central1 \
    --parameters inputTopic=projects/your-project-id/topics/app-metrics,outputDirectory=gs://your-bucket-name/data/,outputFilenamePrefix=metrics,outputFileSuffix=.json
Enter fullscreen mode Exit fullscreen mode
  1. Publish messages:
gcloud pubsub topics publish app-metrics --message '{"event": "page_view", "user": "123"}'
Enter fullscreen mode Exit fullscreen mode

Result: Messages are written as JSON files in the specified Cloud Storage bucket.

Conclusion
Google Pub/Sub is a versatile tool for building scalable and reliable applications. By understanding the core concepts and leveraging the CLI and Python SDK, you can effectively utilize Pub/Sub to solve a variety of real-world problems.

pubsub Article's
30 articles in total
Favicon
Breaking the Scale Barrier: 1 Million Messages with NodeJS and Kafka
Favicon
From Heist Strategy to React State: How data flows between components
Favicon
TOP 5 Brain-Boosting Logic Games for Your Phone
Favicon
RabbitMQ: conceitos fundamentais
Favicon
New article alert! Data Engineering with Scala: mastering data processing with Apache Flink and Pub/Sub ❤️‍🔥
Favicon
Pub-sub Redis in Micronaut
Favicon
Navigating the World of Event-Driven Process Orchestration for Technical Leaders
Favicon
Use cases of Kafka
Favicon
Create scalable and fault-tolerant microservices architecture
Favicon
Choose The Reliable MBA Assignment Help With These Top 10 Tips: A Comprehensive Guide!
Favicon
Communication Protocols in IoT: The Unsung Heroes of Our Connected World
Favicon
MQTT: The Whisperer of the IoT World
Favicon
Event types chaos in Event Driven Architecture
Favicon
Harness PubSub for Real-Time Features in Phoenix Framework
Favicon
How Subscription Management Software is Transforming Mobile Apps
Favicon
Google Pub/Sub
Favicon
New Rotating Shapes Animation
Favicon
Real-Time Data Processing with Node.js, TypeScript, and Apache Kafka
Favicon
Messaging in distributed systems using ZeroMQ
Favicon
Article checker html CSS Java Script
Favicon
When and how to load balance WebSockets at scale
Favicon
Getting Started with Apache Kafka: A Backend Engineer's Perspective
Favicon
WebSocket reliability in realtime infrastructure
Favicon
Realtime reliability: How to ensure exactly-once delivery in pub/sub systems
Favicon
[Event-Driven] understanding
Favicon
Achieving delivery guarantees in a pub/sub system
Favicon
How to build an autonomous news Generator with AI using Fluvio
Favicon
How to build an event-driven architecture with Fluvio
Favicon
Key Components and Tools for Event-Driven Architectures
Favicon
Leveling Up My GraphQL Skills: Real Time Subscriptions

Featured ones: