Memphis

Event Sourcing with Memphis.dev: A Beginner’s Guide

Idan Asulin December 09, 2023 6 min read

Introduction

In the realm of modern software development, managing and maintaining data integrity is paramount. Traditional approaches often involve updating the state of an application directly within a database. However, as systems grow in complexity, ensuring data consistency and traceability becomes more challenging. This is where Event Sourcing, coupled with a powerful distributed streaming platform like Memphis.dev, emerges as a robust solution and a great data structure to work with.

What is Event Sourcing?

At its core, Event Sourcing is a design pattern that captures every change or event that occurs in a system as an immutable and sequentially ordered log of events. Instead of persisting the current state of an entity, Event Sourcing stores a sequence of state-changing events. These events serve as a single source of truth for the system’s state at any given point in time.

Understanding Event Sourcing in Action

Imagine a banking application that tracks an account’s balance. Instead of only storing the current balance in a database, Event Sourcing captures all the events that alter the balance. Deposits, withdrawals, or any adjustments are recorded as individual events in chronological order.

Let’s break down how this might work:

  1. Event Creation: When a deposit of $100 occurs in the account, an event, such as FundsDeposited, with relevant metadata (timestamp, amount, account number) is created.
  2. Event Storage: These events are then appended to an immutable log, forming a sequential history of transactions specific to that account.
  3. State Reconstruction: To obtain the current state of an account, the application replays these events sequentially to compute the current balance. Each event is applied in order to derive the current balance, enabling the system to rebuild state at any given point in time.

Leveraging Memphis in Event Sourcing

Memphis.dev, an open-source distributed event streaming platform, is perfect for implementing Event Sourcing due to its features:

  1. Scalability and Fault Tolerance: Memphis’ distributed nature allows for horizontal scalability and ensures fault tolerance by replicating data across multiple brokers (nodes).
  2. Ordered and Immutable Event Logs: Memphis’ log-based architecture aligns seamlessly with the principles of Event Sourcing. It maintains ordered, immutable logs, preserving the sequence of events.
  3. Real-time Event Processing: Memphis Functions offers a serverless framework, built within the Memphis platform to handle high-throughput, real-time event streams. Applications can process events as they occur, enabling near real-time reactions to changes in state.
  4. Managing Schemas: One of the major challenges in event sourcing is maintaining schemas across the different events to avoid upstream breaks and client crashes.

Benefits of Event Sourcing with Kafka

  • Temporal Queries and Auditing: By retaining a complete history of events, it becomes possible to perform temporal queries and reconstruct past states, aiding in auditing and compliance.
  • Flexibility and Scalability: As the system grows, Event Sourcing with Memphis allows for easy scalability, as new consumers can independently process the event log.
  • Fault Tolerance and Recovery: In the event of failures, the ability to rebuild state from events ensures resiliency and quick recovery.

Let’s see what it looks like via code

Events occur and are pushed by their order of creation into some Memphis Station (=topic)

Event Log:

class EventLog:
    def __init__(self):
        self.events = []

    def append_event(self, event):
        self.events.append(event)

    def get_events(self):
        return self.events

Memphis Producer:

from __future__ import annotations
import asyncio
from memphis import Memphis, Headers, MemphisError, MemphisConnectError, MemphisHeaderError, MemphisSchemaError
import json

class MemphisEventProducer:
    def __init__(self,host="my.memphis.dev"):
        try:
        self.memphis = Memphis()
        await self.memphis.connect(host=host, username="<application type username>", password="<password>")

    def send_event(self, topic, event):
        await self.memphis.produce(station_name=topic, producer_name='prod_py',
  message=event,nonblocking=False)
        except (MemphisError, MemphisConnectError, MemphisHeaderError, MemphisSchemaError) as e:
          print(e)
        finally:
          await self.memphis.close()

Usage:

# Initialize Event Log
event_log = EventLog()

# Initialize Memphis Producer
producer = MemphisEventProducer()

# Append events to the event log and produce them to Memphis
events_to_publish = [
    {"type": "Deposit", "amount": 100},
    {"type": "Withdrawal", "amount": 50},
    # Add more events as needed
]

for event in events_to_publish:
    event_log.append_event(event)
    producer.send_event('account-events', event)


Criteria to choose the right event streaming platform for the job

When implementing Event Sourcing with a message broker, several key features are crucial for a streamlined and efficient system:

1. Persistent Message Storage:

  • Durability: Messages should be reliably stored even in the event of failures. This ensures that no events are lost and the event log remains intact.

2. Ordered and Immutable Event Logs:

  • Sequential Order: Preserving the order of events is critical for accurate state reconstruction. Events must be processed in the same sequence they were produced.
  • Immutability: Once an event is stored, it should not be altered. This guarantees the integrity and consistency of the event log.

3. Scalability and Performance:

  • Horizontal Scalability: The message broker should support horizontal scaling to accommodate increased event volume without sacrificing performance.
  • Low Latency: Minimizing message delivery time ensures near real-time processing of events, enabling quick reactions to state changes.

4. Fault Tolerance and High Availability:

  • Redundancy: Ensuring data redundancy across multiple nodes or partitions prevents data loss in the event of node failures.
  • High Availability: Continuous availability of the message broker is essential to maintain system functionality.

5. Consumer Flexibility and State Rebuilding:

  • Consumer Groups: Support for consumer groups allows multiple consumers to independently process the same set of events, aiding in parallel processing and scalability.
  • State Rebuilding: The broker should facilitate easy rebuilding of the application state by replaying events, enabling historical data retrieval.

6. Retention Policies and Archiving:

  • Retention Policies: Configurable retention policies allow managing the duration or size of stored messages. This ensures efficient storage management.
  • Archiving: Ability to archive or offload older events to long-term storage for compliance or historical analysis purposes.

7. Monitoring and Management:

  • Metrics and Monitoring: Providing insights into message throughput, latency, and system health helps in monitoring and optimizing system performance.
  • Admin Tools: Easy-to-use administrative tools for managing topics, partitions, and configurations streamline system management.

8. Security and Compliance:

  • Encryption and Authentication: Support for encryption and authentication mechanisms ensures the confidentiality and integrity of transmitted events.
  • Compliance Standards: Adherence to compliance standards (such as GDPR, SOC2) ensures that sensitive data is handled appropriately.

9. Seamless Integration and Ecosystem Support:

  • Compatibility and Integrations: Seamless integration with various programming languages and frameworks, along with support for diverse ecosystems, enhances usability.
  • Ecosystem Tools: Availability of connectors, libraries, and frameworks that facilitate Event Sourcing simplifies implementation and reduces development efforts.

Choosing a message broker that aligns with these critical features is essential for implementing robust Event Sourcing, ensuring data integrity, scalability, and resilience within your application architecture.


Event Sourcing using a Database vs a Message Broker (Streaming Platform)

Use Case Complexity: For simpler applications or where scalability isn’t a primary concern, databases might suffice. For higher reliability, distributed systems needing high scalability, and real-time processing, a message broker can be more suitable.

Replay: In event streaming platforms or message brokers, events are stored in a FIFO manner, one after the other as they first appear. That nature also makes it easier for the consumer on the other side to understand the natural flow of events and replay the entire “scene,” whereas in databases, it is not the case, and additional fields must be added, like timestamps, to organize the data based on time. It also requires additional logic to understand the latest state of an entity.


Continue your learning: read how and why event sourcing outgrows the database.