Report Memphis.dev Cloud Performance And Load Report

Beta Build Highly Customizable Real-time Pipelines

Blog Ingesting Webhooks From Stripe – The Better Way

Newsletter Get the latest data engineering news-subscribe now!

Memphis

How to increase data quality with Memphis Schemaverse

Idan Asulin December 12, 2022 3 min read

Contents

What is Memphis?

Memphis{dev} is an open-source real-time data processing platform that provides end-to-end support for in-app streaming use cases using Memphis distributed message broker.

“It took me three minutes to do in Memphis what took me a week and a half in Kafka״
Memphis{dev} enables building next-generation applications that require large volumes of streamed and enriched data, modern protocols, zero ops, rapid development, extreme cost reduction, and a significantly lower amount of dev time for data-oriented developers and data engineers.

Four critical pillars –

Performance – Enhanced cache usage
Resiliency – Never lose a message and 99.95% uptime
Observability – Out-of-the-box observability that makes sense and reduces troubleshooting time
Developer Experience – Modularity, inline processing, schema management, gitops. Made by developers for anyone, not just developers.

Memphis key features (current and future) –

A distributed cloud-native message broker
Written in Golang
Low-code UI and CLI for both engineers, ops, and product teams
Out-of-the-box monitoring and notification center
DLQ
Real-time message tracing
Schema management, validation, and transformation
Connectors framework made by memphis team and the community *
Serverless stream enrichment *
Enhanced security and auditing
Object tagging

Data contracts

In a federated data platform, which responsibilities are distributed between stakeholders, teams, and sources, it’s harder to control establish a single standard. This is where the data contracts concept comes into play. Why do data contracts matter? Because (a) they provide insights into who owns what data products. (b) they support setting standards and managing your data pipelines with confidence. They also provide crucial information on what data is being consumed, by whom, and for what purpose. Bottom line: data contracts are essential for robust data management!

The very basic building block to control and ensure the quality of data that flows through your organization between the different owners is by defining well-written schemas and data models.

Why?

Data pipelines are constantly breaking and creating data quality AND usability issues.
There is a communication chasm between service implementers, data engineers, and data consumers.
There are multiple approaches to solving these issues and data engineers are still very much pioneers exploring the frontier of future best practices.

Formats
A quick overview of the most popular formats

Protobuf.
The modern. Protocol buffers are Google’s language-neutral, platform-neutral, extensible mechanism for serializing structured data – think XML, but smaller, faster, and simpler.
Avro
The popular. Apache Avro™ is the leading serialization format for record data, and first choice for streaming data pipelines. It offers excellent schema evolution.
JSON
The simplest. JSON Schema is a vocabulary that allows you to annotate and validate JSON documents.

Serialization process
Serialization converts a data object—a combination of code and data—into a series of bytes that saves the object’s state in an easily transmittable form.
The opposite process is called deserialization.
Serialization is basically represented as a function, each data format with its own implementation, and in case the structure and content of a given data do not match the defined schema in the .proto/.avro/.json struct, the serialization process fails, and therefore, the message will not be sent to the broker.
That process establishes the message’s initial validation before it reaches the broker itself, using the client cache to store the schema locally.

Memphis Schemaverse

Memphis Schemaverse provides a robust schema store and schema management layer on top of memphis broker without a standalone compute or dedicated resources. With a unique and modern UI and programmatic approach, technical and non-technical users can create and define different schemas, attach the schema to multiple stations and choose if the schema should be enforced or not.
Memphis’ low-code approach removes the serialization part as it is embedded within the producer library.
Schemaverse supports versioning, GitOps methodologies, and schema evolution.

Also includes –

Great UI and programmatic approach
Embed within the broker
Zero-trust enforcement
Versioning
Out-of-the-box monitoring
Import & Export schemas
Low/no-code validation and serialization
No configuration needed
Native support in Python, Go, Node.js

Getting Started

Step 1: Create a new schema (Currently only available through Memphis GUI)

Step 2: Attach Schema
Head to your station, and on the top-left corner, click on “+ Attach schema”

Step 3: Code example in node.js
Memphis abstracts the need for external serialization functions and embeds them within the SDK.

Producer (Protobuf example)

<br />

Consumer (Requires .proto file to decode messages)
<br />

Memphis.dev Cloud Performance and Load testing

Shay Bratslavsky Jan 02, 2024 5 min read

Ingesting Webhooks From Stripe – The Better Way

Shoham Roditi Dec 30, 2023 4 min read

Comparing Webhooks and Event Consumption: A Comparative Analysis

Idan Asulin Dec 11, 2023 5 min read

Related Articles

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.