An event stream that enters Kafka persists as a topic, which is what Kafka considers to be a materialized event stream. Simply put, a topic is any event stream at rest. A topic groups related events and stores them durably. For the sake of understanding, you can think of a topic in Kafka as a folder in a file system or a table in a database.
In Kafka, topics are used to decouple consumers and producers. A consumer pulls messages from the topic, while producers push them into the topic. A topic can have numerous consumers and producers.
Topics in Kafka are divided into numerous partitions. You can think of partition as the smallest unit of storage that stores a subset of all the records owned by a particular topic. A partition is a single log file and records are written to it (following the append-only property, where the existing data is immutable).
Just like Kafka has topics and RabbitMQ has queues, Memphis has stations. These are essentially distributed units used for storing messages and provide a powerful and easy-to-use messaging queue for applications. Think of them as virtual entities residing on a kind of file known as “stream” that stores the data. The stream files are stored either on non-volatile storage or in the broker’s memory, depending on your configuration per station.
All stations have a retention policy (by total size of the stored messages, store time, and stored messages) that defines how and when the messages will be removed from them. They are also distributed across Memphis brokers (one or more) depending on the number of station replicas that have been configured, and data is poured in a RAID-1 fashion.