Data Formats (Serialization)

Protocol Buffers and Descriptors

A Descriptor defines the data serialization format for a record’s native data format. Descriptors are defined for a Topic within a Store. This provides different serialization formats for records within a Store, while defining a clear schema definition for a stream of records through the Store. See Working with ProtoBuf Serialized Data and DeltaStream Descriptors and CREATE DESCRIPTOR_SOURCE for more information on how to import and create Descriptors in DeltaStream. Currently, DeltaStream uses Descriptors to support data in ProtoBuf format.

JSON

A JSON serialization format is assumed for a Topic, if no Schema Registry is defined for the corresponding Store and no Descriptor is defined for the Topic. If both a Schema Registry is present in the Store and a Descriptor is defined for a Topic, that Topic will be serialized using the Descriptor.

Avro and Schema Registry

A Schema Registry is a service to manage message schemas in streaming stores such as Apache Kafka. Message schemas are used to serialize and deserialize messages stored in Topics. In DeltaStream, a Schema Registry is a representation of a Schema Registry service that can be used to fetch and store schemas for Topics in the service. A Store can use one Schema Registry at a time, but a Schema Registry can be used by multiple Stores. When a Schema Registry is attached to a Store, any Topics in that Store using data serialization formats requiring a Schema Registry will use the Store’s associated Schema Registry to fetch the schemas/metadata necessary for marshalling and unmarshalling data events. The schema for an event, fetched by the Schema Registry, should not be confused with the Schema belonging to a Database for organizing Relations.

Last updated