In DeltaStream, Stores provide a layer of abstraction around the raw streaming data, but to be able to process that data in queries, we need to make sense of the data. #_relations are used for defining the metadata and data format that describe the structure of the data for its native format.
Understanding the Data
Let’s say we have defined an Apache Kafka Store, and it has a few Topics:
Let’s assume all Topics are in JSON format. See CREATE STORE and UPDATE TOPIC for using other serialization formats, and refer to the Relation’s DDL statements for information around data formats, e.g. CREATE STREAM.
We can inspect the Topics to understand what kind of data we have. Here is the ds_pageviews Topic:
Now that we know what the data looks like in our Topics, we can attach a structure to them for reference in queries. In our example, ds_pageviews is a continuous Stream of immutable page events from our users. So we define a #_stream for it:
Since the ds_users Topic hosts user information that changes over time, we define a #_changelog for it to be able to capture ongoing changes to each userid:
For certain applications, it may be more useful to have access to a snapshot of the resulting data. See #_materialized_view and CREATE MATERIALIZED VIEW AS for more information on how to create a view for the data.
Once Relations are defined, they can be listed through their Database and Schema:
When Relations are defined for Topics, they become available to DeltaStream as a consumable entity. For example, they can be used in interactive queries: