Stateful Data Flow Beta Build composable event-driven data pipelines in minutes.

Get Started Now

SDF Beta2 update

Sehyo Chang

Sehyo Chang

Contributor, InfinyOn

SHARE ON
GitHub stars

SDF Beta2 is here! This release brings a number of new exciting features and improvements to SDF.

SDF Quickstart

Separation of Run and Deploy

Prior to Beta2, the --ephemeral flag was used on the sdf run command to run an SDF dataflow without deploying it. In Beta2, the sdf deploy command has been added to deploy dataflows, and the sdf run command can only be used for running dataflows locally (without deploying). The --ephemeral flag is deprecated and will be removed in the future.

Custom Serialization and Deserialization

SDF natively supports mapping JSON data into SDF objects. However, in some cases, you may want to override the default mapping behavior. For example, suppose you have following JSON data which may represents a temperature reading:

{
  "id": "A123",
  "location": "New York",
  "temperature": 25.0
}

However, your data may have different field names from those defined by your SDF types. Instead of id fields, you may want to map to or from device-id.

  device:
    type: object
    properties:
      name:
        device-id: string
      location:
        type: string
      temperature:
        type: string

Prior to Beta2, manual mapping of JSON data to the SDF object was required. With Beta2, custom serialization and deserialization can now be defined using the new serialize and deserialize properties in the type schema. With Beta2, rename mapping is supported. For example, you can use the schema to remap id to device-id as follows:

  device:
    type: object
    properties:
      name:
        device-id: string
        deserialize:
          rename: id
      location:
        type: string
      temperature:
        type: string

For more information, see SDF Custom Serialization and Deserialization

Operator Logging everywhere

Prior to Beta2, operator logs were dumped to a local file and were not accessible to non-ephemeral dataflows. In Beta2, operator logging is stored in fluvio topics and works with both ephemeral (run) and non-ephemeral(deploy) dataflows. This allows you to access operator logs from dataflows running in the cloud as well as those running locally.

In addition, you can now filter logs by operator name or dataflow. This is useful when you have multiple operators running in the same dataflow.

Deployment improvements

In Beta2, we have made improvements to the dataflow deployments. Dataflows can be stopped and restarted without having to delete and recreate the dataflow. This is useful when you want to temporarily stop the dataflow for maintenance or debugging. In addition, dataflow operations now use the fully qualified dataflow name (ex: myorg/[email protected]) instead of just the dataflow name to avoid conflicts with other dataflows. For more information, see SDF Deployment

For full list of changes, see SDF What’s New