Stateful Data Flow Beta Build composable event-driven data pipelines in minutes.

Get Started Now

Dataflow - Inline Definitions

Inline Dataflows are dataflow.yaml files that include everything necessary to run a data pipeline. Inline dataflows are useful for trying out various features of the product. Deploying an inline dataflow is simple:

  1. Download (or create) a dataflow file
  2. Run the dataflow

While inline dataflows are a breeze to get started with, maintaining code in yaml is not always ideal. For complex projects, we recommend using Composable Dataflows.

 

Create a Dataflow

Let’s create a simple dataflow to split a sentence into words and count the words. This is the same example we used in the Composable Dataflows section, so you can examine both sections for a comparission.

 
1. Create the Dataflow file

Ceate a dataflow file in the directory split-sentence directory:

$ mkdir -p split-sentence-inline
$ cd split-sentence-inline

Create the dataflow.yaml and add the following content:

#dataflow.yanl
apiVersion: 0.4.0

meta:
  name: split-sentence-inline
  version: 0.1.0
  namespace: example

config:
  converter: raw

topics:
  sentence:
    schema:
      value:
        type: string
        converter: raw
  words:
    schema:
      value:
        type: string
        converter: raw

services:
  sentence-words:
    sources:
      - type: topic
        id: sentence

    transforms:
      - operator: flat-map
        run: |
          fn sentence_to_words(sentence: String) -> Result<Vec<String>, String> {
            Ok(sentence.split_whitespace().map(String::from).collect())
          }                
      - operator: map
        run: |
          pub fn augment_count(word: String) -> Result<String, String> {
            Ok(format!("{}({})", word, word.chars().count()))
          }                  

    sinks:
      - type: topic
        id: words
 
2. Run the Dataflow

Use sdf command line tool to run the dataflow:

$ sdf run --ui

Use --ui to generate the graphical representation and run the Studio.

 
3. Test the Dataflow
  1. Produce sentences to in sentence topic:
$ fluvio produce sentence
Hello world
Hi there

Consume from words to retrieve the result:

$ fluvio consume words -Bd
Hello(5)
world(5)
Hi(2)
there(5)
 
4. Show State

The dataflow collects runtime metrics that you can inspect in the runtime terminal.

Check the sentence-to-words counters:

>> show state sentence-words/sentence-to-words/metrics --table
 Key    Window  succeeded  failed
 stats  *       2          0

Check the augment-count counters:

>> show state sentence-words/augment-count/metrics --table
 Key    Window  succeeded  failed
 stats  *       4          0

Congratulations! You’ve successfully built and run a composable dataflow! The project is available for download in [github].

 
5. Clean-up

Exit sdf terminal and remove the topics:

fluvio topic delete sentence
fluvio topic delete words
 

References