Streams - Docs

A stream defines the schema, content and volume of one data set within a simulation. At a minimum, a stream must have:

a unique name (within its namespace)
a JSON Schema

Currently, you can only add and update streams via your project config.

Schema

All streams define a JSON schema - in the config, it is specified under the schema keyword. When a stream is included in a simulation, rngo guarantees that the data it generates for that stream will be valid against its schema.

See JSON Schema for all details.

Rate

Use the rate keyword to specify the rate at which the stream should produce new events, expressed in hertz. For example, this stream will produce events at a rate of roughly 1 event per 10 seconds:

streams:
  users:
    rate: 0.1
    schema:
      #...

rngo builds in variance, so the observed rate over any sub-interval of the simulation may be higher or lower than the configured one.

The value is an expression, so to make the rate increase over time, you could do something like this:

streams:
  users:
    rate: 0.1 + (0.0001 * sim.offset)
    schema:
      #...

The expression is sampled periodically over the course of the simulation, so the rate will change in steps.

Rates will always be adjusted to be greater than or equal to zero and less than 1000 events / second.

Outputs

The outputs key configures how a stream outputs its data.

The values of the associated object either directly define an ouptput format, or reference a system.

Direct Outputs

You can directly specify a stream's output format like this:

streams:
  users:
    outputs:
      JSON:
        format: json
    schema:
    #...

When run to a file sink, the users stream will output one or more JSON files under /streams/users/JSON/. See Outputs for all configuration options.

If outputs is not specified for a stream, the following output configuration will be used by default:

outputs:
  default:
    format: json

System Outputs

You can also direct a stream's output to a system like this:

systems:
  db:
    type: postgres
streams:
  orders:
    outputs:
      database:
        system: db
        parameters:
          table: ORDER
    schema:
    #...

The orders stream will output CSV, because that is the format defined by the postgres system type. The system's import script will know to look for the CSV file(s) at /streams/users/database/.

You may override system parameters in the associated object. For the above example, the import script will attempt to load the data into the ORDER table.

You can use shorthand to specify a system output by using the system name as a key. So this is a more concise way to write the above:

systems:
  db:
    type: postgres
streams:
  orders:
    outputs:
      db:
        parameters:
          table: ORDER
    schema:
    #...