Tutorial
Outputs and Systems
By default, a stream's data is ouput to a JSON file. But this can be changed by explicitly specifying an output for the stream. Update the config file to output data to CSV:
streams:
users:
outputs:
- format: csv
schema:
type: object
# etc
Rerun the simulation. The new file will now be 001.csv
and the contents will be in CSV format:
id,full_name,created_at
"1","Micaela Batz","2024-05-28T05:16:38.635516+00:00"
"2","Everette Krajcik","2024-05-29T00:50:52.439516+00:00"
...
Next we'll specify a Postgres system, but first let's set things up. Install Docker if you haven't already and add this compose.yml
:
version: '3.8'
services:
postgres:
image: postgres:16
environment:
POSTGRES_PASSWORD: pw
POSTGRES_USER: rngo_tutorial
POSTGRES_DB: rngo_tutorial
ports:
- "5431:5432"
volumes:
- ./db:/docker-entrypoint-initdb.d/
Add db/1-users.sql
with a table that corresponds to the shape of the users
stream:
CREATE TABLE users (
id bigserial PRIMARY KEY,
full_name text NOT NULL,
created_at timestamptz NOT NULL DEFAULT now()
);
Run docker compose up -d
and update the config:
systems:
db:
output:
format: csv
scripts:
import: |
PGPASSWORD="$RNGO_DB_PASSWORD" \
psql \
-h $RNGO_DB_HOST \
-p $RNGO_DB_PORT \
-U $RNGO_DB_USER \
-d $RNGO_DB_DATABASE \
-c "TRUNCATE {{table}} CASCADE;" \
-c "\\COPY {{table}} FROM {{dataFile}} CSV HEADER;"
streams:
users:
systems:
db:
parameters:
table: users
schema:
type: object
# etc
We've added a simple system called "db" that defines how to import data into a Postgres database. It asks for the data to be outputed in CSV format, and defines a bash import
script that copies the CSV into a table.
The script is a template - stream-specific parameters may be interpolated via mustache syntax. In this case, we're refencing {{table}}
, which is defined in the stream, and {{dataFile}}
, which is the path to the CSV file and is provided by rngo.
The script also contains system-specific environment variables that must be available when the sim
command is run - add a .env
file for this:
RNGO_DB_HOST=localhost
RNGO_DB_PORT=5431
RNGO_DB_USER=rngo_tutorial
RNGO_DB_PASSWORD=pw
RNGO_DB_DATABASE=rngo_tutorial
Now run rngo sim
again and then run:
PGPASSWORD=pw psql \
-c "SELECT * FROM users" \
-h localhost \
-p 5431 \
-U rngo_tutorial \
rngo_tutorial
The result should be something like:
id | full_name | created_at
---+------------------+-------------------------
1 | Micaela Batz | 2024-05-28 05:16:38.635516+00
2 | Everette Krajcik | 2024-05-29 00:50:52.439516+00
...
29 | Jacey Dicki | 2024-06-24 21:33:51.897516+00
30 | Agnes Brakus | 2024-06-26 02:20:23.631516+00
You'll see that the (fully-resolved) import
script is part of the downloaded data at .rngo/simulations/last/import.sh
- rngo sim
runs this script as its last step.
You're able to customize systems to meet your needs, but rngo provides a default Postgres system definition that is equivalent the above definition. Update the config to reference the default:
systems:
db:
type: postgres
streams:
users:
systems:
db: {}
schema:
type: object
# etc