# rngo Effect Inference Skill

Your job is to create a configuration file for each **effect** inferred from the ***system*** context.

## Overview

An **effect** models any interaction with a system, such as a database insert.

Effects are defined in YAML files in the `.rngo/effects/` directory, with one file per effect named `{EFFECT_KEY}.yml`. Each file's extension MUST BE `.yml` and should NEVER BE `.yaml`.

## Effect Components

### 1. Schema

Every effect MUST have a `schema` that defines the structure and content of generated data.

#### JSON Schema Types

**string** - Random string generation
```yaml
type: string
minLength: 5
maxLength: 20
```

**integer** - Random integers
```yaml
type: integer
minimum: 1
maximum: 100
```

**number** - Random numbers (including decimals)
```yaml
type: number
minimum: 0.0
maximum: 1.0
```

**boolean** - Random boolean values
```yaml
type: boolean
```

**null** - Always null - note: "null" must be a string
```yaml
type: 'null'
```

**const** - Constant value
```yaml
type: const
const: fixed value
```

**enum** - Random selection from list
```yaml
type: enum
enum:
  - value1
  - value2
  - value3
```

**array** - Arrays with item schema
```yaml
type: array
items:
  type: string
minItems: 1
maxItems: 10
```

**object** - Objects with properties
```yaml
type: object
properties:
  name:
    type: string
  age:
    type: integer
```

#### Core Types

**function** - CEL expression evaluation
```yaml
type: function
expression: "username + '@' + domain"
variables:
  username:
    type: string
    maxLength: 10
  domain:
    type: enum
    enum:
      - example.com
      - example.org
```

**reference** - Sample from another entity
```yaml
type: reference
effect: users.create
unique: false # true = no replacement, false = with replacement
```

**select** - Weighted random selection from multiple options
```yaml
type: select
options:
  - weight: 1
    stream:
      type: integer
      min: 1
      max: 100
  - weight: 4
    stream:
      type: string
      minLength: 5
      maxLength: 10
```

**constant** - Constant stream
```yaml
type: constant
value: static value
```

**util.nullable** - Nullable wrapper for other types
```yaml
type: util.nullable
schema:
  type: string
```

#### Domain Types

**person.name** - Full name generation
```yaml
type: person.name
```

**person.givenName** - First name
```yaml
type: person.givenName
```

**person.familyName** - Last name
```yaml
type: person.familyName
```

**internet.email** - Email address
```yaml
type: internet.email
```

**phone.number** - Phone number
```yaml
type: phone.number
```

**address.address1** - Street address line 1
```yaml
type: address.address1
```

**address.address2** - Street address line 2
```yaml
type: address.address2
```

**address.postalCode** - Postal/ZIP code
```yaml
type: address.postalCode
```

**address.state** - State/province
```yaml
type: address.state
```

**id.integer** - Auto-incrementing integer ID
```yaml
type: id.integer
```

**time.now** - Current simulation time
```yaml
type: time.now
```

**time.offset** - Simulation time offset
```yaml
type: time.offset
```

**content.lorem** - Lorem ipsum text
```yaml
type: content.lorem
```

### 2. Trigger

`trigger` defines how frequently the effect generates values. The value is in Hertz, but time unit helpers should be used whenever possible to improve readability.

**Static trigger with time units:**

10 events per day:
```yaml
trigger: 10 / day
```

100 events per hour:
```yaml
trigger: 100 / hour
```

1 event every 30 seconds:
```yaml
trigger: 1 / seconds(30)
```

1 event every 2 hours:
```yaml
trigger: 1 / hours(2)
```

**Time unit constants** (values in seconds):
- `second` = 1
- `minute` = 60
- `hour` = 3600
- `day` = 86400
- `week` = 604800
- `month` = 2419200 (28 days, fixed)
- `year` = 31536000 (365 days, fixed)

**Time unit functions** (for multiplication):
- `seconds(n)`, `minutes(n)`, `hours(n)`, `days(n)`, `weeks(n)`, `months(n)`, `years(n)`

**Static numeric trigger:**
```yaml
trigger: '5' # 1 value per second
```

**Dynamic trigger expression:**
```yaml
trigger: (10 / day) + (0.0001 * offset) # Increases over time
```

The trigger expression is sampled periodically and can reference `offset` (simulation time in milliseconds). Rates are clamped between 0 and 1000 Hz.

### 3. System

`system` associates an effect with a system defined in `.rngo/systems/`.

```yaml
system: mydb
```

### 4. Format

`format` sets effect-specific properties for the system's format. For example, for a SQL system, you should set the table like this:

```yaml
format:
  table: USERS
```

## Complete Effect Examples

### User Effect (Standalone)
```yaml
# .rngo/effects/users.create.yml
rate: 1 / day
system: db
format:
  table: users
schema:
  type: object
  properties:
    id:
      type: id.integer
    name:
      type: person.name
    email:
      type: internet.email
    createdAt:
      type: time.now
```

### Post Effect (with References)
```yaml
# .rngo/effects/posts.create.yml
rate: 0.5 / day
system: db
format:
  table: POSTS
schema:
  type: object
  properties:
    id:
      type: id.integer
    title:
      type: string
      minLength: 10
      maxLength: 100
    content:
      type: content.lorem
    authorId:
      type: function
      expression: author.id
      variables:
        author:
          type: reference
          effect: users.create
    createdAt:
      type: time.now
```

### Order Effect
```yaml
# .rngo/effects/orders.create.yml
rate: '2'
system: db
format:
  table: ORDERS
schema:
  type: object
  properties:
    id:
      type: id.integer
    userId:
      type: function
      expression: user.id
      variables:
        user:
          type: reference
          effect: users.create
    total:
      type: number
      minimum: 10.00
      maximum: 1000.00
    status:
      type: enum
      enum:
        - pending
        - completed
        - cancelled
```

### Database Table Effect
```yaml
# .rngo/effects/customers.create.yml
rate: 1 / day
system: db
format:
  table: CUSTOMERS
schema:
  type: object
  properties:
    id:
      type: id.integer
    name:
      type: person.name
    email:
      type: internet.email
```

## Inference Best Practices

### 1. Analyze Database Schema
- Use `object` for tables and collections - property names MUST EXACTLY match table column names
- Map columns to `object` properties with sub-schemas matching the column type
- Use `id.integer` for primary keys
- Use `reference` for foreign keys
- Use `util.nullable` for nullable columns

### 2. Determine Entity Relationships
- Use `reference` type for foreign key relationships
- Wrap references in `function` to extract specific fields
- Set `unique: true` for one-to-one relationships

### 3. Choose Appropriate Rates
- High-frequency entities (events, logs): `rate: '100 / hour'` to `rate: '1000 / hour'`
- Medium-frequency (posts, orders): `rate: '10 / hour'` to `rate: '50 / hour'`
- Low-frequency (users, products): `rate: '1 / hour'` to `rate: '10 / day'`
- Use dynamic rates for growing datasets: `rate: '(10 / day) + (0.0001 * offset)'`
- Use time unit functions for precise intervals: `rate: '1 / minutes(5)'` (once every 5 minutes)

### 4. Always specify System
- Use `system: NAME` where NAME is the name of the system from whose context this effect was inferred.

### 5. Use Domain Types
- Prefer domain types (`person.name`, `internet.email`) over generic types
- Domain types generate realistic, properly formatted data
- Use `id.integer` for all ID fields
- Use `time.now` for timestamps

### 6. Use Functions Only When Better Alternatives Do Not Exist
- Use `function` to compute derived values
- Use `function` to extract fields from `reference` types
- Access fields with dot notation: `user.profile.age`
- CEL operation support:
  - arithmetic: `-`, `+`, `*`, `/`
  - relations: `==`, `!=`, `<`, `<=`, `>`, `>=`
  - logic: `!`, `&&`, `||`
  - string concatenation: `+`
- ONLY USE variables listed in the schema, or operations listed here

### 7. Handle Null/Optional Values
- Use `util.nullable` wrapper for optional fields
- Use `enum` with null as an option
- Consider minimum/maximum constraints

### 8. File Naming
- Use `[SCOPE].[TYPE].yml` format, where scope is the name of the table, collection or other entity that the effect applies to, e.g. `users.create.yml`
- If there is no obvious target, use simply `[ACTION].yml`, e.g. `sign-out.yml`
- If there are multiple systems, use `[SYSTEM].[SCOPE].[ACTION].yml`, e.g. `mysql.posts.create.yml`
- Use table / collection / etc. as the scope when possible
- Use lower case for everything and kebab-case for multi-word parts, e.g. `order-items.create-empty.yml`

## Validation Rules

1. Every effect MUST have a `schema` field
2. The `schema` MUST have a `type` field
3. Every effect MUST have a `system` field
4. The `rate` field, if present, must be a string (expression)
5. References must point to existing entities
6. Function expressions must be valid CEL syntax
7. Domain types must be from the supported list

## System Inference

Every effect MUST be inferred from a system context. For example, if this **system** system is defined in `.rngo/systems/db.yml`:

```yaml
# .rngo/systems/mydb.yml
rate: 1 / day
format:
  type: sql
import:
  command: sqlite3 db.sqlite
infer:
  context:
    description: sqlite schema dump
    command: sqlite3 db.sqlite '.schema'
```

You should create a create effect for every table in the schema dump, each of which should have a `system` field with a value of `db`. For example:

```yaml
# .rngo/effects/users.create.yml
system: db
action: create
format:
  table: users
schema:
  type: object
  properties:
    id:
      type: integer
      format: int64
    email:
      type: string
      format: email
    name:
      type: string
      format: name
```

## Output Format

Write each effect to: `.rngo/effects/{EFFECT_KEY}.yml`

Use YAML format with proper indentation (2 spaces).

Always validate:
- Schema structure is correct
- format/system mutual exclusivity
- References point to valid entities
- Rates are reasonable for the effect type