Migrating to v0.5.0

The "where does this run, where does its state live, and which file describes that" question has lived across six config files (profiles.yaml, backends.yaml, pipelines.yaml, sources.yaml, runners.yaml, sparks.yaml) and four flags (--on, --sw-on, --sw-profile, --sw-target). v0.5.0 collapses that to two files, two flags (--profile and --target), and a new pipeline trigger verb with no shims or deprecation runway -- this is a hard-cut release. Plan on touching every .sparkwing/ repo and every ~/.config/sparkwing/profiles.yaml once.

If you only run sparkwing locally with no shared state, the migration is a five-minute file rename. If your team uses S3 or a controller, expect ~30 minutes to flatten the YAML and re-issue tokens through the profile model.

Single .sparkwing/sparkwing.yaml per repoSection anchor link

The project-level .sparkwing/ directory loses five files (pipelines.yaml, backends.yaml, runners.yaml, sources.yaml, sparks.yaml) and gains one (sparkwing.yaml). Content is the same, just under top-level keys.

Before:

.sparkwing/
  pipelines.yaml
  backends.yaml
  runners.yaml
  sources.yaml
  sparks.yaml

After:

.sparkwing/
  sparkwing.yaml
# .sparkwing/sparkwing.yaml

# Optional: which profile this repo expects when --profile is unset.
profile: shared-team

pipelines:
  - name: release
    entrypoint: Release
    on: { push: { branches: [main] } }
    targets: { prod: { runners: [my-pool], source: prod-secrets } }

runners:
  local: { type: local }
  my-pool:
    type: kubernetes
    profile: prod        # was `controller:`; now points at a profile by name
    labels: [arch=arm64]

sources:
  default: laptop-dotenv
  entries:
    laptop-dotenv: { type: file, path: .env }
    prod-secrets:  { type: profile, profile: prod }  # was type: remote-controller

sparks:
  - { name: x, source: github.com/sparks/x, version: v0.3.1 }

Why: Five files with overlapping cross-references (runners.yaml's controller: pointed at profiles.yaml; sources.yaml's type: remote-controller pointed at the same; per-pipeline targets.<name>.backend overlapped backends.yaml) made it hard to teach the model and impossible to grep the team's deployment intent in one place. Flattening to one file makes that intent visible.

Hard error on the old files: sparkwing refuses to run while a legacy file is still present in .sparkwing/, so a half-migrated repo fails loudly instead of silently ignoring config:

.sparkwing/pipelines.yaml, .sparkwing/backends.yaml are no longer read in v0.5.0; combine this project's YAML into .sparkwing/sparkwing.yaml -- see https://sparkwing.dev/docs/migration-guide/v0.5.0 for the layout

Delete each named file once its content lives under the matching sparkwing.yaml section. A repo with only sparkwing.yaml (or no .sparkwing/ at all) is silent.

Edge cases:

  • A repo with no project-local config still runs -- sparkwing.yaml is optional. Without it, pipelines come from the registered Go code in .sparkwing/jobs/, and profile resolution falls through to ~/.config/sparkwing/profiles.yaml's default:.
  • The profile: top-level field is just a hint. --profile X always wins. There is no sticky context state, so the only way a command "auto-switches" by cwd is via this committed hint.
  • Per-target backend overrides (the rare target.backend block, once in pipelines.yaml) live under pipelines[].targets.<name>.backend in sparkwing.yaml; they layer on top of the resolved profile's surfaces, semantics unchanged.

Profiles absorb all backend specsSection anchor link

~/.config/sparkwing/profiles.yaml was a thin connection-bundle file -- URL, token, log_store, artifact_store. It now owns the full backend triple (state, cache, logs), so a profile fully describes "where do my runs go and what auth do I need to get there."

Before:

# profiles.yaml
default: laptop
profiles:
  laptop: {}                              # implied local SQLite + filesystem
  prod:
    controller: https://api.example.dev
    token: swu_xxx
    log_store: s3://shared/logs           # optional convenience
    artifact_store: s3://shared/cache
# backends.yaml (deleted in v0.5.0)
defaults:
  state: { type: sqlite }
  cache: { type: filesystem, path: ~/.cache/sparkwing }
  logs:  { type: filesystem, path: ~/.cache/sparkwing/logs }
environments:
  gha:
    detect: { env_var: GITHUB_ACTIONS, equals: "true" }
    state: { type: s3, bucket: team, prefix: state }

After:

# profiles.yaml (the only "where" config)
default: laptop
profiles:
  laptop:
    state: { type: sqlite }
    cache: { type: filesystem, path: ~/.cache/sparkwing }
    logs:  { type: filesystem, path: ~/.cache/sparkwing/logs }

  shared-team:
    state: { type: s3, bucket: team, prefix: state }
    cache: { type: s3, bucket: team, prefix: cache }
    logs:  { type: s3, bucket: team, prefix: logs }

  prod:
    controller: https://api.example.dev
    token: swu_xxx
    # state/cache/logs are implied by controller; reads/writes go through it.

Why: Two surfaces saying "where state lives" (one per-profile, one per-project) was a tug-of-war with no clear precedence. Profile becomes the single addressable noun, and the project file only hints which profile to use.

Environment auto-detection that used to live in backends.yaml (gha, kubernetes) moves to profiles.yaml under a per-profile detect: block. A profile with detect: becomes the auto-selected profile when its env condition matches, ahead of the project hint:

profiles:
  gha:
    detect: { env_var: GITHUB_ACTIONS, equals: "true" }
    state: { type: s3, bucket: team-ci, prefix: state }
    # ...

--profile is the only "where" flagSection anchor link

--on and --sw-on are replaced by --profile (storage / dispatch addressing); --sw-target is renamed to --target (pipeline-internal deployment selector -- same semantics, just out of the --sw- namespace, since the distinction between sparkwing-owned and pipeline-typed flags now lives in the flag's purpose, not its prefix).

--profile and --target are orthogonal: --profile picks the addressable storage (and, for pipeline trigger, the controller endpoint); --target picks which deployment environment inside the pipeline definition the run acts on (its runner / secret bindings). A multi-target pipeline still needs --target to disambiguate.

Before:

sparkwing run release --on prod
sparkwing run release --sw-target prod
sparkwing runs list --on prod

After:

sparkwing run release --profile prod          # local execution, state via prod
sparkwing run release --target prod           # pick the pipeline's prod target
sparkwing pipeline trigger release --profile prod  # submit to prod's controller
sparkwing runs list --profile prod

Why: --on and --sw-target did different things -- one addressed a storage/dispatch target, the other picked a pipeline's deployment environment -- but the names suggested they were variants of each other. Splitting them by purpose makes the model legible: --profile is exclusively "what addressable storage/dispatch target am I talking to," and --target is exclusively "which deployment environment within this pipeline." They compose: sparkwing run release --profile shared-team --target prod runs the pipeline's prod target with state in shared-team.

sparkwing pipeline trigger for remote executionSection anchor link

sparkwing run --on prod used to be the way to submit a trigger row to a remote controller. v0.5.0 splits the verb so the execution model is visible in the command:

Before:

sparkwing run release --on prod              # remote dispatch (cluster runs it)

After:

sparkwing run release --profile prod         # local execution, state to prod
sparkwing pipeline trigger release --profile prod   # submit to prod controller

Why: sparkwing run had two meanings -- "execute here" and "ask the cluster to execute" -- depending on whether --on was set. Splitting the verbs makes the model legible at the verb name, and frees --profile to mean only "where does state live." pipeline trigger requires a profile that has controller: set; passing a controller-less profile errors with a clear message.

Default behavior: sparkwing pipeline trigger follows the remote run until it reaches a terminal state. When the profile defines a logs URL, full log streaming; otherwise, node-status updates from the controller. --detach skips the follow and returns once the trigger is registered.

Dual-write state when local execution writes to a profileSection anchor link

sparkwing run X --profile prod from a laptop now writes state to both the local SQLite store and prod's backend in parallel. The remote is canonical; the local mirror is a free byproduct of the laptop already having the data. This means sparkwing runs list (no flag) on the laptop sees the run after the fact even with no network.

Before: state went only to whatever backends.yaml resolved to. Listing local runs after a --on prod dispatch returned nothing.

After: local SQLite always gets the data when the laptop is the executor. Disable per-profile with mirror_local: false for automated workers that fire-and-forget thousands of runs:

profiles:
  ci-fire-and-forget:
    state: { type: s3, ... }
    mirror_local: false

Why: "I ran this locally yesterday with --profile prod, I should still see it on my laptop" is the expected mental model. Remote unreachability shouldn't lose laptop visibility.

Run start emits the resolved profileSection anchor link

Every run's run_start envelope now carries the resolved profile and effective backends, so JSON consumers can tell at a glance where state and logs landed:

{
  "event": "run_start",
  "attrs": {
    "profile": { "name": "prod", "source": "project", "mirror_local": true },
    "backends": {
      "state": "controller://prod",
      "logs":  "controller://prod",
      "cache": "controller://prod"
    }
    ...
  }
}

Pretty-mode prints the same as a header block. sparkwing profile (new command) prints the resolved profile and the resolution chain without running anything, useful for "where would this command go right now?" -- --profile NAME shows the hypothetical, and -o json emits the same effective + considered shape for agents.

Audit-stream events for spawned childrenSection anchor link

A spawning node used to emit a single pipeline_await_spawned audit event when its child run started. v0.5.0 splits this into two structured events on the parent's stream:

  • child_run_start { run_id, pipeline, ... } -- when the child run begins.
  • child_run_finish { run_id, status, duration_ms } -- when the child reaches a terminal state (success / failed / cancelled / timeout).

Migration for audit-stream consumers (log forwarders, dashboards, alerting parsers): rewire anything keyed on pipeline_await_spawned onto the new pair. The child run_id appears on both events, so existing parent-to-child join logic still works after the key rename. Consumers that never read pipeline_await_spawned need no changes.

Why: the single event only marked the spawn point and gave no visibility into the child's outcome from the parent's perspective. The split lets consumers correlate parent-to-child without inlining the child's full stream, and lets them attribute parent failures to specific child runs with terminal status and duration in hand.