SOAR design — config-as-code lanes over the reliable AppKey API
Design for the Google SecOps SOAR surface of secopsctl, ported from real
operational SOAR usage. Two facts drive it:
1 · The AppKey legacy API is the backbone, not a quarantine. The modern v1alpha SOAR methods are new, 500 intermittently, and cover little; the AppKey
/api/external/v1surface is reliable and by far the most complete, so it is what the operator-facing reconcile engine runs on (soar/legacy/— ~90 files, a durable SDK). Only the genuinely transitionallegacyPlaybooks:legacy*bridge is “delete when the native API ships.”2 · Every surface is exactly one lane — reconcile (per-object CUD), imperative (per-entity verbs, no file), or raw (batch/bundle/selector passthrough). The engine + lane model is product-neutral and lives in ARCHITECTURE.md; this doc is SOAR specifics. Live status per surface is in CATALOG.md.
All identifiers here are placeholders (<tenant>, <num>, <reg>, <id>) — the
public repo stays tenant-neutral; real values come from config/env at runtime.
The operator surface — three lanes
What an operator actually drives. Every surface is classified into one lane, and the engine enforces the boundary (a batch/bundle/selector endpoint cannot register as reconcile):
| Lane | Mechanism | SOAR surfaces |
|---|---|---|
| reconcile (per-object CUD) | engine + a reconcile.Surface |
14: webhooks · environments · networks · tracking-lists · soc-roles · idp · visual-families · sla-definitions · case-stages · case-tags · close-root-causes · blacklists · playbook-categories · playbooks (bespoke, name-keyed) |
| imperative (per-entity verbs, no desired-state file) | soar case <verb> |
reads: list (queue cards) · get <id> (case + its alerts); 9 mutate verbs: assign · rename · stage · tag · untag · describe · importance · close · merge; plus soar push bulk-close |
| raw (batch upserts / bundles / selector reads) | soar legacy call <op> |
integrations · jobs · ontology-mapping · permissions · settings · … |
Commands:
soar pull <surface>— live → filessoar push <surface> [--prune]— files → live; additive by default, dry-run unless--yessoar case list/get— readsoar case <verb>— guarded mutatesoar legacy call <op>— raw passthrough
The operational loop is soar case list → review → soar case get <id> → act
(a mutate verb or soar push bulk-close). Per-surface identity, capabilities (NoDelete /
WholeBodyWrite / PruneEligible) and read/write validation are in
CATALOG.md — today only webhooks is PruneEligible; every other
surface is additive/NoDelete by design.
The transport tiers (under the hood)
The lane table above is what an operator drives; this is the transport reality the
lanes ride on. SOAR uses one host (https://<tenant>.siemplify-soar.com) with
one AppKey and no ADC, across three tiers:
| Tier | Surface | Transport | Lifecycle |
|---|---|---|---|
| Modern ⚠ | v1alpha native: integrations · connectors · jobs · alertGroupingRules · moduleSettings · cases | /v1alpha/projects/<num>/…/instances/<id> + ?format=camel + x-goog-api-version + updateMask |
pull/patch-only · 500s intermittently — not the build path today |
| Bridge 🟠 | legacyPlaybooks:legacy* (list/get/save/attach/stats) |
v1alpha host, legacy op names | the one genuine quarantine: delete when native v1alpha playbook CRUD ships |
| Legacy AppKey ✅ | Siemplify external API — the broad, reliable surface the reconcile engine runs on | /api/external/v1/… (offset paging) |
durable backbone, not slated for removal |
Which tier to trust: the Legacy AppKey tier is reliable and complete and backs the engine; the Modern v1alpha tier is new, pull/patch-only, and 500s intermittently. Only the Bridge tier is genuinely delete-when-native.
Plus one legacy SIEM pair on the Chronicle side (ADC auth, not SOAR):
legacy:legacyFindRawLogs and legacy:legacyBatchGetCases (the SOAR-integer-id
⇄ SIEM-uuid bridge).
Package layout
danny.vn/secops/
│
├── auth/ OAuth(ADC) + APIKey/SOARAppKey ← unchanged
├── config/ + soar_url (tenant SOAR host) ← small add
│
├── chronicle/ (SIEM · v1alpha · MODERN, ADC)
│ └── legacy.go 🗑 QUARANTINE FILE ── legacyFindRawLogs, legacyBatchGetCases
│ (SOAR int-id ⇄ SIEM uuid map). Delete when v1alpha equivalents land.
│
└── soar/ (host=https://<tenant>.siemplify-soar.com · AppKey, NO ADC)
│
│ internal/transport/ shared, durable plumbing (AppKey + host) — transport.go
│ • Transport.V1Alpha() → /v1alpha/projects/<num>/locations/<reg>/instances/<id>/…
│ auto: ?format=camel · x-goog-api-version · updateMask · {items,nextPageToken}
│ • Transport.External() → /api/external/v1/… offset paging {requestedPage,pageSize}
│
├── MODERN — v1alpha native (pull/patch only · flaky) ─────────────────────
│ client.go SOAR client
│ integrations.go integrations · connectors · jobs (discovery)
│ connectors.go connectorInstances GET · PATCH(updateMask) · :fetchLatestDefinition
│ jobs.go jobInstances GET · PATCH(updateMask)
│ grouping.go alertGroupingRules · moduleSettings(:batchUpdate)
│ cases.go cases (v1alpha listing)
│
└── soar/legacy/ ── DURABLE AppKey SDK (~90 files) — backs the reconcile engine ──
─ reliable Siemplify external API (/api/external/v1); NOT a quarantine ─
cases · connectors · jobs · settings · ontology · webhooks ·
environments · networks · blacklists · soc-roles · …
─ BRIDGE (the one delete-when-native piece): v1alpha host, legacy op names ─
legacyPlaybooks:legacy{List,Get,GetByName,Save,Attach,Stats}
gotchas baked in (see below)
dependency rule: soar(modern) → soar/internal/transport ← soar/legacy
(modern never imports legacy; both share the transport)
Wire shapes actually sent — modeled as types
legacy/cases.go CaseQueueRequest{ SortBy, RequestedPage, PageSize, Statuses[] } // 1=OPEN 2=CLOSED
BulkCloseRequest{ CasesIDs[], CloseReason, RootCause, CloseComment, DynamicParameters[] }
└ CloseReason enum: 0 NotMalicious · 1 Malicious · 2 Maintenance · 3 Inconclusive
connectors/jobs Parameters map[string]string // EVERYTHING is a string ("true","100")
└ secrets read back as "***…" → pass through unchanged on PATCH (never re-send a real secret)
transport (v1alpha) every request: ?format=camel + header x-goog-api-version: v1alpha + PATCH ?updateMask=a,b
bridge/playbooks coercePlaybookTypes(): id/priority/version/*UnixTimeInMs int→str (top-level, trigger, each step)
validatePlaybookName(): allow [A-Za-z0-9 _-], reject . ( ) [ ] : /
playbook save mints a NEW UUID → never cache it; re-resolve by display name
save = whole-body replace (not a patch): read → modify same body → save
SOAR-specific gotchas to encode
- Playbook UUID rotation — every save mints a new identifier; the one you sent goes stale. Always re-resolve by display name after a save.
- Playbook type coercion — GET returns ints, save requires strings for
id/priority/version/*UnixTimeInMs(top-level,trigger, eachstep);templateNamemust be"", nevernull. - Playbook name charset — letters/digits/space/
-/_only; reject.()[]:/. - Dual case IDs — SOAR uses integer IDs, SIEM uses UUIDs; map via
legacy:legacyBatchGetCases(soarPlatformInfo.caseId). - Parameters are always strings on connectors/jobs (even bool/int); secrets
read back masked (
***…) and must be passed through unchanged on PATCH. - Integration clones — integrations can appear twice (
Name+Name__<uuid>); use the un-suffixed one for live edits. - Two paginations — legacy is offset (
requestedPage/pageSize); v1alpha is Google-style (pageToken/nextPageToken).
Repo touchpoints beyond the SDK
- config: add
soar_url(tenant SOAR host); reuseproject_number/region/customer_idfor the v1alpha path. AppKey viaauth.SOARAppKey/SECOPS_SOAR_APP_KEY(no ADC). - internal/mirror + internal/cli:
pull soar(cases · playbooks · connectors · jobs · grouping → YAML/JSON snapshots) and guardedpush(bulk-close · connector-patch · job-patch · playbook-save) under the same dry-run / LIVE-DEPLOY guard as rules. - leak guard: SOAR snapshots carry masked secrets + integer case IDs; the pre-commit scanner already covers AppKey/secret patterns.
Out of scope (per project decision)
No SentinelOne, no Teams/chat notifications — out of this repo’s scope.