secopsctl:~$ docs

SIEM design — two planes: config-as-code + operational query/act

Design for the Google SecOps SIEM surface of secopsctl. SIEM splits into two planes that need fundamentally different models. All identifiers here are placeholders; the public repo stays tenant-neutral.

The split. Config is desired state — rules, lists, tables, parsers, feeds, dashboards. It’s detection-as-code: pull → review in git diff → push, reconciled. Operational is live data — events, alerts, cases. You don’t reconcile a case from a file; you query a subset and act on it, the way a SOC analyst triages. Two planes, two models, one CLI.

secopsctl
├── CONFIG plane  (desired state, files, git)        pull → diff → push  (reconcile)
│     rules · reference_lists · data_tables · feeds · parsers · dashboards · curated …
│
└── OPERATIONAL plane  (live data, no files)          query → review → act
      events (read-only) · alerts (triage) · cases (triage) · entities/iocs (enrich)

Auth note: the SIEM API needs ADC / SECOPS_ACCESS_TOKEN (the SOAR AppKey does NOT work here). The official v1alpha REST surface HTTP-500s intermittently (Google is still building SecOps); surface a clean error on 500, never retry forever (see CLAUDE.md).


Plane 1 — Config as code (reconcile)

The user’s framing: “rules, curated rules, dashboards, etc. — similar things.” They are. This plane reuses the proven product-neutral reconcile engine (internal/mirror/reconcile, already shipped for SOAR + SIEM reference_lists): identity + canonical diff + redaction + additive/--prune guard.

Surface Shape Plan
rules (YARA-L) source .yaral + deployment state machine bespoke (two-resource; keep push rules-create/disable, extend with update/enable/retrohunt) — not a single canonical body
reference_lists typed, .txt+.yaml done (engine, NoDelete)
data_tables .csv+.yaml, rows via a separate API done + live-validated (engine, push data_tables): columns immutable after create (update rejects a column change); rows = wholesale ReplaceDataTableRows; not prune-eligible (whole-table delete is high-blast). Gated write smoke TestLiveReconcileDataTableWriteSmoke passed on the tenant
feeds typed, secrets in settings engine surface; redact on pull, overlay on update; resolve the assetNamespace(read) vs namespace(write) mismatch with a live smoke first
parsers versioned/immutable (create new, no update) engine surface, Create+Delete only (no update)
dashboards (native) typed, charts as JSON engine surface, full CUD
curated / curated_rules Google-managed (read-mostly) read + enable/disable/alerting toggle + exclusions only — not full CUD
watchlists, rule_exclusions, forwarders, log_pipelines typed engine surfaces where per-object CUD fits

Discipline (same as SOAR, proven): workflow-spec the shape → verify SDK signatures by hand → wire as a reconcile.Surfacelive read-validategated write-smoke on an inert throwaway. The SIEM write-smoke harness lives in internal/mirror/reconcile_smoke_siem_test.go, gated by SECOPS_SIEM_SMOKE (read round-trip of every SIEM surface) and SECOPS_SIEM_SMOKE_WRITE (the create/update/delete cycle). No surface is trusted for --yes until its write loop is live-validated.


Plane 2 — Operational query/act (the SOC workflow)

This is the part that needs the new design. Events/alerts/cases are live security data, not desired state. The loop is query → review → act on each or a subset — exactly how an analyst triages. The SDK is largely already built; this plane is about the operator model and safety, not new API code.

The three act surfaces (and one read-only)

Domain Query (read) Act (mutate) Mutability
events (UDM) SearchUDM / NLSearch / GetStats / FindUDMFieldValues immutable telemetry — read-only, never mutate
alerts GetAlerts (list) · GetAlert · ListDetections · SearchRuleAlerts UpdateAlert · BulkUpdateAlerts (status / verdict / priority / reason / comment) per-item + subset
cases (SIEM, UUID) ListCases / SearchCases (filter) · GetCase · GetCases PatchCase (+etag/updateMask) · MergeCases · BulkClose/Assign/AddTag/ChangePriority/ChangeStage/Reopen per-item + subset
entities / IoCs SummarizeEntity · ListIoCs · FetchAssociatedInvestigations enrichment — read-only

SIEM cases and SOAR cases are TWO VIEWS OF THE SAME CASE — not two systems. Google SecOps = Chronicle (SIEM) + Siemplify (SOAR) merged; cases originate in the SOAR case-management engine and are surfaced natively in the SIEM. The proof is in the wire: a case carries both ids and legacyBatchGetCases returns soarPlatformInfo.caseId — the bridge between them.

  SOAR case (Siemplify) SIEM case (Chronicle/SecOps)
id integer (e.g. 234) UUID (resource name)
api · auth /api/external/v1/cases · AppKey v1beta cases + v1alpha legacy:legacyListCases · ADC
role the engine — alerts→case, playbooks, tasks, queue, stages, close the first-class SecOps cases surface (get/list/patch/merge/bulk) — the convergence layer
CLI soar case list/get (read) · soar case <verb> (act) cases …
today mature, reliable newer, flaky (v1beta 500 / v1alpha 404 observed)

They stay separate command trees (different id, api, reliability), bridged by soarPlatformInfo.caseId only when correlating. The SIEM-UUID api is the forward-looking unified one; the SOAR-integer api is what works today.

The query model

Every list/search command shares: a filter, a time window, a limit, pagination, and an output format.

secopsctl query udm '<udm filter>' [--hours N | --from TS --to TS] [--limit N] [--json]   # events (exists)
secopsctl search nl  '<question>'   [--hours N] [--limit N] [--json]                       # NL → UDM → search
secopsctl stats      '<query>'      [--hours N]                                            # aggregations
secopsctl alerts list   [--filter EXPR] [--hours N] [--state OPEN|CLOSED] [--limit N] [--json]
secopsctl cases  list   [--filter EXPR] [--status …] [--priority …] [--limit N] [--json]
secopsctl entity summarize <ip|domain|hash|user> <value> [--hours N]
secopsctl iocs   list   [--prioritized] [--hours N] [--limit N] [--json]

The act model — single + subset, safe by construction

Two ways to act, mirroring how SOC consoles work (open one, or select rows → bulk action). Both are guarded exactly like push: LIVE banner, dry-run by default, real apply needs --yes.

1. Per-item — unambiguous, low blast radius:

secopsctl alerts update <id> --verdict FALSE_POSITIVE --priority LOW [--comment "…"]
secopsctl cases  comment <id> "triaged: benign"
secopsctl cases  assign  <id> --user <analyst>
secopsctl cases  close   <id> --reason NOT_MALICIOUS --root-cause "…"

2. Subset (bulk) — the dangerous one; two selection paths, safest first:

Guard summary (one rule): no operational mutation runs without an explicit --yes; any --filter-driven bulk shows the count + sample first and is --limit-capped. A live-data mutation is treated as a production deploy, same as a config push.

Command tree

Designed shape. Built today: query udm, cases list/get/search. Everything else here (incl. alerts …, cases <verb>, cases bulk) is the planned model, not yet wired — authoritative per-command status is in CATALOG.md.

secopsctl query udm | search nl | stats | iocs list | entity summarize     # read
secopsctl alerts  list | get | update | bulk <close|verdict|priority|comment>
secopsctl cases   list | get | search | comment | assign | tag | priority | stage | close | reopen | merge
                  + cases bulk <close|assign|tag|priority|stage|reopen>     # subset (--ids/--filter, guarded)

Cross-cutting

First implementation wave — SIEM cases (operational)

Two case paths, one case. The SIEM-native cases collection below (v1beta) is new and returns intermittent 5xx; the reliable, complete path for case operations is the SOAR AppKey APIsoar case <verb> plus the soar case list/get reads (ListCaseCards / GetCaseFullDetails, which also returns the case’s alerts). The SOAR reads are wired (soar case list/get), completing the reliable query → review → act loop; the SIEM-native cases/alerts commands here are the unified view for when that collection stabilizes.

Decided: the subset-act model is both paths (reviewed---ids preferred, --filter gated dry-run-first + --limit-capped), and the first wave is case management — the full triage lifecycle.

# query
secopsctl cases list   [--filter EXPR] [--status …] [--priority …] [--limit 100] [--json]   # ListCases / ListCasesOpts
secopsctl cases search '<expr>' [--hours N] [--json]                                         # SearchCases
secopsctl cases get <uuid> [--expand alerts|events]                                          # GetCase

# per-item act (guarded: dry-run default, --yes to apply)
secopsctl cases comment  <uuid> "<text>"
secopsctl cases assign   <uuid> --user <analyst>
secopsctl cases tag      <uuid> --tag <t>
secopsctl cases priority <uuid> --priority <…>
secopsctl cases stage    <uuid> --stage <…>
secopsctl cases close    <uuid> --reason <…> --root-cause "<…>"
secopsctl cases reopen   <uuid> --comment "<…>"
secopsctl cases merge    --into <uuid> --ids <a,b,c>

# subset act (guarded; --ids reviewed-set OR --filter dry-run-first + --limit cap)
secopsctl cases bulk <close|assign|tag|priority|stage|reopen> [--ids 1,2|@file | --filter EXPR] … [--yes] [--limit N]

Wiring: single-field edits go through PatchCase (etag + updateMask — round-trip the etag, surface conflicts); mergeMergeCases; every bulk verb → the matching Bulk* SDK method. Reuse the liveBanner + dry-run/--yes guard from push. A cases <verb> shares the casesOps plumbing; cases bulk adds the --ids/--filter+--limit selection on top.

Build discipline (same as SOAR, and the gate is real here): SIEM needs a token (SECOPS_ACCESS_TOKEN; ADC is restricted) and the v1alpha surface 500s — so the query/read layer + --dry-run previews are built and validated first (safe), and no --yes bulk close/assign is trusted until a live smoke closes→reopens a single throwaway-safe case (or runs against a non-prod instance). Until a token is available, this wave ships read + dry-run only.

Non-goals