# OAGS v0.1 — Core Specification

Status: working draft. Keywords MUST / SHOULD / MAY per RFC 2119.
Custodian: OMPU Research Collective. Canonical context (future): `https://oags.dev/context/v0.1`.

OAGS is a **profile over plain JSON**. The wire format is ordinary JSON. Everything heavier
(JSON-LD, signatures, canonicalization, RDF mapping) is an *opt-in upgrade*, never a default tax.
Design priority order: **(1) simple & intuitive to author, (2) honest about what's missing,
(3) navigable by an agent, (4) compact, (5) linked-data-bridgeable.**

---

## 1. The scene payload

An OAGS scene is a JSON object. **Six keys are REQUIRED** (the conformance floor):

| key | type | meaning |
|---|---|---|
| `oags` | string | version, e.g. `"0.1"`. First key a cold parser branches on; self-identifies the payload with no network round-trip. |
| `graph_id` | string | stable identity of the whole graph this scene slices. **Opaque token** — MAY be an IRI, but consumers MUST NOT dereference it. |
| `entry` | object | `{ block_id, depth }` — where the agent enters and how deep this slice went. |
| `blocks` | array | the typed nodes (§2). |
| `edges` | array | the typed relations (§3). MAY be `[]`, but the key MUST be present. |
| `declared_losses` | array | the confession surface (§4). MAY be `[]` (a meaningful claim — see §4.4). |

**Strongly recommended** (present them unless you have a reason not to): `negative_space` (§5),
`provenance` (§6), `expand` (§7), `schema`. **Optional**: `site`, `lenses` (§8), `as_of` (§9),
`canon`/`sig` (§10), `conforms_to`, `@context` (§11), and reserved hooks (§12).

> Why floor-6 and not 7: required honesty only works where the empty value is itself a claim.
> `declared_losses: []` means "I claim I withheld nothing within `entry.depth`." `negative_space: []`
> cannot carry that meaning (see §5.2), so forcing it invites theater. It is strongly recommended,
> not required.

---

## 2. Blocks (typed nodes)

```json
{ "block_id": "sleep-1", "block_class": "meta", "label": "держит", "...": "domain fields" }
```
- `block_id` (REQUIRED) — unique within the scene.
- `block_class` (OPTIONAL) — OMPU vocab `{empirical, empirical_candidate, theoretical, speculative,
  scar, meta, lens}` or any string; unknown values MUST be preserved, not rejected.
- any other fields are the block's payload. A block MAY carry an optional opaque `attachment` (§12).

## 3. Edges (typed relations) — plain JSON, no RDF-star

```json
{ "from": "sleep-1", "to": "sleep-2", "op": "precedes", "lens": "time", "polarity": "+" }
```
- `from`, `to` (REQUIRED) — block_ids.
- `op` (OPTIONAL) — the relation type, semantically (`precedes`, `supports`, `make`, …). Carries the
  edge "type" so v0.1 needs no RDF-star / RDF 1.2 triple-term dependency.
- `lens` (OPTIONAL) — id into `lenses` (§8): which interpretive frame this edge lives under.
- `polarity` (OPTIONAL) — `"+"` | `"-"` | `"0"`, the edge's semantic sign. **Named `polarity`, not
  `sign`,** to avoid collision with the cryptographic `sig` (§10).
- `id`, `type` (OPTIONAL, reserved) — present only when an edge must be individually addressed
  (per-edge provenance, signature, or a `hidden_edges` target).
- **Anonymous-edge identity** (for the future RDF mapping): an edge without an explicit `id` is
  identified as `graph_id + "#edge-" + base64url(sha256(JCS(edge-object-without-id)))`. Hashing the
  *whole* edge object (not the `{from,to,op}` tuple) avoids collisions when two edges differ only by
  `lens`/`polarity`/metadata.

---

## 4. `declared_losses` — the differentiator

Each entry is a **JSON object** (bare strings are NOT allowed except as `note` text). Two
orthogonal axes — **scope** (what shape of thing is missing) and **reason** (why):

```json
{
  "scope": "depth_limited",
  "where": "@graph",
  "count": 39,
  "recoverable": true,
  "expand_via": { "rel": "deeper", "from": "sleep-1", "depth": 3 }
}
```

### 4.1 `scope` — CLOSED enum (REQUIRED)
`depth_limited` · `truncated` · `hidden_edges` · `omitted_nodes` · `sampled`
(`sampled` is distinct from `omitted_nodes`: a sample is not topologically complete — different
trust math.) The closed set keeps an agent's partial-view logic deterministic.

### 4.2 `reason` — OPEN enum (OPTIONAL; REQUIRED only when scope does not self-explain)
`depth_limited` and `sampled` are self-explaining (the consumer set the bound) → `reason` MAY be
omitted. For the others, give a reason. Base terms: `policy` · `rights` · `rate_limit` · `timeout` ·
`stale` · `expiry` · `redacted_by_policy` · `compact_profile` · `encoding`. The enum is **open**:
custom reasons use an `x_` prefix. **A consumer that meets an unknown reason MUST treat it as
"a loss exists, advisory" and surface it — never drop it.**
> `encoding` means *lost to the codec/serialization itself* (another codec could recover it). Do
> NOT use it for editorial choices like "one word per block" — that is `policy` / `compact_profile`,
> recoverable only via `expand_via`.

### 4.3 Other fields (all OPTIONAL)
- `where` — a block_id, an edge id, or a reserved region word with an `@` sigil: `@graph`, `@schema`,
  `@blocks`, `@edges`. Block_ids MUST NOT start with `@`.
- `count` — integer or `"unknown"` — how much is missing (feeds partial-view math).
- `recoverable` — **`true`** (follow `expand_via`) | **`false`** (gone — do not retry/expand) |
  **`unknown`**. This is the field agents branch on first. `null`/absent ≠ `false`.
- `expand_via` — `{ rel, from?, depth?, ... }` binding into the `expand` template (§7), or `null`.
- `note` — free-text human gloss. **Agents MUST NOT derive behavior from `note`.**

### 4.4 Empty set
`declared_losses: []` is a **positive, signable assertion of completeness within `entry.depth`** —
not "we forgot to fill it." A conforming producer MUST emit a loss whenever it truncates,
depth-limits, hides edges, samples, or omits nodes relative to the fuller graph it holds.

### 4.5 Build-state is NOT a loss
An unpublished schema, a TODO, a missing optional field — these are **conformance warnings**, raised
by the validator, NOT `declared_losses` entries. `declared_losses` describes missing *graph*, never
missing *spec work*.

---

## 5. `negative_space` — absence by design (strongly recommended, structured-only)

```json
{ "kind": "by_design", "claim": "the cat itself is not a node", "where": "@graph", "machine_readable": true }
```
Structured objects only — no bare strings (it is the second honesty field; it must be machine-
readable). `kind` + `claim` REQUIRED; `where`, `machine_readable` OPTIONAL.

### 5.1 The boundary test (declared_losses vs negative_space)
**"Is there a fuller version of the producer's OWN representation where this would be present?"**
- **Yes** → `declared_losses` (omitted *in transit* — a claim about the transmission).
- **No** → `negative_space` (it genuinely isn't part of the graph — a claim about the territory).
The same fact MUST NOT be scattered across both.

### 5.2 Empty-set asymmetry (normative)
`negative_space: []` is **NOT** a completeness claim — it only means "no by-design absences
enumerated." To make the strong claim, include a sentinel:
`{ "kind": "asserted_complete", "claim": "graph is complete as modeled" }`.

---

## 6. `provenance` — PROV-O mini-subset
Flat object with ~4–6 PROV-O terms: `wasGeneratedBy`, `wasDerivedFrom`, `wasAttributedTo`,
`generatedAtTime` (and `specializationOf` for versioned/projected entities). Scenes that travel
off-origin SHOULD use the split form `{ claim: {…about the graph}, document: {…about this served
slice: served_at, by, count, returned} }` (the nanopub claim-vs-publication split).

## 7. `expand` — the follow handle (RFC 6570 URI template)
```json
"expand": "https://catconstant.com/sleeps{?from,depth,view,loss,cursor,limit}"
```
**Reserved variables** (consumers know how to fill them): `from`, `depth`, `view`, `loss`, `cursor`,
`limit`. Each `declared_losses[].expand_via` binds into this template. An object form
`{ template, params[], rels{} }` is allowed for HATEOAS publishers but the bare template is the
baseline.

## 8. `lenses`
`{ lens_id: "human gloss" }`. The **`lens_id` is the machine-actionable token**; the definition is a
human gloss — **agents MUST NOT derive behavior from the definition string.** Structured/computable
lenses are reserved for v0.2.

## 9. `as_of` (recommended for live graphs)
Top-level timestamp of the **graph state** this slice reflects — distinct from the byte-generation
time in `provenance.document.served_at`. A cached scene can have fresh bytes over a stale `as_of`.
An optional `graph_version`/`etag` lets a cold agent ask "changed since?" without re-diffing.

---

## 10. Trust — `canon` + `sig` (both optional; the off-origin story)

- `canon` is **OPTIONAL**; if a `sig` is present, `canon` is **REQUIRED**. The v0.1 default profile
  is `"jcs-v1"` = JCS canonical JSON (RFC 8785) over the full payload with the `sig` field excluded.
  A self-contained object form `{ alg, scope, sig_excluded }` is allowed off-registry. Undefined
  bare labels are deprecated (the live `slot0-v1` placeholder MUST die).
- `sig` is **OPTIONAL but strongly recommended** for scenes crossing a cache / mirror / aggregator /
  bus: a detached or embedded **JWS** (RFC 7515) over the `canon` form, covering `declared_losses`.
  Far lighter than W3C Data Integrity / mandatory RDFC canonicalization (which we deliberately do
  NOT require — it is isomorphism-hard with a Dataset-Poisoning DoS surface).

> **A signature proves the publisher MADE this claim over these exact bytes — it does NOT prove the
> claim is true, complete, or that nothing was silently dropped.** A signed scene with a fabricated
> or empty `declared_losses` is cryptographically valid and substantively dishonest. `declared_losses`
> is a *speech-act*, enforced socially (reputation + falsifiability), not a cryptographic guarantee.
> Consumers MUST treat unsigned or absent losses as advisory. — This paragraph is normative and MUST
> appear in any derived spec.

---

## 11. Discovery & JSON-LD compat
- **In-band (cheap):** the `oags` key self-identifies a scene found anywhere (bus, cache, paste).
- **Site index:** `GET /.well-known/oags` (RFC 8615 well-known suffix; collision-free vs
  `agent-card.json`, `api-catalog`). Lists graph_ids + the schema/context URLs. Experimental until
  an IANA registration is filed.
- **Schema pointer:** the `schema` field carries the JSON-Schema/profile URL. A site claiming OAGS
  support MUST publish a non-null schema; an individual *payload* MAY leave `schema` absent (the
  `oags` version covers the cheap path).
- **JSON-LD:** OPTIONAL — inline `@context`, or HTTP `Link: rel="http://www.w3.org/ns/json-ld#context"`,
  or `Link: rel="alternate"; type="application/ld+json"`. Never required, never a core key.
- `llms.txt` MAY mention the OAGS endpoint but MUST NOT be the discovery mechanism (it is not a graph
  carrier).

## 12. Extensibility — so v0.2 sits on v0.1 without glitches
This is a **load-bearing requirement** (more fields / edges / node-types are expected later):
- **Unknown top-level keys, unknown `op`, unknown `block_class`, unknown `reason` MUST be preserved
  and tolerated, never rejected** (A2A "ignore unrecognized fields" rule). Producers MAY emit them.
- Custom vocabulary uses an `x_` prefix; consumers surface unknown `x_` terms, never silently drop.
- **Reserved (named now, shape may tighten in v0.2):** `attachment` on blocks/edges (opaque latent
  payload ref `{model, projection, dims, codec, version}`); `entry.selector_profile`; edge `id`/`type`;
  a top-level `warnings` array for conformance/build-state notes (so it never pollutes
  `declared_losses`).
- The `oags` version string gates the parse grammar; minor versions are additive-only.

## 13. RDF / linked-data bridge (promise that must ship, not rot)
The mapping (block→node, edge→reified statement via the anonymous-edge IRI of §3, `provenance`→PROV-O,
`@context`→JSON-LD) MUST ship as a dereferenceable `@context` URL + a conformance fixture
(scene → expected N-Quads) tested in CI. A promised-but-unshipped bridge is a painted door; until the
fixture passes, OAGS is plain JSON with a linked-data *intention*.

---

## 14. Conformance checklist (v0.1)
A scene conforms if: the 6 required keys are present and well-typed; `declared_losses` entries are
objects with a closed-enum `scope`; `negative_space` (if present) is structured; no `declared_losses`
entry encodes build-state; if `sig` is present, `canon` is too; unknown extension fields are
preserved. A **web surface** additionally MUST serve `/.well-known/oags` and a non-null `schema`.
