# Crystallise AI Backend API Reference

Stateless AI processing service for Evidence Mapper integration

Base URL: **http://localhost:8005**  |  All routes at `/v1/*` and `/*`

This reference covers every HTTP endpoint on the Crystallise AI backend. The service provides three AI capabilities for systematic literature reviews —
**AI Screening** (scoring studies against eligibility criteria),
**AutoIndexer** (extracting structured fields from title/abstract), and
**Criteria AI** (building/refining eligibility criteria and analysing research questions).
Every endpoint is stateless: Evidence Mapper owns all persistent data. Start with [Health](#health) to verify the service is reachable, then [Configuration](#config) for service defaults, then the three capability sections.

#### Service Auth

`X-API-Key: <key>` or `Authorization: Bearer <key>` on all non-public requests.

#### OpenAI Key

`X-OpenAI-API-Key: sk-...` per-request passthrough. Falls back to server env var.

#### Async Jobs

Screening and Indexer batches: `POST /jobs`, poll `GET /jobs/{id}`.

#### Sync Jobs

Criteria AI and `POST /indexer/run`: one request, one response. No polling needed.

#### Mock Mode

Every mutating endpoint accepts `"mock": true` — canned data, no OpenAI call, no key needed.

### Contents

- [Health](#health)
- [Configuration](#config)
- [AI Screening](#screening)
- [AutoIndexer](#indexer)
- [Criteria AI](#criteria)
- [Error Responses](#errors)
- [Shared Types](#types)

## Health

Public endpoints for uptime monitoring. `/health` is a liveness probe (is the process running?); `/health/ready` also verifies database connectivity and OpenAI-key presence. Neither requires authentication.

### `GET /health` — Liveness check (no auth)

Returns `{ "status": "ok" }`. No checks — "am I running?" only.

Response codes: `200`

### `GET /health/ready` — Readiness probe (no auth)

Response 200 — healthy

```
{"status": "ready", "checks": {"database": "ok", "openai_key": "configured"}}
```

Response 503 — degraded

```
{"status": "degraded", "checks": {"database": "error: connection refused", "openai_key": "missing"}}
```

Response codes: `200`, `503`

## Configuration

Inspect and (carefully) update runtime model/temperature/prompt settings per AI service. Reads are listings of service configs and the centralised prompt registry; the single `PUT` is for admin-level tuning. Everything here is metadata — no user data flows through.

### `GET /v1/config/services` — List service configurations

Response 200 — array of `ServiceConfigResponse`

| Field | Type | Description |
| --- | --- | --- |
| service\_id | string | e.g. `screening`, `extraction`, `criteria` |
| model | string |  |
| system\_prompt | string | Current prompt (may be inherited from registry) |
| prompt\_template\_id | string |  |
| temperature | float |  |
| max\_output\_tokens | integer |  |
| extra | dict | Service-specific knobs |

Response codes: `200`, `401`

### `GET /v1/config/services/{service_id}` — Get service config

Returns a single `ServiceConfigResponse`. Unknown `service_id` returns a default config rather than 404.

Response codes: `200`, `401`

### `PUT /v1/config/services/{service_id}` — Update service config (partial)

Request Body — all fields optional, patch-style

| Field | Type | Required |
| --- | --- | --- |
| model | string | No |
| system\_prompt | string | No |
| prompt\_template\_id | string | No |
| temperature | float | No |
| max\_output\_tokens | integer | No |
| extra | dict | No |

Response 200

The updated `ServiceConfigResponse` (same fields as `GET`).

Example

```
curl -s -X PUT http://localhost:8005/v1/config/services/screening \
  -H "X-API-Key: dev-key" -H "Content-Type: application/json" \
  -d '{"model": "gpt-5-mini", "temperature": 0.2}'
# →
{
  "service_id": "screening",
  "model": "gpt-5-mini",
  "temperature": 0.2,
  "max_output_tokens": 8192,
  ...
}
```

Response codes: `200`, `400`, `401`, `500`

### `GET /v1/config/prompts` — List AI prompt definitions

Response 200 — array of prompt metadata

| Field | Type | Description |
| --- | --- | --- |
| name | string | e.g. `criteria.question_analysis` |
| service | string | `screening`, `criteria`, `indexer` |
| description | string | One-line purpose |
| has\_variables | boolean | Whether the prompt has templated parameters |
| system\_or\_user | string | `system`, `user`, or `both` |

Response codes: `200`, `401`

## AI Screening

Title/abstract screening pipeline in four stages — scoring each paper against eligibility criteria across multiple AI "repetitions", generating human-readable reasoning, grouping that reasoning into thematic clusters, and assigning each paper to a cluster. Designed for batch runs: submit a job, poll for results. Typical throughput: hundreds to low-thousands of papers per run.

### Core

### `POST /v1/screening/jobs` — Start async screening job

Request Body

| Field | Type | Required |
| --- | --- | --- |
| papers | dict[] | Yes — each with `id`, `title`, `abstract` |
| criteria | dict[] | No |
| questions | string[] | No |
| model | string | No (default: `gpt-5-nano`) |
| repetitions | integer | No (default: 5) |
| threshold | float | No (default: 1.0) |
| clusters\_type | "include" | "exclude" | No |
| project\_id | integer | No — opaque correlation key from EM |
| mock | boolean | No — deterministic scoring, no OpenAI call |
| max\_estimated\_cost\_usd | float | No — job rejected with 400 if estimate exceeds this |

Response 200

| Field | Type | Description |
| --- | --- | --- |
| job\_id | string | UUID for polling |
| status | string | `"pending"` initially |
| progress | float | 0.0 initially; 0–1 once running |
| stage | string | Empty initially; `"labelling"`, `"reasoning"`, `"clustering"`, `"assignment"` while running |

Example (success)

```
# Request
curl -s -X POST http://localhost:8005/v1/screening/jobs \
  -H "X-API-Key: dev-key" \
  -H "Content-Type: application/json" \
  -d '{
    "papers": [
      {"id": "p1", "title": "RCT of drug X in adults", "abstract": "Randomized trial..."}
    ],
    "criteria": [{"name": "Population", "type": "include", "value": "Adults"}],
    "questions": ["Is drug X effective?"],
    "mock": true
  }'

# Response
{"job_id": "abc-123", "status": "pending", "progress": 0.0, "stage": ""}
```

Example (error — cost cap exceeded)

```
{"detail": "Estimated cost $1.23 exceeds max_estimated_cost_usd=$0.50"}
```

4-stage pipeline: labelling → reasoning → clustering → assignment. Poll `GET /screening/jobs/{job_id}` for progress.

Response codes: `200`, `400`, `401`, `500`

### `GET /v1/screening/jobs/{job_id}` — Poll job status + results

Response 200

| Field | Type | Description |
| --- | --- | --- |
| job\_id | string |  |
| status | string | `"pending"`, `"running"`, `"completed"`, `"failed"` |
| progress | float | 0–1 |
| stage | string | Current pipeline stage |
| results | dict[] | Per-paper scores + reasoning (when completed) |
| clusters | dict[] | Reason clusters (when completed) |
| error | string | Error message (when failed) |
| error\_category | string | Same taxonomy as [`error_code`](#errors) |
| error\_retryable | boolean |  |
| duration\_ms | integer | Wall-clock duration |
| estimated\_cost\_usd | float | Final cost (approximate) |
| stage\_timings | dict | Per-stage duration in ms |

Example

```
curl -s http://localhost:8005/v1/screening/jobs/abc-123 -H "X-API-Key: dev-key"
# →
{
  "job_id": "abc-123",
  "status": "completed",
  "progress": 1.0,
  "stage": "assignment",
  "results": [{"id": "p1", "final_score": 4.2, "cluster_id": 1, "reasoning": "..."}],
  "clusters": [{"cluster_id": 1, "label": "Eligible RCTs", "count": 1}],
  "duration_ms": 3421,
  "estimated_cost_usd": 0.002
}
```

Response codes: `200`, `401`, `404` (unknown `job_id`)

### `GET /v1/screening/jobs` — List recent jobs

Query Parameters

| Param | Type | Default |
| --- | --- | --- |
| limit | integer | 50 |

Response 200 — array of `ScreeningJobListItem`

| Field | Type | Description |
| --- | --- | --- |
| job\_id | string |  |
| status | string |  |
| progress | float |  |
| stage | string |  |
| papers\_count | integer |  |
| model | string |  |
| project\_id | integer | From EM, if provided at create |
| duration\_ms | integer |  |
| estimated\_cost\_usd | float |  |
| created\_at | string | ISO 8601 timestamp |
| completed\_at | string | ISO 8601 timestamp (when completed) |

Response codes: `200`, `401`

### `GET /v1/screening/active-job` — Get active job for a project

Query Parameters

| Param | Type | Required |
| --- | --- | --- |
| project\_id | integer | Yes |

Returns the running/pending job for this `project_id`, or `null`. Use to dedupe: EM should not start a new job while one is running for the same project.

Response codes: `200`, `401`

### Optional

### `POST /v1/screening/estimate` — Estimate cost before running

Request Body

| Field | Type | Required |
| --- | --- | --- |
| papers\_count | integer | Yes |
| model | string | No (default: `gpt-5-nano`) |
| repetitions | integer | No (default: 5) |
| criteria\_count | integer | No (default: 0) |

Response 200

| Field | Type | Description |
| --- | --- | --- |
| estimated\_input\_tokens | integer | Sum across labelling + reasoning stages |
| estimated\_output\_tokens | integer |  |
| estimated\_cost\_usd | float | Based on hardcoded model pricing |
| model | string | Echoes request |
| papers\_count | integer | Echoes request |
| repetitions | integer | Echoes request |
| confidence | string | `"approximate"` |
| disclaimer | string | Expected variance (±30%) |

Example

```
curl -s -X POST http://localhost:8005/v1/screening/estimate \
  -H "X-API-Key: dev-key" -H "Content-Type: application/json" \
  -d '{"model": "gpt-5-nano", "papers_count": 500, "repetitions": 5, "criteria_count": 10}'
# →
{
  "estimated_input_tokens": 775000,
  "estimated_output_tokens": 75000,
  "estimated_cost_usd": 0.0625,
  "model": "gpt-5-nano",
  "papers_count": 500,
  "repetitions": 5,
  "confidence": "approximate",
  "disclaimer": "Estimate based on empirical averages. Actual cost may vary +-30%..."
}
```

**Caveat:** pricing is hardcoded in `crystallise.llm.cost.DEFAULT_PRICING_PER_1M` and may drift from OpenAI's public pricing over time. Treat `estimated_cost_usd` as approximate and cross-check against OpenAI's current rates before relying on it for budget caps.

Response codes: `200`, `400`, `401`, `500`

## AutoIndexer

Structured data extraction from title + abstract. Define your extraction fields (by hand, or via the optional AI-suggest / AI-refine helpers), submit a batch, and receive per-paper values with *evidence spans* (the quote that justified each value) and per-field *confidence scores*. Use `POST /run` for small batches synchronously or `POST /jobs` for larger batches asynchronously.

### Core

### `POST /v1/indexer/run` — Synchronous extraction (small batches)

Request Body

| Field | Type | Required |
| --- | --- | --- |
| records | dict[] | Yes — each with `ID`, `Title`, `Abstract` |
| fields | IndexerField[] | Yes — see [Shared Types](#types) |
| model | string | No (default: `gpt-5-mini`) |
| project\_context | ProjectContext | No — `{description, research_questions}` |
| mode | "test" | "sample" | "full" | No (default: `full`). `test` processes first 5 records, `sample` 20, `full` all. |
| max\_workers | integer | No (default: 4) |
| batch\_size | integer | No (default: 50) |
| project\_id | integer | No — opaque correlation key |

Response 200

| Field | Type | Description |
| --- | --- | --- |
| results | dict[] | One per record with extracted field values + evidence + confidence |
| errors | string[] | Per-record error messages |
| usage | dict | Token usage: `{input_tokens, output_tokens, total_tokens, estimated_cost_usd}` |
| model\_version | string | Actual OpenAI model string returned (may include date suffix) |

Example (success)

```
curl -s -X POST http://localhost:8005/v1/indexer/run \
  -H "X-API-Key: dev-key" -H "Content-Type: application/json" \
  -d '{
    "records": [{"ID": "p1", "Title": "RCT of drug X", "Abstract": "150 adults..."}],
    "fields": [
      {"name": "study_design", "description": "Type of study", "data_type_primary": "text"},
      {"name": "sample_size", "description": "Number of participants", "data_type_primary": "number"}
    ],
    "mode": "test"
  }'
# →
{
  "results": [{
    "ID": "p1",
    "indexing_status": "ok",
    "study_design": {"value": "RCT", "confidence": 0.95, "evidence": [...]},
    "sample_size": {"value": 150, "confidence": 0.9, "evidence": [...]}
  }],
  "errors": [],
  "usage": {"input_tokens": 320, "output_tokens": 85, "total_tokens": 405, "estimated_cost_usd": 0.0002},
  "model_version": "gpt-5-mini-2025-02-01"
}
```

Example (error — invalid field type)

```
{"detail": {"message": "...", "error_code": "validation", "retryable": false}}
```

Response codes: `200`, `400`, `401`, `429`, `500`

### `POST /v1/indexer/jobs` — Start async indexer job

Request Body

Identical to [`POST /v1/indexer/run`](#). Summarised here for completeness.

| Field | Type | Required |
| --- | --- | --- |
| records | dict[] | Yes — each with `ID`, `Title`, `Abstract` |
| fields | IndexerField[] | Yes |
| model | string | No (default: `gpt-5-mini`) |
| project\_context | ProjectContext | No |
| mode | "test" | "sample" | "full" | No (default: `full`) |
| project\_id | integer | No |

Response 200

| Field | Type | Description |
| --- | --- | --- |
| job\_id | string | UUID for polling |
| status | string | `"pending"` initially |
| progress | float | 0.0 initially |

Example

```
curl -s -X POST http://localhost:8005/v1/indexer/jobs \
  -H "X-API-Key: dev-key" -H "Content-Type: application/json" \
  -d '{"records": [...], "fields": [...], "mode": "full"}'
# →
{"job_id": "xyz-789", "status": "pending", "progress": 0.0}
```

Response codes: `200`, `400`, `401`, `500`

### `GET /v1/indexer/jobs/{job_id}` — Poll job status + results

Response 200

| Field | Type | Description |
| --- | --- | --- |
| job\_id | string |  |
| status | string | `"pending"`, `"running"`, `"completed"`, `"failed"` |
| progress | float | 0–1 |
| partial\_results | dict[] | Records processed so far |
| errors | string[] |  |
| usage | dict | Token usage to date |
| error | string | Terminal error message (when failed) |
| error\_category | string |  |
| error\_retryable | boolean |  |
| duration\_ms | integer |  |
| estimated\_cost\_usd | float |  |
| model\_version | string |  |
| created\_at | string | ISO 8601 |
| completed\_at | string | ISO 8601 |

Response codes: `200`, `401`, `404` (unknown `job_id`)

### `GET /v1/indexer/jobs` — List recent jobs

Query Parameters

| Param | Type | Default |
| --- | --- | --- |
| limit | integer | 50 |

Response 200 — array of `IndexerJobListItem`

| Field | Type | Description |
| --- | --- | --- |
| job\_id | string |  |
| status | string |  |
| progress | float |  |
| model | string |  |
| record\_count | integer |  |
| duration\_ms | integer |  |
| estimated\_cost\_usd | float |  |
| created\_at | string | ISO 8601 |
| completed\_at | string | ISO 8601 |

Response codes: `200`, `401`

### `GET /v1/indexer/active-job` — Get active job for a project

Query Parameters

| Param | Type | Required |
| --- | --- | --- |
| project\_id | integer | Yes |

Returns the running/pending indexer job for this `project_id`, or `null`.

Response codes: `200`, `401`

### Optional

### `POST /v1/indexer/estimate` — Estimate indexer cost before running

Request Body

| Field | Type | Required |
| --- | --- | --- |
| fields | IndexerField[] | Yes |
| record\_count | integer | Yes |
| model | string | No (default: `gpt-5-mini`) |

Response 200

| Field | Type | Description |
| --- | --- | --- |
| estimated\_input\_tokens | integer |  |
| estimated\_output\_tokens | integer |  |
| estimated\_cost\_usd | float | Based on hardcoded pricing |
| confidence | string | `"approximate"` |
| disclaimer | string | ±30% expected variance |

Example

```
curl -s -X POST http://localhost:8005/v1/indexer/estimate \
  -H "X-API-Key: dev-key" -H "Content-Type: application/json" \
  -d '{"fields": [{"name": "study_design", "description": "...", "data_type_primary": "text"}], "record_count": 100}'
# →
{
  "estimated_input_tokens": 32000,
  "estimated_output_tokens": 8500,
  "estimated_cost_usd": 0.0234,
  "confidence": "approximate",
  "disclaimer": "Estimate based on empirical averages..."
}
```

**Caveat:** pricing is hardcoded in `crystallise.llm.cost.DEFAULT_PRICING_PER_1M` and may drift from OpenAI's public pricing. Treat this as a rough sizing, not a bill.

Response codes: `200`, `400`, `401`, `500`

### `POST /v1/indexer/suggest-fields` — AI field suggestion from project context

Request Body

| Field | Type | Required |
| --- | --- | --- |
| project\_context | ProjectContext | No — description + research questions |
| pico | dict | No — PICOS elements from `/criteria/picos` |
| sample\_records | dict[] | No — sample papers for grounding |
| existing\_fields | string[] | No — field names already defined |
| model | string | No (default: `gpt-4.1`) |
| mock | boolean | No |

Response 200

| Field | Type | Description |
| --- | --- | --- |
| fields | IndexerField[] | Suggested extraction fields |
| warnings | ExtractionWarning[] | Per-field risk flags (e.g. low-signal fields) |

Example

```
curl -s -X POST http://localhost:8005/v1/indexer/suggest-fields \
  -H "X-API-Key: dev-key" -H "Content-Type: application/json" \
  -d '{"project_context": {"description": "RCTs of exercise for depression"}, "mock": true}'
# →
{
  "fields": [
    {"name": "study_design", "description": "Type of study", "data_type_primary": "text", "examples": ["RCT", "cohort"]},
    {"name": "sample_size", "description": "Number of participants", "data_type_primary": "number"}
  ],
  "warnings": []
}
```

Response codes: `200`, `400`, `401`, `429`, `500`

### `POST /v1/indexer/refine-fields` — AI review of field definitions

Request Body

| Field | Type | Required |
| --- | --- | --- |
| fields | IndexerField[] | Yes — current field set to review |
| project\_context | ProjectContext | No |
| sample\_records | dict[] | No — ground suggestions against real papers |

Response 200

| Field | Type | Description |
| --- | --- | --- |
| suggestions | FieldSuggestion[] | Proposed `add`, `modify`, `remove`, or `merge` actions |

Example

```
curl -s -X POST http://localhost:8005/v1/indexer/refine-fields \
  -H "X-API-Key: dev-key" -H "Content-Type: application/json" \
  -d '{"fields": [{"name": "outcome", "description": "...", "data_type_primary": "text"}]}'
# →
{
  "suggestions": [
    {"action": "modify", "field": {"name": "primary_outcome", ...}, "rationale": "...",
     "original_field_name": "outcome"}
  ]
}
```

Response codes: `200`, `400`, `401`, `429`, `500`

### `POST /v1/indexer/group-tags` — AI-assisted value grouping

Request Body

| Field | Type | Required |
| --- | --- | --- |
| field\_name | string | Yes — field the values belong to |
| values | string[] | Yes — extracted values to cluster |
| project\_context | ProjectContext | No |
| num\_groups\_hint | integer | No |

Response 200

| Field | Type | Description |
| --- | --- | --- |
| groups | TagGroup[] | Clustered buckets with labels |
| usage | dict | Token usage |

Example

```
curl -s -X POST http://localhost:8005/v1/indexer/group-tags \
  -H "X-API-Key: dev-key" -H "Content-Type: application/json" \
  -d '{"field_name": "study_design", "values": ["RCT", "randomised controlled trial", "cohort study", "case-control"]}'
# →
{
  "groups": [
    {"name": "Randomised controlled trials", "values": ["RCT", "randomised controlled trial"], "rationale": "..."},
    {"name": "Observational", "values": ["cohort study", "case-control"], "rationale": "..."}
  ],
  "usage": {"total_tokens": 150, "estimated_cost_usd": 0.0002}
}
```

Response codes: `200`, `400`, `401`, `429`, `500`

## Criteria AI

Helpers for building and refining the eligibility criteria a screening pipeline runs against. The core endpoint `/analyze-question` checks whether a single research question is PICOS-ready for a literature search; the optional endpoints generate criteria from context, extract PICOS elements, refine project descriptions, or consolidate duplicate criteria. All endpoints are synchronous — one request, one response.

### Core

### `POST /v1/criteria/analyze-question` — PICOS search-readiness check for a single research question

Request Body

| Field | Type | Required |
| --- | --- | --- |
| research\_question | string | Yes |
| model | string | No (default: `gpt-5-mini`) |
| mock | boolean | No |

Response 200

| Field | Type | Description |
| --- | --- | --- |
| status | string | `"ready"` or `"could_improve"` |
| missing\_elements | string[] | PICOS elements that are unclear or absent |
| suggestion | string | Actionable improvement or confirmation message |

Example (success, mock)

```
curl -s -X POST http://localhost:8005/v1/criteria/analyze-question \
  -H "X-API-Key: dev-key" -H "Content-Type: application/json" \
  -d '{"research_question": "Does exercise help depression?", "mock": true}'
# →
{
  "status": "could_improve",
  "missing_elements": [
    "Population is not specified",
    "Outcome measures are vague"
  ],
  "suggestion": "Mock mode: specify the population, intervention, and primary outcome to make the question searchable. Run without mock for real analysis."
}
```

Example (error — missing required field)

```
{"detail": [{"type": "missing", "loc": ["body", "research_question"], "msg": "Field required"}]}
```

**Demo lineage:** this endpoint mirrors the behaviour of the Streamlit research-question demo (`demo.py`) shared with NetReady earlier. It's the recommended entry point for the "is this question ready for a literature search?" flow.

Response codes: `200`, `400`, `401`, `429`, `500`

### Optional

### `POST /v1/criteria/generate` — Generate criteria from project context

Request Body

| Field | Type | Required |
| --- | --- | --- |
| project\_description | string | Yes |
| research\_questions | string[] | No |
| additional\_notes | string | No |
| existing\_criteria | dict[] | No — for deduplication |
| criterion\_type | "include" | "exclude" | No (default: `exclude`) |
| model | string | No (default: `gpt-4.1`) |
| mock | boolean | No |

Response 200

| Field | Type | Description |
| --- | --- | --- |
| criteria | CriterionResponse[] | Generated criteria — see [Shared Types](#types) |

Example (success, mock)

```
curl -s -X POST http://localhost:8005/v1/criteria/generate \
  -H "X-API-Key: dev-key" -H "Content-Type: application/json" \
  -d '{"project_description": "RCTs of exercise for depression in adults", "mock": true}'
# →
{
  "criteria": [
    {"category": "Study Design", "text": "Review articles, systematic reviews, meta-analyses", "criterion_type": "exclude", "description": "..."},
    {"category": "Publication Type", "text": "Conference abstracts without full publication", "criterion_type": "exclude", "description": "..."}
  ]
}
```

Response codes: `200`, `400`, `401`, `429`, `500`

### `POST /v1/criteria/picos` — Extract PICOS elements from description

Request Body

| Field | Type | Required |
| --- | --- | --- |
| project\_description | string | Yes |
| research\_questions | string[] | No |
| model | string | No (default: `gpt-4.1`) |
| mock | boolean | No |

Response 200

| Field | Type | Description |
| --- | --- | --- |
| elements | dict | Keys: `population`, `intervention`, `comparison`, `outcome`, `study_design` |
| gap\_flags | string[] | Missing or ambiguous elements |
| contraindications | dict[] | Potential conflicts between elements |

Example (success, mock)

```
curl -s -X POST http://localhost:8005/v1/criteria/picos \
  -H "X-API-Key: dev-key" -H "Content-Type: application/json" \
  -d '{"project_description": "RCTs of metformin vs placebo in adults with type 2 diabetes", "mock": true}'
# →
{
  "elements": {
    "population": "Adults with the condition described in the project",
    "intervention": "The primary intervention or exposure under review",
    "comparison": "Standard of care, placebo, or no intervention",
    "outcome": "Primary clinical outcomes, efficacy, and safety measures",
    "study_design": "Study designs relevant to the research question"
  },
  "gap_flags": ["Mock mode: PICOS elements are placeholders — run without mock for real extraction"],
  "contraindications": []
}
```

Response codes: `200`, `400`, `401`, `429`, `500`

### `POST /v1/criteria/refine-context` — Improve project description for screening

Request Body

| Field | Type | Required |
| --- | --- | --- |
| description | string | Yes |
| research\_questions | string[] | No |
| model | string | No (default: `gpt-4.1`) |
| mock | boolean | No |

Response 200

| Field | Type | Description |
| --- | --- | --- |
| refined\_description | string | Improved, more specific project description |
| refined\_research\_questions | string[] | Questions rewritten for search precision |
| explanation | string | Why these refinements were made |

Example (success, mock)

```
curl -s -X POST http://localhost:8005/v1/criteria/refine-context \
  -H "X-API-Key: dev-key" -H "Content-Type: application/json" \
  -d '{"description": "Review of drug X", "research_questions": ["Is drug X effective?"], "mock": true}'
# →
{
  "refined_description": "Review of drug X\n\n[Refined for clarity and specificity in systematic review screening.]",
  "refined_research_questions": ["Is drug X effective? [refined for precision]"],
  "explanation": "Mock mode: minor refinements applied as placeholders. Run without mock for real AI refinement."
}
```

Response codes: `200`, `400`, `401`, `429`, `500`

### `POST /v1/criteria/refine` — Refine criteria from conflict patterns

Request Body

| Field | Type | Required |
| --- | --- | --- |
| current\_criteria | dict[] | Yes — the active criteria set |
| conflicts | dict[] | No — AI-vs-human disagreement records |
| project\_description | string | No |
| model | string | No (default: `gpt-4.1`) |
| mock | boolean | No |

Response 200

| Field | Type | Description |
| --- | --- | --- |
| criteria | CriterionResponse[] | Refined criteria derived from the conflict patterns |

Example (success, mock)

```
curl -s -X POST http://localhost:8005/v1/criteria/refine \
  -H "X-API-Key: dev-key" -H "Content-Type: application/json" \
  -d '{
    "current_criteria": [{"category": "Population", "text": "Adults only"}],
    "conflicts": [{"paper_title": "Study A", "decision_a": "include", "decision_b": "exclude"}],
    "mock": true
  }'
# →
{
  "criteria": [
    {"category": "Study Design", "text": "Exclude retrospective observational studies without a control arm", "criterion_type": "exclude", "confidence": 0.72, "rationale": "Derived from 1 reviewer conflict(s) on study design."},
    {"category": "Outcome Reporting", "text": "Exclude studies that do not report the primary outcome quantitatively", "criterion_type": "exclude", "confidence": 0.65, "rationale": "Pattern across 1 conflict(s) flagged insufficient outcome data."}
  ]
}
```

Response codes: `200`, `400`, `401`, `429`, `500`

### `POST /v1/criteria/consolidate` — Detect duplicates and propose merges

Request Body

| Field | Type | Required |
| --- | --- | --- |
| criteria | dict[] | Yes — criteria to analyse |
| project\_description | string | No |
| research\_questions | string[] | No |
| model | string | No (default: `gpt-4.1`) |
| mock | boolean | No |

Response 200

| Field | Type | Description |
| --- | --- | --- |
| duplicate\_groups | DuplicateGroup[] | Groups of criteria with overlapping scope — see [Shared Types](#types) |
| consolidation\_proposals | ConsolidationProposal[] | Proposed merged criteria |
| warnings | string[] | Low-confidence rejections or notes |

Example (success, mock)

```
curl -s -X POST http://localhost:8005/v1/criteria/consolidate \
  -H "X-API-Key: dev-key" -H "Content-Type: application/json" \
  -d '{"criteria": [{"id": 1, "category": "Population", "text": "Adults 18+"}, {"id": 2, "category": "Population", "text": "Adult participants over 18"}], "mock": true}'
# →
{
  "duplicate_groups": [],
  "consolidation_proposals": [],
  "warnings": ["Mock mode: no consolidation performed"]
}
```

Response codes: `200`, `400`, `401`, `429`, `500`

## Error Responses

The same status codes and body shapes apply everywhere — read this section once and cross-reference from each endpoint. Classified LLM errors carry a structured `error_code` and a `retryable` flag so client code can decide whether to back off, surface the message, or abort. Async jobs additionally report terminal errors *inside* the job response rather than as HTTP error codes.

| HTTP | `error_code` | Retryable | When you see it |
| --- | --- | --- | --- |
| 400 | validation | no | Malformed request body, missing required field, Pydantic validation failed |
| 401 | auth | no | Missing/invalid `X-API-Key`, or invalid `X-OpenAI-API-Key` |
| 404 | — | no | Resource not found (e.g. unknown `job_id`) |
| 429 | rate\_limit | yes | OpenAI rate limit — retry with exponential backoff |
| 500 | unknown | — | Unexpected server error |
| 503 | — | — | `/health/ready` only, when DB or OpenAI key check fails |

Standard body — classified LLM error

```
{
  "detail": {
    "message": "Rate limit exceeded",
    "error_code": "rate_limit",
    "retryable": true
  }
}
```

Standard body — FastAPI validation / missing resource

```
{ "detail": "Field required: research_question" }
```

Async job in-body failure (screening, indexer)

```
{
  "job_id": "abc-123",
  "status": "failed",
  "error": "Invalid OpenAI key",
  "error_category": "auth",
  "error_retryable": false
}
```

Async jobs report terminal errors inside the job response (HTTP **200**), not as HTTP error codes. Poll the job and check `status === "failed"` + `error_category`.

## Shared Types

Data types referenced by multiple endpoints. These mirror the Pydantic models in `api/schemas/` (source of truth) — documented once here to avoid per-endpoint repetition.

### IndexerField

| Field | Type | Description |
| --- | --- | --- |
| name | string | Field identifier (e.g. `study_design`) |
| description | string | What the AI should extract |
| data\_type\_primary | string | `text`, `number`, `yes_no`, `list_text`, `list_number` |
| data\_type\_secondary | string | Sub-type qualifier (default `NA`) optional |
| examples | string[] | Example values optional |
| examples\_mode | "guide" | "enum" | "guide" = suggestions; "enum" = strict list optional |
| depth | "minimal" | "full" | Extraction effort level optional |

### ProjectContext

| Field | Type | Description |
| --- | --- | --- |
| description | string | Free-text project description |
| research\_questions | string[] |  |

### CriterionResponse

| Field | Type | Description |
| --- | --- | --- |
| category | string | PICOS category (`Population`, `Intervention`, `Outcome`, etc.) |
| text | string | The criterion itself |
| description | string | Expanded definition |
| criterion\_type | "include" | "exclude" |  |
| confidence | float | 0–1 AI confidence optional |
| rationale | string | Why this criterion was suggested optional |
| title\_abstract\_assessable | boolean | Whether the criterion can be decided from title/abstract alone |

### DuplicateGroup

| Field | Type | Description |
| --- | --- | --- |
| group\_type | string | e.g. `"exact"`, `"semantic"` |
| category | string | PICOS category these criteria share |
| criterion\_ids | integer[] | IDs of criteria in this group |
| recommended\_primary\_id | integer | Which criterion to keep |
| merge\_rationale | string |  |
| ai\_confidence | float | 0–1; groups below 0.75 are filtered out server-side |

### ConsolidationProposal

| Field | Type | Description |
| --- | --- | --- |
| category | string |  |
| criterion\_ids | integer[] | Criteria to merge |
| proposed\_merged\_criterion | string | New label — rejected server-side if > 10 words |
| proposed\_description | string |  |
| proposed\_type | "include" | "exclude" |  |
| merge\_rationale | string |  |
| ai\_confidence | float | 0–1; proposals below 0.75 are filtered out server-side |

### TagGroup

| Field | Type | Description |
| --- | --- | --- |
| name | string | Group label |
| values | string[] | Member values |
| rationale | string | Why these cluster together optional |

### ExtractionWarning

| Field | Type | Description |
| --- | --- | --- |
| field | string | Field name the warning applies to |
| risk\_level | "low" | "medium" | "high" | Default `medium` |
| reason | string | Why the field is at risk (ambiguous, hard to extract from title/abstract, etc.) |
| suggested\_fallback | string | Recommended mitigation |

### FieldSuggestion

| Field | Type | Description |
| --- | --- | --- |
| action | "add" | "modify" | "remove" | "merge" |  |
| field | IndexerField | The proposed (new or revised) field |
| rationale | string |  |
| original\_field\_name | string | For `modify`/`remove`/`merge` — which field this applies to optional |
| target\_field\_name | string | For `merge` — the name to merge into optional |