Crystallise AI Backend API Reference

Stateless AI processing service for Evidence Mapper integration

Base URL: http://localhost:8005  |  All routes at /v1/* and /*

This reference covers every HTTP endpoint on the Crystallise AI backend. The service provides three AI capabilities for systematic literature reviews — AI Screening (scoring studies against eligibility criteria), AutoIndexer (extracting structured fields from title/abstract), and Criteria AI (building/refining eligibility criteria and analysing research questions). Every endpoint is stateless: Evidence Mapper owns all persistent data. Start with Health to verify the service is reachable, then Configuration for service defaults, then the three capability sections.

Service Auth

X-API-Key: <key> or Authorization: Bearer <key> on all non-public requests.

OpenAI Key

X-OpenAI-API-Key: sk-... per-request passthrough. Falls back to server env var.

Async Jobs

Screening and Indexer batches: POST /jobs, poll GET /jobs/{id}.

Sync Jobs

Criteria AI and POST /indexer/run: one request, one response. No polling needed.

Mock Mode

Every mutating endpoint accepts "mock": true — canned data, no OpenAI call, no key needed.

Contents

Health

Public endpoints for uptime monitoring. /health is a liveness probe (is the process running?); /health/ready also verifies database connectivity and OpenAI-key presence. Neither requires authentication.

GET /health Liveness check (no auth)

Returns { "status": "ok" }. No checks — "am I running?" only.

Response codes: 200
GET /health/ready Readiness probe (no auth)
{"status": "ready", "checks": {"database": "ok", "openai_key": "configured"}}
{"status": "degraded", "checks": {"database": "error: connection refused", "openai_key": "missing"}}
Response codes: 200, 503

Configuration

Inspect and (carefully) update runtime model/temperature/prompt settings per AI service. Reads are listings of service configs and the centralised prompt registry; the single PUT is for admin-level tuning. Everything here is metadata — no user data flows through.

GET /v1/config/services List service configurations
FieldTypeDescription
service_idstringe.g. screening, extraction, criteria
modelstring
system_promptstringCurrent prompt (may be inherited from registry)
prompt_template_idstring
temperaturefloat
max_output_tokensinteger
extradictService-specific knobs
Response codes: 200, 401
GET /v1/config/services/{service_id} Get service config

Returns a single ServiceConfigResponse. Unknown service_id returns a default config rather than 404.

Response codes: 200, 401
PUT /v1/config/services/{service_id} Update service config (partial)
FieldTypeRequired
modelstringNo
system_promptstringNo
prompt_template_idstringNo
temperaturefloatNo
max_output_tokensintegerNo
extradictNo

The updated ServiceConfigResponse (same fields as GET).

curl -s -X PUT http://localhost:8005/v1/config/services/screening \
  -H "X-API-Key: dev-key" -H "Content-Type: application/json" \
  -d '{"model": "gpt-5-mini", "temperature": 0.2}'
# →
{
  "service_id": "screening",
  "model": "gpt-5-mini",
  "temperature": 0.2,
  "max_output_tokens": 8192,
  ...
}
Response codes: 200, 400, 401, 500
GET /v1/config/prompts List AI prompt definitions
FieldTypeDescription
namestringe.g. criteria.question_analysis
servicestringscreening, criteria, indexer
descriptionstringOne-line purpose
has_variablesbooleanWhether the prompt has templated parameters
system_or_userstringsystem, user, or both
Response codes: 200, 401

AI Screening

Title/abstract screening pipeline in four stages — scoring each paper against eligibility criteria across multiple AI "repetitions", generating human-readable reasoning, grouping that reasoning into thematic clusters, and assigning each paper to a cluster. Designed for batch runs: submit a job, poll for results. Typical throughput: hundreds to low-thousands of papers per run.

Core

POST /v1/screening/jobs Start async screening job
FieldTypeRequired
papersdict[]Yes — each with id, title, abstract
criteriadict[]No
questionsstring[]No
modelstringNo (default: gpt-5-nano)
repetitionsintegerNo (default: 5)
thresholdfloatNo (default: 1.0)
clusters_type"include" | "exclude"No
project_idintegerNo — opaque correlation key from EM
mockbooleanNo — deterministic scoring, no OpenAI call
max_estimated_cost_usdfloatNo — job rejected with 400 if estimate exceeds this
FieldTypeDescription
job_idstringUUID for polling
statusstring"pending" initially
progressfloat0.0 initially; 0–1 once running
stagestringEmpty initially; "labelling", "reasoning", "clustering", "assignment" while running
# Request
curl -s -X POST http://localhost:8005/v1/screening/jobs \
  -H "X-API-Key: dev-key" \
  -H "Content-Type: application/json" \
  -d '{
    "papers": [
      {"id": "p1", "title": "RCT of drug X in adults", "abstract": "Randomized trial..."}
    ],
    "criteria": [{"name": "Population", "type": "include", "value": "Adults"}],
    "questions": ["Is drug X effective?"],
    "mock": true
  }'

# Response
{"job_id": "abc-123", "status": "pending", "progress": 0.0, "stage": ""}
{"detail": "Estimated cost $1.23 exceeds max_estimated_cost_usd=$0.50"}
4-stage pipeline: labelling → reasoning → clustering → assignment. Poll GET /screening/jobs/{job_id} for progress.
Response codes: 200, 400, 401, 500
GET /v1/screening/jobs/{job_id} Poll job status + results
FieldTypeDescription
job_idstring
statusstring"pending", "running", "completed", "failed"
progressfloat0–1
stagestringCurrent pipeline stage
resultsdict[]Per-paper scores + reasoning (when completed)
clustersdict[]Reason clusters (when completed)
errorstringError message (when failed)
error_categorystringSame taxonomy as error_code
error_retryableboolean
duration_msintegerWall-clock duration
estimated_cost_usdfloatFinal cost (approximate)
stage_timingsdictPer-stage duration in ms
curl -s http://localhost:8005/v1/screening/jobs/abc-123 -H "X-API-Key: dev-key"
# →
{
  "job_id": "abc-123",
  "status": "completed",
  "progress": 1.0,
  "stage": "assignment",
  "results": [{"id": "p1", "final_score": 4.2, "cluster_id": 1, "reasoning": "..."}],
  "clusters": [{"cluster_id": 1, "label": "Eligible RCTs", "count": 1}],
  "duration_ms": 3421,
  "estimated_cost_usd": 0.002
}
Response codes: 200, 401, 404 (unknown job_id)
GET /v1/screening/jobs List recent jobs
ParamTypeDefault
limitinteger50
FieldTypeDescription
job_idstring
statusstring
progressfloat
stagestring
papers_countinteger
modelstring
project_idintegerFrom EM, if provided at create
duration_msinteger
estimated_cost_usdfloat
created_atstringISO 8601 timestamp
completed_atstringISO 8601 timestamp (when completed)
Response codes: 200, 401
GET /v1/screening/active-job Get active job for a project
ParamTypeRequired
project_idintegerYes

Returns the running/pending job for this project_id, or null. Use to dedupe: EM should not start a new job while one is running for the same project.

Response codes: 200, 401

Optional

POST /v1/screening/estimate Estimate cost before running
FieldTypeRequired
papers_countintegerYes
modelstringNo (default: gpt-5-nano)
repetitionsintegerNo (default: 5)
criteria_countintegerNo (default: 0)
FieldTypeDescription
estimated_input_tokensintegerSum across labelling + reasoning stages
estimated_output_tokensinteger
estimated_cost_usdfloatBased on hardcoded model pricing
modelstringEchoes request
papers_countintegerEchoes request
repetitionsintegerEchoes request
confidencestring"approximate"
disclaimerstringExpected variance (±30%)
curl -s -X POST http://localhost:8005/v1/screening/estimate \
  -H "X-API-Key: dev-key" -H "Content-Type: application/json" \
  -d '{"model": "gpt-5-nano", "papers_count": 500, "repetitions": 5, "criteria_count": 10}'
# →
{
  "estimated_input_tokens": 775000,
  "estimated_output_tokens": 75000,
  "estimated_cost_usd": 0.0625,
  "model": "gpt-5-nano",
  "papers_count": 500,
  "repetitions": 5,
  "confidence": "approximate",
  "disclaimer": "Estimate based on empirical averages. Actual cost may vary +-30%..."
}
Caveat: pricing is hardcoded in crystallise.llm.cost.DEFAULT_PRICING_PER_1M and may drift from OpenAI's public pricing over time. Treat estimated_cost_usd as approximate and cross-check against OpenAI's current rates before relying on it for budget caps.
Response codes: 200, 400, 401, 500

AutoIndexer

Structured data extraction from title + abstract. Define your extraction fields (by hand, or via the optional AI-suggest / AI-refine helpers), submit a batch, and receive per-paper values with evidence spans (the quote that justified each value) and per-field confidence scores. Use POST /run for small batches synchronously or POST /jobs for larger batches asynchronously.

Core

POST /v1/indexer/run Synchronous extraction (small batches)
FieldTypeRequired
recordsdict[]Yes — each with ID, Title, Abstract
fieldsIndexerField[]Yes — see Shared Types
modelstringNo (default: gpt-5-mini)
project_contextProjectContextNo — {description, research_questions}
mode"test" | "sample" | "full"No (default: full). test processes first 5 records, sample 20, full all.
max_workersintegerNo (default: 4)
batch_sizeintegerNo (default: 50)
project_idintegerNo — opaque correlation key
FieldTypeDescription
resultsdict[]One per record with extracted field values + evidence + confidence
errorsstring[]Per-record error messages
usagedictToken usage: {input_tokens, output_tokens, total_tokens, estimated_cost_usd}
model_versionstringActual OpenAI model string returned (may include date suffix)
curl -s -X POST http://localhost:8005/v1/indexer/run \
  -H "X-API-Key: dev-key" -H "Content-Type: application/json" \
  -d '{
    "records": [{"ID": "p1", "Title": "RCT of drug X", "Abstract": "150 adults..."}],
    "fields": [
      {"name": "study_design", "description": "Type of study", "data_type_primary": "text"},
      {"name": "sample_size", "description": "Number of participants", "data_type_primary": "number"}
    ],
    "mode": "test"
  }'
# →
{
  "results": [{
    "ID": "p1",
    "indexing_status": "ok",
    "study_design": {"value": "RCT", "confidence": 0.95, "evidence": [...]},
    "sample_size": {"value": 150, "confidence": 0.9, "evidence": [...]}
  }],
  "errors": [],
  "usage": {"input_tokens": 320, "output_tokens": 85, "total_tokens": 405, "estimated_cost_usd": 0.0002},
  "model_version": "gpt-5-mini-2025-02-01"
}
{"detail": {"message": "...", "error_code": "validation", "retryable": false}}
Response codes: 200, 400, 401, 429, 500
POST /v1/indexer/jobs Start async indexer job

Identical to POST /v1/indexer/run. Summarised here for completeness.

FieldTypeRequired
recordsdict[]Yes — each with ID, Title, Abstract
fieldsIndexerField[]Yes
modelstringNo (default: gpt-5-mini)
project_contextProjectContextNo
mode"test" | "sample" | "full"No (default: full)
project_idintegerNo
FieldTypeDescription
job_idstringUUID for polling
statusstring"pending" initially
progressfloat0.0 initially
curl -s -X POST http://localhost:8005/v1/indexer/jobs \
  -H "X-API-Key: dev-key" -H "Content-Type: application/json" \
  -d '{"records": [...], "fields": [...], "mode": "full"}'
# →
{"job_id": "xyz-789", "status": "pending", "progress": 0.0}
Response codes: 200, 400, 401, 500
GET /v1/indexer/jobs/{job_id} Poll job status + results
FieldTypeDescription
job_idstring
statusstring"pending", "running", "completed", "failed"
progressfloat0–1
partial_resultsdict[]Records processed so far
errorsstring[]
usagedictToken usage to date
errorstringTerminal error message (when failed)
error_categorystring
error_retryableboolean
duration_msinteger
estimated_cost_usdfloat
model_versionstring
created_atstringISO 8601
completed_atstringISO 8601
Response codes: 200, 401, 404 (unknown job_id)
GET /v1/indexer/jobs List recent jobs
ParamTypeDefault
limitinteger50
FieldTypeDescription
job_idstring
statusstring
progressfloat
modelstring
record_countinteger
duration_msinteger
estimated_cost_usdfloat
created_atstringISO 8601
completed_atstringISO 8601
Response codes: 200, 401
GET /v1/indexer/active-job Get active job for a project
ParamTypeRequired
project_idintegerYes

Returns the running/pending indexer job for this project_id, or null.

Response codes: 200, 401

Optional

POST /v1/indexer/estimate Estimate indexer cost before running
FieldTypeRequired
fieldsIndexerField[]Yes
record_countintegerYes
modelstringNo (default: gpt-5-mini)
FieldTypeDescription
estimated_input_tokensinteger
estimated_output_tokensinteger
estimated_cost_usdfloatBased on hardcoded pricing
confidencestring"approximate"
disclaimerstring±30% expected variance
curl -s -X POST http://localhost:8005/v1/indexer/estimate \
  -H "X-API-Key: dev-key" -H "Content-Type: application/json" \
  -d '{"fields": [{"name": "study_design", "description": "...", "data_type_primary": "text"}], "record_count": 100}'
# →
{
  "estimated_input_tokens": 32000,
  "estimated_output_tokens": 8500,
  "estimated_cost_usd": 0.0234,
  "confidence": "approximate",
  "disclaimer": "Estimate based on empirical averages..."
}
Caveat: pricing is hardcoded in crystallise.llm.cost.DEFAULT_PRICING_PER_1M and may drift from OpenAI's public pricing. Treat this as a rough sizing, not a bill.
Response codes: 200, 400, 401, 500
POST /v1/indexer/suggest-fields AI field suggestion from project context
FieldTypeRequired
project_contextProjectContextNo — description + research questions
picodictNo — PICOS elements from /criteria/picos
sample_recordsdict[]No — sample papers for grounding
existing_fieldsstring[]No — field names already defined
modelstringNo (default: gpt-4.1)
mockbooleanNo
FieldTypeDescription
fieldsIndexerField[]Suggested extraction fields
warningsExtractionWarning[]Per-field risk flags (e.g. low-signal fields)
curl -s -X POST http://localhost:8005/v1/indexer/suggest-fields \
  -H "X-API-Key: dev-key" -H "Content-Type: application/json" \
  -d '{"project_context": {"description": "RCTs of exercise for depression"}, "mock": true}'
# →
{
  "fields": [
    {"name": "study_design", "description": "Type of study", "data_type_primary": "text", "examples": ["RCT", "cohort"]},
    {"name": "sample_size", "description": "Number of participants", "data_type_primary": "number"}
  ],
  "warnings": []
}
Response codes: 200, 400, 401, 429, 500
POST /v1/indexer/refine-fields AI review of field definitions
FieldTypeRequired
fieldsIndexerField[]Yes — current field set to review
project_contextProjectContextNo
sample_recordsdict[]No — ground suggestions against real papers
FieldTypeDescription
suggestionsFieldSuggestion[]Proposed add, modify, remove, or merge actions
curl -s -X POST http://localhost:8005/v1/indexer/refine-fields \
  -H "X-API-Key: dev-key" -H "Content-Type: application/json" \
  -d '{"fields": [{"name": "outcome", "description": "...", "data_type_primary": "text"}]}'
# →
{
  "suggestions": [
    {"action": "modify", "field": {"name": "primary_outcome", ...}, "rationale": "...",
     "original_field_name": "outcome"}
  ]
}
Response codes: 200, 400, 401, 429, 500
POST /v1/indexer/group-tags AI-assisted value grouping
FieldTypeRequired
field_namestringYes — field the values belong to
valuesstring[]Yes — extracted values to cluster
project_contextProjectContextNo
num_groups_hintintegerNo
FieldTypeDescription
groupsTagGroup[]Clustered buckets with labels
usagedictToken usage
curl -s -X POST http://localhost:8005/v1/indexer/group-tags \
  -H "X-API-Key: dev-key" -H "Content-Type: application/json" \
  -d '{"field_name": "study_design", "values": ["RCT", "randomised controlled trial", "cohort study", "case-control"]}'
# →
{
  "groups": [
    {"name": "Randomised controlled trials", "values": ["RCT", "randomised controlled trial"], "rationale": "..."},
    {"name": "Observational", "values": ["cohort study", "case-control"], "rationale": "..."}
  ],
  "usage": {"total_tokens": 150, "estimated_cost_usd": 0.0002}
}
Response codes: 200, 400, 401, 429, 500

Criteria AI

Helpers for building and refining the eligibility criteria a screening pipeline runs against. The core endpoint /analyze-question checks whether a single research question is PICOS-ready for a literature search; the optional endpoints generate criteria from context, extract PICOS elements, refine project descriptions, or consolidate duplicate criteria. All endpoints are synchronous — one request, one response.

Core

POST /v1/criteria/analyze-question PICOS search-readiness check for a single research question
FieldTypeRequired
research_questionstringYes
modelstringNo (default: gpt-5-mini)
mockbooleanNo
FieldTypeDescription
statusstring"ready" or "could_improve"
missing_elementsstring[]PICOS elements that are unclear or absent
suggestionstringActionable improvement or confirmation message
curl -s -X POST http://localhost:8005/v1/criteria/analyze-question \
  -H "X-API-Key: dev-key" -H "Content-Type: application/json" \
  -d '{"research_question": "Does exercise help depression?", "mock": true}'
# →
{
  "status": "could_improve",
  "missing_elements": [
    "Population is not specified",
    "Outcome measures are vague"
  ],
  "suggestion": "Mock mode: specify the population, intervention, and primary outcome to make the question searchable. Run without mock for real analysis."
}
{"detail": [{"type": "missing", "loc": ["body", "research_question"], "msg": "Field required"}]}
Demo lineage: this endpoint mirrors the behaviour of the Streamlit research-question demo (demo.py) shared with NetReady earlier. It's the recommended entry point for the "is this question ready for a literature search?" flow.
Response codes: 200, 400, 401, 429, 500

Optional

POST /v1/criteria/generate Generate criteria from project context
FieldTypeRequired
project_descriptionstringYes
research_questionsstring[]No
additional_notesstringNo
existing_criteriadict[]No — for deduplication
criterion_type"include" | "exclude"No (default: exclude)
modelstringNo (default: gpt-4.1)
mockbooleanNo
FieldTypeDescription
criteriaCriterionResponse[]Generated criteria — see Shared Types
curl -s -X POST http://localhost:8005/v1/criteria/generate \
  -H "X-API-Key: dev-key" -H "Content-Type: application/json" \
  -d '{"project_description": "RCTs of exercise for depression in adults", "mock": true}'
# →
{
  "criteria": [
    {"category": "Study Design", "text": "Review articles, systematic reviews, meta-analyses", "criterion_type": "exclude", "description": "..."},
    {"category": "Publication Type", "text": "Conference abstracts without full publication", "criterion_type": "exclude", "description": "..."}
  ]
}
Response codes: 200, 400, 401, 429, 500
POST /v1/criteria/picos Extract PICOS elements from description
FieldTypeRequired
project_descriptionstringYes
research_questionsstring[]No
modelstringNo (default: gpt-4.1)
mockbooleanNo
FieldTypeDescription
elementsdictKeys: population, intervention, comparison, outcome, study_design
gap_flagsstring[]Missing or ambiguous elements
contraindicationsdict[]Potential conflicts between elements
curl -s -X POST http://localhost:8005/v1/criteria/picos \
  -H "X-API-Key: dev-key" -H "Content-Type: application/json" \
  -d '{"project_description": "RCTs of metformin vs placebo in adults with type 2 diabetes", "mock": true}'
# →
{
  "elements": {
    "population": "Adults with the condition described in the project",
    "intervention": "The primary intervention or exposure under review",
    "comparison": "Standard of care, placebo, or no intervention",
    "outcome": "Primary clinical outcomes, efficacy, and safety measures",
    "study_design": "Study designs relevant to the research question"
  },
  "gap_flags": ["Mock mode: PICOS elements are placeholders — run without mock for real extraction"],
  "contraindications": []
}
Response codes: 200, 400, 401, 429, 500
POST /v1/criteria/refine-context Improve project description for screening
FieldTypeRequired
descriptionstringYes
research_questionsstring[]No
modelstringNo (default: gpt-4.1)
mockbooleanNo
FieldTypeDescription
refined_descriptionstringImproved, more specific project description
refined_research_questionsstring[]Questions rewritten for search precision
explanationstringWhy these refinements were made
curl -s -X POST http://localhost:8005/v1/criteria/refine-context \
  -H "X-API-Key: dev-key" -H "Content-Type: application/json" \
  -d '{"description": "Review of drug X", "research_questions": ["Is drug X effective?"], "mock": true}'
# →
{
  "refined_description": "Review of drug X\n\n[Refined for clarity and specificity in systematic review screening.]",
  "refined_research_questions": ["Is drug X effective? [refined for precision]"],
  "explanation": "Mock mode: minor refinements applied as placeholders. Run without mock for real AI refinement."
}
Response codes: 200, 400, 401, 429, 500
POST /v1/criteria/refine Refine criteria from conflict patterns
FieldTypeRequired
current_criteriadict[]Yes — the active criteria set
conflictsdict[]No — AI-vs-human disagreement records
project_descriptionstringNo
modelstringNo (default: gpt-4.1)
mockbooleanNo
FieldTypeDescription
criteriaCriterionResponse[]Refined criteria derived from the conflict patterns
curl -s -X POST http://localhost:8005/v1/criteria/refine \
  -H "X-API-Key: dev-key" -H "Content-Type: application/json" \
  -d '{
    "current_criteria": [{"category": "Population", "text": "Adults only"}],
    "conflicts": [{"paper_title": "Study A", "decision_a": "include", "decision_b": "exclude"}],
    "mock": true
  }'
# →
{
  "criteria": [
    {"category": "Study Design", "text": "Exclude retrospective observational studies without a control arm", "criterion_type": "exclude", "confidence": 0.72, "rationale": "Derived from 1 reviewer conflict(s) on study design."},
    {"category": "Outcome Reporting", "text": "Exclude studies that do not report the primary outcome quantitatively", "criterion_type": "exclude", "confidence": 0.65, "rationale": "Pattern across 1 conflict(s) flagged insufficient outcome data."}
  ]
}
Response codes: 200, 400, 401, 429, 500
POST /v1/criteria/consolidate Detect duplicates and propose merges
FieldTypeRequired
criteriadict[]Yes — criteria to analyse
project_descriptionstringNo
research_questionsstring[]No
modelstringNo (default: gpt-4.1)
mockbooleanNo
FieldTypeDescription
duplicate_groupsDuplicateGroup[]Groups of criteria with overlapping scope — see Shared Types
consolidation_proposalsConsolidationProposal[]Proposed merged criteria
warningsstring[]Low-confidence rejections or notes
curl -s -X POST http://localhost:8005/v1/criteria/consolidate \
  -H "X-API-Key: dev-key" -H "Content-Type: application/json" \
  -d '{"criteria": [{"id": 1, "category": "Population", "text": "Adults 18+"}, {"id": 2, "category": "Population", "text": "Adult participants over 18"}], "mock": true}'
# →
{
  "duplicate_groups": [],
  "consolidation_proposals": [],
  "warnings": ["Mock mode: no consolidation performed"]
}
Response codes: 200, 400, 401, 429, 500

Error Responses

The same status codes and body shapes apply everywhere — read this section once and cross-reference from each endpoint. Classified LLM errors carry a structured error_code and a retryable flag so client code can decide whether to back off, surface the message, or abort. Async jobs additionally report terminal errors inside the job response rather than as HTTP error codes.

HTTPerror_codeRetryableWhen you see it
400validationnoMalformed request body, missing required field, Pydantic validation failed
401authnoMissing/invalid X-API-Key, or invalid X-OpenAI-API-Key
404noResource not found (e.g. unknown job_id)
429rate_limityesOpenAI rate limit — retry with exponential backoff
500unknownUnexpected server error
503/health/ready only, when DB or OpenAI key check fails
Standard body — classified LLM error
{
  "detail": {
    "message": "Rate limit exceeded",
    "error_code": "rate_limit",
    "retryable": true
  }
}
Standard body — FastAPI validation / missing resource
{ "detail": "Field required: research_question" }
Async job in-body failure (screening, indexer)
{
  "job_id": "abc-123",
  "status": "failed",
  "error": "Invalid OpenAI key",
  "error_category": "auth",
  "error_retryable": false
}
Async jobs report terminal errors inside the job response (HTTP 200), not as HTTP error codes. Poll the job and check status === "failed" + error_category.

Shared Types

Data types referenced by multiple endpoints. These mirror the Pydantic models in api/schemas/ (source of truth) — documented once here to avoid per-endpoint repetition.

IndexerField

FieldTypeDescription
namestringField identifier (e.g. study_design)
descriptionstringWhat the AI should extract
data_type_primarystringtext, number, yes_no, list_text, list_number
data_type_secondarystringSub-type qualifier (default NA) optional
examplesstring[]Example values optional
examples_mode"guide" | "enum""guide" = suggestions; "enum" = strict list optional
depth"minimal" | "full"Extraction effort level optional

ProjectContext

FieldTypeDescription
descriptionstringFree-text project description
research_questionsstring[]

CriterionResponse

FieldTypeDescription
categorystringPICOS category (Population, Intervention, Outcome, etc.)
textstringThe criterion itself
descriptionstringExpanded definition
criterion_type"include" | "exclude"
confidencefloat0–1 AI confidence optional
rationalestringWhy this criterion was suggested optional
title_abstract_assessablebooleanWhether the criterion can be decided from title/abstract alone

DuplicateGroup

FieldTypeDescription
group_typestringe.g. "exact", "semantic"
categorystringPICOS category these criteria share
criterion_idsinteger[]IDs of criteria in this group
recommended_primary_idintegerWhich criterion to keep
merge_rationalestring
ai_confidencefloat0–1; groups below 0.75 are filtered out server-side

ConsolidationProposal

FieldTypeDescription
categorystring
criterion_idsinteger[]Criteria to merge
proposed_merged_criterionstringNew label — rejected server-side if > 10 words
proposed_descriptionstring
proposed_type"include" | "exclude"
merge_rationalestring
ai_confidencefloat0–1; proposals below 0.75 are filtered out server-side

TagGroup

FieldTypeDescription
namestringGroup label
valuesstring[]Member values
rationalestringWhy these cluster together optional

ExtractionWarning

FieldTypeDescription
fieldstringField name the warning applies to
risk_level"low" | "medium" | "high"Default medium
reasonstringWhy the field is at risk (ambiguous, hard to extract from title/abstract, etc.)
suggested_fallbackstringRecommended mitigation

FieldSuggestion

FieldTypeDescription
action"add" | "modify" | "remove" | "merge"
fieldIndexerFieldThe proposed (new or revised) field
rationalestring
original_field_namestringFor modify/remove/merge — which field this applies to optional
target_field_namestringFor merge — the name to merge into optional