# Crystallise AI Backend — Glossary

Systematic-review terminology used by the API and accompanying docs.

### Terms

- [AutoIndexer](#autoindexer)
- [Clustering (screening)](#clustering-screening)
- [Confidence (extraction)](#confidence-extraction)
- [Criteria (eligibility)](#criteria-eligibility)
- [Evidence span](#evidence-span)
- [Exclusion criterion](#exclusion-criterion)
- [Gap flag (PICOS)](#gap-flag-picos)
- [Inclusion criterion](#inclusion-criterion)
- [Labelling (screening)](#labelling-screening)
- [Mock mode](#mock-mode)
- [PICO / PICOS](#pico-picos)
- [Reasoning (screening)](#reasoning-screening)
- [Repetitions](#repetitions)
- [Screening](#screening)
- [Search-readiness](#search-readiness)
- [Sensitivity (SR)](#sensitivity-sr)
- [Specificity (SR)](#specificity-sr)
- [Stateless compute](#stateless-compute)
- [Study design](#study-design)
- [Systematic review](#systematic-review)
- [Threshold (screening)](#threshold-screening)

## Terms

### AutoIndexer

Structured field extraction from a paper's title and abstract, driven by an OpenAI function call that returns a typed JSON object. Each extracted field carries an [evidence span](#evidence-span) quoting the source text and a [confidence](#confidence-extraction) score so a human reviewer can audit the result.

### Clustering (screening)

Stage 3 of the 4-stage [screening](#screening) pipeline. It groups the free-text [reasoning](#reasoning-screening) produced in stage 2 into thematic buckets so a reviewer can see *why* papers were scored similarly rather than only seeing the scores.

### Confidence (extraction)

The model's self-reported 0–1 score attached to each extracted field value by [AutoIndexer](#autoindexer). Treat it as a signal for human review, not as a calibrated probability — two fields with equal "0.85" confidence may not be equally reliable.

### Criteria (eligibility)

The include/exclude rules a screener applies to decide whether a study belongs in the review. Split into [inclusion criteria](#inclusion-criterion) and [exclusion criteria](#exclusion-criterion), usually derived from the project's [PICOS](#pico-picos).

### Evidence span

A verbatim quote taken from the title or abstract that supported an extracted value. Returned alongside each field by [AutoIndexer](#autoindexer) so reviewers can verify the extraction without re-reading the source.

### Exclusion criterion

A rule that disqualifies a study from the review, for example "non-English language" or "animal model only". A single matched exclusion criterion is enough to drop a paper regardless of how well it matches the [inclusion criteria](#inclusion-criterion).

### Gap flag (PICOS)

An element of [PICOS](#pico-picos) that is missing or ambiguous in the project description, returned by `POST /criteria/picos`. Gap flags tell the caller which dimensions still need user input before the question is ready to drive a literature search.

### Inclusion criterion

A rule that a study must satisfy to be considered in scope, for example "adult human participants" or "reports mortality outcomes". A study typically has to meet every inclusion criterion *and* fail every [exclusion criterion](#exclusion-criterion).

### Labelling (screening)

Stage 1 of the [screening](#screening) pipeline. Each paper is scored on a 1–5 relevance scale across N [repetitions](#repetitions), and the per-paper `mean_score` becomes the primary signal used by downstream stages and by the [include threshold](#threshold-screening).

### Mock mode

Setting `"mock": true` in a request body returns a canned response shaped like the real one, without calling OpenAI. Useful for wiring up the integration, writing tests, or reproducing bugs without spending tokens.

### PICO / PICOS

The standard framework for specifying a clinical research question: **P**opulation, **I**ntervention, **C**omparator, **O**utcome, and — in the PICOS variant — **S**tudy design. The API uses PICOS throughout: it structures [eligibility criteria](#criteria-eligibility), drives [gap flags](#gap-flag-picos), and feeds [search-readiness](#search-readiness).

### Reasoning (screening)

Stage 2 of the [screening](#screening) pipeline. It produces a short human-readable explanation of why a paper received its [labelling](#labelling-screening) score, and those explanations are what the [clustering](#clustering-screening) stage groups.

### Repetitions

The number of independent AI calls made per paper during [labelling](#labelling-screening). More repetitions give a more stable `mean_score` but cost proportionally more tokens; three to five is typical.

### Screening

Title/abstract eligibility assessment — deciding which studies to include in a review from a larger candidate set. In this API it is a 4-stage pipeline: [labelling](#labelling-screening), [reasoning](#reasoning-screening), [clustering](#clustering-screening), and final selection against a [threshold](#threshold-screening).

### Search-readiness

Whether a research question has enough [PICOS](#pico-picos) specificity to feed a literature search without returning noise. A question that is "search-ready" has concrete terms for each PICOS element; one that isn't will come back with [gap flags](#gap-flag-picos).

### Sensitivity (SR)

In systematic-review terminology, the recall of a search: the proportion of truly relevant studies that the search actually captured. A sensitive search errs toward retrieving too much rather than missing anything.

### Specificity (SR)

In systematic-review terminology, the precision of a search: the proportion of retrieved studies that are actually relevant. A specific search errs toward a clean result set at the risk of missing borderline papers.

### Stateless compute

This service holds no user data beyond transient job state — inputs come in on the request, outputs go back on the response, and nothing persists after the job completes. The Evidence Mapper application owns all durable storage of projects, papers, and results.

### Study design

The methodological type of a study: randomised controlled trial (RCT), cohort, case-control, cross-sectional, case series, [systematic review](#systematic-review), and so on. It's the "S" in [PICOS](#pico-picos) and is commonly used as an [inclusion](#inclusion-criterion) or [exclusion](#exclusion-criterion) criterion.

### Systematic review

A literature review conducted with an explicit, reproducible methodology — pre-specified question, search strategy, eligibility criteria, and extraction protocol. The endpoints in this API are building blocks for that workflow, not a finished review product.

### Threshold (screening)

The `mean_score` cutoff above which a paper is treated as "include" after [labelling](#labelling-screening). The default is `1.0`; raising it makes screening stricter (higher [specificity](#specificity-sr)), lowering it makes it more permissive (higher [sensitivity](#sensitivity-sr)).

**See also:** [API Reference](api-reference.md) for how these terms map to HTTP endpoints; [Backend Guide](backend-guide.md) for how they're implemented.
