Crystallise AI Backend — Troubleshooting

Top confusion points and their fixes. Check here before filing a ticket.

Authentication & Keys
Jobs & Polling
Costs & Models
Local Dev & Docker
Unexpected Responses

Authentication & Keys

"401 Unauthorized but my API key is correct"

Diagnosis

Two different keys flow through this service. X-API-Key authenticates your call to the Crystallise backend; X-OpenAI-API-Key is a per-request passthrough to OpenAI. A 401 on an otherwise well-formed request usually means these got swapped — your OpenAI key is being treated as the service API key.

Fix

Inspect your request headers and confirm X-API-Key matches an entry in CRYSTALLISE_API_KEYS (or is any non-empty string in dev mode). If you also need per-user OpenAI billing, set X-OpenAI-API-Key separately. See the Authentication section for header examples.

"Job finishes with `status: failed`, `error_category: auth`"

Diagnosis

The service-level X-API-Key passed the gate — otherwise the POST would have 401'd synchronously — but the downstream OpenAI call rejected its key. That key came either from the per-request X-OpenAI-API-Key header or from the server-side CRYSTALLISE_OPENAI_API_KEY environment variable. One of those is invalid, revoked, or out of quota.

Fix

If you sent X-OpenAI-API-Key, mint a fresh key in the OpenAI dashboard and retry.
If you rely on the server env var, have the operator verify CRYSTALLISE_OPENAI_API_KEY is set and still valid.
See Backend Guide § LLM for how these two sources are resolved at request time.

"Dev mode (empty `CRYSTALLISE_API_KEYS`) accepts any key — is that safe?"

Diagnosis

When CRYSTALLISE_API_KEYS is unset or empty, the auth middleware falls through and accepts any non-empty X-API-Key value. That is deliberate for local development so integrators don't need credentials to run the stack, but it means anyone who can reach the port can submit jobs that spend OpenAI credits.

Fix

For any deployment beyond a developer laptop, set CRYSTALLISE_API_KEYS to a comma-separated list of strong random strings and restart the service. Treat the list as a shared secret — it gates all access to the API.

"How do I rotate the service API key?"

Diagnosis

The service has no per-user key database; CRYSTALLISE_API_KEYS is the single source of truth and is read at process start. Rotation therefore means editing that env var and cycling the process, not issuing revocations in a dashboard.

Fix

Add the new key to CRYSTALLISE_API_KEYS alongside the old one and restart.
Roll out the new value in the client's X-API-Key header.
Once all clients are migrated, remove the old key from CRYSTALLISE_API_KEYS and restart again.

"Can I send the OpenAI key once and have it remembered?"

Diagnosis

No. The service is intentionally stateless and never persists per-request keys — that's why a job's X-OpenAI-API-Key only affects that job. There is no login, no session, no server-side store of caller credentials.

Fix

Either send X-OpenAI-API-Key on every request that needs per-caller OpenAI billing, or configure the server-side CRYSTALLISE_OPENAI_API_KEY env var as a shared fallback. The per-request header always wins when both are present.

Jobs & Polling

"Job is stuck in `status: pending` forever"

Diagnosis

Screening jobs run as a background task kicked off at POST time. "Forever pending" means that task either never started (the kickoff raised before it scheduled) or crashed without updating the job record. The in-memory job store still holds the pending row because no writer ever moved it forward.

Fix

Check the server logs for an exception traced to the job id. If the process has restarted since submission, the job is gone — resubmit. Otherwise cancel via DELETE and resubmit; see API Reference § Screening.

"Results are empty but `status: completed`"

Diagnosis

Completion just means the pipeline ran cleanly end-to-end; it doesn't guarantee any paper survived filtering. A strict threshold combined with a narrow cluster type can eliminate every candidate, and you'll see an empty result set rather than an error.

Fix

Inspect the clusters array in the response to see how papers were bucketed before filtering. Lower the threshold (default 1.0) or widen the cluster configuration and resubmit. The threshold glossary entry explains the sensitivity/specificity trade-off.

"Job returns `status: failed`, `error_category: server_restart`"

Diagnosis

The server restarted while your job was in flight. Because job state lives in process memory, any partial progress was lost and the job is marked failed with this category so clients can distinguish it from a real pipeline error. It is not retriable in place — there is no row to resume from.

Fix

Resubmit the same request. If you see this category repeatedly, the service is probably crash-looping — have the operator check logs and CRYSTALLISE_OPENAI_API_KEY wiring before you retry further.

"How often should I poll?"

Diagnosis

There is no push notification; clients discover completion by polling GET on the job id. Poll too fast and you burn request budget on a status field that hasn't changed; poll too slow and your end-to-end latency is dominated by sleep, not by inference.

Fix

Every 1–2 seconds is a sensible default. For very long jobs you can back off after the first minute, but don't go below 1s — there is no coalescing on the server side and short intervals add nothing but load.

"I see `409 Conflict` on POST /v1/screening/jobs"

Diagnosis

The service enforces one active job per project_id to keep cost and concurrency bounded. If a project already has a job in pending or running, a new POST with the same project_id is rejected with 409 rather than queued.

Fix

Poll the existing job to completion, or DELETE it if you want to abandon that run.
Alternatively, drop project_id entirely for a one-off submission — the uniqueness check only fires when the field is present.

Costs & Models

"`estimated_cost_usd` doesn't match my OpenAI billing"

Diagnosis

The estimate is computed from a hardcoded table (DEFAULT_PRICING_PER_1M) snapshotted from OpenAI's public rate card at build time. OpenAI adjusts those rates periodically and the table does not auto-update, so the estimate drifts from your real invoice — usually by single-digit percent, but more during a pricing change.

Fix

Treat estimated_cost_usd as a cost ceiling for planning, not a billing line. Reconcile against the OpenAI dashboard for actuals. If the drift is large, the table in the server code needs refreshing; flag it to the backend team.

"How do I cap spend on a job?"

Diagnosis

The screening request accepts an optional max_estimated_cost_usd. The server computes the estimate up-front and, if it exceeds your cap, rejects the request with a 400 before any OpenAI call is made. The cap is a preflight check, not a runtime kill switch.

Fix

Set max_estimated_cost_usd in the POST body to your budget. On rejection, either raise the cap, narrow the paper set, or switch to a cheaper model (see c3). See API Reference § Screening for the field.

"Which model should I pick for screening?"

Diagnosis

Screening is I/O-bound over many small prompts, so model choice is a straight cost/quality trade. gpt-5-nano is the default and the cheapest; gpt-5-mini improves labelling quality, especially on borderline abstracts, at roughly 4x the cost per token.

Fix

Start with gpt-5-nano and only upgrade if you see too many low-confidence or clearly-wrong labels in the reasoning output. For tight budgets, keep max_estimated_cost_usd in place as a guardrail regardless of model.

"Model X isn't supported — why?"

Diagnosis

Every request is validated against crystallise.config.model_capabilities, which encodes context window and feature support (structured outputs, function calling) for the models we've qualified. Unknown models, or models lacking a required feature for the endpoint, fail preflight with a validation error rather than being forwarded to OpenAI.

Fix

Pick a model listed in model_capabilities. If you need a newer or specialist model, have the backend team add an entry — the gate is deliberate so an incompatible model can't silently produce malformed results.

Local Dev & Docker

"`ModuleNotFoundError: No module named 'fastapi'` running `pytest`"

Diagnosis

Your shell resolved pytest to a system-wide install (often /usr/bin/pytest or a pyenv shim) instead of the project's virtualenv. That interpreter has no access to the project's dependencies, so the first FastAPI import blows up.

Fix

Activate the venv: source .venv/bin/activate before running pytest.
Or bypass PATH: .venv/bin/python -m pytest.

"Port 5337 already in use on Docker Compose"

Diagnosis

The docker-compose.yml maps the Postgres container's 5432 to host port 5337 to avoid clashing with a default local Postgres on 5432. If another process (a different Postgres, a prior Compose stack still running, or a leftover bind) holds 5337, the bind fails.

Fix

Change the host-side port in docker-compose.yml (e.g. "5338:5432") and update any local connection strings. Alternatively, find the offender with ss -ltn 'sport = :5337' (or lsof -i :5337) and stop it.

"Mock-mode tests pass but real calls 401"

Diagnosis

The test suite installs an auth-bypass fixture that short-circuits the X-API-Key check so unit tests don't need credentials. When you point the same client code at a real deployment, that fixture isn't there and a missing or wrong X-API-Key header 401s immediately.

Fix

Confirm your production client is setting X-API-Key to a value present in the server's CRYSTALLISE_API_KEYS. See a1 for the key-confusion variant and mock mode for what the fixture actually bypasses.

"How do I run only the integration (live) tests?"

Diagnosis

Unit tests live alongside the code and run in mock mode; integration tests under tests/integration/ hit a real OpenAI endpoint and are skipped by default unless an API key is present in the environment. They're separated so CI doesn't accidentally spend money.

Fix

Export a real key and run the directory explicitly: CRYSTALLISE_OPENAI_API_KEY=sk-... pytest tests/integration -v. Either CRYSTALLISE_OPENAI_API_KEY or OPENAI_API_KEY is accepted.

Unexpected Responses

"`/criteria/consolidate` returns empty lists but a populated `warnings` array"

Diagnosis

A server-side quality filter sits between the LLM output and the response body. Proposals that are too low-confidence, too long, or structurally malformed are dropped and logged into warnings rather than returned. A fully-empty response with warnings means every candidate failed that filter.

Fix

Read the warnings entries for the specific reason (confidence, length, schema). Usually the input criteria are too sparse to consolidate — expand them and retry. See Backend Guide § Criteria for the filter rules.

"`/indexer/run` returns per-record `indexing_status` other than `ok`"

Diagnosis

The indexer is a best-effort batch: one malformed abstract doesn't fail the whole call. Records that couldn't be extracted — typically because the abstract is missing, extremely short, or structurally broken — come back with a non-ok indexing_status and an extraction_error message, while successful records sit next to them in the same response.

Fix

Walk the errors array and each record's extraction_error to identify the bad inputs. Usually the fix is upstream in your data cleaning (fill missing abstracts, strip HTML), not in the API call. See the AutoIndexer glossary entry for what fields the extractor expects.

Still stuck? Check Backend Guide § LLM for retry semantics, or the glossary for terminology.

Crystallise AI Backend — Troubleshooting

Contents

Authentication & Keys

"401 Unauthorized but my API key is correct"

"Job finishes with status: failed, error_category: auth"

"Dev mode (empty CRYSTALLISE_API_KEYS) accepts any key — is that safe?"

"How do I rotate the service API key?"

"Can I send the OpenAI key once and have it remembered?"

Jobs & Polling

"Job is stuck in status: pending forever"

"Results are empty but status: completed"

"Job returns status: failed, error_category: server_restart"

"How often should I poll?"

"I see 409 Conflict on POST /v1/screening/jobs"

Costs & Models

"estimated_cost_usd doesn't match my OpenAI billing"

"How do I cap spend on a job?"

"Which model should I pick for screening?"

"Model X isn't supported — why?"

Local Dev & Docker

"ModuleNotFoundError: No module named 'fastapi' running pytest"

"Port 5337 already in use on Docker Compose"

"Mock-mode tests pass but real calls 401"

"How do I run only the integration (live) tests?"

Unexpected Responses

"/criteria/consolidate returns empty lists but a populated warnings array"

"/indexer/run returns per-record indexing_status other than ok"

"Job finishes with `status: failed`, `error_category: auth`"

"Dev mode (empty `CRYSTALLISE_API_KEYS`) accepts any key — is that safe?"

"Job is stuck in `status: pending` forever"

"Results are empty but `status: completed`"

"Job returns `status: failed`, `error_category: server_restart`"

"I see `409 Conflict` on POST /v1/screening/jobs"

"`estimated_cost_usd` doesn't match my OpenAI billing"

"`ModuleNotFoundError: No module named 'fastapi'` running `pytest`"

"`/criteria/consolidate` returns empty lists but a populated `warnings` array"

"`/indexer/run` returns per-record `indexing_status` other than `ok`"