# Crystallise AI Backend — PHP Recipes

Copy-paste Guzzle/Laravel recipes for the most common integration flows.

**Prerequisites:** PHP 8.2+, Guzzle 7+, Laravel 10+ for Livewire/Broadcasting recipes. Adjust namespaces and base URL (`$baseUrl`) for your app.

### Contents

1. [Minimal screening call](#r1)
2. [Polling with exponential backoff](#r2)
3. [Per-tenant OpenAI key passthrough](#r3)
4. [Error handling with error\_category routing](#r4)
5. [Cost cap with budget reservation](#r5)
6. [Mock-mode CI test](#r6)
7. [Question analysis from a Livewire component](#r7)
8. [AutoIndexer batch with progress streaming](#r8)

## 1. Minimal screening call

The smallest wrapper that makes the async screening API feel synchronous: submit a job, poll until it reaches a terminal state, return the `results` and `clusters` arrays. Use this when you want to prototype end-to-end quickly and you're happy to block on the outcome. For anything production-grade, swap the inline polling for Recipe 2 and the error handling for Recipe 4.

Code

```
// app/Services/Crystallise/Screener.php
<?php

declare(strict_types=1);

namespace App\Services\Crystallise;

use GuzzleHttp\Client;

final class Screener
{
    public function __construct(
        private readonly Client $http,
        private readonly string $apiKey,
        private readonly string $baseUrl,
    ) {}

    /** @return array{results: array<int, array<string, mixed>>, clusters: array<int, array<string, mixed>>} */
    public function screen(array $papers, array $criteria = [], bool $mock = true): array
    {
        $create = $this->http->post("{$this->baseUrl}/v1/screening/jobs", [
            'headers' => ['X-API-Key' => $this->apiKey],
            'json' => compact('papers', 'criteria') + ['mock' => $mock],
        ]);

        $jobId = json_decode((string) $create->getBody(), true)['job_id'];

        while (true) {
            $poll = $this->http->get("{$this->baseUrl}/v1/screening/jobs/{$jobId}", [
                'headers' => ['X-API-Key' => $this->apiKey],
            ]);
            $body = json_decode((string) $poll->getBody(), true);

            if ($body['status'] === 'completed') {
                return ['results' => $body['results'] ?? [], 'clusters' => $body['clusters'] ?? []];
            }
            if ($body['status'] === 'failed') {
                throw new \RuntimeException("Screening failed: {$body['error']}");
            }
            usleep(1_500_000);
        }
    }
}
```

Usage

```
// anywhere in your app (controller, console command, job...)
use App\Services\Crystallise\Screener;
use GuzzleHttp\Client;

$screener = new Screener(
    new Client(),
    config('services.crystallise.key'),
    config('services.crystallise.url'),
);

$out = $screener->screen(
    papers: [
        ['id' => 'p1', 'title' => 'RCT of drug X', 'abstract' => 'Randomized trial...'],
    ],
    criteria: [['name' => 'Population', 'type' => 'include', 'value' => 'Adults']],
    mock: true,
);

foreach ($out['results'] as $r) {
    echo $r['id'] . ' scored ' . $r['final_score'] . PHP_EOL;
}
```

**Note:** this recipe raises a generic `RuntimeException` on failure. For real apps, route through Recipe 4's `ErrorRouter` so you can distinguish retryable from terminal failures.

## 2. Polling with exponential backoff

A reusable poller that replaces the `while (true) { ... usleep(1_500_000); }` loop in Recipe 1. It starts at 500 ms, doubles each attempt up to a 5 s cap, and gives up after roughly five minutes of wall clock. Use it for any async endpoint (screening, indexer) since both share the `status` / `error` response shape.

Code

```
// app/Services/Crystallise/JobPoller.php
<?php

declare(strict_types=1);

namespace App\Services\Crystallise;

use GuzzleHttp\Client;

final class TimeoutException extends \RuntimeException {}

final class JobPoller
{
    private const INITIAL_DELAY_US = 500_000;
    private const MAX_DELAY_US     = 5_000_000;
    private const MAX_ATTEMPTS     = 120;

    public function __construct(
        private readonly Client $http,
        private readonly string $apiKey,
        private readonly string $baseUrl,
    ) {}

    /** @return array<string, mixed> terminal job body (status in {completed, failed}) */
    public function pollUntilTerminal(string $endpoint, string $jobId): array
    {
        $delay = self::INITIAL_DELAY_US;

        for ($attempt = 0; $attempt < self::MAX_ATTEMPTS; $attempt++) {
            $response = $this->http->get("{$this->baseUrl}{$endpoint}/{$jobId}", [
                'headers' => ['X-API-Key' => $this->apiKey],
            ]);
            $body = json_decode((string) $response->getBody(), true);

            if (in_array($body['status'] ?? '', ['completed', 'failed'], true)) {
                return $body;
            }

            usleep($delay);
            $delay = min($delay * 2, self::MAX_DELAY_US);
        }

        throw new TimeoutException("Job {$jobId} did not reach a terminal state within " . self::MAX_ATTEMPTS . ' attempts.');
    }
}
```

Usage

```
// Recipe 1's Screener, rewritten to delegate polling
public function screen(array $papers, array $criteria = [], bool $mock = true): array
{
    $create = $this->http->post("{$this->baseUrl}/v1/screening/jobs", [
        'headers' => ['X-API-Key' => $this->apiKey],
        'json' => compact('papers', 'criteria') + ['mock' => $mock],
    ]);
    $jobId = json_decode((string) $create->getBody(), true)['job_id'];

    $body = $this->poller->pollUntilTerminal('/v1/screening/jobs', $jobId);

    if ($body['status'] === 'failed') {
        throw new \RuntimeException("Screening failed: {$body['error']}");
    }

    return ['results' => $body['results'] ?? [], 'clusters' => $body['clusters'] ?? []];
}
```

**Note:** 120 attempts × up-to-5 s each is a soft upper bound, not a guarantee. Real screening jobs over thousands of papers can legitimately exceed 5 minutes — tune the constants or run the poll loop from a queued job rather than a web request.

## 3. Per-tenant OpenAI key passthrough

The Crystallise service accepts a per-request `X-OpenAI-API-Key` header that overrides the server-side env var, so each tenant can be billed on their own OpenAI account. The service `X-API-Key` stays constant across tenants; only the OpenAI key rotates. This recipe shows a tiny resolver class that builds the header set from a `Tenant` model with a stored OpenAI credential.

Code

```
// app/Services/Crystallise/TenantOpenAIKeyResolver.php
<?php

declare(strict_types=1);

namespace App\Services\Crystallise;

use App\Models\Tenant;

final class TenantOpenAIKeyResolver
{
    public function __construct(private readonly string $serviceApiKey) {}

    /** @return array<string, string> */
    public function headers(Tenant $tenant): array
    {
        $headers = ['X-API-Key' => $this->serviceApiKey];

        if ($tenant->openai_api_key !== null && $tenant->openai_api_key !== '') {
            $headers['X-OpenAI-API-Key'] = $tenant->openai_api_key;
        }

        return $headers;
    }
}
```

Usage

```
// Recipe 1's Guzzle call, extended with per-tenant headers
$resolver = new TenantOpenAIKeyResolver(config('services.crystallise.key'));

$create = $this->http->post("{$this->baseUrl}/v1/screening/jobs", [
    'headers' => $resolver->headers($tenant),
    'json'    => ['papers' => $papers, 'criteria' => $criteria, 'mock' => false],
]);
```

**Note:** store the OpenAI key encrypted at rest (Laravel's `Crypt` facade or a cast like `'encrypted'`). The resolver above assumes `$tenant->openai_api_key` is already decrypted by your model accessor.

## 4. Error handling with error\_category routing

The backend surfaces errors two ways: Guzzle throws a `RequestException` on non-2xx HTTP responses (with a classified `error_code` in the body), and async jobs can complete with `status: "failed"` carrying `error_category` and `error_retryable`. A single dispatcher can unify both and return a `Retry` / `Surface` / `Abort` decision based on the category.

Code

```
// app/Services/Crystallise/ErrorDecision.php
<?php

declare(strict_types=1);

namespace App\Services\Crystallise;

enum ErrorDecision: string
{
    case Retry   = 'retry';
    case Surface = 'surface';
    case Abort   = 'abort';
}
```

```
// app/Services/Crystallise/ErrorRouter.php
<?php

declare(strict_types=1);

namespace App\Services\Crystallise;

use GuzzleHttp\Exception\RequestException;

final class ErrorRouter
{
    /** @return array{decision: ErrorDecision, category: string, message: string, retry_after_ms: int} */
    public function fromGuzzle(RequestException $e): array
    {
        $body = $e->hasResponse()
            ? json_decode((string) $e->getResponse()->getBody(), true)
            : [];
        $detail   = $body['detail'] ?? [];
        $category = is_array($detail) ? ($detail['error_code'] ?? 'unknown') : 'unknown';
        $message  = is_array($detail) ? ($detail['message'] ?? $e->getMessage()) : (string) $detail;

        return $this->route($category, $message);
    }

    /** @param array<string, mixed> $jobBody terminal body from GET /v1/screening/jobs/{id} or /v1/indexer/jobs/{id} */
    public function fromFailedJob(array $jobBody): array
    {
        return $this->route(
            (string) ($jobBody['error_category'] ?? 'unknown'),
            (string) ($jobBody['error'] ?? 'Unknown job failure'),
        );
    }

    /** @return array{decision: ErrorDecision, category: string, message: string, retry_after_ms: int} */
    private function route(string $category, string $message): array
    {
        $decision = match ($category) {
            'rate_limit'     => ErrorDecision::Retry,
            'server_restart' => ErrorDecision::Retry,
            'auth'           => ErrorDecision::Abort,
            'validation'     => ErrorDecision::Surface,
            default          => ErrorDecision::Abort, // 'internal', 'unknown', anything else
        };

        $retryAfterMs = match ($category) {
            'rate_limit'     => 2_000,
            'server_restart' => 5_000,
            default          => 0,
        };

        return compact('decision', 'category', 'message', 'retryAfterMs') + ['retry_after_ms' => $retryAfterMs];
    }
}
```

Usage

```
use App\Services\Crystallise\ErrorDecision;
use App\Services\Crystallise\ErrorRouter;
use GuzzleHttp\Exception\RequestException;

$router = new ErrorRouter();

try {
    $response = $client->post("{$baseUrl}/v1/screening/jobs", [
        'headers' => ['X-API-Key' => $apiKey],
        'json'    => $payload,
    ]);
} catch (RequestException $e) {
    $d = $router->fromGuzzle($e);
    match ($d['decision']) {
        ErrorDecision::Retry   => $this->queueRetry($payload, $d['retry_after_ms']),
        ErrorDecision::Surface => $this->flashUserError($d['message']),
        ErrorDecision::Abort   => logger()->error('crystallise.abort', $d),
    };
}
```

**Note:** the server returns `error_code` on classified HTTP failures and `error_category` inside terminal async job bodies — same taxonomy, different field name. This router normalises both.

## 5. Cost cap with budget reservation

Before submitting a large screening batch, call `POST /v1/screening/estimate` to get an approximate cost. If the estimate already exceeds a tenant policy ceiling, reject locally. If it's within budget, submit the job with `max_estimated_cost_usd` set so the server enforces the ceiling too — the two layers defend against drift between the client-side estimate and the job's own recalculation.

Code

```
// app/Services/Crystallise/BudgetedScreener.php
<?php

declare(strict_types=1);

namespace App\Services\Crystallise;

use App\Models\Tenant;
use GuzzleHttp\Client;
use GuzzleHttp\Exception\RequestException;

final class BudgetOverrunException extends \RuntimeException {}

final class BudgetedScreener
{
    public function __construct(
        private readonly Client $http,
        private readonly Screener $screener,
        private readonly string $apiKey,
        private readonly string $baseUrl,
    ) {}

    public function screenForTenant(Tenant $tenant, array $papers, array $criteria = []): array
    {
        $ceiling = (float) $tenant->screening_budget_usd;

        $estimateRes = $this->http->post("{$this->baseUrl}/v1/screening/estimate", [
            'headers' => ['X-API-Key' => $this->apiKey],
            'json'    => [
                'papers_count'   => count($papers),
                'criteria_count' => count($criteria),
                'repetitions'    => 5,
            ],
        ]);
        $estimate = (float) json_decode((string) $estimateRes->getBody(), true)['estimated_cost_usd'];

        if ($estimate > $ceiling) {
            throw new BudgetOverrunException(
                "Estimated \${$estimate} exceeds tenant ceiling \${$ceiling}."
            );
        }

        try {
            return $this->screener->screen($papers, $criteria, mock: false);
        } catch (RequestException $e) {
            // Defence-in-depth: server may reject with 400 if the job's own recalc overshoots.
            if ($e->getResponse()?->getStatusCode() === 400) {
                throw new BudgetOverrunException(
                    'Server rejected job: ' . (string) $e->getResponse()->getBody()
                );
            }
            throw $e;
        }
    }
}
```

Usage

```
$budgeted = new BudgetedScreener($client, $screener, $apiKey, $baseUrl);

try {
    $out = $budgeted->screenForTenant($tenant, $papers, $criteria);
} catch (BudgetOverrunException $e) {
    return back()->withErrors(['budget' => $e->getMessage()]);
}
```

**Note:** the cost estimate has a stated ±30% variance. Set tenant ceilings with headroom, and treat the estimate as guidance, not a bill.

## 6. Mock-mode CI test

Mock mode (`"mock": true`) returns canned data with no OpenAI call, which makes it perfect for CI: no secrets, no token spend, deterministic structure. Assert only structural properties — that the status transitions pending → completed and that `results` arrives with the expected keys. Never assert on score values or reasoning text; mock fixtures can change without being a bug.

Code

```
// tests/Feature/CrystalliseScreenerTest.php
<?php

declare(strict_types=1);

namespace Tests\Feature;

use App\Services\Crystallise\Screener;
use GuzzleHttp\Client;
use PHPUnit\Framework\Attributes\Test;
use Tests\TestCase;

final class CrystalliseScreenerTest extends TestCase
{
    #[Test]
    public function it_completes_a_mock_screening_end_to_end(): void
    {
        $screener = new Screener(
            new Client(),
            config('services.crystallise.key'),
            config('services.crystallise.url'),
        );

        $out = $screener->screen(
            papers: [
                ['id' => 'p1', 'title' => 'Trial', 'abstract' => 'Randomized...'],
            ],
            criteria: [],
            mock: true,
        );

        $this->assertArrayHasKey('results', $out);
        $this->assertArrayHasKey('clusters', $out);
        $this->assertNotEmpty($out['results']);
        $this->assertArrayHasKey('id', $out['results'][0]);
        $this->assertArrayHasKey('final_score', $out['results'][0]);
    }
}
```

Usage

```
# Run with a live test instance pointed at a mock-only key:
CRYSTALLISE_URL=https://api-staging.example.com \
CRYSTALLISE_KEY=ci-dev-key \
  php artisan test --filter=CrystalliseScreenerTest
```

**Note:** we test against a live service with `mock: true` rather than Laravel's HTTP fake, because the point is to catch breaking changes in request/response shape — a fake would just mirror your own assumptions. If your CI can't reach the service, fall back to `Http::fake()` and test only that your payloads are well-formed.

## 7. Question analysis from a Livewire component

A Livewire v3 component that calls `POST /v1/criteria/analyze-question` on the blur of a textarea and renders the response inline. The endpoint is synchronous (one request, one response) so no job polling is needed — it fits naturally into a reactive UI where users iterate on their research question until `status === "ready"`.

Code

```
// app/Livewire/ResearchQuestionAnalyzer.php
<?php

declare(strict_types=1);

namespace App\Livewire;

use GuzzleHttp\Client;
use Livewire\Component;

final class ResearchQuestionAnalyzer extends Component
{
    public string $question = '';
    public ?string $status = null;
    /** @var string[] */
    public array $missingElements = [];
    public ?string $suggestion = null;

    public function updatedQuestion(): void
    {
        if (trim($this->question) === '') {
            $this->reset(['status', 'missingElements', 'suggestion']);
            return;
        }

        $response = (new Client())->post(
            config('services.crystallise.url') . '/v1/criteria/analyze-question',
            [
                'headers' => ['X-API-Key' => config('services.crystallise.key')],
                'json'    => ['research_question' => $this->question, 'mock' => app()->environment('local')],
            ],
        );

        $body = json_decode((string) $response->getBody(), true);
        $this->status           = $body['status'] ?? null;
        $this->missingElements  = $body['missing_elements'] ?? [];
        $this->suggestion       = $body['suggestion'] ?? null;
    }

    public function render()
    {
        return view('livewire.research-question-analyzer');
    }
}
```

```
{{-- resources/views/livewire/research-question-analyzer.blade.php --}}
<div>
  <textarea wire:model.lazy="question" rows="3" class="w-full"></textarea>

  @if ($status)
    <div class="mt-2 text-sm">
      <strong>Status:</strong> {{ $status }}
      @if (!empty($missingElements))
        <ul class="list-disc ml-6 mt-1">
          @foreach ($missingElements as $m)
            <li>{{ $m }}</li>
          @endforeach
        </ul>
      @endif
      @if ($suggestion)
        <p class="mt-1 italic">{{ $suggestion }}</p>
      @endif
    </div>
  @endif
</div>
```

Usage

```
<!-- in any Blade view -->
<livewire:research-question-analyzer />
```

**Note:** `updatedQuestion()` fires on every Livewire sync. `wire:model.lazy` defers the sync until blur, which matches the "finish typing, then analyse" UX. For a debounced live-analysis flow, swap in `wire:model.live.debounce.500ms`.

## 8. AutoIndexer batch with progress streaming

Kick off a long-running indexer job from a queued Laravel job, poll for progress, and broadcast incremental updates to the browser over a websocket channel. The queued job is the durability boundary (restarts resume from the last polled state if you persist `job_id`), the broadcast is only a UX convenience. Uses `partial_results` from `GET /v1/indexer/jobs/{id}` to stream records as they complete.

Code

```
// app/Events/IndexerProgress.php
<?php

declare(strict_types=1);

namespace App\Events;

use Illuminate\Broadcasting\Channel;
use Illuminate\Broadcasting\InteractsWithSockets;
use Illuminate\Contracts\Broadcasting\ShouldBroadcast;
use Illuminate\Foundation\Events\Dispatchable;

final class IndexerProgress implements ShouldBroadcast
{
    use Dispatchable, InteractsWithSockets;

    public function __construct(
        public readonly int $tenantId,
        public readonly string $jobId,
        public readonly float $progress,
        public readonly array $partialResults,
    ) {}

    public function broadcastOn(): Channel
    {
        return new Channel("tenant.{$this->tenantId}.indexer");
    }
}
```

```
// app/Jobs/ProcessIndexerBatch.php
<?php

declare(strict_types=1);

namespace App\Jobs;

use App\Events\IndexerProgress;
use App\Services\Crystallise\JobPoller;
use GuzzleHttp\Client;
use Illuminate\Bus\Queueable;
use Illuminate\Contracts\Queue\ShouldQueue;
use Illuminate\Foundation\Bus\Dispatchable;

final class ProcessIndexerBatch implements ShouldQueue
{
    use Dispatchable, Queueable;

    public function __construct(
        public readonly int $tenantId,
        public readonly array $records,
        public readonly array $fields,
    ) {}

    public function handle(Client $http, JobPoller $poller): void
    {
        $apiKey  = config('services.crystallise.key');
        $baseUrl = config('services.crystallise.url');

        $create = $http->post("{$baseUrl}/v1/indexer/jobs", [
            'headers' => ['X-API-Key' => $apiKey],
            'json'    => ['records' => $this->records, 'fields' => $this->fields, 'mode' => 'full'],
        ]);
        $jobId = json_decode((string) $create->getBody(), true)['job_id'];

        do {
            $res = $http->get("{$baseUrl}/v1/indexer/jobs/{$jobId}", [
                'headers' => ['X-API-Key' => $apiKey],
            ]);
            $body = json_decode((string) $res->getBody(), true);

            broadcast(new IndexerProgress(
                tenantId:       $this->tenantId,
                jobId:          $jobId,
                progress:       (float) ($body['progress'] ?? 0),
                partialResults: $body['partial_results'] ?? [],
            ))->toOthers();

            if (in_array($body['status'], ['completed', 'failed'], true)) {
                return;
            }
            sleep(2);
        } while (true);
    }
}
```

Usage

```
// Dispatch from a controller
ProcessIndexerBatch::dispatch($tenant->id, $records, $fields);

// Browser-side: listen on the same channel (Laravel Echo + Pusher)
Echo.channel(`tenant.${tenantId}.indexer`).listen('IndexerProgress', (e) => {
    updateProgressBar(e.progress);
    appendRows(e.partialResults);
});
```

**Note:** this recipe shows the integration seam, not a complete Pusher/Reverb setup — configure `config/broadcasting.php` and the JS `Echo` client per the Laravel docs. The poll interval (2 s) is coarse on purpose: broadcast storms are worse than slightly stale progress.

**Where to go next:** [Playbook](playbook.md) for the conceptual flow, [API Reference](api-reference.md) for full endpoint details, [Troubleshooting](troubleshooting.md) for common errors.
