p1_extraction_runner¶

UC1-P1 extraction runner — execute the FNOL extraction prompt and parse output.

class prompt_risk.uc.uc1.p1_extraction_runner.P1ExtractionUserPromptData(*, source: str, narrative: str)[source]¶

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class prompt_risk.uc.uc1.p1_extraction_runner.P1ExtractionOutput(*, date_of_loss: str, time_of_loss: str, location: str, line_of_business_hint: str, parties_involved: list[str], damage_description: str, injury_indicator: Literal['none', 'minor', 'moderate', 'severe', 'fatal'], police_report: str, evidence_available: list[str], estimated_severity: Literal['low', 'medium', 'high'])[source]¶

Structured output for the P1 FNOL extraction prompt.

Each field mirrors the JSON schema specified in the system prompt. Pydantic validators enforce that the model returns values within the expected formats and enumerations. When validation fails, the retry loop in run() feeds the error back to the model so it can self-correct — see run() for details.

model_config: ClassVar[ConfigDict] = {}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

prompt_risk.uc.uc1.p1_extraction_runner.MAX_RETRIES = 3¶

Maximum number of converse API calls per run() invocation.

LLM output is non-deterministic — even with a well-crafted prompt, the model may occasionally return values that violate the output schema (e.g. a date in MM/DD/YYYY instead of YYYY-MM-DD, or a severity string outside the allowed enum). Rather than failing immediately, we feed the Pydantic validation error back to the model as a follow-up user message so it can self-correct. Three attempts strikes a balance between resilience and cost: most fixable errors resolve on the second try, and a third guards against edge cases without runaway API spend.

prompt_risk.uc.uc1.p1_extraction_runner.run_p1_extraction(client: BedrockRuntimeClient, data: P1ExtractionUserPromptData, prompt_version: str = '01', model_id: str = 'us.amazon.nova-2-lite-v1:0') → P1ExtractionOutput[source]¶

Execute the P1 extraction prompt and return validated output.

System prompt caching — The system prompt is static (no Jinja variables) by design. This lets us place a cachePoint after it so that Bedrock caches the prefix across calls. When the same system prompt is reused — whether across retries within a single run_p1_extraction() call or across independent invocations — subsequent requests hit the cache and skip redundant input processing, reducing both latency and cost.

Why the user prompt is NOT cached — The user prompt contains the per-request FNOL narrative and is different for every claim. Caching it would incur a cache-write cost on every call with virtually zero chance of a cache hit, making it a net loss. During retries the user prompt is already present in the messages history, so the model sees it without any extra caching mechanism.

Retry on validation failure — LLM output is non-deterministic. When Pydantic validation fails (e.g. wrong date format, invalid enum value), we append the model’s raw reply as an assistant message and the validation error as a user message, then call the API again. This gives the model concrete feedback on what went wrong so it can self-correct. We allow up to MAX_RETRIES attempts; if all fail, the last exception is re-raised.