p1_extraction_runner¶
UC1-P1 extraction runner — execute the FNOL extraction prompt and parse output.
- class prompt_risk.uc.uc1.p1_extraction_runner.P1ExtractionUserPromptData(*, source: str, narrative: str)[source]¶
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class prompt_risk.uc.uc1.p1_extraction_runner.P1ExtractionOutput(*, date_of_loss: str, time_of_loss: str, location: str, line_of_business_hint: str, parties_involved: list[str], damage_description: str, injury_indicator: Literal['none', 'minor', 'moderate', 'severe', 'fatal'], police_report: str, evidence_available: list[str], estimated_severity: Literal['low', 'medium', 'high'])[source]¶
Structured output for the P1 FNOL extraction prompt.
Each field mirrors the JSON schema specified in the system prompt. Pydantic validators enforce that the model returns values within the expected formats and enumerations. When validation fails, the retry loop in
run()feeds the error back to the model so it can self-correct — seerun()for details.- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- prompt_risk.uc.uc1.p1_extraction_runner.MAX_RETRIES = 3¶
Maximum number of converse API calls per
run()invocation.LLM output is non-deterministic — even with a well-crafted prompt, the model may occasionally return values that violate the output schema (e.g. a date in
MM/DD/YYYYinstead ofYYYY-MM-DD, or a severity string outside the allowed enum). Rather than failing immediately, we feed the Pydantic validation error back to the model as a follow-up user message so it can self-correct. Three attempts strikes a balance between resilience and cost: most fixable errors resolve on the second try, and a third guards against edge cases without runaway API spend.
- prompt_risk.uc.uc1.p1_extraction_runner.run_p1_extraction(client: BedrockRuntimeClient, data: P1ExtractionUserPromptData, prompt_version: str = '01', model_id: str = 'us.amazon.nova-2-lite-v1:0') P1ExtractionOutput[source]¶
Execute the P1 extraction prompt and return validated output.
System prompt caching — The system prompt is static (no Jinja variables) by design. This lets us place a
cachePointafter it so that Bedrock caches the prefix across calls. When the same system prompt is reused — whether across retries within a singlerun_p1_extraction()call or across independent invocations — subsequent requests hit the cache and skip redundant input processing, reducing both latency and cost.Why the user prompt is NOT cached — The user prompt contains the per-request FNOL narrative and is different for every claim. Caching it would incur a cache-write cost on every call with virtually zero chance of a cache hit, making it a net loss. During retries the user prompt is already present in the
messageshistory, so the model sees it without any extra caching mechanism.Retry on validation failure — LLM output is non-deterministic. When Pydantic validation fails (e.g. wrong date format, invalid enum value), we append the model’s raw reply as an
assistantmessage and the validation error as ausermessage, then call the API again. This gives the model concrete feedback on what went wrong so it can self-correct. We allow up toMAX_RETRIESattempts; if all fail, the last exception is re-raised.