Commit Graph

13 Commits

Author SHA1 Message Date
Jonathan Barazany 32cb6f0cd9 fix: guard short-circuit against providers with native agentic loop
- Skip short-circuit for providers that have a BaseAnthropicMessagesConfig
  (bedrock, vertex_ai, azure_ai, anthropic) — they use the agentic loop
  which includes a follow-up LLM synthesis step. Short-circuiting would
  return raw search text instead of an LLM-synthesized answer.
- Add fallback to litellm.get_llm_provider() for custom_llm_provider
  derivation when litellm_params is overwritten by kwargs.
- Add test for bedrock guard.

Addresses Greptile review comments #3 and #4.
2026-03-20 01:07:20 +02:00
Jonathan Barazany 141ad04955 refactor: reuse get_last_user_message, fix UUID convention, move import
- Replace hand-rolled _extract_search_query with existing
  get_last_user_message from common_utils
- Use full UUID (str(uuid.uuid4())) to match codebase convention
- Move uuid import to module level per CLAUDE.md
2026-03-19 19:56:42 +02:00
Jonathan Barazany 3b129260f5 fix: use original_stream for short-circuit, propagate derived provider
Addresses Greptile review feedback:
- Save original stream flag before pre-request hooks convert it, so
  streaming callers get SSE events instead of a plain dict
- Propagate custom_llm_provider derived inside _execute_pre_request_hooks
  when it was not explicitly passed by the caller
- Add tests covering both scenarios
2026-03-19 19:52:16 +02:00
Jonathan Barazany b5a775d54e style: fix Black formatting in test file 2026-03-19 19:47:13 +02:00
Jonathan Barazany 6d0763b8ba fix: short-circuit websearch for non-Anthropic providers (github_copilot)
For providers like github_copilot that don't natively support web search,
Claude Code's search sub-conversations were falling through to the adapter
path which strips the web_search tool and has no stream reconversion.

Instead of routing search requests through the full LLM pipeline, detect
web-search-only requests early (all tools are web_search, simple prompt)
and execute the search directly via Tavily/Perplexity, returning a
synthetic Anthropic response. No adapter, no backend LLM call needed.

Fixes #21733
2026-03-19 19:28:05 +02:00
giulio-leone 7b0ed0ff91 fix: replace sk-fake with safe test key to avoid secret scanner
Replace 'sk-fake' with 'fake-key-for-testing' in websearch interception
tests to prevent false-positive secret scanner triggers.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-03-05 18:29:28 +01:00
giulio-leone 12691dcce3 fix: WebSearch interception fails with thinking enabled + SpendLimit constraint 2026-03-04 22:44:52 +01:00
michelligabriele 4630793fb0 fix(websearch_interception): preserve thinking blocks in agentic loop follow-up messages
When extended thinking is enabled, the websearch interception agentic loop
builds a follow-up assistant message with only tool_use blocks. Anthropic's
API requires assistant messages to start with thinking/redacted_thinking
blocks when thinking is enabled, causing a 400 Bad Request.

Extract thinking blocks from the model's initial response, thread them
through the agentic loop, and prepend them to the follow-up assistant
message — matching the pattern used by anthropic_messages_pt in factory.py.

Fixes the error: "Expected 'thinking' or 'redacted_thinking', but found
'tool_use'"
2026-02-19 21:51:00 +01:00
michelligabriele 053ee4826f fix(websearch_interception): fix pre_call_deployment_hook not triggering via proxy router (#21433)
* fix(websearch_interception): fix pre_call_deployment_hook not triggering via proxy router

Fix provider lookup (check top-level kwargs + fallback to get_llm_provider),
return full kwargs dict instead of partial, and use OpenAI-format tool definition.

* remove unnecessary inline import
2026-02-19 06:38:45 -08:00
Sameer Kankute 4e94ecb08d Add tests for WebSearch interception with chat completions API 2026-02-09 13:41:29 +05:30
mpcusack-altos 88f8f49e1d fix(websearch_interception): filter internal kwargs before follow-up request (#19577)
The websearch interception handler was passing internal flags like
`_websearch_interception_converted_stream` to the follow-up LLM request.
This caused "Extra inputs are not permitted" errors from providers like
Bedrock that use strict Pydantic validation.

Fix: Filter out all kwargs starting with `_websearch_interception` prefix
before making the follow-up anthropic_messages.acreate() call.
2026-01-22 10:42:20 -08:00
John Greek aa4b0e0149 Fix duplicate test_handler.py filenames causing pytest collection errors (#19385) 2026-01-21 08:47:50 -08:00
Ishaan Jaff 104283ae8f [Feat] Claude Code - Add Websearch support using LiteLLM /search (using web search interception hook) (#19263)
* init WebSearchInterceptionLogger

* test_websearch_interception_real_call

* init async_should_run_agentic_completion

* async_should_run_agentic_loop

* async_run_agentic_loop

* refactor folder

* fix organization

* WebSearchTransformation

* WebSearchInterceptionLogger

* _call_agentic_completion_hooks

* WebSearch Interception Architecture

* test_websearch_interception_real_call

* add streaming

* add transform_request for streaming

* get_llm_provider

* test fix

* fix info

* init from config.yaml

* fixes

* test handler

* fix _is_streaming_response

* async_run_agentic_loop

* mypy fix
2026-01-16 21:10:05 -08:00