* add: generic object pool & tests
Introduced a reusable object pool that can be applied across the codebase.
Note: memory growth is managed via eviction settings—using a hard cap could
reduce performance, so eviction is the preferred safeguard.
* fix: simpler tests
* fix: avoid redundant __init__ calls on hot path
Previously, imports on the request hot path caused __init__ to run
excessively for every request. This change ensures initialization
happens once, reducing cpu overhead.
* fix: remove redundant __init__ import
The current implementation no longer requires an import at the top of the function.
* fix: placed on core utils for future reuse
* test: add coverage & remove inline import
A general import-checking tool across all endpoints would be a large PR.
This commit focuses on a smaller, targeted fix for the discussed case.
* added import check to CI
- Apply Black formatting to all Bedrock CountTokens files
- Clean up imports and remove unused variables in tests
- Fix indentation and simplify test structure
- Fix pyright type error with type ignore annotation
- All tests continue to pass after cleanup
- Add endpoint integration test in test_proxy_token_counter.py
- Add unit tests for transformation logic in bedrock/count_tokens/
- Test model extraction from request body vs endpoint path
- Test input format detection (converse vs invokeModel)
- Test request transformation from Anthropic to Bedrock format
- All tests follow existing codebase patterns and pass successfully
* fix: iscoroutine removed from hot path
* fix: replace all instances & separate concerns
1. Replaced all instances of iscoroutine with is_async_callable
2. Place the coroutine checker in its own file
* fix: PR comment changes
* fix: missing config setting declaration
* fix: revert non-performance related changes
* fix: revert to initial implementation
* fix: remove dead const
Bedrock Guardrails - support setting bedrock runtime endpoint + Protect `/health/test_connect` to prevent users without model creation permissions from calling it
UI - allow team member to view service account keys they create + Anthropic - include cache creation tokens in prompt token total (separate out during cost tracking)
- Change truncation strategy from head-only to middle-truncation (35% start, 65% end)
- Preserve both beginning and end of long strings for better debugging context
- Apply same sanitization to response payloads when store_prompts_in_spend_logs is enabled
- Increase default MAX_STRING_LENGTH_PROMPT_IN_DB from 1000 to 2048 characters
- Update tests to verify new truncation behavior with 35%-65% split
This provides better diagnostic value by keeping the more important end context
while still maintaining storage limits.