Commit Graph

12 Commits

Author SHA1 Message Date
yuneng-jiang acfaea9d25 [Fix] Reset api_base/api_key in xdist conftest to prevent cross-test leakage
test_rerank.py sets litellm.api_base = "http://localhost:4000" which leaked
to all subsequent tests on the same xdist worker, causing connection failures
across every provider (Cohere, Azure, OpenAI, etc.).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-15 23:55:44 -07:00
yuneng-jiang 5db6aef834 [Fix] Restore xdist test isolation: capture true defaults and poll cooldowns
The revert of 9711e3adfe left xdist tests without proper state isolation.
Module-level assignments like `litellm.num_retries = 3` in 12+ test files
pollute shared globals, and the fixture was saving/restoring contaminated
values instead of resetting to true defaults.

- Capture true litellm defaults at conftest import time and reset before
  each test (local_testing + llm_translation)
- Make llm_translation/conftest.py xdist-safe (skip reload under xdist,
  add state isolation)
- Replace asyncio.sleep(2) with polling in cooldown handler tests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-15 23:33:21 -07:00
yuneng-jiang b4f7d11a82 Revert "Fix xdist test isolation: capture true defaults and poll instead of sleep"
This reverts commit 9711e3adfe.
2026-03-15 22:57:39 -07:00
yuneng-jiang 9711e3adfe Fix xdist test isolation: capture true defaults and poll instead of sleep
The conftest fixtures were saving/restoring the current (potentially
contaminated) values of litellm globals like num_retries instead of
resetting to true defaults. Under xdist, module-level assignments
(e.g. `litellm.num_retries = 3` in 12+ test files) pollute the
shared module state and leak across tests in the same worker.

- Capture true litellm defaults at conftest import time and reset
  before each test (local_testing + llm_translation)
- Make llm_translation/conftest.py xdist-safe (skip reload, add
  state isolation)
- Replace asyncio.sleep(2) with polling in cooldown handler tests
- Add @pytest.mark.flaky to tests making real API calls under xdist

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-15 22:27:26 -07:00
yuneng-jiang 1a00dd4dbb Fix router test isolation for xdist and rebalance proxy unit tests
Router tests: expand conftest save/restore to cover all globals mutated
by router tests (default_fallbacks, tag_budget_config, request_timeout,
enable_azure_ad_token_refresh, num_retries_per_request, model_cost,
token_counter). These were leaking across xdist workers.

Proxy tests: move test_proxy_utils.py (169 parametrized) and
test_proxy_server.py (72 parametrized) from part2 to part1, balancing
~370 vs ~360 tests (was ~129 vs ~600).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-15 21:36:56 -07:00
yuneng-jiang 968d7a3eca Fix test isolation: save/restore pre_call_rules and post_call_rules
test_post_call_rule_streaming in test_rules.py sets
litellm.post_call_rules but never cleans up. Since
pytest_collection_modifyitems sorts tests by name across modules,
the leaked rule causes failures in test_streaming.py,
test_register_model.py, and test_sagemaker.py.

Add pre_call_rules and post_call_rules to the isolate_litellm_state
fixture's save/restore and clear lists.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-15 13:28:14 -07:00
yuneng-jiang 023654d9ad Fix flaky CI tests: mock timeout race, update deprecated model, fix callback leak
- test_hanging_request_azure: mock httpx.AsyncClient.send to simulate slow
  response instead of racing real network latency against a 10ms timeout.
  The old non-existent deployment (gpt-4o-new-test) returned 404 faster
  than the timeout, causing NotFoundError instead of APITimeoutError.
- test_completion_together_ai_llama: update model from deprecated
  Meta-Llama-3.1-8B-Instruct-Turbo to Llama-3.2-3B-Instruct-Turbo
  (Together AI removed the old model from serverless).
- conftest.py: clear litellm.callbacks list before each test to prevent
  proxy hooks (SkillsInjectionHook, VirtualKeyModelMaxBudgetLimiter)
  from leaking across tests via Router initialization.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 23:45:58 -07:00
yuneng-jiang f838bea85b Optimize CI: parallelize router and guardrails test jobs, fix test isolation
- Router testing: add CircleCI parallelism=4 with timing-based test splitting
- Guardrails testing: add pytest-xdist -n 4, suppress DEBUG logs with LITELLM_LOG=WARNING
- Rewrite conftest.py in both test dirs for xdist compatibility (save/restore pattern)
- Fix module-level Router instances in test_router_fallback_handlers, test_router_custom_routing, test_acooldowns_router

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 22:54:44 -07:00
Ishaan Jaff b9132968b2 [Perf] Improvements for Async Success Handler (Logging Callbacks) - Approx +130 RPS (#13905)
* [Performance] Reduce Significant CPU overhead from litellm_logging.py (#13895)

* fix: litellm.configured_cold_storage_logger

* fix Session Management - Non-OpenAI Models docs

* ruff fix

* test fix

* create LoggingWorker

* add GLOBAL_LOGGING_WORKER for async task handling

* fix logging tests

* add conftest

* fix conftest

* test fix location of encode bedrock runtime modelid arn

* fix conftest.py

* tuning LoggingWorker

* conftest.py

* fix conftest batches/

* test_async_chat_azure

* event_loop

* test_bedrock_streaming_passthrough_test2

* fix GLOBAL_LOGGING_WORKER

* logging worker

* add flush for global logging worker

* Revert "fix GLOBAL_LOGGING_WORKER"

This reverts commit d254f508f48935652f054777652938ad71976cce.

* fix conftest clear_queue

* fix conftest clear_queue

* setup_and_teardown for llm translation

* docs AWS_REGION

* test_async_chat_azure

* change test DIR

* run ci/cd again

* use 1 job for litellm_router_unit_testing

* fix space

* fix litellm_router_unit_testing

* test_aaarouter_dynamic_cooldown_message_retry_time

* litellm_router_unit_testing

* conftest.py clearing qu

* fixes litellm_router_unit_testing

* fixes clear_queue

* fix router_unit_tests

* remove conftest

* add back conftest for router

* fix event loop test

* test fix

* fixes for LoggingWorker

* ruff fix
2025-08-23 13:13:23 -07:00
Krish Dholakia 9f2053e4af ci(conftest.py): reset conftest.py for local_testing/ (#6657)
* ci(conftest.py): reset conftest.py for local_testing/

check if that speeds up testing

* fix: fix import

* fix(conftest.py): fix import to check if hasattr

* fix(conftest.py): ignore proxy reload if doesn't exist
2024-11-08 19:14:16 +05:30
Krish Dholakia 27e18358ab fix(pattern_match_deployments.py): default to user input if unable to… (#6632)
* fix(pattern_match_deployments.py): default to user input if unable to map based on wildcards

* test: fix test

* test: reset test name

* test: update conftest to reload proxy server module between tests

* ci(config.yml): move langfuse out of local_testing

reduce ci/cd time

* ci(config.yml): cleanup langfuse ci/cd tests

* fix: update test to not use global proxy_server app module

* ci: move caching to a separate test pipeline

speed up ci pipeline

* test: update conftest to check if proxy_server attr exists before reloading

* build(conftest.py): don't block on inability to reload proxy_server

* ci(config.yml): update caching unit test filter to work on 'cache' keyword as well

* fix(encrypt_decrypt_utils.py): use function to get salt key

* test: mark flaky test

* test: handle anthropic overloaded errors

* refactor: create separate ci/cd pipeline for proxy unit tests

make ci/cd faster

* ci(config.yml): add litellm_proxy_unit_testing to build_and_test jobs

* ci(config.yml): generate prisma binaries for proxy unit tests

* test: readd vertex_key.json

* ci(config.yml): remove `-s` from proxy_unit_test cmd

speed up test

* ci: remove any 'debug' logging flag

speed up ci pipeline

* test: fix test

* test(test_braintrust.py): rerun

* test: add delay for braintrust test
2024-11-08 00:55:57 +05:30
Krrish Dholakia 3560f0ef2c refactor: move all testing to top-level of repo
Closes https://github.com/BerriAI/litellm/issues/486
2024-09-28 21:08:14 -07:00