Commit Graph

27 Commits

Author SHA1 Message Date
Jugal D. Bhatt aea0605eed [LLM Translation] Fix Realtime API endpoint for no intent (#13476)
* fix intent params

* Add responses

* fix unrelated test

* test fix - fireworks API endpoint is down

* test fix fireworks ai is having an active outage

* test_completion_cost_databricks

* dbrx fix test API currently not responding

* Update OpenAI Realtime handler to use the correct endpoint and include all query parameters. Adjusted error messages for missing API base and key. Updated health check URL construction to pass model as a query parameter.

* Enhance OpenAI Realtime handler tests to ensure model parameter inclusion in WebSocket URL. Added new tests to verify correct URL construction with model and additional parameters, preventing 'missing_model' errors. Updated existing tests for consistency.

* Remove debug print statements for API base and key in OpenAIRealtime handler to clean up the code.

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
2025-08-14 16:24:14 -07:00
Ishaan Jaff eeed03a78f test fix: gcp deprecated gemini-1.5-flash 2025-08-06 08:43:45 -07:00
Ishaan Jaff 59f3771799 test_text_completion_stream - hf 2025-07-03 16:00:51 -07:00
Ishaan Jaff 5630147e80 Revert "Revert "fix tests (#12286)""
This reverts commit 12f157513b.
2025-07-03 12:08:27 -07:00
Ishaan Jaff 12f157513b Revert "fix tests (#12286)"
This reverts commit 99ce3a24cc.
2025-07-03 12:04:23 -07:00
célina 99ce3a24cc fix tests (#12286) 2025-07-03 10:57:19 -07:00
Ishaan Jaff 355e6118d8 def test_text_completion_stream(): 2025-06-14 16:46:09 -07:00
Krish Dholakia 711601e22a Add key-level multi-instance tpm/rpm/max parallel request limiting (#10458)
* fix: initial commit of v2 parallel request limiter hook

enables multi-instance rate limiting to work

* fix: subsequent commit with additional refactors

* fix(parallel_request_limiter_v2.py): cleanup initial call hook

simplify it

* fix(parallel_request_limiter_v2.py): working v2 parallel request limiter

* fix: more updates - still not passing testing

* fix(test_parallel_request_limiter_v2.py): update test + add conftest

* fix: fix ruff checks

* fix(parallel_request_limiter_v2.py): use pull via pattern method to load in keys instance wouldn't have seen yet

Fixes issue where redis syncing was not pulling key until instance had seen it

* test: update testing to cover tpm and rpm

* fix(parallel_request_limiter_v2.py): fix ruff errors

* fix(proxy/hooks/__init__.py): feature flag export

* fix(proxy/hooks/__init_.py): fix linting error

* ci(config.yml): add tests/enterprise to ci/cd

* fix: fix ruff check

* test: update testing
2025-04-30 21:32:31 -07:00
Krish Dholakia 2508ca71cb Handle fireworks ai tool calling response (#10130)
* feat(fireworks_ai/chat): handle tool calling with fireworks ai correctly

Fixes https://github.com/BerriAI/litellm/issues/7209

* fix(utils.py): handle none type in message

* fix: fix model name in test

* fix(utils.py): fix validate check for openai messages

* fix: fix model returned

* fix(main.py): fix text completion routing

* test: update testing

* test: skip test - cohere having RBAC issues
2025-04-19 09:37:45 -07:00
Ishaan Jaff 198922b26f test fixes for vertex mistral, this model was deprecated on vertex 2025-04-16 20:51:45 -07:00
Ishaan Jaff c38146e180 test fix 2025-04-16 20:13:31 -07:00
Ishaan Jaff cf801f9642 test fix vertex_ai/codestral 2025-04-16 20:01:36 -07:00
Krrish Dholakia 22faf7d232 fix(ollama/completions/transformation.py): pass prompt, untemplated on /completions request
Fixes https://github.com/BerriAI/litellm/issues/6900
2025-03-17 18:35:44 -07:00
Ishaan Jaff 970e9c7507 huggingface/mistralai/Mistral-7B-Instruct-v0.3 2025-01-13 18:42:36 -08:00
Krrish Dholakia aa7f416b7f test: update hf test to check if client closed 2024-12-12 11:34:50 -08:00
Krish Dholakia 350cfc36f7 Litellm merge pr (#7161)
* build: merge branch

* test: fix openai naming

* fix(main.py): fix openai renaming

* style: ignore function length for config factory

* fix(sagemaker/): fix routing logic

* fix: fix imports

* fix: fix override
2024-12-10 22:49:26 -08:00
Ishaan Jaff 128eeb4997 handle vertex ServiceUnavailableError for codestral 2024-11-17 18:45:58 -08:00
Ishaan Jaff e1ca95672a vertex_ai/codestral@2405 is very unstable - handle their instability in our tests 2024-11-17 18:17:14 -08:00
Ishaan Jaff 585b54e70c handle codestral@2405 instability 2024-11-17 17:55:19 -08:00
Ishaan Jaff 401531a8c9 fix test_completion_codestral_fim_api_stream 2024-11-16 20:02:27 -08:00
Krrish Dholakia ca09f4afec test: cleanup codestral tests - backend api unavailable 2024-10-23 22:19:57 -07:00
Ishaan Jaff 182adec7d0 def test_text_completion_with_echo(stream): (#6401)
test
2024-10-23 23:27:19 +05:30
Krish Dholakia 6729c9ca7f LiteLLM Minor Fixes & Improvements (10/07/2024) (#6101)
* fix(utils.py): support dropping temperature param for azure o1 models

* fix(main.py): handle azure o1 streaming requests

o1 doesn't support streaming, fake it to ensure code works as expected

* feat(utils.py): expose `hosted_vllm/` endpoint, with tool handling for vllm

Fixes https://github.com/BerriAI/litellm/issues/6088

* refactor(internal_user_endpoints.py): cleanup unused params + update docstring

Closes https://github.com/BerriAI/litellm/issues/6100

* fix(main.py): expose custom image generation api support

Fixes https://github.com/BerriAI/litellm/issues/6097

* fix: fix linting errors

* docs(custom_llm_server.md): add docs on custom api for image gen calls

* fix(types/utils.py): handle dict type

* fix(types/utils.py): fix linting errors
2024-10-07 22:17:22 -07:00
Krish Dholakia 14165d3648 LiteLLM Minor Fixes & Improvements (10/02/2024) (#6023)
* feat(together_ai/completion): handle together ai completion calls

* fix: handle list of int / list of list of int for text completion calls

* fix(utils.py): check if base model in bedrock converse model list

Fixes https://github.com/BerriAI/litellm/issues/6003

* test(test_optional_params.py): add unit tests for bedrock optional param mapping

Fixes https://github.com/BerriAI/litellm/issues/6003

* feat(utils.py): enable passing dummy tool call for anthropic/bedrock calls if tool_use blocks exist

Fixes https://github.com/BerriAI/litellm/issues/5388

* fixed an issue with tool use of claude models with anthropic and bedrock (#6013)

* fix(utils.py): handle empty schema for anthropic/bedrock

Fixes https://github.com/BerriAI/litellm/issues/6012

* fix: fix linting errors

* fix: fix linting errors

* fix: fix linting errors

* fix(proxy_cli.py): fix import route for app + health checks path (#6026)

* (testing): Enable testing us.anthropic.claude-3-haiku-20240307-v1:0. (#6018)

* fix(proxy_cli.py): fix import route for app + health checks gettsburg.wav

Fixes https://github.com/BerriAI/litellm/issues/5999

---------

Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>

---------

Co-authored-by: Ved Patwardhan <54766411+vedpatwardhan@users.noreply.github.com>
Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>
2024-10-02 22:00:28 -04:00
Ishaan Jaff 045ecf3ffb (feat proxy slack alerting) - allow opting in to getting key / internal user alerts (#5990)
* define all slack alert types

* use correct type hints for alert type

* use correct defaults on slack alerting

* add readme for slack alerting

* fix linting error

* update readme

* docs all alert types

* update slack alerting docs

* fix slack alerting docs

* handle new testing dir structure

* fix config for testing

* fix testing folder related imports

* fix /tests import errors

* fix import stream_chunk_testdata

* docs alert types

* fix test test_langfuse_trace_id

* fix type checks for slack alerting

* fix outage alerting test slack
2024-10-01 10:49:22 -07:00
Krrish Dholakia 5ad01e59f6 refactor: fix imports 2024-09-28 21:08:14 -07:00
Krrish Dholakia 3560f0ef2c refactor: move all testing to top-level of repo
Closes https://github.com/BerriAI/litellm/issues/486
2024-09-28 21:08:14 -07:00