litellm

mirror of https://github.com/tiennm99/litellm.git synced 2026-06-18 05:28:02 +00:00

Author	SHA1	Message	Date
yuneng-jiang	7c667b8797	fix(helm): drop main- prefix from default image tag (#28710 ) * fix(helm): drop main- prefix from default image tag The default image tag in the deployment + migrations-job templates was `main-{{ .Chart.AppVersion }}`. The current release pipeline publishes content tags without the `main-` prefix (e.g. `v1.85.1` / `1.85.1`, `v1.86.0-rc.1` / `1.86.0-rc.1`), so the rendered ref points at a tag that does not exist on GHCR or DockerHub and installs fail with ImagePullBackOff. - templates/deployment.yaml, templates/migrations-job.yaml: render `.Chart.AppVersion` directly instead of `main-<AppVersion>`. - Chart.yaml: bump stale `appVersion: v1.80.12` (not on either registry) to `v1.85.1` so local-checkout installs also resolve. - values.yaml: update the commented tag-override hint to match. * fix(helm): use :latest in tag override example, not pinned version Per review: ghcr.io/berriai/litellm-database:latest is a floating alias for the most recent stable (same digest as :main-stable), maintained by the release pipeline's UPDATE_LATEST advance step. Better example than a pinned version that goes stale.	2026-05-23 15:57:38 -07:00
Sameer Kankute	36c494fdd2	Litellm oss staging (#28161 ) * fix(opentelemetry): JSON-serialize dict metadata fields for OTEL span attributes (#27451) (#27455) Squash-merged by litellm-agent from Anai-Guo's PR. * feat(dashscope): add embeddings and reranks(qwen3-rerank) support via OpenAI-compatible endpoint (#27508) Squash-merged by litellm-agent from yimao's PR. * fix(vertex_ai/gemini): raise BadRequestError when image_url or url fi… (#24550) Squash-merged by litellm-agent from krisxia0506's PR. * fix(vertex_ai): raise error on mid-stream 429/error chunks instead of silently swallowing (#23711) Squash-merged by litellm-agent from krisxia0506's PR. * fix: raise BadRequestError for file content blocks missing 'file' sub… (#24503) Squash-merged by litellm-agent from krisxia0506's PR. * Fix Gemini MIME detection for extensionless GCS URIs (#27278) Squash-merged by litellm-agent from krisxia0506's PR. * fix(vertex_ai/partner_models): drop unused vertexai SDK gate from count_tokens (closes #28084) (#28107) Squash-merged by litellm-agent from voidborne-d's PR. * feat(chart): add support for autoscaling behavior in HPA (#27990) Squash-merged by litellm-agent from FabrizioCafolla's PR. * feat(proxy): add blocked flag to models for pause/resume from the UI (#27927) Squash-merged by litellm-agent from Cyberfilo's PR. * fix: pass socket timeouts to Redis cluster clients (#27920) Squash-merged by litellm-agent from tomdee's PR. * Fix/cache token (#28009) Squash-merged by litellm-agent from escon1004's PR. * fix(deepseek): forward reasoning_content in multi-turn thinking mode conversations (#28080) Squash-merged by litellm-agent from Divyansh8321's PR. * fix(guardrails): return HTTP 400 instead of 500 for blocked requests (#27617) * fix: reset org and tag budgets (#27326) * reset org budgets * reset tag budgets --------- Co-authored-by: Michael Riad Zaky <michaelr@Mac.localdomain> * fix(ui): omit allowed_routes from key edit save when unchanged (#27553) * fix(ui): omit allowed_routes from key edit save when unchanged When a team admin opens Edit Settings on a key with key_type=AI APIs and saves without changing anything, the UI re-sends the existing allowed_routes value, which the backend's _check_allowed_routes_caller_permission gate rejects for non-proxy-admins (LIT-2681). Strip allowed_routes from the patch in handleSubmit when it deep-equals the original keyData.allowed_routes. The backend treats absence as "leave alone," so no-op saves now succeed for non-admins. Admins explicitly editing the field still send the new value. * fix(ui): order-insensitive allowed_routes diff + cover null-original case Address Greptile review: - Switch the "is allowed_routes unchanged" check to a Set-based comparison so a server-side reorder of the array doesn't register as a user edit and re-trigger LIT-2681. - Add two regression tests: (1) keyData.allowed_routes is null and the form is untouched — patch should strip the field; (2) server returned routes in a different order than the user originally entered — patch should still recognize the value as unchanged. * chore(ui): strip ticket refs and tighten comments in key edit fix - Remove internal-tracker references from in-code comments - Tighten the WHY comment in handleSubmit to two lines - Drop redundant test-block comments — test names already describe the case * fix(ui): annotate Set<string> generic in allowed_routes diff to fix tsc * fix(guardrails): return HTTP 400 instead of 500 for guardrail-blocked requests GuardrailRaisedException and BlockedPiiEntityError both lacked a status_code attribute. When these exceptions reached the proxy exception handler (getattr(e, 'status_code', 500)), the fallback defaulted to HTTP 500 — making intentional guardrail blocks indistinguishable from server errors and causing unnecessary client retries. Changes: - Add status_code=400 (keyword-only) to GuardrailRaisedException - Add status_code=400 (keyword-only) to BlockedPiiEntityError - Update _is_guardrail_intervention() to recognize both exceptions so downstream loggers record 'guardrail_intervened' instead of 'guardrail_failed_to_respond' - Add 6 unit tests for default/custom status codes and getattr pattern - Strengthen existing blocked-action test with status_code assertion Fixes #24348 --------- Co-authored-by: Michael-RZ-Berri <michael@berri.ai> Co-authored-by: Michael Riad Zaky <michaelr@Mac.localdomain> Co-authored-by: ryan-crabbe-berri <ryan@berri.ai> Co-authored-by: Krrish Dholakia <krrish+github@berri.ai> * fix(router/proxy): address Greptile P1+P2 review comments on PR #28161 - router: raise ServiceUnavailableError (503) instead of RouterRateLimitErrorBasic (429) when a specifically-addressed deployment is administratively blocked; 429 misleads retry-enabled clients into spinning forever against a paused model - proxy_server: compute get_fully_blocked_model_names() once before both branches in model_list() instead of duplicating the call in each branch - deepseek: upgrade silent debug log to warning when injecting placeholder reasoning_content so callers are clearly notified of degraded multi-turn quality - tests: update two blocked-deployment assertions to expect ServiceUnavailableError Co-authored-by: Cursor <cursoragent@cursor.com> * fix: address bug detection findings (cache token order, mutable defaults) Co-authored-by: Yassin Kortam <yassin@berri.ai> * fix: address bugs in async pass-through, anthropic cache token detection, rerank tests - async_get_available_deployment_for_pass_through: enforce blocked check on specific deployments - cost_calculator: detect anthropic-style usage by attribute presence (not truthiness) to avoid mixing OpenAI cached_tokens into anthropic normalization when read=0 - dashscope rerank tests: pass request to httpx.Response constructions for consistency Co-authored-by: Yassin Kortam <yassin@berri.ai> * fix code qa * fix(vertex_ai/gemini): strip MIME parameters from GCS contentType GCS object metadata's contentType field can include parameters such as 'text/html; charset=utf-8'. Strip them in _apply_gemini_mime_type_aliases so downstream get_file_extension_from_mime_type sees a bare MIME type. Co-authored-by: Yassin Kortam <yassin@berri.ai> * fix(vertex_ai/gemini): clarify mime-type error message string concatenation Co-authored-by: Yassin Kortam <yassin@berri.ai> --------- Co-authored-by: Tai An <antai12232931@outlook.com> Co-authored-by: Vincent <yimao1231@gmail.com> Co-authored-by: Kris Xia <xiajiayi0506@gmail.com> Co-authored-by: d 🔹 <liusway405@gmail.com> Co-authored-by: Fabrizio Cafolla <developer@fabriziocafolla.com> Co-authored-by: Filippo Menghi <113345637+Cyberfilo@users.noreply.github.com> Co-authored-by: Tom Denham <tom@tomdee.co.uk> Co-authored-by: escon1004 <70471150+escon1004@users.noreply.github.com> Co-authored-by: Divyansh Singhal <97736786+Divyansh8321@users.noreply.github.com> Co-authored-by: robin-fiddler <robin@fiddler.ai> Co-authored-by: Michael-RZ-Berri <michael@berri.ai> Co-authored-by: Michael Riad Zaky <michaelr@Mac.localdomain> Co-authored-by: ryan-crabbe-berri <ryan@berri.ai> Co-authored-by: Krrish Dholakia <krrish+github@berri.ai> Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: Yassin Kortam <yassin@berri.ai>	2026-05-18 16:27:44 -07:00
Yassin Kortam	fa5eae8bc9	chore: remove legacy deployment artifacts and litellm-js packages (#27541 ) - Remove litellm-js/proxy and litellm-js/spend-logs TypeScript packages that provided Cloudflare Worker proxy and Node.js spend logging services, as these are no longer maintained - Remove deprecated Docker variants (Dockerfile.alpine, Dockerfile.dev, Dockerfile.custom_ui, Dockerfile.health_check, Dockerfile.ghcr_base) that have been superseded by the primary Dockerfile - Remove legacy Kubernetes manifests (kub.yaml, service.yaml) from deploy/kubernetes in favor of the Helm chart - Remove stale index.yaml Helm chart index pinned to an old version (v1.43.18) - Remove dev_config.yaml development configuration file that contained hardcoded credentials and example endpoints - Clean up ~3,500 lines of unused code and configuration to reduce repository maintenance burden Co-authored-by: Yassin Kortam <yassinkortam@g.ucla.edu>	2026-05-09 20:51:34 +00:00
Yassin Kortam	b5d3a5fc85	feat: add read-replica routing for Prisma DB via DATABASE_URL_READ_REPLICA (#27493 ) - Introduce RoutingPrismaWrapper that transparently routes read operations (find_*, count, group_by, query_raw, query_first) to a reader endpoint while writes remain on the writer, enabling Aurora-style reader/writer endpoint splits - Add IAMEndpoint dataclass and parse_iam_endpoint_from_url() to capture static connection fields from a reader URL so only the IAM token needs to rotate, avoiding the need for separate DATABASE_HOST_READ_REPLICA/etc. env vars - Enhance PrismaWrapper with per-instance knobs (db_url_env_var, iam_endpoint, recreate_uses_datasource, log_prefix) so writer and reader wrappers are independent: the reader writes its fresh URL to DATABASE_URL_READ_REPLICA and passes datasource override to Prisma since Prisma only auto-reads DATABASE_URL - Fix deadlock in PrismaWrapper.__getattr__: when called from inside a running event loop, schedule the token refresh as a background task instead of blocking with run_coroutine_threadsafe + future.result(), which would deadlock the loop thread waiting for a coroutine that needs the loop to run - Fix botocore crash when DATABASE_PORT is unset by defaulting to "5432" in both proxy_cli.py and PrismaWrapper.get_rds_iam_token(); passing None caused botocore to embed the literal string "None" in the presigned URL - Implement graceful reader degradation: reader connect/recreate failures are non-fatal; wrapper sets _reader_unavailable=True and silently routes reads to the writer to keep the proxy serving traffic during transient reader outages - Add PrismaClient.writer_db property so the reconnect smoke-test always validates the writer engine specifically; query_raw on the routing wrapper would route to the reader and not verify the newly-recreated writer - Expose DATABASE_URL_READ_REPLICA in Helm chart (values.yaml + deployment.yaml) via both plain value and secret key reference, and document the field in docker-compose.yml - Add 887-line test suite covering routing logic, IAM token refresh paths, reader degradation scenarios, datasource override behavior, and the deadlock regression Co-authored-by: Yassin Kortam <yassinkortam@g.ucla.edu>	2026-05-08 21:05:50 -07:00
Yassin Kortam	451ce161fc	fix: remove separate health app	2026-05-07 16:04:56 -07:00
Yassin Kortam	dbc8f5a937	helm: skip proxy startup prisma db push when migrations Job is enabled (#27200 ) Co-authored-by: Yassin Kortam <yassinkortam@g.ucla.edu>	2026-05-05 16:58:53 -07:00
Yassin Kortam	618df94433	helm: increase default probe timeouts, disable debug logging by default (#27237 ) Co-authored-by: Yassin Kortam <yassinkortam@g.ucla.edu>	2026-05-05 16:58:34 -07:00
CHANGE	87d7e86479	feat(helm): add tpl support to extraContainers and extraInitContainers Wrap toYaml with tpl in deployment and migration job templates so users can reference Helm values (e.g. {{ .Values.image.repository }}) inside extraContainers and extraInitContainers definitions.	2026-04-10 09:41:33 -04:00
Yuneng Jiang	5f63873dca	[Infra] Pin all Docker build dependencies to exact versions Pin every dependency across all Docker builds so upgrades are intentional. Verified by building all 3 production images and diffing pip freeze against known-good v1.83.0-nightly baselines — zero version drift. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-01 00:05:39 -07:00
Chesars	1be6b31e2f	merge: resolve conflicts between main and litellm_oss_staging_03_11_2026	2026-03-12 09:38:31 -03:00
RJ Duffner	0c95d415e1	Add Abilty To Set minReadySeconds From values Files (#23173 ) * Add Abilty To Set minReadySeconds From values Files * typo * uppercase Min as it comes after deployment * Don't use defaults, just omit	2026-03-11 23:29:15 +05:30
Harshit28j	3127d79da8	feat: add strategy to deployment for helmchart	2026-03-10 05:49:46 +05:30
Sean Marsh Glover	4652c73259	feat(proxy): limit concurrent health checks with health_check_concurrency (#20584 ) * staged first pass * black * Update litellm/proxy/health_check.py Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * simpler * restore cached logo * fix tests for perform_health_check max_concurrency arg * implement pr suggestion * and the helm chart * add configureable resources and probes to the deployment in the helm chart * more helm chart unittests * move some background healthcheck loggin to debug --------- Co-authored-by: Sean Glover <sglover@athenahealth.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>	2026-02-24 08:16:59 -08:00
Cesar Garcia	622983cf89	fix(helm): add OCI annotations so GHCR shows helm pull instead of docker pull (#20617 ) The Helm chart on GHCR displays a `docker pull` command instead of the correct `helm pull oci://` command. This is because the OCI artifact is missing the `org.opencontainers.image.source` annotation that GHCR uses to identify and properly display Helm charts. Changes: - Add OCI annotations to Chart.yaml (source + url) which Helm 3.10+ propagates to the OCI manifest on push - Install explicit Helm v3.20.0 via azure/setup-helm@v4 for reproducible builds and proper OCI annotation support - Remove deprecated HELM_EXPERIMENTAL_OCI env var (OCI is GA since Helm 3.8)	2026-02-12 19:58:16 +05:30
Pragya Sardana	b4a27712a1	Add Init Containers in the community helm chart (#19816 )	2026-01-27 18:10:47 -08:00
Harshit Jain	9084c1d1bd	feat(helm): Enable PreStop hook configuration in values.yaml (#19613 )	2026-01-22 19:28:52 -08:00
R.Sicart	608979c7e9	feat: add support for keda in helm chart (#19337 ) * feat: add support for keda in helm chart Signed-off-by: R.Sicart <roger.sicart@gmail.com> * chore: bump chart version --------- Signed-off-by: R.Sicart <roger.sicart@gmail.com>	2026-01-19 10:38:41 -08:00
Harshit Jain	3ad8fa5422	fix: mount config.yaml as single file in Helm chart (#19146 )	2026-01-15 21:21:13 +05:30
Cesar Garcia	46dd420833	fix: sync Helm chart versioning with production standards and Docker versions (#18868 ) * fix: sync Helm chart versioning with production standards and Docker versions - Update Chart.yaml version from 0.4.10 to 1.0.0 (SemVer 0.x is for development, 1.0+ for production) - Update appVersion from v1.50.2 to v1.80.12 to match current Docker image version - Update workflow defaults from 0.1.0 to 1.0.0 for new chart version scheme - Maintain independent chart versioning per Helm best practices This ensures: - Helm chart follows SemVer production standards (1.x instead of 0.x) - appVersion stays synchronized with Docker/application version - Chart version remains independent for flexibility (can update chart without waiting for app releases) * fix: sync Helm chart appVersion with Docker image tags in release workflow Updates the GitHub workflow to ensure Helm chart appVersion matches the Docker image tags that are actually published: - For stable/rc releases: Uses the workflow input tag (e.g., v1.80.12) - For latest/dev releases: Uses the release_type to match main-{type} tags - Makes 'tag' input required to prevent accidental releases with wrong versions - Simplifies fallback logic by removing git-describe dependency This ensures the chart's appVersion correctly references Docker images that exist, preventing deployment failures from missing image tags. * Update ghcr_deploy.yml	2026-01-12 17:04:59 +05:30
Alexsander Hamir	1544e8f971	feat: Add line_profiler support for performance analysis and fix Windows CRLF issues in Docker builds (#18773 )	2026-01-07 11:36:57 -08:00
Mehmet Can Şakiroğlu	a3503e59c2	Litellm feat helm lifecycle support (#18517 ) * feat(helm): add lifecycle hook support for helm * add tests	2026-01-04 00:22:50 +05:30
Krrish Dholakia	7c2478b70e	docs: replace ghcr link with docker.litellm.ai	2025-12-16 08:35:45 +05:30
expruc	2d112fc8b2	add option to include additional resources to chart (#17627 )	2025-12-07 23:25:57 -08:00
Lukas de Boer	3b8a6ec888	Helm Chart: Add possibility to override command, args and add deployment labels (#17535 ) * Helm Chart: Add possibility to override command, args and also add deployment labels * Helm Chart: Fix helm lint issue * Helm Chart: Fix helm unit tests	2025-12-06 14:01:09 -08:00
Fabian Reinold	c173a4a275	Helm Chart: add ingress-only labels (#17348 ) * feat(helm): add ingress-only labels * feat(helm): add ingress configuration tests * chore(helm): bump chart version	2025-12-02 22:30:54 -08:00
Saar wintrov	777ef628d2	Enhancement(helm): ServiceMonitor template rendering (#17038 ) * Metadata: fix 401 when audio/transcriptions * check if str, CR fixes * Added new helmchart functionality * . * . * adding new tests	2025-11-24 20:53:02 -08:00
tushar8408	5f94b372f8	Migration job labels (#16831 ) * Add dynamic pod labels and annotations to migrations job * Bump chart version to 0.4.8	2025-11-19 09:53:21 -08:00
YutaSaito	645f84c02e	fix: add imagePullSecrets to migrations-job (#15681 )	2025-10-18 13:56:31 -07:00
Krish Dholakia	cf3c18a420	Merge pull request #13855 from edify42/allow-no-db-url feat(helm): Allow no DATABASE_URL to be set on migration job to keep the behaviour same as deployment	2025-09-06 22:02:01 -07:00
Abhinav	b6c26c3365	helm(chart): add optional PodDisruptionBudget for litellm proxy (#14062 ) (#14093 )	2025-09-01 12:21:44 -07:00
Const-antine	f8d1e03450	rework tests	2025-08-28 13:39:09 -04:00
Const-antine	1350336515	fix tests	2025-08-28 13:30:11 -04:00
Const-antine	d3b526041f	better formatting	2025-08-28 13:18:36 -04:00
Const-antine	730e9c90a2	fix formatting	2025-08-28 13:18:33 -04:00
Const-antine	5d973ea06e	update readme	2025-08-28 13:18:26 -04:00
Const-antine	409429ddd6	add new tests	2025-08-28 13:18:23 -04:00
Const-antine	ff4040bbe1	add functionality to mount existing configmap if needed	2025-08-28 13:18:05 -04:00
Jugal D. Bhatt	d63f5f99e9	Enhance database configuration: add support for optional endpointKey in values.yaml and update deployment/migrations job templates to conditionally source DATABASE_HOST from the secret if endpointKey is set. (#13763 )	2025-08-21 14:58:50 -07:00
Ishaan Jaff	f498cf4901	Fix - Ensure Helm chart auto generated master keys follow sk-xxxx format (#13871 ) * docs - master key * fix - auto generate sk-xxx prefixed key * test master key fix * fix master key gen	2025-08-21 14:34:21 -07:00
Ed Kim	c88a13c58b	add unit test which confirms the removal of DATABASE_URL Signed-off-by: Ed Kim <edward.kim@lendi.com.au>	2025-08-21 21:08:18 +10:00
edward kim	418b70b38e	fixes Signed-off-by: edward kim <edward.kim@lendi.com.au>	2025-08-21 17:44:54 +10:00
edward kim	2bd3daa742	fixes the mounting of this only when deployStandalone is true Signed-off-by: edward kim <edward.kim@lendi.com.au>	2025-08-21 17:39:31 +10:00
Mattias Andersson	89f71af4cd	Add possibility to configure resources for migrations-job in Helm chart	2025-08-14 17:08:26 +02:00
unique-jakub	f58807ff6e	Add labels to migrations job template (#13343 ) * set labels on the migration job * update comment to retrigger the pipeline	2025-08-07 09:41:24 -07:00
Jugal D. Bhatt	7cf3b4682a	[Separate Health App] Update Helm Deployment.yaml (#13162 ) * add helm deployment fix * clean deployment	2025-08-01 16:50:23 -07:00
unique-jakub	3edb71e617	allow helm hooks for migrations job (#13174 )	2025-07-31 21:51:07 -07:00
Marvin Huetter	d23a6e3ea4	fix: best practices suggest this to set to true (#12809 ) The order of the specification is important here, k8s will take the last value as truth. Push down to be sure schema update is done by migration job	2025-07-29 15:40:12 -07:00
Anton	f05ec34e11	feat: Add envVars and extraEnvVars support to Helm migrations job (#12591 ) - Add support for envVars (simple key-value pairs) in migrations job - Add support for extraEnvVars (complex environment variable configurations) - Include comprehensive test coverage for both envVars and extraEnvVars - Ensure backward compatibility with existing configurations - Tests verify proper rendering of environment variables in container spec	2025-07-14 22:24:13 -07:00
Victor Krylov	1d58fc5429	Add deployment annotations (#11849 ) * Add deployment annotations * Correct the indent and simplify if 0 annotations	2025-06-19 20:11:31 -07:00
Steven Aldinger	b8bdf98a4b	feat(helm): [BerriAI/litellm#11648] support extraContainers in migrations-job.yaml (#11649 )	2025-06-11 23:16:06 -07:00

1 2 3

130 Commits