tiennm99/litellm

Fork 0

mirror of https://github.com/tiennm99/litellm.git synced 2026-07-04 17:08:48 +00:00

T

Ishaan Jaff 5fb8675cf8 Update issue templates

2023-08-26 16:34:47 -07:00

.circleci

add litellm caching

2023-08-26 16:08:48 -07:00

.github/ISSUE_TEMPLATE

Update issue templates

2023-08-26 16:34:47 -07:00

cookbook

edit promptlayer cookbook

2023-08-26 13:37:08 -07:00

dist

update testing for streaming to catch empty responses

2023-08-26 11:20:06 -07:00

docs

gpt cache docs

2023-08-26 16:30:38 -07:00

litellm

move caching

2023-08-26 16:16:58 -07:00

proxy-server

add missing references

2023-08-21 21:38:44 +02:00

.DS_Store

add togetherai tutorial to docs

2023-08-15 21:23:55 -07:00

.env.example

Added Infisical token to .env.example

2023-08-11 13:37:44 +03:00

.gitignore

Updated the favicon

2023-08-22 14:50:17 +03:00

.readthedocs.yaml

Update .readthedocs.yaml

2023-07-29 12:54:38 -07:00

LICENSE

Initial commit

2023-07-26 17:09:52 -07:00

mkdocs.yml

add missing references

2023-08-21 21:38:44 +02:00

poetry.lock

update testing for streaming to catch empty responses

2023-08-26 11:20:06 -07:00

pyproject.toml

bump version

2023-08-26 12:02:39 -07:00

README.md

Update README.md

2023-08-26 14:04:51 -07:00

README.md

🚅 LiteLLM

Call all LLM APIs using the OpenAI format [Anthropic, Huggingface, Cohere, Azure OpenAI etc.]

100+ Supported Models | Docs | Demo Website

LiteLLM manages

Translating inputs to the provider's completion and embedding endpoints
Guarantees consistent output, text responses will always be available at ['choices'][0]['message']['content']
Exception mapping - common exceptions across providers are mapped to the OpenAI exception types

Usage

pip install litellm

from litellm import completion

## set ENV variables
os.environ["OPENAI_API_KEY"] = "openai key"
os.environ["COHERE_API_KEY"] = "cohere key"
os.environ["ANTHROPIC_API_KEY"] = "anthropic key"

messages = [{ "content": "Hello, how are you?","role": "user"}]

# openai call
response = completion(model="gpt-3.5-turbo", messages=messages)

# cohere call
response = completion(model="command-nightly", messages=messages)

# anthropic
response = completion(model="claude-2", messages=messages)

Stable version

pip install litellm==0.1.424

LiteLLM Client - debugging & 1-click add new LLMs

Debugging Dashboard 👉 https://docs.litellm.ai/docs/debugging/hosted_debugging

Streaming

liteLLM supports streaming the model response back, pass stream=True to get a streaming iterator in response. Streaming is supported for OpenAI, Azure, Anthropic, Huggingface models

response = completion(model="gpt-3.5-turbo", messages=messages, stream=True)
for chunk in response:
    print(chunk['choices'][0]['delta'])

# claude 2
result = completion('claude-2', messages, stream=True)
for chunk in result:
  print(chunk['choices'][0]['delta'])

support / talk with founders

Schedule Demo 👋
Community Discord 💭
Our numbers 📞 +1 (770) 8783-106 / ‭+1 (412) 618-6238‬
Our emails ✉️ ishaan@berri.ai / krrish@berri.ai

why did we build this

Need for simplicity: Our code started to get extremely complicated managing & translating calls between Azure, OpenAI, Cohere

Description

Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, VLLM, NVIDIA NIM]

Readme MIT 1.1 GiB

Languages

Python 81%

TypeScript 12.2%

JavaScript 5.9%

HTML 0.5%

HCL 0.2%

README.md Unescape Escape

🚅 LiteLLM

100+ Supported Models | Docs | Demo Website

Usage

LiteLLM Client - debugging & 1-click add new LLMs

Streaming

support / talk with founders

why did we build this

README.md