mirror of
https://github.com/tiennm99/danswer-docs.git
synced 2026-06-17 12:48:27 +00:00
Fix Request Model Docs (#7)
This commit is contained in:
@@ -20,9 +20,12 @@ Refer to the code [here](https://github.com/danswer-ai/danswer/blob/main/backend
|
||||
- [https://medium.com/@yuhongsun96/host-a-llama-2-api-on-gpu-for-free-a5311463c183](https://medium.com/@yuhongsun96/host-a-llama-2-api-on-gpu-for-free-a5311463c183)
|
||||
- This demo uses Google Colab to access a free GPU but this is not suitable for long term deployments
|
||||
|
||||
## Set Danswer to use the LLM model server
|
||||
## Set Danswer to use an LLM behind a REST API
|
||||
There is an offering from HuggingFace called "Inference Endpoints" where users can rent dedicated hardware and host
|
||||
HuggingFace compatible models behind a REST API.
|
||||
Danswer works out of the box with any text-generation HuggingFace models hosted this way.
|
||||
- INTERNAL_MODEL_VERSION=request-completion
|
||||
- GEN_AI_HOST_TYPE=colab-demo
|
||||
- GEN_AI_HOST_TYPE=huggingface
|
||||
- or reference your custom class
|
||||
- GEN_AI_ENDPOINT=<your-model-endpoint-url>
|
||||
- GEN_AI_ENDPOINT=<your-huggingface-inference-endpoint-url>
|
||||
|
||||
|
||||
Reference in New Issue
Block a user