#

local-inference

Here are 4 public repositories matching this topic...

SJTU-IPADS / PowerInfer

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

falcon llama large-language-models llm local-inference llm-inference bamboo-7b

Updated Jul 15, 2024
C++

efeslab / fiddler

Fast Inference of MoE Models with CPU-GPU Orchestration

mixture-of-experts llm local-inference llm-inference mixtral-8x7b

Updated May 22, 2024
Python

BorjaOteroFerreira / IALab-Suite

Tool for test diferents large language models without code.

api-rest chat-application flask-api inference-api large-language-models llm local-inference llamacpp llm-inference llama2 llama-cpp-python llama2-7b mixtral-8x7b

Updated Mar 12, 2024
Python

yas-sim / openvino-llm-chatbot-rag

LLM chatbot example using OpenVINO with RAG (Retrieval Augmented Generation).

natural-language-processing offline chatbot intel edge-computing rag openvino huggingface edge-inference cloud-free llm local-inference langchain dolly2 retrieval-augmented-generation llama2 neural-chat

Updated Jan 25, 2024
Python

Improve this page

Add a description, image, and links to the local-inference topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the local-inference topic, visit your repo's landing page and select "manage topics."