The easiest way to serve AI/ML models in production - Build Model Inference Service, LLM APIs, Multi-model Inference Graph/Pipelines, LLM/RAG apps, and more!
-
Updated
Jul 16, 2024 - Python
The easiest way to serve AI/ML models in production - Build Model Inference Service, LLM APIs, Multi-model Inference Graph/Pipelines, LLM/RAG apps, and more!
CLIP as a service - Embed image and sentences, object recognition, visual reasoning, image classification and reverse image search
Online Inference API for NLP Transformer models - summarization, text classification, sentiment analysis and more
Learn the ins and outs of efficiently serving Large Language Models (LLMs). Dive into optimization techniques, including KV caching and Low Rank Adapters (LoRA), and gain hands-on experience with Predibase’s LoRAX framework inference server.
Add a description, image, and links to the model-inference-service topic page so that developers can more easily learn about it.
To associate your repository with the model-inference-service topic, visit your repo's landing page and select "manage topics."