#

multi-modal

Here are 289 public repositories matching this topic...

OpenBMB / MiniCPM-V

MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone

multi-modal minicpm minicpm-v

Updated Jul 16, 2024
Python

modelscope

modelscope / modelscope

ModelScope: bring the notion of Model-as-a-Service to life.

python nlp science machine-learning deep-learning cv speech multi-modal

Updated Jul 15, 2024
Python

THUDM / CogVLM

a state-of-the-art-level open visual language model | 多模态预训练模型

pretrained-models language-model multi-modal cross-modality visual-language-models

Updated May 29, 2024
Python

lucidrains / DALLE-pytorch

Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch

deep-learning transformers artificial-intelligence multi-modal attention-mechanism text-to-image

Updated Feb 17, 2024
Python

marqo

marqo-ai / marqo

Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai

Updated Jul 16, 2024
Python

valhalla / valhalla

Open Source Routing Engine for OpenStreetMap

directions openstreetmap routing astar traveling-salesman dijkstra routing-engine isochrones multi-modal tiled

Updated Jul 16, 2024
C++

OpenGVLab / InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4V. 接近GPT-4V表现的可商用开源多模态对话模型

image-classification gpt multi-modal semantic-segmentation video-classification mme image-text-retrieval llm vision-language-model gpt-4v vit-6b vit-22b gpt-4o

Updated Jul 12, 2024
Python

THUDM / VisualGLM-6B

Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型

gpt multi-modal chatglm-6b

Updated Jun 28, 2024
Python

OFA-Sys / Chinese-CLIP

Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.

nlp computer-vision deep-learning transformers pytorch chinese pretrained-models multi-modal clip coreml-models contrastive-loss vision-language multi-modal-learning image-text-retrieval vision-and-language-pre-training

Updated Nov 29, 2023
Python

zjunlp / DeepKE

[EMNLP 2022] An Open Toolkit for Knowledge Graph Extraction and Construction

Updated Jul 13, 2024
Python

modelscope / agentscope

Start building LLM-empowered multi-agent applications in an easier way.

agent chatbot multi-agent multi-modal distributed-agents gpt-4 large-language-models llm llm-agent llama3 gpt-4o

Updated Jul 16, 2024
Python

docarray

docarray / docarray

Represent, send, store and search multimodal data

elasticsearch machine-learning deep-learning protobuf pytorch data-structures nearest-neighbor-search cross-modal multi-modal semantic-search multimodal nested-data weaviate dataclass pydantic fastapi neural-search qdrant docarray

Updated Jul 15, 2024
Python

PKU-YuanGroup / Video-LLaVA

Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

multi-modal instruction-tuning large-vision-language-model

Updated May 21, 2024
Python

SciSharp / LLamaSharp

A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.

chatbot llama gpt multi-modal llm llava semantic-kernel llamacpp llama-cpp llama2 llama3

Updated Jul 16, 2024
C#

PKU-YuanGroup / MoE-LLaVA

Mixture-of-Experts for Large Vision-Language Models

moe multi-modal mixture-of-experts large-vision-language-model

Updated May 15, 2024
Python

Kav-K / GPTDiscord

A robust, all-in-one GPT interface for Discord. ChatGPT-style conversations, image generation, AI-moderation, custom indexes/knowledgebase, youtube summarizer, and more!

Updated May 30, 2024
Python

modelscope / data-juicer

A one-stop data processing system to make data higher-quality, juicier, and more digestible for (multimodal) LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据！

Updated Jul 16, 2024
Python

dvlab-research / LISA

Project Page for "LISA: Reasoning Segmentation via Large Language Model"

segmentation multi-modal llm large-language-model

Updated Jul 2, 2024
Python

THUDM / CogVLM2

GPT4V-level open-source multi-modal model based on Llama3-8B

pretrained-models language-model multi-modal cogvlm

Updated Jul 16, 2024
Python

hcaptcha-challenger

QIN2DIM / hcaptcha-challenger

🥂 Gracefully face hCaptcha challenge with MoE(ONNX) embedded solution.

computer-vision solver yolo object-detection image-segmentation multi-modal clip opencv-python onnx hcaptcha multi-modal-learning onnxruntime playwright onnx-models yolov5 zero-shot-classification hcaptcha-solver

Updated Apr 20, 2024
Python

Improve this page

Add a description, image, and links to the multi-modal topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the multi-modal topic, visit your repo's landing page and select "manage topics."