vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 3.3k
Star 23k

Code
Issues 1.1k
Pull requests 314
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Issues: vllm-project/vllm

[Roadmap] vLLM Roadmap Q3 2024

#5805 opened Jun 25, 2024 by simon-mo

Open 16

Virtual Office Hours: July 9 and July 25

#5937 opened Jun 27, 2024 by mgoin

Open 2

[RFC] Drop beam search support

#6226 opened Jul 8, 2024 by WoosukKwon

Open 12

Labels 45 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1,088 Open 2,377 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

[New Model]: Codestral Mamba new model

Requests to new models

#6479 opened Jul 16, 2024 by K-Mistele

[Bug]: AttributeError: '_OpNamespace' '_C' object has no attribute 'rotary_embedding' / gemma-2-9b with vllm=0.5.2 bug

Something isn't working

#6478 opened Jul 16, 2024 by choco9966

[Bug]: Gemma 27B crashes on GCP A100 bug

Something isn't working

#6477 opened Jul 16, 2024 by noamgat

[Bug]: [vllm-openvino]: ValueError: use_cache was set to True but the loaded model only supports use_cache=False. bug

Something isn't working

#6473 opened Jul 16, 2024 by HPUedCSLearner

[Installation]: Unable to build docker image for v0.5.2 installation

Installation problems

#6472 opened Jul 16, 2024 by rajeevbaalwan

[Feature]: Pipeline parallelism support for qwen model feature request

#6471 opened Jul 16, 2024 by hiyforever

[Usage]: PeftModelForCausalLM is not JSON serializable usage

How to use vllm

#6469 opened Jul 16, 2024 by jazzisfuture

[Performance]: [Speculative Decoding] Measurement of Cost Coefficient through vLLM performance

Performance-related issues

#6468 opened Jul 16, 2024 by bong-furiosa

[Bug]: failed when run Qwen2-54B-A14B-GPTQ-Int4(MOE) bug

Something isn't working

#6465 opened Jul 16, 2024 by weiminw

unable to run vllm model deployment bug

Something isn't working

#6464 opened Jul 16, 2024 by riyajatar37003

[Bug]: Can't load gemma-2-9b-it with vllm 0.5.2 bug

Something isn't working

#6462 opened Jul 16, 2024 by vlsav

[Bug]: No metrics exposed at /metrics with 0.5.2 (0.5.1 is fine), possible regression? bug

Something isn't working

#6461 opened Jul 16, 2024 by frittentheke

[Bug]: vLLM is unable to load Mistral on Inferentia and AWS neuron bug

Something isn't working

#6452 opened Jul 15, 2024 by servient-ashwin

[Bug]: Seed issue with Pipeline Parallel bug

Something isn't working

#6449 opened Jul 15, 2024 by andoorve

[Bug]: TypeError: 'NoneType' object is not callable when start Gemma2-27b-it bug

Something isn't working

#6445 opened Jul 15, 2024 by candowu

[Bug]: Severe computation errors when batching request for microsoft/Phi-3-mini-128k-instruct bug

Something isn't working

#6438 opened Jul 15, 2024 by lance0108

v0.5.2, v0.5.3, v0.6.0 Release Tracker release

Related to new version release

#6434 opened Jul 15, 2024 by simon-mo

1 of 4 tasks

[Bug]: autogen can't work with vllm v0.5.1 bug

Something isn't working

#6432 opened Jul 15, 2024 by tonyaw

[Bug]: illegal memory access when increase max_model_length on FP8 models bug

Something isn't working

#6429 opened Jul 15, 2024 by IEI-mjx

[Bug]: When using qwen-32b-chat-awq with multi-threaded access, errors occur after approximately several hundred visits.”vllm.engine.async_llm_engine.AsyncEngineDeadError: Background loop has errored already.“ bug

Something isn't working

#6421 opened Jul 14, 2024 by ZHJ19970917

[Feature]: Apply chat template through LLM class feature request good first issue

Good for newcomers

#6416 opened Jul 13, 2024 by robertgshaw2-neuralmagic

[Bug]: Timeout Error When Deploying Llamafied InternLM2-5-7B-Chat-1M Model via vLLM OpenAI API Server bug

Something isn't working

#6414 opened Jul 13, 2024 by mf-skjung

[Misc]: _run_workers_async function of DistributedGPUExecutorAsync misc

#6400 opened Jul 13, 2024 by HMJW

[Bug]: Gemma-2 + FlashInfer: ValueError: Unsupported max_frags_z: bug

Something isn't working

#6395 opened Jul 12, 2024 by HanGuo97

[Bug]: Problem loading Gemma 2 27b-it bug

Something isn't working

#6387 opened Jul 12, 2024 by rdaiello

Previous 1 2 3 4 5 … 43 44 Next

Previous Next

ProTip! Adding no:label will show everything without a label.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly