-
-
Notifications
You must be signed in to change notification settings - Fork 3.3k
Issues: vllm-project/vllm
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[New Model]: Codestral Mamba
new model
Requests to new models
#6479
opened Jul 16, 2024 by
K-Mistele
[Bug]: AttributeError: '_OpNamespace' '_C' object has no attribute 'rotary_embedding' / gemma-2-9b with vllm=0.5.2
bug
Something isn't working
#6478
opened Jul 16, 2024 by
choco9966
[Bug]: Gemma 27B crashes on GCP A100
bug
Something isn't working
#6477
opened Jul 16, 2024 by
noamgat
[Bug]: [vllm-openvino]: ValueError: Something isn't working
use_cache
was set to True
but the loaded model only supports use_cache=False
.
bug
#6473
opened Jul 16, 2024 by
HPUedCSLearner
[Installation]: Unable to build docker image for v0.5.2
installation
Installation problems
#6472
opened Jul 16, 2024 by
rajeevbaalwan
[Feature]: Pipeline parallelism support for qwen model
feature request
#6471
opened Jul 16, 2024 by
hiyforever
[Usage]: PeftModelForCausalLM is not JSON serializable
usage
How to use vllm
#6469
opened Jul 16, 2024 by
jazzisfuture
[Performance]: [Speculative Decoding] Measurement of Cost Coefficient through vLLM
performance
Performance-related issues
#6468
opened Jul 16, 2024 by
bong-furiosa
[Bug]: failed when run Qwen2-54B-A14B-GPTQ-Int4(MOE)
bug
Something isn't working
#6465
opened Jul 16, 2024 by
weiminw
unable to run vllm model deployment
bug
Something isn't working
#6464
opened Jul 16, 2024 by
riyajatar37003
[Bug]: Can't load gemma-2-9b-it with vllm 0.5.2
bug
Something isn't working
#6462
opened Jul 16, 2024 by
vlsav
[Bug]: No metrics exposed at /metrics with 0.5.2 (0.5.1 is fine), possible regression?
bug
Something isn't working
#6461
opened Jul 16, 2024 by
frittentheke
[Bug]: vLLM is unable to load Mistral on Inferentia and AWS neuron
bug
Something isn't working
#6452
opened Jul 15, 2024 by
servient-ashwin
[Bug]: Seed issue with Pipeline Parallel
bug
Something isn't working
#6449
opened Jul 15, 2024 by
andoorve
[Bug]: TypeError: 'NoneType' object is not callable when start Gemma2-27b-it
bug
Something isn't working
#6445
opened Jul 15, 2024 by
candowu
[Bug]: Severe computation errors when batching request for microsoft/Phi-3-mini-128k-instruct
bug
Something isn't working
#6438
opened Jul 15, 2024 by
lance0108
v0.5.2, v0.5.3, v0.6.0 Release Tracker
release
Related to new version release
#6434
opened Jul 15, 2024 by
simon-mo
1 of 4 tasks
[Bug]: autogen can't work with vllm v0.5.1
bug
Something isn't working
#6432
opened Jul 15, 2024 by
tonyaw
[Bug]: illegal memory access when increase max_model_length on FP8 models
bug
Something isn't working
#6429
opened Jul 15, 2024 by
IEI-mjx
[Feature]: Apply chat template through Good for newcomers
LLM
class
feature request
good first issue
#6416
opened Jul 13, 2024 by
robertgshaw2-neuralmagic
[Bug]: Timeout Error When Deploying Llamafied InternLM2-5-7B-Chat-1M Model via vLLM OpenAI API Server
bug
Something isn't working
#6414
opened Jul 13, 2024 by
mf-skjung
[Misc]: _run_workers_async function of DistributedGPUExecutorAsync
misc
#6400
opened Jul 13, 2024 by
HMJW
[Bug]: Gemma-2 + FlashInfer: ValueError: Unsupported max_frags_z:
bug
Something isn't working
#6395
opened Jul 12, 2024 by
HanGuo97
[Bug]: Problem loading Gemma 2 27b-it
bug
Something isn't working
#6387
opened Jul 12, 2024 by
rdaiello
Previous Next
ProTip!
Adding no:label will show everything without a label.