Skip to content

Issues: NVIDIA/TensorRT-LLM

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

Is MPI required even multi device is disabled? bug Something isn't working
#1959 opened Jul 16, 2024 by jlewi
4 tasks
Model Performance Degraded when using BFLOAT16 LoRa Adapters bug Something isn't working
#1957 opened Jul 16, 2024 by TheCodeWrangler
1 of 4 tasks
LLAMA checkpoint ImportError: undefined symbol bug Something isn't working
#1950 opened Jul 16, 2024 by Pareek-Yash
2 of 4 tasks
Does tensorrt-llm support blip2 with fp8 quantization?? question Further information is requested
#1949 opened Jul 15, 2024 by SVT-Yang
[Feature]: FlashAttention 3 support feature request New feature or request
#1947 opened Jul 15, 2024 by fan-niu
How to use Medusa to support non llama models? question Further information is requested
#1946 opened Jul 15, 2024 by skyCreateXian
2 of 4 tasks
How to quantize customed models, such as LVM? question Further information is requested
#1945 opened Jul 15, 2024 by XA23i
[new] discord channel for tensorrt question Further information is requested
#1943 opened Jul 13, 2024 by geraldstanje
4 tasks
Mixtral-8x7B repetitive answers bug Something isn't working Investigating
#1942 opened Jul 12, 2024 by BugsBuggy
2 of 4 tasks
Inquiry Regarding the Use of FP8 Type in GEMM Computations question Further information is requested
#1940 opened Jul 12, 2024 by unbelievable3513
problem with tensorrt_llm performance bug Something isn't working
#1938 opened Jul 12, 2024 by Arnold1
4 tasks
[Model Request] InternVL2.0 support feature request New feature or request
#1934 opened Jul 11, 2024 by BasicCoder
Cannot install tensorrt_llm bug Something isn't working
#1933 opened Jul 11, 2024 by Dawn-2-Winter
1 of 4 tasks
GPU OOM Error When Quantizing Llama 3 8b bug Something isn't working
#1932 opened Jul 11, 2024 by ngockhanh5110
2 of 4 tasks
[model request] PaliGemma support feature request New feature or request
#1931 opened Jul 11, 2024 by kitterive
failed to load whisper decoder engine with paged kv cache bug Something isn't working
#1930 opened Jul 10, 2024 by MahmoudAshraf97
3 of 4 tasks
ProTip! Type g i on any issue or pull request to go back to the issue listing page.