-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Issues: mlc-ai/mlc-llm
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Question] How to estimation of the vRAM the model takes at runtime?
question
Question about the usage
#2666
opened Jul 16, 2024 by
limin05030
[Question] Running TVM Dlight low-level optimizations ERROR
question
Question about the usage
#2661
opened Jul 15, 2024 by
ponytaill
[Bug] android opencl kernel matmul Too large unroll parameter causes out of memory
bug
Confirmed bugs
#2650
opened Jul 12, 2024 by
shifeiwen
[Question] While running the mlc-llm app on Android, the prefill token is very slow sometimes.
question
Question about the usage
#2648
opened Jul 11, 2024 by
tombang
[Question] How to optimize the scheduling of multimodal LLM model convolution in mlc.
question
Question about the usage
#2646
opened Jul 11, 2024 by
shifeiwen
[Bug] The Qwen2-1.5B-Instruct cannot run on Android phones with Huawei Mate50 (8gen1)
bug
Confirmed bugs
#2645
opened Jul 11, 2024 by
lengjing606
[Question] Are messages always truncated to last Question about the usage
context_length
tokens?
question
#2636
opened Jul 8, 2024 by
0xLienid
Which conv-template should I choose for the model MobileLLaMA-1.4B-Chat?
question
Question about the usage
#2634
opened Jul 8, 2024 by
Giustiniano
[Feature Request] MLC LLM support AutoAWQ quantization
feature request
New feature or request
#2633
opened Jul 8, 2024 by
Stephen888888
[Bug] rwkv5 can't run.the error is"kv_cache not found"
bug
Confirmed bugs
#2631
opened Jul 8, 2024 by
xinyinan9527
[Question] Multiple lora support.
question
Question about the usage
#2625
opened Jul 4, 2024 by
lumiere-ml
[Bug] TypeError: RWKV6Config.__init__() missing 1 required positional argument: 'model_version'
bug
Confirmed bugs
#2623
opened Jul 4, 2024 by
zhushuaifeifei
[Bug] error: no template named 'is_base_of_v' in namespace 'std'; did you mean 'is_base_of'?
bug
Confirmed bugs
#2620
opened Jul 3, 2024 by
haohenggang
[Feature Request] Support for Qualcomm Snapdragon XElite PCs (arm64 Windows and WSL2-Linux) as target
feature request
New feature or request
#2617
opened Jul 2, 2024 by
Sing-Li
[Bug] Check failed: (it != type_key2index_.end()) is false: Cannot find type ObjectPath. Did you forget to register the node by TVM_REGISTER_NODE_TYPE
bug
Confirmed bugs
#2602
opened Jun 24, 2024 by
raj-khare
[Bug] fine-tuned model deployed with webllm not working
bug
Confirmed bugs
#2601
opened Jun 24, 2024 by
JLKaretis
[Question] It takes too much time for the first token to be returned after a reqeust issued.
question
Question about the usage
#2595
opened Jun 19, 2024 by
dkjung
[Question] Can you programmatically clear the kv cache?
question
Question about the usage
#2593
opened Jun 19, 2024 by
0xLienid
[Question] How to use function calling in MLCChat Android app?
question
Question about the usage
#2589
opened Jun 17, 2024 by
wqwz111
[Question] How to use cpp in project
question
Question about the usage
#2588
opened Jun 17, 2024 by
Moxoo
[Question] batchsize of prefill step
question
Question about the usage
#2583
opened Jun 14, 2024 by
Jack-liu1998
[Bug] FP8 quantization accuracy loss with TinyLlama-1.1B-Chat-v1.0
bug
Confirmed bugs
#2579
opened Jun 14, 2024 by
razetime
Previous Next
ProTip!
Find all open issues with in progress development work with linked:pr.