-
Notifications
You must be signed in to change notification settings - Fork 44
Pull requests: NVIDIA/Fuser
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Avoid Reduction/Broadcast domains in ValGraph scheduling
#2609
opened Jul 16, 2024 by
jacobhinkle
Loading…
Allow alias analysis mark candidate on segmented fusion inputs
#2608
opened Jul 16, 2024 by
jjsjann123
•
Draft
Unit test to try schedule the store of the Mma output from shared to global memory using TMA.
#2605
opened Jul 16, 2024 by
protonu
Loading…
[Tutorial Multidevice]: communicator, DeviceMesh, Simple pipelining
#2585
opened Jul 12, 2024 by
samnordmann
Loading…
Print torch/nvfuser versions in python error repro
#2578
opened Jul 11, 2024 by
jacobhinkle
Loading…
FusionExecutorCache runs a communication-only fusion.
#2575
opened Jul 11, 2024 by
wujingyue
Loading…
Use
_reduce_scatter_base
instead of reduce_scatter
when possible
#2562
opened Jul 10, 2024 by
samnordmann
Loading…
[Hopper matmul] Refactor the code which schedules TMA loads for the input Operands
#2508
opened Jul 1, 2024 by
protonu
Loading…
Previous Next
ProTip!
What’s not been updated in a month: updated:<2024-06-16.