-
-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Core] Tweaks to model runner/input builder developer APIs
ready
#6712
opened Jul 24, 2024 by
Yard1
Loading…
[CI/Build] Build wheel in release model when sccache is not enabled
ready
#6710
opened Jul 23, 2024 by
zifeitong
Loading…
[CI] [nightly benchmark] Do not re-download sharegpt dataset if exists
ready
#6706
opened Jul 23, 2024 by
cadedaniel
Loading…
[Bugfix] Miscalculated latency lead to time_to_first_token_seconds inaccurate.
ready
#6686
opened Jul 23, 2024 by
AllenDou
Loading…
[mypy] Enable following imports for some directories
ready
#6681
opened Jul 23, 2024 by
DarkLight1337
Loading…
[ Kernel ] Tuned FP8 Kernels for Ada Lovelace
ready
#6677
opened Jul 23, 2024 by
varun-sundar-rabindranath
Loading…
[Kernels] Add fp8 support to
reshape_and_cache_flash
ready
#6667
opened Jul 23, 2024 by
Yard1
Loading…
[Draft] [Speculative decoding] Use SPMD worker to reduce control plane communication
ready
#6664
opened Jul 23, 2024 by
cadedaniel
•
Draft
[Misc] Fix attribute error when accessing compiled_dag
ready
#6663
opened Jul 22, 2024 by
ruisearch42
Loading…
[Kernel] Add dynamic asymmetric quantization kernel
#6651
opened Jul 22, 2024 by
ProExpertProg
•
Draft
[ CI ] Awq Marlin Integration Tests
ready
#6627
opened Jul 22, 2024 by
robertgshaw2-neuralmagic
Loading…
[Frontend] Represent tokens with identifiable strings
ready
#6626
opened Jul 21, 2024 by
ezliu
Loading…
Previous Next
ProTip!
Follow long discussions with comments:>50.