Hangs were reported on Jetson Orin AGX if we set CUDA_SCALE_LAUNCH_QUEUES=4x. Reverting the previous PR (#19042) and updating the document to consider setting CUDA_SCALE_LAUNCH_QUEUES=4x for faster throughput on multi-GPU systems. |
||
|---|---|---|
| .. | ||
| android | ||
| backend | ||
| development | ||
| multimodal | ||
| ops | ||
| android.md | ||
| build-riscv64-spacemit.md | ||
| build-s390x.md | ||
| build.md | ||
| docker.md | ||
| function-calling.md | ||
| install.md | ||
| llguidance.md | ||
| multimodal.md | ||
| ops.md | ||
| preset.md | ||
| speculative.md | ||