FloE: On-the-Fly MoE Inference on Memory-constrained GPU
Zhou, Y., Li, Z., Zhang, J., Wang, J., Wang, Y., Xie, Z., Chen, K., & Shou, L. (2025). FloE: On-the-Fly MoE Inference on Memory-constrained GPU. arXiv. https://arxiv.org/abs/2505.05950
2023全球智能汽车AI挑战赛——AI大模型检索问答 优胜奖(4/730)
Zhou, Y., Li, Z., Zhang, J., Wang, J., Wang, Y., Xie, Z., Chen, K., & Shou, L. (2025). FloE: On-the-Fly MoE Inference on Memory-constrained GPU. arXiv. https://arxiv.org/abs/2505.05950