Portfolio item number 1
Short description of portfolio item number 1
Short description of portfolio item number 1
Short description of portfolio item number 2
Published in ICML, 2025
An on-the-fly MoE inference system on memory-constrained GPU, founded on the insight that substantial untapped redundancy exists within sparsely activated experts.
Recommended citation: Zhou, Y., Li, Z., Zhang, J., Wang, J., Wang, Y., Xie, Z., Chen, K., & Shou, L. (2025). FloE: On-the-Fly MoE Inference on Memory-constrained GPU. arXiv. https://arxiv.org/abs/2505.05950 https://arxiv.org/pdf/2505.05950v2
Published:
This is a description of your talk, which is a markdown files that can be all markdown-ified like any other post. Yay markdown!
Published:
This is a description of your conference proceedings talk, note the different field in type. You can put anything in this field.
Undergraduate course, University 1, Department, 2014
This is a description of a teaching experience. You can use markdown like any other post.
Workshop, University 1, Department, 2015
This is a description of a teaching experience. You can use markdown like any other post.