PerfDojo
Automated ML library generation for heterogeneous architectures
PerfDojo: Automated ML Library Generation for Heterogeneous Architectures
PerfDojo frames performance optimization as a reinforcement-learning game over a human-readable, mathematically-inspired code representation whose transformations are guaranteed to preserve semantics. On top of it, PerfLLM combines LLMs with RL to search this transformation space across diverse CPU (x86, Arm, RISC-V) and GPU architectures. On GH200, the resulting code achieves a geometric mean speedup of 6.65× over PyTorch and 13.65× over TVM.
- Role: Contributor
- Status: Published, SC 2025
- Paper: arXiv:2511.03586 · ACM DL