Band

Jun 27, 2022

Photo from Band Paper

Summary

Band is the first mobile inference platform to support multi-DNN workloads on heterogeneous mobile processors. Existing mobile deep learning frameworks such as TFLite focus on single DNN inference and thus cannot fully handle multi-DNN workloads with heterogeneous processors. Moreover, the limited operator support of different accelerators further complicates the problem. Band tackles this challenge by partitioning DNNs into subgraphs, dynamically selecting optimal schedules, and considering fallback operators for unsupported processors. Evaluation results show that Band outperforms TensorFlow Lite by up to 5.04× for single-app multi-DNN workloads and achieves a 3.76× higher satisfaction rate for latency-critical multi-app scenarios.

With novel findings and extensive evaluation, Band was published in MobiSys 2022.

My Role

Our team consisted of 5 people, and we implemented and evaluated the entire platform together. Furthermore, I designed and implemented the subgraph partitioning algorithm, which is the core concept of Band.

Accomplishments:

Implemented Band in a 5-person team and experienced collaboration, including code review and testing
Implemented Band based on well-organized TensorFlow Lite C++ code, increasing my understanding of system design and C++-based development
Designed and implemented techniques based on system profiling
Gained an understanding of accelerator APIs such as OpenCL and NNAPI while implementing a platform that supports heterogeneous processors

On-Device AI

Changmin Jeon

My research interests lie in enhancing scene understanding with deep learning for XR systems.