Outstanding Open-source Project
Model Adaptive Compression is a technique to compress deep learning models to improve their efficiency and reduce the computational requirements. With the growing complexity of deep learning models, there is a need for optimizing their size and computational cost, particularly for resource-limited devices and systems.Here, we provide two excellent papers, AdaDeep and AdaSpring, which offer some ideas and new insights for model compression solutions.
AdaDeep: A Usage-Driven, Automated Deep Model Compression Framework for Enabling Ubiquitous Intelligent Mobiles
Abstract:Recent breakthroughs in Deep Neural Networks (DNNs) have fueled a tremendously growing demand for bringing DNNpowered intelligence into mobile platforms. While the potential of deploying DNNs on resource-constrained platforms has been demonstrated by DNN compression techniques, the current practice suffers from two limitations: 1) merely stand-alone compression schemes are investigated even though each compression technique only suit for certain types of DNN layers; and 2) mostly compression techniques are optimized for DNNs’ inference accuracy, without explicitly considering other application-driven system performance (e.g., latency and energy cost) and the varying resource availability across platforms (e.g., storage and processing capability).To this end, we propose AdaDeep, a usage-driven, automated DNN compression framework for systematically exploring the desired trade-off between performance and resource constraints, from a holistic system level.Specifically, in a layer-wise manner, AdaDeep automatically selects the most suitable combination of compression techniques and the corresponding compression hyperparameters for a given DNN. Furthermore, AdaDeep also uncovers multiple novel combinations of compression techniques.
AdaSpring: Context-adaptive and Runtime-evolutionary Deep Model Compression for Mobile Applications
Abstract:There are many deep learning (e.g. DNN) powered mobile and wearable applications today continuously and unobtrusively sensing the ambient surroundings to enhance all aspects of human lives. To enable robust and private mobile sensing, DNN tends to be deployed locally on the resource-constrained mobile devices via model compression. The current practice either hand-crafted DNN compression techniques, i.e., for optimizing DNN-relative performance (e.g. parameter size), or on-demand DNN compression methods, i.e., for optimizing hardware-dependent metrics (e.g. latency), cannot be locally online because they require offline retraining to ensure accuracy. Also, none of them have correlated their efforts with runtime adaptive compression to consider the dynamic nature of deployment context of mobile applications.To address those challenges, we present AdaSpring, a context-adaptive and self-evolutionary DNN compression framework. It enables the runtime adaptive DNN compression locally online.Specifically, it presents the ensemble training of a retraining-free and self-evolutionary network to integrate multiple alternative DNN compression configurations (i.e., compressed architectures and weights). It then introduces the runtime search strategy to quickly search for the most suitable compression configurations and evolve the corresponding weights.