Administrative information
Title
|
Model Compression - Edge Computing |
|
Duration |
45 mins |
Module |
C |
Lesson Type |
Lecture |
Focus |
Technical - Future AI |
Topic |
Advances in ML models through a HC lens - A result Oriented Study |
|
Keywords
model compression,pruning,quantization,knowledge distillation,
Learning Goals
- Understand the concept of model compression
- Provide the rationale behind the techniques of pruning, quantization and knowledge distillation
- Prepare for understanding of basic implementations using a high-level framework like TensorFlow
Expected Preparation
Learning Events to be Completed Before
Obligatory for Students
- Knowledge of the supervised learning theory
- Introduction to machine learning and deep learning concepts given by previous lectures
Optional for Students
- Knowledge of the most common hyper parameters involved in neural networks building process
References and background for students
- Knowledge distillation - Easy
- Song Han, et al. "Learning both Weights and Connections for Efficient Neural Networks". CoRR abs/1506.02626. (2015).
- Song Han, et al. "Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding." 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings. 2016. Yanzhi Wang, et al. "Non-Structured DNN Weight Pruning -- Is It Beneficial in Any Platform?". CoRR abs/1907.02124. (2019).
- Cheong and Daniel. "transformers.zip: Compressing Transformers with Pruning and Quantization"
- Song Han, et al. "Learning both Weights and Connections for Efficient Neural Networks". CoRR abs/1506.02626. (2015).
- Davis W. Blalock, et al. "What is the State of Neural Network Pruning?." Proceedings of Machine Learning and Systems 2020, MLSys 2020, Austin, TX, USA, March 2-4, 2020. mlsys.org, 2020.
- https://github.com/kingreza/quantization
- Song Han, et al. "Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding." 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings. 2016.
- Zhi Gang Liu, et al. "Learning Low-precision Neural Networks without Straight-Through Estimator (STE)." Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, August 10-16, 2019. ijcai.org, 2019.
- Peiqi Wang, et al. "HitNet: Hybrid Ternary Recurrent Neural Network." Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3-8, 2018, Montréal, Canada. 2018.
- Cristian Bucila, et al. "Model compression." Proceedings of the Twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Philadelphia, PA, USA, August 20-23, 2006. ACM, 2006.
- Geoffrey E. Hinton, et al. "Distilling the Knowledge in a Neural Network". CoRR abs/1503.02531. (2015).
- https://towardsdatascience.com/knowledge-distillation-simplified-dd4973dbc764
- https://www.ttic.edu/dl/dark14.pdf
- https://josehoras.github.io/knowledge-distillation/
Lesson materials
The materials of this learning event are available under CC BY-NC-SA 4.0.
Instructions for Teachers
- Provide insight into trends and why models are growing
- Give examples and reasons why it is necessary to have smaller models
- Provide an overview of the techniques, their pros and cons
- Propose pop up quizzes
- Try to stick to the time table
- If possible provide more time to the question and answer session if needed
The lecture can refer to model types, model evaluation, model fitting and model optimization
Outline
Duration |
Description |
Concepts |
Activity |
0-10 min |
Introduction to techniques for model compression: what it is, what it is for, when and why it is needed |
Model compression |
Introduction to main concepts |
10-20 min |
Pruning: concepts and techniques. Main approaches to pruning |
Pruning |
Taught session and examples |
20-30 min |
Quantization: concepts and techniques. Main approaches to quantization |
Quantization |
Taught session and examples |
30-40 min |
Knowledge distillation: concepts and techniques. Main approaches to knowledge distillation |
Knowledge distillation |
Taught session and examples |
40-45 min |
Conclusion, questions and answers |
Summary |
Conclusions |
More information
Click here for an overview of all lesson plans of the master human centred AI
Please visit the home page of the consortium HCAIM
Acknowledgements
|
The Human-Centered AI Masters programme was co-financed by the Connecting Europe Facility of the European Union Under Grant №CEF-TC-2020-1 Digital Skills 2020-EU-IA-0068.
The materials of this learning event are available under CC BY-NC-SA 4.0
|
The HCAIM consortium consists of three excellence centres, three SMEs and four Universities
|