参考
参考#
参考资料
- BLC13
Yoshua Bengio, Nicholas Léonard, and Aaron Courville. Estimating or propagating gradients through stochastic neurons for conditional computation. 2013. URL: https://arxiv.org/abs/1308.3432, doi:10.48550/ARXIV.1308.3432.
- CWZZ17
Yu Cheng, Duo Wang, Pan Zhou, and Tao Zhang. A survey of model compression and acceleration for deep neural networks. 2017. URL: https://arxiv.org/abs/1710.09282, doi:10.48550/ARXIV.1710.09282.
- CHS+16
Matthieu Courbariaux, Itay Hubara, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. Binarized neural networks: training deep neural networks with weights and activations constrained to +1 or -1. 2016. URL: https://arxiv.org/abs/1602.02830, doi:10.48550/ARXIV.1602.02830.
- CFH+22
Matteo Croci, Massimiliano Fasi, Nicholas Higham, Theo Mary, and Mantas Mikaitis. Stochastic rounding: implementation, error analysis and applications. Royal Society Open Science, 9:, 03 2022. doi:10.1098/rsos.211631.
- GKD+21
Amir Gholami, Sehoon Kim, Zhen Dong, Zhewei Yao, Michael W. Mahoney, and Kurt Keutzer. A survey of quantization methods for efficient neural network inference. 2021. arXiv:2103.13630.
- HMD15
Song Han, Huizi Mao, and William J. Dally. Deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding. 2015. URL: https://arxiv.org/abs/1510.00149, doi:10.48550/ARXIV.1510.00149.
- HWZ+16
Qinyao He, He Wen, Shuchang Zhou, Yuxin Wu, Cong Yao, Xinyu Zhou, and Yuheng Zou. Effective quantization methods for recurrent neural networks. 2016. URL: https://arxiv.org/abs/1611.10176, doi:10.48550/ARXIV.1611.10176.
- HCS+16
Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. Quantized neural networks: training neural networks with low precision weights and activations. 2016. URL: https://arxiv.org/abs/1609.07061, doi:10.48550/ARXIV.1609.07061.
- Kri18
Raghuraman Krishnamoorthi. Quantizing deep convolutional networks for efficient inference: a whitepaper. 2018. arXiv:1806.08342.
- LBBH98
Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998. doi:10.1109/5.726791.
- LZL16
Fengfu Li, Bo Zhang, and Bin Liu. Ternary weight networks. 2016. URL: https://arxiv.org/abs/1605.04711, doi:10.48550/ARXIV.1605.04711.
- RORF16
Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, and Ali Farhadi. Xnor-net: imagenet classification using binary convolutional neural networks. 2016. URL: https://arxiv.org/abs/1603.05279, doi:10.48550/ARXIV.1603.05279.
- TA21
Koki Tsubota and Kiyoharu Aizawa. Comprehensive comparisons of uniform quantizers for deep image compression. In 2021 IEEE International Conference on Image Processing (ICIP), volume, 2089–2093. 2021. doi:10.1109/ICIP42928.2021.9506497.
- WJZ+20
Hao Wu, Patrick Judd, Xiaojie Zhang, Mikhail Isaev, and Paulius Micikevicius. Integer quantization for deep learning inference: principles and empirical evaluation. 2020. arXiv:2004.09602.