参考#

参考资料

BLC13: Yoshua Bengio, Nicholas Léonard, and Aaron Courville. Estimating or propagating gradients through stochastic neurons for conditional computation. 2013. URL: https://arxiv.org/abs/1308.3432, doi:10.48550/ARXIV.1308.3432.
CWZZ17: Yu Cheng, Duo Wang, Pan Zhou, and Tao Zhang. A survey of model compression and acceleration for deep neural networks. 2017. URL: https://arxiv.org/abs/1710.09282, doi:10.48550/ARXIV.1710.09282.
CHS+16: Matthieu Courbariaux, Itay Hubara, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. Binarized neural networks: training deep neural networks with weights and activations constrained to +1 or -1. 2016. URL: https://arxiv.org/abs/1602.02830, doi:10.48550/ARXIV.1602.02830.
CFH+22: Matteo Croci, Massimiliano Fasi, Nicholas Higham, Theo Mary, and Mantas Mikaitis. Stochastic rounding: implementation, error analysis and applications. Royal Society Open Science, 9:, 03 2022. doi:10.1098/rsos.211631.
GKD+21: Amir Gholami, Sehoon Kim, Zhen Dong, Zhewei Yao, Michael W. Mahoney, and Kurt Keutzer. A survey of quantization methods for efficient neural network inference. 2021. arXiv:2103.13630.
HMD15: Song Han, Huizi Mao, and William J. Dally. Deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding. 2015. URL: https://arxiv.org/abs/1510.00149, doi:10.48550/ARXIV.1510.00149.
HWZ+16: Qinyao He, He Wen, Shuchang Zhou, Yuxin Wu, Cong Yao, Xinyu Zhou, and Yuheng Zou. Effective quantization methods for recurrent neural networks. 2016. URL: https://arxiv.org/abs/1611.10176, doi:10.48550/ARXIV.1611.10176.
HCS+16: Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. Quantized neural networks: training neural networks with low precision weights and activations. 2016. URL: https://arxiv.org/abs/1609.07061, doi:10.48550/ARXIV.1609.07061.
Kri18: Raghuraman Krishnamoorthi. Quantizing deep convolutional networks for efficient inference: a whitepaper. 2018. arXiv:1806.08342.
LBBH98: Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998. doi:10.1109/5.726791.
LZL16: Fengfu Li, Bo Zhang, and Bin Liu. Ternary weight networks. 2016. URL: https://arxiv.org/abs/1605.04711, doi:10.48550/ARXIV.1605.04711.
RORF16: Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, and Ali Farhadi. Xnor-net: imagenet classification using binary convolutional neural networks. 2016. URL: https://arxiv.org/abs/1603.05279, doi:10.48550/ARXIV.1603.05279.
TA21: Koki Tsubota and Kiyoharu Aizawa. Comprehensive comparisons of uniform quantizers for deep image compression. In 2021 IEEE International Conference on Image Processing (ICIP), volume, 2089–2093. 2021. doi:10.1109/ICIP42928.2021.9506497.
WJZ+20: Hao Wu, Patrick Judd, Xiaojie Zhang, Mikhail Isaev, and Paulius Micikevicius. Integer quantization for deep learning inference: principles and empirical evaluation. 2020. arXiv:2004.09602.

torch_book 0.0.3 文档

参考

参考#