参考#

参考资料

BLC13

Yoshua Bengio, Nicholas Léonard, and Aaron Courville. Estimating or propagating gradients through stochastic neurons for conditional computation. 2013. URL: https://arxiv.org/abs/1308.3432, doi:10.48550/ARXIV.1308.3432.

CWZZ17

Yu Cheng, Duo Wang, Pan Zhou, and Tao Zhang. A survey of model compression and acceleration for deep neural networks. 2017. URL: https://arxiv.org/abs/1710.09282, doi:10.48550/ARXIV.1710.09282.

CHS+16

Matthieu Courbariaux, Itay Hubara, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. Binarized neural networks: training deep neural networks with weights and activations constrained to +1 or -1. 2016. URL: https://arxiv.org/abs/1602.02830, doi:10.48550/ARXIV.1602.02830.

CFH+22

Matteo Croci, Massimiliano Fasi, Nicholas Higham, Theo Mary, and Mantas Mikaitis. Stochastic rounding: implementation, error analysis and applications. Royal Society Open Science, 9:, 03 2022. doi:10.1098/rsos.211631.

GKD+21

Amir Gholami, Sehoon Kim, Zhen Dong, Zhewei Yao, Michael W. Mahoney, and Kurt Keutzer. A survey of quantization methods for efficient neural network inference. 2021. arXiv:2103.13630.

HMD15

Song Han, Huizi Mao, and William J. Dally. Deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding. 2015. URL: https://arxiv.org/abs/1510.00149, doi:10.48550/ARXIV.1510.00149.

HWZ+16

Qinyao He, He Wen, Shuchang Zhou, Yuxin Wu, Cong Yao, Xinyu Zhou, and Yuheng Zou. Effective quantization methods for recurrent neural networks. 2016. URL: https://arxiv.org/abs/1611.10176, doi:10.48550/ARXIV.1611.10176.

HCS+16

Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. Quantized neural networks: training neural networks with low precision weights and activations. 2016. URL: https://arxiv.org/abs/1609.07061, doi:10.48550/ARXIV.1609.07061.

Kri18

Raghuraman Krishnamoorthi. Quantizing deep convolutional networks for efficient inference: a whitepaper. 2018. arXiv:1806.08342.

LBBH98

Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998. doi:10.1109/5.726791.

LZL16

Fengfu Li, Bo Zhang, and Bin Liu. Ternary weight networks. 2016. URL: https://arxiv.org/abs/1605.04711, doi:10.48550/ARXIV.1605.04711.

RORF16

Mohammad Rastegari, Vicente Ordonez, Joseph Redmon, and Ali Farhadi. Xnor-net: imagenet classification using binary convolutional neural networks. 2016. URL: https://arxiv.org/abs/1603.05279, doi:10.48550/ARXIV.1603.05279.

TA21

Koki Tsubota and Kiyoharu Aizawa. Comprehensive comparisons of uniform quantizers for deep image compression. In 2021 IEEE International Conference on Image Processing (ICIP), volume, 2089–2093. 2021. doi:10.1109/ICIP42928.2021.9506497.

WJZ+20

Hao Wu, Patrick Judd, Xiaojie Zhang, Mikhail Isaev, and Paulius Micikevicius. Integer quantization for deep learning inference: principles and empirical evaluation. 2020. arXiv:2004.09602.