Outline of Reading List
>>Reading list to be continuously updated throughout the semester<<
1 Introduction to Deep Learning
Course "Text Book"
[1.0] Bengio, Yoshua, Ian J. Goodfellow, and Aaron Courville. "Deep learning." An MIT Press book. (2015). [pdf]
High-level Survey
[1.1] LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. "Deep learning." Nature 521.7553 (2015): 436-444. [pdf]
️️️️️
2 Convolutional Neural Networks (CNNs)
LeNet: Image Classification on Handwritten Digits
[2.0] Y. LeCun, L. Bottou, Y. Bengio and P. Haffner. "Gradient-Based Learning Applied to Document Recognition." Proceedings of the IEEE, 86(11):2278-2324. 1998.️[pdf] (Seminal Paper: LeNet) [AYS 1/26/22]
Image Classification on ImageNet
[2.1] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012. [pdf] [JOC 1/26/22]
[2.2] Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014). [pdf] (VGG Network) [JOC 1/26/22]
[2.3] Szegedy, Christian, et al. "Going deeper with convolutions." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015. [pdf] (GoogleNet)
[2.4] He, Kaiming, et al. "Deep residual learning for image recognition." arXiv preprint arXiv:1512.03385 (2015). [pdf] (ResNet) [RL 1/26/22]
[2.5] Huang, G. et al. "Densely Connected Convolutional Networks." arXiv preprint arXiv:1608.06993 (2017) [pdf] (DenseNet) [RL 1/26/22]
[2.6] Hu, Jie et al. "Squeeze-and-Excitation Networks." arXiv preprint arXiv:1709.01507 (2017) [pdf]
[2.7] Howard, A. G. et al. "MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications." [pdf] [RL 1/26/22]
[2.8] Tan, M. and Le, Q. V. "EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks." [pdf]
[2.9] Xie, Q. et al. "Self-training with Noisy Student improves ImageNet classification." [pdf] [JOC 1/26/22]
[2.10] Bojarski, M. et al. "End to End Learning for Self-Driving Cars." [pdf] [AYS 1/26/22]
3 Object Detection and Segmentation
Detection:
[3.0] H. A. Rowley, S. Baluja, and T. Kanade, “Neural network-based face detection,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognition, pp. 203–208, 1996. [pdf]
[3.1] P. Viola and M. Jones, “Rapid object detection using a boosted cascade of simple features,” in Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. [pdf]
[3.2] Szegedy, Christian, Alexander Toshev, and Dumitru Erhan. "Deep neural networks for object detection." Advances in Neural Information Processing Systems. 2013. [pdf] ️️️
[3.2b] Sermanet, Eigen, Zhang, Mathieu, Fergus, and LeCun, "OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks," 2014. [pdf]
[3.3] Girshick, Ross, et al. "Rich feature hierarchies for accurate object detection and semantic segmentation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2014. [pdf] (RC NN) [SC 02/02/22] ️️️️️
[3.4] He, Kaiming, et al. "Spatial pyramid pooling in deep convolutional networks for visual recognition." European Conference on Computer Vision. Springer International Publishing, 2014. [pdf] (SPPNet)
[3.5] Girshick, Ross. "Fast R-CNN." Proceedings of the IEEE International Conference on Computer Vision. 2015. [pdf]️️️️ [SC 02/02/22]
[3.6] Ren, Shaoqing, et al. "Faster R-CNN: Towards real-time object detection with region proposal networks." Advances in neural information processing systems. 2015. [pdf] ️️️️ [SC 02/02/22]
[3.7] Redmon, Joseph, et al. "You only look once: Unified, real-time object detection." arXiv preprint arXiv:1506.02640 (2015). [pdf]
[3.8] Liu, Wei, et al. "SSD: Single Shot MultiBox Detector." arXiv preprint arXiv:1512.02325 (2015). [pdf]
[3.9] Dai, Jifeng, et al. "R-FCN: Object Detection via Region-based Fully Convolutional Networks." arXiv preprint arXiv:1605.06409 (2016). [pdf] ️️️️
[3.10] K. He et al. "Mask R-CNN" arXiv preprint arXiv:1703.06870 (2017). [pdf] [JW 02/02/22]
[3.11] Tsung-Yi Lin et al. "Feature Pyramid Networks for Object Detection." arXiv:1612.03144 (2017). [pdf] [JW 02/02/22]
[3.12] Kaiwen Duan, Song Bai, Lingxi Xie, Honggang Qi, Qingming Huang: “CenterNet: Keypoint Triplets for Object Detection”, 2019; arXiv:1904.08189 [pdf]
Segmentation:
[3.13] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation.” in CVPR, 2015. [pdf]️️️️️ [NN 02/02/22]
[3.14] O. Ronnenberger et al. “U-Net: Convolutional Networks for Biomedical Image Segmentation.” 2015. [pdf] [JW 02/02/22]
[3.15] “Multi-Scale Context Aggregation by Dilated Convolutions.” 2016. [pdf]
[3.16] (Deeplabv2) “DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs.” 2016. [pdf]
[3.17] (DeepLabv3) “Rethinking Atrous Convolution for Semantic Image Segmentation.” 2017. [pdf]
[3.18 and 3.10] K. He et al. "Mask R-CNN" arXiv preprint arXiv:1703.06870. 2017. [pdf] ️️️️️[NN 02/02/22]
[3.19] (DeepLabv3+) “Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation.” 2018. [pdf]
[3.20] “Learning to Segment Everything.” 2018. [pdf]
4 Self-Supervised Learning
[4.0] Supervised Learning Intro
[4.1] “A Simple Framework for Contrastive Learning of Visual Representations.” 2020. [https://arxiv.org/pdf/2002.05709.pdf] [JW 02/09/22]
[4.2] “Adversarial Feature Learning.” 2017. [https://arxiv.org/pdf/1605.09782.pdf] [JW 02/09/22]
[4.3] “Learning Image Representations by Completing Damaged Jigsaw Puzzles.” 2018. [https://arxiv.org/pdf/1802.01880.pdf]
[4.4] “A Critical Analysis of Self-Supervision, or What We Can Learn From a Single Image.” 2020. [https://arxiv.org/pdf/1904.13132.pdf] [JW 02/09/22]
[4.5] “Momentum Contrast for Unsupervised Visual Representation Learning” [pdf]) (MoCo)
[4.6] "Bootstrap your own latent: A new approach to self-supervised learning." arXiv preprint arXiv:2006.07733 (2020) [pdf] (BYOL)
[4.7] “Unsupervised Learning of Video Representations using LSTMs.” 2015 [pdf] [RS 02/09/22]
[4.8] “Tracking Emerges by Colorizing Videos.” 2018 [pdf] [ER 02/09/22]
[4.9] “Learning Correspondence from the Cycle-consistency of Time.” 2019 [pdf] [ER 02/09/22]
[4.10] “Video Representation Learning by Dense Predictive Coding.” 2019 [pdf] [RS 02/09/22]
5 RNN / Sequence-to-Sequence Model
[5.0] Bengio, Yoshua et. al. "A Neural Probabilistic Model" JMLR (2003). [pdf]
[5.1] Graves, Alex. "Generating sequences with recurrent neural networks." arXiv preprint arXiv:1308.0850 (2013). [pdf](LSTM, very nice generating result, show the power of RNN) [AL 02/16/22]
[5.2] "Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation" EMNLP (2014) [pdf] (GRU, a strong variant of RNN)
[5.3] Mikolov, et al. "Distributed representations of words and phrases and their compositionality." NIPS(2013) [pdf] (word2vec) [KS 02/16/22]
[5.4] "GloVe: Global Vectors for Word Representation" EMNLP (2014) [pdf] (Glove) [KS 02/16/22]
[5.5] Sutskever, Ilya, Oriol Vinyals, and Quoc V. Le. "Sequence to sequence learning with neural networks." Advances in neural information processing systems. 2014. [pdf] (Outstanding Work) [FB 02/16/22]
[5.6] Bahdanau, Dzmitry, KyungHyun Cho, and Yoshua Bengio. "Neural Machine Translation by Jointly Learning to Align and Translate." arXiv preprint arXiv:1409.0473 (2014). [pdf] [FB 02/16/22]
[5.7] Ashish Vaswani, et al. “Attention is All you Need” [pdf] (Transformer) [AL 02/16/22]
6 Pre-trained Language Modeling
[6.0] Matthew Peters, et al. “Deep Contexualized Word Representations” NAACL-HLT(2018) [pdf] (ELMo) [SA 03/02/22]
[6.1] Jeremy Howard, et al. "Universal Language Model Fine-Tuning for Text Classification" ACL (2018) [pdf] [AYM 03/02/22]
[6.2] Jacob Devlin, et al. "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding" NAACL-HLT(2019) [pdf] (Milestone, Best Paper) [JW 03/02/22]
[6.3] Liu, Yinhan, et al. "Roberta: A robustly optimized bert pretraining approach." arXiv preprint arXiv:1907.11692 (2019) [pdf] (Roberta) [SC 03/02/22]
[6.4] Radford, Alec, et al. "Improving language understanding by generative pre-training." (2018) [pdf] (GPT) [JW 03/02/22]
[6.5] Victor Sanh, et al. "DistilBERT, a distilled version of BERT." arXiv preprint arXiv:1910.01108(2019) [pdf] [SC 03/02/22]
[6.6] Dong, Li, et al. "Unified language model pre-training for natural language understanding and generation." arXiv preprint arXiv:1905.03197 (2019) [pdf] (Prefix LM, UniLM1) [SA 03/02/22]
[6.7] Bao, Hangbo, et al. "Unilmv2: Pseudo-masked language models for unified language model pre-training." International Conference on Machine Learning. PMLR, 2020. [pdf] (UniLM2) [SA 03/02/22]
[6.8] Raffel, Colin, et al. "Exploring the limits of transfer learning with a unified text-to-text transformer." arXiv preprint arXiv:1910.10683 (2019) [pdf] (T5).
7 Vision Transformer
ViT and variants
[7.0] Dosovitskiy, Alexey, et al. "An image is worth 16x16 words: Transformers for image recognition at scale." ICLR(2021) [pdf] (ViT). [SA 03/09/22]
[7.1] Touvron, Hugo, et al. "Training data-efficient image transformers & distillation through attention." International Conference on Machine Learning. PMLR, 2021 [pdf] (DeiT) [SA 03/09/22]
[7.2] Yuan, Li, et al. "Tokens-to-token vit: Training vision transformers from scratch on imagenet." ICCV(2021) [pdf]. [AA 03/09/22]
[7.3] Liu, Ze, et al. "Swin transformer: Hierarchical vision transformer using shifted windows." ICCV(2021) [pdf] (ICCV Best paper). [AA 03/09/22]
[7.4] Wu, Haiping, et al. "Cvt: Introducing convolutions to vision transformers." ICCV(2021) [pdf] (Conv+Transformer) [AA 03/09/22]
ViT on other tasks
[7.5] Carion, Nicolas, et al. "End-to-end object detection with transformers." ECCV2020 [pdf] (Transformer+detection).
[7.6] Zheng, Sixiao, et al. "Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers." CVPR2021 [pdf] (Transformer+segmentation).
[7.7] Parmar, Niki, et al. "Image transformer." International Conference on Machine Learning. PMLR2018 [pdf] (Transformer+Generation).
[7.8] Arnab, Anurag, et al. "V ivit: A video vision transformer. " ICCV(2021) [pdf] (Transformer+Video).
8 Application of Language Modeling
[8.0] Lee, et al. "Fully Character-Level Neural Machine Translation without Explicit Segmentation". In arXiv preprint arXiv:1610.03017 (2016) [pdf]
[8.1] Wu, Schuster, Chen, Le, et al. "Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation". In arXiv preprint arXiv:1609.08144v2 (2016) [pdf]
[8.2] Jonas Gehring, et al. "Convolutional Sequence to Sequence Learning." arXiv:1705.03122 (2017). [pdf]
[8.3] Lample, et al. "Phrase-Based & Neural Unsupervised Machine Translation". arXiv:1804.07755. (2018) [pdf]
[8.4] Ye Jia, et al. "Direct Speech-to-Speech Translation with a Sequence-to-Sequence Model." arXiv preprint arXiv:1904.06037 (2019). [pdf]
[8.5] Wen, et al. "Recurrent Neural Network Language Generation for Spoken Dialogue Systems". (2019) [pdf]
[8.6] Mrksic, et al. "Multi-domain Dialog State Tracking using RNNs".arXiv:1506.07190 (2015) [pdf]
[8.7] Srinivasan, et al. "Natural Language Generation using Reinforcement Learning with External Rewards." arXiv:1911.11404 (2019). [pdf]
[8.8] Zhu, et al. "SDNet: Contextualized Attention-based Deep Network for Conversational Question Answering". arXiv:1812.03593. (2018) [pdf]
[8.9] Xiong, et al. "Achieving Human Parity in Conversational Speech Recognition". arXiv:1610.05256 (2016). [pdf]
9 Vision-Language Learning
Non-pretraining Models
[9.0] Farhadi,Ali,etal. "Every picture tells a story: Generating sentences from images". In Computer VisionECCV 2010. Springer Berlin Heidelberg:15-29, 2010. [pdf] ️️️
[9.1] Xu, Kelvin, et al. "Show, attend and tell: Neural image caption generation with visual attention". In arXiv preprint arXiv:1502.03044, 2015. [pdf]
[9.2] Fukui, Akira, et al. "Multimodal compact bilinear pooling for visual question answering and visual grounding." EMNLP 2016 [pdf]
Pretraining Models
[9.3] Li, Liunian Harold, et al. "Visualbert: A simple and performant baseline for vision and language." arXiv preprint arXiv:1908.03557 (2019). [pdf] [RT 04/13/22]
[9.4] Tan, Hao, and Mohit Bansal. "Lxmert: Learning cross-modality encoder representations from transformers." EMNLP(2019)[pdf]. [AA 04/13/22]
[9.5] Chen, Yen-Chun, et al. "Uniter: Universal image-text representation learning." ECCV2020 [pdf].
[9.6] Cho, Jaemin, et al. "Unifying vision-and-language tasks via text generation." ICML2021 [pdf]. [AM 04/13/22]
[9.7] Wang, Zirui, et al. "Simvlm: Simple visual language model pretraining with weak supervision." arXiv preprint arXiv:2108.10904 [pdf].
[9.8] Radford, Alec, et al. "Learning transferable visual models from natural language supervision." ICML(2021) [pdf] (CLIP).
10 Reinforcement Learning
[10.0] Mnih, Volodymyr, et al. "Playing atari with deep reinforcement learning." arXiv preprint arXiv:1312.5602 (2013). [pdf]) (First Paper named deep reinforcement learning) [BZ 04/06/22]
[10.1] Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489. [pdf] (AlphaGo) [BZ 04/06/22]
[10.2] Pathak, Deepak, et al. “Curiosity-driven Exploration by Self-supervised Prediction” ICML 2017 [pdf] [SC 04/06/22]
[10.3] OpenAI. "Learning Dexterous In-Hand Manipulation." arXiv. [pdf]
[10.4] Byravan, Arunkumar “SE3-Nets: Learning Rigid Body Motion using Deep Neural Networks” [pdf]
[10.5] Lee, Michelle “Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks” [pdf] [SC 04/06/22]
[10.6] Shridhar, Mohit “CLIPORT: What and Where Pathways for Robotic Manipulation” [pdf]
[10.7] Xie, Annie “Learning Latent Representations to Influence Multi-Agent Interaction” [pdf]
11 Deep Generative Model and Applications
Generative Adversarial Networks:
[11a.0] Kingma, D, and Welling, M. "Auto-encoding variational bayes." arXiv preprint arXiv:1312.6114 (2013). [pdf](VAE) [JW 03/30/22]
[11a.1] Goodfellow, Ian, et al. "Generative adversarial nets." 2014. [pdf] [JW 03/30/22]
[11a.2] Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprint arXiv:1601.06759 (2016). [pdf] (PixelRNN)
[11a.3] Makzhani, Alireza, et al. "Adversarial Autoencoders" arXiv:1511.05644 (2015). [pdf] [JW 03/30/22]
[11a.4] Gregor, Karol, et al. "DRAW: A recurrent neural network for image generation." arXiv:1502.04623 (2015). [pdf]
[11a.5] Bosch, Marc ten. "N-Dimensional Rigid Body Dynamics." SIGGRAPH (2020). [pdf]
Generative Adversarial Networks: Applications
[11b.0] "Wasserstein GAN." 2017. [pdf]
[11b.1] "Large Scale GAN Training for High Fidelity Natural Image Synthesis." 2018. [pdf]
[11b.2] "A Style-based Generator Architecture for Generative Adversarial Networks" 2018. [pdf]
[11b.3] "Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks" 2017. [pdf]
[11b.4] "Conditional LSTM-GAN for Melody Generation from Lyrics." 2019. [pdf]
[11b.5] "GANFIT: Generative Adversarial Network Fitting for High Fidelity 3D Face Reconstruction." 2019. [pdf]
[11b.6] "Image-to-Image Translation with Conditional Adversarial Networks." 2019. [pdf]
Generative Adversarial Networks: Art and Music
[11c.0] Mordvintsev, Alexander; Olah, Christopher; Tyka, Mike (2015). "Inceptionism: Going Deeper into Neural Networks". Google Research. [html] (Deep Dream)
[11c.1] Gatys, Leon A., Alexander S. Ecker, and Matthias Bethge. "A neural algorithm of artistic style." arXiv preprint arXiv:1508.06576 (2015). [pdf] (Outstanding Work)
[11c.2] "CAN: Creative Adversarial Networks" 2017. [pdf]
[11c.3] "Semantic Image Synthesis with Spatially-Adaptive Normalization" 2019. [pdf]
[11c.4] "Deep Poetry: Word-Level and Char-Level Language Models for Shakespearean Sonnet Generation" [pdf] [KS 03/30/22]
[11c.5] "BachProp: Learning to Compose Music in Multiple Styles" 2018. [pdf] [KS 03/30/22]
[11d.0] "A 'New' Rembrandt: From the Frontiers of AI And Not The Artist's Atelier" 2016. [html]
[11d.1] "Is artificial intelligence set to become art’s next medium?" 2018. [html]
[11d.2] "AI Will Enhance - Not End - Human Art" 2019. [html]
[11d.3] "An AI-Written Novella Almost Won a Literary Prize" 2016. [html] [KS 03/30/22]
[11d.4] "How AI-Generated Music Is Changing The Way Hits Are Made" 2018. [html]
[11d.5] "AI puts final notes on Beethoven's Tenth Symphony" 2019. [html]
Unsupervised Learning / Auto-Encoder
[11e.0] Le, Quoc V. "Building high-level features using large scale unsupervised learning." 2013 IEEE international conference on acoustics, speech and signal processing. IEEE, 2013. [pdf] (Milestone, Andrew Ng, Google Brain Project, Cat) ️️️️
[11e.1] Kingma, Diederik P., and Max Welling. "Auto-encoding variational bayes." arXiv preprint arXiv:1312.6114 (2013). [pdf](VAE)️️
[11e.2] Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprint arXiv:1601.06759 (2016). [pdf] (PixelRNN) ️️️️
[11e.3] Oord, Aaron van den, et al. "Conditional image generation with PixelCNN decoders." arXiv preprint arXiv:1606.05328 (2016). [pdf] (PixelCNN) ️️️
Previous Papers
[11.1] Zhu, Jun-Yan, et al. "Generative Visual Manipulation on the Natural Image Manifold." European Conference on Computer Vision. Springer International Publishing, 2016. [pdf] (iGAN) ️️️️
[11.2] Champandard, Alex J. "Semantic Style Transfer and Turning Two-Bit Doodles into Fine Artworks." arXiv preprint arXiv:1603.01768 (2016). [pdf] (Neural Doodle) ️️️️
[11.3] Johnson, Justin, Alexandre Alahi, and Li Fei-Fei. "Perceptual losses for real-time style transfer and super-resolution." arXiv preprint arXiv:1603.08155 (2016). [pdf] ️️️️
[11.4] Vincent Dumoulin, Jonathon Shlens and Manjunath Kudlur. "A learned representation for artistic style." arXiv preprint arXiv:1610.07629 (2016). [pdf] ️️️️
[11.5] Gatys, Leon and Ecker, et al."Controlling Perceptual Factors in Neural Style Transfer." arXiv preprint arXiv:1611.07865 (2016). [pdf] (control style transfer over spatial location,colour information and across spatial scale)️️️️
[11.6] Ulyanov, Dmitry and Lebedev, Vadim, et al. "Texture Networks: Feed-forward Synthesis of Textures and Stylized Images." arXiv preprint arXiv:1603.03417(2016). [pdf] (Texture generation and style transfer) ️️️️
12 Speech Recognition
[12.0] Hinton, Geoffrey, et al. "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups." IEEE Signal Processing Magazine 29.6 (2012): 82-97. [pdf] (DeepNets show progress in speech recognition.)️️️️
[12.1] Graves, Alex, Abdel-rahman Mohamed, and Geoffrey Hinton. "Speech recognition with deep recurrent neural networks." 2013 IEEE international conference on acoustics, speech and signal processing. IEEE, 2013. [pdf] (RNN)️️️
[12.2] Graves, Alex, and Navdeep Jaitly. "Towards End-To-End Speech Recognition with Recurrent Neural Networks." ICML. Vol. 14. 2014. [pdf]️️️
[12.3] Sak, Haşim, et al. "Fast and accurate recurrent neural network acoustic models for speech recognition." arXiv preprint arXiv:1507.06947 (2015). [pdf] (Google Speech Recognition System) ️️️
[12.4] Amodei, Dario, et al. "Deep speech 2: End-to-end speech recognition in english and mandarin." arXiv preprint arXiv:1512.02595 (2015). [pdf] (Baidu Speech Recognition System) ️️️️
[12.5] W. Xiong, J. Droppo, X. Huang, F. Seide, M. Seltzer, A. Stolcke, D. Yu, G. Zweig "Achieving Human Parity in Conversational Speech Recognition." arXiv preprint arXiv:1610.05256 (2016). [pdf] (Microsoft Speech Recognition System) ️️
13 Deep Learning Optimization and More...
[13.0] Hinton, Geoffrey E., et al. "Improving neural networks by preventing co-adaptation of feature detectors." arXiv preprint arXiv:1207.0580 (2012). [pdf] (Dropout) ️️️
[13.1] Srivastava, Nitish, et al. "Dropout: a simple way to prevent neural networks from overfitting." Journal of Machine Learning Research 15.1 (2014): 1929-1958. [pdf] ️️️
[13.2] Ioffe, Sergey, and Christian Szegedy. "Batch normalization: Accelerating deep network training by reducing internal covariate shift." arXiv preprint arXiv:1502.03167 (2015). [pdf] (An outstanding Work in 2015) ️️️️
[13.3] Ba, Jimmy Lei, Jamie Ryan Kiros, and Geoffrey E. Hinton. "Layer normalization." arXiv preprint arXiv:1607.06450 (2016). [pdf] (Update of Batch Normalization) ️
[13.4] Courbariaux, Matthieu, et al. "Binarized Neural Networks: Training Neural Networks with Weights and Activations Constrained to+ 1 or−1." [pdf] (New Model,Fast) ️️️
[13.5] Jaderberg, Max, et al. "Decoupled neural interfaces using synthetic gradients." arXiv preprint arXiv:1608.05343 (2016). [pdf] (Innovation of Training Method,Amazing Work) ️️
[13.6] Chen, Tianqi, Ian Goodfellow, and Jonathon Shlens. "Net2net: Accelerating learning via knowledge transfer." arXiv preprint arXiv:1511.05641 (2015). [pdf] (Modify previously trained network to reduce training epochs) ️️️
[13.7] Wei, Tao, et al. "Network Morphism." arXiv preprint arXiv:1603.01670 (2016). [pdf] (Modify previously trained network to reduce training epochs) ️️️
[13.8] Sutskever, Ilya, et al. "On the importance of initialization and momentum in deep learning." ICML (3) 28 (2013): 1139-1147. [pdf] (Momentum optimizer) ️️
[13.9] Kingma, Diederik, and Jimmy Ba. "Adam: A method for stochastic optimization." arXiv preprint arXiv:1412.6980 (2014). [pdf] (Maybe used most often currently) ️
[13.10] Andrychowicz, Marcin, et al. "Learning to learn by gradient descent by gradient descent." arXiv preprint arXiv:1606.04474 (2016). [pdf] (Neural Optimizer, Amazing Work) ️️️️️
[13.11] Han, Song, Huizi Mao, and William J. Dally. "Deep compression: Compressing deep neural network with pruning, trained quantization and huffman coding." CoRR, abs/1510.00149 2 (2015). [pdf] (ICLR best paper, new direction to make NN running fast, DeePhi Tech Startup) ️️️️️
[13.12] Iandola, Forrest N., et al. "SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 1MB model size." arXiv preprint arXiv:1602.07360 (2016). [pdf] (Also a new direction to optimize NN,DeePhi Tech Startup) ️️️️
>>>>>>>>>>>>>>>>>>>>>>>> Previous Topics >>>>>>>>>>>>>>>>>>>>>>>>
15 Deep Transfer Learning / Lifelong Learning / especially for RL
[15.0] Bengio, Yoshua. "Deep Learning of Representations for Unsupervised and Transfer Learning." ICML Unsupervised and Transfer Learning 27 (2012): 17-36. [pdf] (A Tutorial) ️️️
[15.1] Silver, Daniel L., Qiang Yang, and Lianghao Li. "Lifelong Machine Learning Systems: Beyond Learning Algorithms." AAAI Spring Symposium: Lifelong Machine Learning. 2013. [pdf] (A brief discussion about lifelong learning) ️️️
[15.2] Hinton, Geoffrey, Oriol Vinyals, and Jeff Dean. "Distilling the knowledge in a neural network." arXiv preprint arXiv:1503.02531 (2015). [pdf] (Godfather's Work) ️️️️
[15.3] Rusu, Andrei A., et al. "Policy distillation." arXiv preprint arXiv:1511.06295 (2015). [pdf] (RL domain) ️️️
[15.4] Parisotto, Emilio, Jimmy Lei Ba, and Ruslan Salakhutdinov. "Actor-mimic: Deep multitask and transfer reinforcement learning." arXiv preprint arXiv:1511.06342 (2015). [pdf] (RL domain) ️️️
[15.5] Rusu, Andrei A., et al. "Progressive neural networks." arXiv preprint arXiv:1606.04671 (2016). [pdf] (Outstanding Work, A novel idea) ️️️️️
16 One Shot Deep Learning
[16.0] Lake, Brenden M., Ruslan Salakhutdinov, and Joshua B. Tenenbaum. "Human-level concept learning through probabilistic program induction." Science 350.6266 (2015): 1332-1338. [pdf] (No Deep Learning, but worth reading)️️️️️
[16.1] Koch, Gregory, Richard Zemel, and Ruslan Salakhutdinov. "Siamese Neural Networks for One-shot Image Recognition."(2015) [pdf] ️️️
[16.2] Santoro, Adam, et al. "One-shot Learning with Memory-Augmented Neural Networks." arXiv preprint arXiv:1605.06065 (2016). [pdf] (A basic step to one shot learning) ️️️️
[16.3] Vinyals, Oriol, et al. "Matching Networks for One Shot Learning." arXiv preprint arXiv:1606.04080 (2016). [pdf]️️️
[16.4] Hariharan, Bharath, and Ross Girshick. "Low-shot visual object recognition." arXiv preprint arXiv:1606.02819 (2016). [pdf] (A step to large data) ️️️️
17 Neural Turing Machine
[17.0] Graves, Alex, Greg Wayne, and Ivo Danihelka. "Neural turing machines." arXiv preprint arXiv:1410.5401 (2014). [pdf] (Basic Prototype of Future Computer) ️️️️️
[17.1] Zaremba, Wojciech, and Ilya Sutskever. "Reinforcement learning neural Turing machines." arXiv preprint arXiv:1505.00521 362 (2015). [pdf] ️️️
[17.2] Weston, Jason, Sumit Chopra, and Antoine Bordes. "Memory networks." arXiv preprint arXiv:1410.3916 (2014). [pdf]️️️
[17.3] Sukhbaatar, Sainbayar, Jason Weston, and Rob Fergus. "End-to-end memory networks." Advances in neural information processing systems. 2015. [pdf] ️️️️
[17.4] Vinyals, Oriol, Meire Fortunato, and Navdeep Jaitly. "Pointer networks." Advances in Neural Information Processing Systems. 2015. [pdf] ️️️️
[17.5] Graves, Alex, et al. "Hybrid computing using a neural network with dynamic external memory." Nature (2016). [pdf] ️️️️️