**Department of Systems Engineering and Operations Research**

**George Mason University**

**Fall 2018**

This is a PhD course focused on developing deep learning predictive models. We will learn both practical and theoretical aspects of deep learning. We will consider applications in engineering, finance and artificial intelligence. It is targeted towards the students who have completed an introductory courses in statistics and optimization. We will make extensive use of computational tools, such as the Python language, both for illustration in class and in homework problems. The class will consist of 12 lectures given by the instructor on several advanced topics in deep learning. At another 3 lectures students will present on the topic of their choice.

4/20/2018: First class is on Aug 27 at 4:30pm

Convex Optimization

Stochastic gradient descent and its variants (ADAM, RMSpropr, Nesterov acceleration)

Second order methods

ADMM

Regularization (l1, l2 and dropout)

Batch normalization

Theory of deep learning

Universal approximators

Curse of dimensionality

Kernel spaces

Topology and geometry

Computational aspects (accelerated linear algebra, reduced precision calculations, parallelism)

Architectures (CNN, LSTM, MLP, VAE)

Bayesian DL

Deep reinforcement learning

Hyperparameter selection and parameter initialization

Generative models (GANs)

Modeling with TensorFlow

**Instructor**: Vadim Sokolov

**Office**: Engineering Building, Room 2242

vsokolov(at)gmu.edu

**Tel**: 703-993-4533

Normalization Propagation: A Parametric Technique for Removing Internal Covariate Shift in Deep Networks (paper)

Why does Monte Carlo Fail to Work Properly in High-Dimensional Optimization Problems? (paper)

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift (paper)

Auto-Encoding Variational Bayes (paper)

Learning Deep Architectures for AI (monograph)

Representation Learning: A Review and New Perspectives (paper)

Sequence to Sequence Learning with Neural Networks (paper)

Twin Networks: Using the Future as a Regularizer (paper)

Skip RNN (blog and paper)

VAE with a VampPrior (paper)

Don't Decay the Learning Rate, Increase the Batch Size (paper)

Bayesian DL (blog)

Generative Adversarial Networks (presentation)

GANs at OpenAI (blog)

Revisiting the Unreasonable Effectiveness of Data (blog)

Learning the Enigma with Recurrent Neural Networks (blog)

Tuning CNN architecture (blog)

Security (blog)

Unsupervised learning (blog)

DL Tuning (blog)

Cybersecurity (paper collection)

LSTM blog

SGD (link)

Stanford's CS231n (course page)

Stanford's STATS385 (course page)

Vadim Sokolov: By appointment (at Engineering 2242)

**Location**: Enterprise Hall 77

**Times**: 4:30-7:10pm on Monday

**Grade composition**: No in-class examination. Grade is based entirely on participation in class and homework assignments.

TF Playground (Google)

SnakeViz (python profiler)

Pytorch resources (a curated list of tutorials, papers, projects)