This site accompanies the latter half of the ART.T458: Machine Learning course at Tokyo Institute of Technology, which focuses on Deep Learning for Natural Language Processing (NLP). Course materials, demos, and implementations are available.

Lecture #1: Feedforward Neural Network (I)

Slides for lecture #1 Interactive single-layer perceptron Interactive multi-layer perceptron
Keywords: binary classification, Threshold Logic Units (TLUs), Single-layer Perceptron (SLP), Perceptron algorithm, sigmoid function, Stochastic Gradient Descent (SGD), Multi-layer Perceptron (MLP), Backpropagation, Computation Graph, Automatic Differentiation, Universal Approximation Theorem. Interactive demos available for single-layer perceptron and multi-layer perceptron. Implementation available on github and Colaboratory.

Lecture #2: Feedforward Neural Network (II)

Slides for lecture #2 Slides for lecture #2 Slides for lecture #2
Keywords: multi-class classification, linear multi-class classifier, softmax function, Stochastic Gradient Descent (SGD), mini-batch training, loss functions, activation functions, dropout. All the above images link to the same slides. Implementation available on github and Colaboratory.

Lecture #3: Word embeddings

Slides for lecture #3 Word vectors pre-trained on English newspapers Word vectors trained on Japanese Wikipedia
Keywords: word embeddings, distributed representation, distributional hypothesis, pointwise mutual information, singular value decomposition, word2vec, word analogy, GloVe, fastText. Demo with word vectors trained on Japanese Wikipedia and English newspapers available.

Lecture #4: DNN for structural data

Slides for lecture #4 Slides for lecture #4 Slides for lecture #4
Keywords: Recurrent Neural Networks (RNNs), Gradient vanishing and exploding, Long Short-Term Memory (LSTM), Gated Recurrent Units (GRUs), Recursive Neural Network, Tree-structured LSTM, Convolutional Neural Networks (CNNs). All the above images link to the same slides. Implementation available on Github and Colaboratory.

Lecture #5: Encoder-decoder models

Slides for lecture #5 Slides for lecture #5 Slides for lecture #5
Keywords: language modeling, Recurrent Neural Network Language Model (RNNLM), encoder-decoder models, sequence-to-sequence models, attention mechanism, reading comprehension, question answering, headline generation, multi-task learning, character-based RNN, byte-pair encoding, Convolutional Sequence to Sequence (ConvS2S), Transformer, coverage. All the above images link to the same slides.