This is a brief summary of paper for me to study and organize it, Semi-supervised Sequence Learning (Dai and Le., NIPS 2015) I read and studied.

This paper showed the pretrainig with unlabeled data improve the performance of text classification.

They present two approaches that sequence autoencoder and Language modeling with recurrent nueral network.

Below is the sequence autoencoder firgure they used:

Dai and Le., NIPS 2015

Interesting experiment to me is that they used CIFAR-10 to classify imaget with LSTM pretraining.

The input to a LSTM is an entire row of pixels and they predict the class of the image after reading the final row.

In this case, for pretraining based on Language modeling, they tranied LSTM to do next row prediction given the current row.

Another method for sequence auhoencoder autoencode the image by rows.

And the loss function during unsupervised learning is the Eucledean L2 distance between prediction and the target row.

For details, read section 4.5 Object classification experiments with CIFAR-10 in their paper.

They said

They demonstrated that a language model or a sequence autoencoder can help stabilize the learning in LSTM recurrent networks.

Also They used word dropout

Note(Abstract): They present two approaches to use unlabeled data to improve Sequence Learning with recurrent networks. The first approach is to predict what comes next in a sequence, which is a language model in NLP. The second approach is to use a sequence autoencoder, which reads the input sequence into a vector and predicts the input sequence again. These two algorithms can be used as a “pretraining” algorithm for a later supervised sequence learning algorithm. In other words, the parameters obtained from the pretraining step can then be used as a starting point for other supervised training models. In they experiments, we find that long short term memory recurrent networks after pretrained with the two approaches become more stable to train and generalize better.

Download URL:
The paper: Semi-supervised Sequence Learning (Dai and Le., NIPS 2015)

Reference

Paper
- arXiv Version: Semi-supervised Sequence Learning (Dai and Le., arXiv 2015)
- NIPS Version: Semi-supervised Sequence Learning (Dai and Le., NIPS 2015)
How to use html for alert
- how to use icon
For your information
- NAACL 2019 Highlight on Ruder.io
- Slide
  - Transfer Learning in Natural Language Processing tutorial on NAACL-HLT 2019

Semi-supervised Sequence Learning

Title of paper - Semi-supervised Sequence Learning

Semi-supervised Sequence Learning

Title of paper - Semi-supervised Sequence Learning

Reference