This is a brief summary of paper for me to study and arrange for Multi-prototype Chinese Character Embedding (Lu et al. LREC 2016) I read and studied.

This paper is a research ralted to character embedding

They designed the model to train character vector based on skip-gram model with word as input. there are a reason for their to suggest multi-prototype character embedidng.

The reason is that Characters are highly polysemous in forming words.

They make three significant changes to MSSG for the character embedding task.

First, they predict the sense of the current character given the context directly by using a neural model, rather than by finding the cluster center which is closed to average context vector.

Second, characters are highly order-sensitive in forming words, they add position into the context by combining a character with its position, so each character in each position (relative to the current character position) has a embedding.

This results in a position-sensitive variation of the Skipgram model.

In addition, the number of senses per character is induced from a lexicon rather than automatically.

Their model is as follows:

Lu et al. LREC 2016

Note(Abstract): Chinese sentences are written as sequences of characters, which are elementary units of syntax and semantics. Characters are highly polysemous in forming words. They present a position-sensitive skip-gram model to learn multi-prototype Chinese character embeddings, and explore the usefulness of such character embeddings to Chinese NLP tasks. Evaluation on character similarity shows that multi-prototype embeddings are significantly better than a single-prototype baseline.

Download URL:
The paper: Multi-prototype Chinese Character Embedding. Lu, Zhang, and Ji. 2016 LREC

Reference

Paper
- LREC Version: Multi-prototype Chinese Character Embedding (Lu et al. LREC 2016)
For Information
- COLING Version: A Probabilistic Model for Learning Multi-Prototype Word Embeddings (Tian et al. COLING 2014)
How to use html for alert
- how to use icon

Multi-prototype Chinese Character Embedding

Title of paper - Multi-prototype Chinese Character Embedding

Multi-prototype Chinese Character Embedding

Title of paper - Multi-prototype Chinese Character Embedding

Reference