This is a brief summary of paper for me to study and organize it, Linguistic Regularities in Sparse and Explicit Word Representations (Levy and Goldberg, CoNLL 2014) that I read and studied.
They demonstrated that the linguistic regularity in word embedding is also captured by the traditional distributed word representation, which is the co-occurence matrix with word and context word.
In other words, for word embedding by neural network, it has the property: words that appear in similar contexts will be close to each other in the projected space.
The property makes word embedding share semantic or syntactic properties or capture linguistic regularities and relational similarties.
Those properties also can be captured by the word-context co-occurence matrix.
To sump their saying up, The linguistic regularities apparent in neural embedding are not a consequence of the embedding process, but rather are well preserved by neural embedding.
In order to prove it, they suggest new objective function to eevaluate liguistic regularity using logarithm.
For detailed experiment analysis, you can found in Linguistic Regularities in Sparse and Explicit Word Representations (Levy and Goldberg, CoNLL 2014)
The paper: Linguistic Regularities in Sparse and Explicit Word Representations (Levy and Goldberg, CoNLL 2014)
Reference
- Paper
- How to use html for alert