This is a brief summary of paper for me to study and organize it, Learning Character-level Representations for Part-of-Speech Tagging (Santos and Zadrozny., ICML 2014) I read and studied.
They used character embedding not pretrained other than word embedding pretrained.
They concatented character and word embedding, and characters is a intra-word to be used to extract morphological and shape information.
When constructing a word-level representation from characters, they used CNN network as follows:
The next layer used the concatenation of word-symbol and word-level representation with sucessive window centralized in a target word.
Finally, They compute the cost function of strucured inference depending neighboring tags.
Note(Abstract):
Distributed word representations have recently been proven to be an invaluable resource for NLP. These representations are normally learned using neural networks and capture syntactic and semantic information about words. Information about word morphology and shape is normally ignored when learning word representations. However, for tasks like part-of-speech tagging, intra-word information is extremely useful, specially when dealing with morphologically rich languages. In this paper, they propose a deep neural network that learns character-level representation of words and associate them with usual word representations to perform POS tagging. Using the proposed approach, while avoiding the use of any handcrafted feature, they produce stateof-the-art POS taggers for two languages: English, with 97.32% accuracy on the Penn Treebank WSJ corpus; and Portuguese, with 97.47% accuracy on the Mac-Morpho corpus, where the latter represents an error reduction of 12.2% on the best previous known result.
Download URL:
The paper: Learning Character-level Representations for Part-of-Speech Tagging (Santos and Zadrozny., ICML 2014)
The paper: Learning Character-level Representations for Part-of-Speech Tagging (Santos and Zadrozny., ICML 2014)
Reference
- Paper
- How to use html for alert