This posting is summary for my study about the paper, “Are distributed representational ready for the real world? Evaluating word vectors for grounded perceptual meaning (Lucy and Gauthier., RoboNLP-WS 2017)

Befere entering deep, I was wondering what is the Language grounding.

So I read a blog which introduce the concept of Language Grounding.

After reading the blog above, I think langauge grounding is mapping words into real world considering context.

The context means a avariety of facets which is other data, expectation and interpretation, individual speakers, and discourse context and so on.

Grounded learning

The auhtors doubt that the modern reperesenation of word is distributional representation which encode vector into the compact one using co-occurences, however, it could accurately encode facets of conceptual meaning.

In order to identify the capability of how well these representations can predict perceptual and conceptual feature of concerete concepts.

They evaluate it with two semantic norm datasets sourced from human participants.

Finally, they find out that several standard word representationa fail to encode many salient perceptual features of concepts.

They have two dataset of semantic feature norm for testing whether or not distributional represenation has the conceptual and perceptual meaning.

  • First dataset is, McRae et al.(2005): Semantic feature productions norm for a large set of living and nonliving things.

Let’s see an example of semantic norm data above.

a sample of McRae norm dataset

the concepts have various features which is related to perceptual and conceptual meaning.

  • Second is CLBS(Devereux et al., 2014): The centre for speech, language and the grain(cslb) concept property norms.

As you can see the semantic norm dataset, concept means words and feature means characteristic the words have.

For example, in order to explain feature of a word, airplane, there are a variety of features as follows:

  • airplane has_wings(visual-from_and_surface)
  • airplane used_for_passengers(function)
  • airplane found_in_airports(encclopaedic)
  • airplane is_large(visual-form_and_surface)

They argued that the feature deficiencies affect word-word similarity measure and domain clustering.

Reference