Truth be Told: A Corpus-Based Study of Adjectives of Truth and Reality across Languages

This is the abstract of a TULCON14 presentation.

Anna Pyrtchenkov, Maya Blumenthal, and Lee Jiang (University of Toronto)

[Click here for the data visualization based on this research]

In studying how word meanings vary between languages, lexical semantic typology tends to focus  on concrete domains and employ methodologies that center the labelling function of language.  Both tendencies constitute biases that may hinder a fuller understanding of the cross-linguistic  patterning of word meanings. This paper employs a way to get around these biases, namely the use  of parallel translation data (the same text translated into many different languages) combined with  statistical machine translation tools and dimensionality reduction techniques. 

Building on studies by Wierzbicka and Tognini-Bonelli, we study the cross-linguistic patterns for  adjectives pertaining to truth and reality (English true, real, and right). Our results show a  patterning that aligns globally with these previous studies, but adds insight into the extent and  nature of the cross-linguistic variation by presenting a semantic map of the domain which displays  both continua and more discrete lexical semantic boundaries. 

Concerning methodology, our results show that (1) we can obtain insight in the cross-linguistic patterns of variations of abstract semantic domains through translation data, and (2) in doing so, it  is important to consider the functions of words beyond the labelling function: more discourse oriented functions play a pivotal role in the description of the cross-linguistic patterns that we  observe.