Skip to main content

The Good‚ the Bad‚ and the Unknown: Morphosyllabic Sentiment Tagging of Unseen Words

Karo Moilanen and Stephen Pulman

Abstract

The omnipresence of unknown words is a problem that any NLP component needs to address in some form. While there exist many established techniques for dealing with unknown words in the realm of POS-tagging, for example, guessing unknown words' semantic properties is a less-explored area with greater challenges. In this paper, we study the semantic field of sentiment and propose five methods for assigning prior sentiment polarities to unknown words based on known sentiment carriers. Tested on 2000 cases, the methods mirror human judgements closely in three- and two-way polarity classification tasks, and reach accuracies above 63% and 81%, respectively.

Book Title
Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (ACL 2008)‚ Short Papers
Location
Columbus‚ Ohio
Month
June 15−20
Pages
109–112
Year
2008