Wishful Expressions Corpora =========================== Released April 2009 Andrew B. Goldberg (goldberg@cs.wisc.edu) These data files contain the manually labeled Products and Politics corpora introduced in the following paper: Andrew B. Goldberg, Nathanael Fillmore, David Andrzejewski, Zhiting Xu, Bryan Gibson and Xiaojin Zhu. May All Your Wishes Come True: A Study of Wishes and How to Recognize Them. Annual Conference of the North American Chapter of the Association for Computational Linguistics - Human Language Technologies (NAACL HLT 2009). Please cite the above work if using these corpora (specifically the wish/non-wish labels) in your own work. The text in these corpora originally comes from two earlier data sets prepared by other researchers. We simply subsampled text and assigned new labels to the text. The original sources are: * Products: Customer product reviews from Amazon.com and cnet.com, collected by Bing Liu and colleagues, and used in several publications. The original data and publications can be found at http://www.cs.uic.edu/~liub/FBS/sentiment-analysis.html * Politics: Web postings at politics.com, gathered by Mullen and Malouf and originally used in the following publication: Mullen, T. & Malouf, R. (2006) A preliminary investigation into sentiment analysis of informal political discourse. Proceedings of the AAAI-2006, Spring Symposium on Computational Approaches to Analyzing Weblogs. To prepare the wish-labeled data, we split the text into sentences uses MxTerminator, sampled sentences for labeling, applied Penn-TreeBank-style tokenization, down-cased, and removed punctuation. The preprocessed sentences were then presented to annotators who were instructed to decide whether or not each sentence contained a wishful expression. The following instructions were provided to the annotators: What should be labeled as containing a "wishful expression"? - any text about a wish being expressed by the author or someone else - a question or speculation about someone else's wish - more implicit wishes (e.g., using "should") are acceptable if you can substitute or rephrase as a statement using "i wish that..." (e.g., "the government should provide X") - past regrets should be considered wishful ("i wish i had done X") The following should *not* be considered wishful: - definitive statements about a future activity ("i will do X" is not a wish) - statements that use want/wish/would/etc. purely as a rhetorical device ("i want to start by saying..." is not a wish) Each sentence was judged by one annotator. The results appear in the following format: 1st column: 0/1 label (0=non-wish, 1=wish) 2nd column: the preprocessed text of the sentence in question Any questions on this data should be directed to Andrew Goldberg (goldberg@cs.wisc.edu)