408 research outputs found

    Is This a Joke? Detecting Humor in Spanish Tweets

    Full text link
    While humor has been historically studied from a psychological, cognitive and linguistic standpoint, its study from a computational perspective is an area yet to be explored in Computational Linguistics. There exist some previous works, but a characterization of humor that allows its automatic recognition and generation is far from being specified. In this work we build a crowdsourced corpus of labeled tweets, annotated according to its humor value, letting the annotators subjectively decide which are humorous. A humor classifier for Spanish tweets is assembled based on supervised learning, reaching a precision of 84% and a recall of 69%.Comment: Preprint version, without referra

    Detecting Singleton Review Spammers Using Semantic Similarity

    Full text link
    Online reviews have increasingly become a very important resource for consumers when making purchases. Though it is becoming more and more difficult for people to make well-informed buying decisions without being deceived by fake reviews. Prior works on the opinion spam problem mostly considered classifying fake reviews using behavioral user patterns. They focused on prolific users who write more than a couple of reviews, discarding one-time reviewers. The number of singleton reviewers however is expected to be high for many review websites. While behavioral patterns are effective when dealing with elite users, for one-time reviewers, the review text needs to be exploited. In this paper we tackle the problem of detecting fake reviews written by the same person using multiple names, posting each review under a different name. We propose two methods to detect similar reviews and show the results generally outperform the vectorial similarity measures used in prior works. The first method extends the semantic similarity between words to the reviews level. The second method is based on topic modeling and exploits the similarity of the reviews topic distributions using two models: bag-of-words and bag-of-opinion-phrases. The experiments were conducted on reviews from three different datasets: Yelp (57K reviews), Trustpilot (9K reviews) and Ott dataset (800 reviews).Comment: 6 pages, WWW 201

    Exploratory Analysis of Highly Heterogeneous Document Collections

    Full text link
    We present an effective multifaceted system for exploratory analysis of highly heterogeneous document collections. Our system is based on intelligently tagging individual documents in a purely automated fashion and exploiting these tags in a powerful faceted browsing framework. Tagging strategies employed include both unsupervised and supervised approaches based on machine learning and natural language processing. As one of our key tagging strategies, we introduce the KERA algorithm (Keyword Extraction for Reports and Articles). KERA extracts topic-representative terms from individual documents in a purely unsupervised fashion and is revealed to be significantly more effective than state-of-the-art methods. Finally, we evaluate our system in its ability to help users locate documents pertaining to military critical technologies buried deep in a large heterogeneous sea of information.Comment: 9 pages; KDD 2013: 19th ACM SIGKDD Conference on Knowledge Discovery and Data Minin

    Three-Dimensional Analysis of Wakefields Generated by Flat Electron Beams in Planar Dielectric-Loaded Structures

    Full text link
    An electron bunch passing through dielectric-lined waveguide generates Cˇ\check{C}erenkov radiation that can result in high-peak axial electric field suitable for acceleration of a subsequent bunch. Axial field beyond Gigavolt-per-meter are attainable in structures with sub-mm sizes depending on the achievement of suitable electron bunch parameters. A promising configuration consists of using planar dielectric structure driven by flat electron bunches. In this paper we present a three-dimensional analysis of wakefields produced by flat beams in planar dielectric structures thereby extending the work of Reference [A. Tremaine, J. Rosenzweig, and P. Schoessow, Phys. Rev. E 56, No. 6, 7204 (1997)] on the topic. We especially provide closed-form expressions for the normal frequencies and field amplitudes of the excited modes and benchmark these analytical results with finite-difference time-domain particle-in-cell numerical simulations. Finally, we implement a semi-analytical algorithm into a popular particle tracking program thereby enabling start-to-end high-fidelity modeling of linear accelerators based on dielectric-lined planar waveguides.Comment: 12 pages, 2 tables, 10 figure

    Visual Affect Around the World: A Large-scale Multilingual Visual Sentiment Ontology

    Get PDF
    Every culture and language is unique. Our work expressly focuses on the uniqueness of culture and language in relation to human affect, specifically sentiment and emotion semantics, and how they manifest in social multimedia. We develop sets of sentiment- and emotion-polarized visual concepts by adapting semantic structures called adjective-noun pairs, originally introduced by Borth et al. (2013), but in a multilingual context. We propose a new language-dependent method for automatic discovery of these adjective-noun constructs. We show how this pipeline can be applied on a social multimedia platform for the creation of a large-scale multilingual visual sentiment concept ontology (MVSO). Unlike the flat structure in Borth et al. (2013), our unified ontology is organized hierarchically by multilingual clusters of visually detectable nouns and subclusters of emotionally biased versions of these nouns. In addition, we present an image-based prediction task to show how generalizable language-specific models are in a multilingual context. A new, publicly available dataset of >15.6K sentiment-biased visual concepts across 12 languages with language-specific detector banks, >7.36M images and their metadata is also released.Comment: 11 pages, to appear at ACM MM'1

    Commission des Communautes Europeennes: Groupe du Porte-Parole = Commission of European Communities: Spokesman Group. Spokesman Service Note to National Offices Bio No. (81) 276, 8 July 1981

    Get PDF
    This paper presents a novel approach for multi-lingual sentiment classification in short texts. This is a challenging task as the amount of training data in languages other than English is very limited. Previously proposed multi-lingual approaches typically require to establish a correspondence to English for which powerful classifiers are already available. In contrast, our method does not require such supervision. We leverage large amounts of weakly-supervised data in various languages to train a multi-layer convolutional network and demonstrate the importance of using pre-training of such networks. We thoroughly evaluate our approach on various multi-lingual datasets, including the recent SemEval-2016 sentiment prediction benchmark (Task 4), where we achieved state-of-the-art performance. We also compare the performance of our model trained individually for each language to a variant trained for all languages at once. We show that the latter model reaches slightly worse – but still acceptable – performance when compared to the single language model, while benefiting from better generalization properties across languages

    Towards environments that have a sense of humor

    Get PDF
    Humans have humorous conversations and interactions. Nowadays our real life existence is integrated with our life in social media, videogames, mixed reality and physical environments that sense our activities and that can adapt appearance and properties due to our activities. There are other inhabitants in these environments, not only human, but also virtual agents and social robots with which we interact and who decide about their participation in activities. In this paper we look at designing humor and humor opportunities in such environments, providing them with a sense of humor, and able to recognize opportunities to generate humorous interactions or events on the fly. Opportunities, made possible by introducing incongruities, can be exploited by the environment itself, or they can be communicated to its inhabitants

    Diamond deposition on modified silicon substrates: Making diamond atomic force microscopy tips for nanofriction experiments

    Get PDF
    Fine-crystalline diamond particles are grown on standard Si atomic force microscopy tips, using hot filament-assisted chemical vapor deposition. To optimize the conditions for diamond deposition, first a series of experiments is carried out using silicon substrates covered by point-topped pyramids as obtained by wet chemical etching. The apexes and the edges of the silicon pyramids provide favorable sites for diamond nucleation and growth. The investigation of the deposited polycrystallites is done by means of optical microscopy, scanning electron microscopy and micro-Raman spectroscopy. The resulting diamond-terminated tips are tested in ultra high vacuum using contact-mode atomic force microscope on a stepped surface of sapphire showing high stability, sharpness, and hardnes
    • …
    corecore