29 research outputs found

    Scaling a Plagiarism Search Service on the BonFIRE Testbed

    Get PDF
    The KOPI Online Plagiarism Search Portal ? a nationwide plagiarism service in Hungary ? is a unique, open service for web users that enables them to check for identical or similar contents between their own documents and the files uploaded by other authors. As our recent result, we can also detect cross-language plagiarism, but with a highly increased computational demand. The paper describes our experiment with the BonFIRE testbed to find a suitable scaling mechanism for translational plagiarism detection in a cloud federation

    Content-based trust and bias classification via biclustering

    Get PDF
    In this paper we improve trust, bias and factuality classification over Web data on the domain level. Unlike the majority of literature in this area that aims at extracting opinion and handling short text on the micro level, we aim to aid a researcher or an archivist in obtaining a large collection that, on the high level, originates from unbiased and trustworthy sources. Our method generates features as Jensen-Shannon distances from centers in a host-term biclustering. On top of the distance features, we apply kernel methods and also combine with baseline text classifiers. We test our method on the ECML/PKDD Discovery Challenge data set DC2010. Our method improves over the best achieved text classification NDCG results by over 3--10% for neutrality, bias and trustworthiness. The fact that the ECML/PKDD Discovery Challenge 2010 participants reached an AUC only slightly above 0.5 indicates the hardness of the task

    Smokers’ Engagement Behavior on Facebook: Verbalizing and Visual Expressing the Smoking Cessation Process

    Get PDF
    The “processes of change” and “motivational language” are common in smoker Facebook users’ comments under smoking cessation support contents. Smokers can combine this verbalization of the smoking cessation process with visual expression when they use comments and Facebook reactions at the same time. The aim of this study was to understand the relationship between processes of change, motivational language, and the Facebook reaction buttons. A total of 821 smokers’ comments were analyzed in the current study (n = 821), which responded to image-based smoking cessation support contents. The processes of change and the motivational language used in the investigated comments were identified. These linguistic categories were compared with the usage of reaction buttons. The Facebook users who used the “Haha” reaction button wrote a significantly higher proportion of sustain talk than those who used the “Like” or “Love” reaction buttons. The Facebook users who combined the comment and “Love” reaction wrote significantly more change talk than those who did not utilize these buttons. We suggest that the “Haha” reaction may be a negative indicator, the “Like” reaction may be a neutral indicator, and the “Love” reaction may be a positive engagement indicator in terms of the smoking cessation process during Facebook-based interventions. These results may highlight how to evaluate Facebook reactions relating to smoking cessation support contents

    How to Avoid Lower Priority for Smoking Cessation Support Content on Facebook: An Analysis of Engagement Bait

    Get PDF
    Facebook demotes “engagement bait” content that makes people interact. As a result of this sanctioning, public health content can reach fewer Facebook users. This study aims to determine the negative effect of engagement bait and find alternative techniques. In a three-year period, 791 smoking cessation support content was included (n = 791). The Facebook posts were classified into “engagement bait”, “alternative techniques” and control groups. Facebook metrics were compared between the study and control groups. The reach of Facebook page fans was significantly lower in the engagement bait group compared to the control group. On the other hand, the alternative techniques had a significantly lower rate of negative Facebook interactions, as well as significantly higher click rates compared to the control group. This is the first study to reveal the sanctioning of engagement bait on smoking cessation support Facebook posts. “Engagement bait” content has a lower ranking on the Facebook Fans’ Newsfeed page. Nevertheless, alternative techniques can circumvent the restrictions on engagement bait. At the same time, alternative techniques can stimulate the click rate and inhibit the rate of negative interactions

    Alkalmazott algoritmusok nagyméretű feladatokra = Applied algorithms for large-scale problems

    Get PDF
    Alap és alkalmazott kutatást végeztünk a következő fő területeken: - Formális matematikai módszerek adatbányászatban és optimalizálásban; - Nagyméretű adatok elemzése és modellezése, hálózatokkal kapcsolatos üzleti intelligencia alkalmazásokban; - Felhasználó és tartalom összerendelése, keresés, ajánlás. A projekt résztvevői zárt láncban a teljes innovációs láncot lefedik az oktatástól (ELTE és BME algoritmusok, adatbányászat, Web információ-keresés előadások) az elméleti kutatásokon át az alkalmazásokig. A kutatáshoz kapcsolódó legfontosabb két ipari partnerünk a Magyar Telekom és az AEGON, amelyek számára egyedi kereső megoldásokat fejlesztettünk, naplóelemzési és ügyfél-elemzési feladatokat oldottunk meg. Európai kapcsolataink segítségével a jelen kutatási eredményekre épülő Digitális Könyvtárak és Biztonság témájú projektben veszünk részt. A kutatásunk nemzetközi elismertségét jelzi, hogy felkértek a legjelentősebb európai adatbányászati verseny, az ECML/PKDD Discovery Challenge szervezésére, illetve a legrangosabb World Wide Web konferencián Workshop Chair, a WSDM (Web Search and Data Mining) konferencián szenior, további kapcsolódó témájú konferencián és workshopon (ICALP, AIRWeb, ESA stb) programbizottági tagot adunk. Legfontosabb eredményeink: - Előrelépést a véges testek feletti polinomfelbontás algoritmusaiban; - Díjnyertes megoldás a KDD Cup 2009 feladaton; - Új Web Spam szűrő módszerek; - Tartalom alapú képkereső eljárások. | Our results cover a wide range of areas of theory and application: -Formal mathematical methods in data mining and optimization; -Analysis and modeling very large scale data with applications in the areas of network related business intelligence; -User-content interaction, optimization. The project team covers full innovation chain from Education (Technical University and Eötvös University courses in algorithms, data mining, Web information retrieval), Pure, Applied Research and Innovation. Our industrial exploitation include the Hungarian Telecom Group and AEGON Hungary where we developed custom search engines and conducted log mining and business intelligence projects. Based on the reported results, we participated in several Digital Libraries and Security ICT projects. Our results are acknowledged by being the main organizer of the major European data mining contest, the ECML/PKDD Discovery Challenge 2010 and the invitation to serve as Workshop Chair at the highest prestige World Wide Web conference, senoir program committee member at the Web Search and Data Mining conferences, and PC member of other related conferences and workshops (ICALP, AIRWeb, ESA etc). Our most important research results include -Breakthrough algorithms in factorization of polynomials over finite fields; -Prize winner solution at KDD Cup 2009, in a telco classification task; -New methodologies in Web Spam filtering; -Content-based multimedia indexing methods
    corecore