348 research outputs found

    EveTAR: Building a Large-Scale Multi-Task Test Collection over Arabic Tweets

    Full text link
    This article introduces a new language-independent approach for creating a large-scale high-quality test collection of tweets that supports multiple information retrieval (IR) tasks without running a shared-task campaign. The adopted approach (demonstrated over Arabic tweets) designs the collection around significant (i.e., popular) events, which enables the development of topics that represent frequent information needs of Twitter users for which rich content exists. That inherently facilitates the support of multiple tasks that generally revolve around events, namely event detection, ad-hoc search, timeline generation, and real-time summarization. The key highlights of the approach include diversifying the judgment pool via interactive search and multiple manually-crafted queries per topic, collecting high-quality annotations via crowd-workers for relevancy and in-house annotators for novelty, filtering out low-agreement topics and inaccessible tweets, and providing multiple subsets of the collection for better availability. Applying our methodology on Arabic tweets resulted in EveTAR , the first freely-available tweet test collection for multiple IR tasks. EveTAR includes a crawl of 355M Arabic tweets and covers 50 significant events for which about 62K tweets were judged with substantial average inter-annotator agreement (Kappa value of 0.71). We demonstrate the usability of EveTAR by evaluating existing algorithms in the respective tasks. Results indicate that the new collection can support reliable ranking of IR systems that is comparable to similar TREC collections, while providing strong baseline results for future studies over Arabic tweets

    Credibility in Online Social Networks: A Survey

    Get PDF
    The importance of information credibility in society cannot be underestimated given that it is at the heart of all decision-making. Generally, more information is better; however, knowing the value of this information is essential for the decision-making processes. Information credibility defines a measure of the fitness of the information for consumption. It can also be defined in terms of reliability, which denotes the probability that a data source will appear credible to the users. A challenge in this topic is that there is a great deal of literature that has developed different credibility dimensions. In addition, information science dealing with online social networks has grown in complexity, attracting interest from researchers in information science, psychology, human–computer interaction, communication studies, and management studies, all of whom have studied the topic from different perspectives. This work will attempt to provide an overall review of the credibility assessment literature over the period 2006–2017 as applied to the context of the microblogging platform, Twitter. The known interpretations of credibility will be examined, particularly as they relate to the Twitter environment. In addition, we investigate levels of credibility assessment features. We then discuss recent works, addressing a new taxonomy of credibility analysis and assessment techniques. At last, a cross-referencing of literature is performed while suggesting new topics for future studies of credibility assessment in a social media context

    Susceptibility to Social Engineering in Social Networking Sites: The Case of Facebook

    Get PDF
    Past research has suggested that social engineering poses the most significant security risk. Recent studies have suggested that social networking sites (SNSs) are the most common source of social engineering attacks. The risk of social engineering attacks in SNSs is associated with the difficulty of making accurate judgments regarding source credibility in the virtual environment of SNSs. In this paper, we quantitatively investigate source credibility dimensions in terms of social engineering on Facebook, as well as the source characteristics that influence Facebook users to judge an attacker as credible, therefore making them susceptible to victimization. Moreover, in order to predict users’ susceptibility to social engineering victimization based on their demographics, we investigate the effectiveness of source characteristics on different demographic groups by measuring the consent intentions and behavior responses of users to social engineering requests using a role-play experiment

    Gender, feminism, and blogging in Egypt

    Get PDF
    This research is focusing on blogs in Egypt. It aims at finding out how effective blogging is in promoting equality and freedom of expression between men and women. It is trying to provide an overview of whether blogs aim to challenge the prevailing gender assumption of the society trying to achieve more liberating ways for women to exist in the world or not. The blogging trend is still on the rise, and several researches are assuring that the number of blogs and bloggers will continue to increase. Results found support for these forecasts and showed that women are actively involved in blogging. In addition, previous results in addition to this researchâ s findings prove that blogs are considered to be a way for bloggers to express themselves, especially for women

    A mixed-methods study of exploring and explaining the impact of the use of educational blogging on Saudi EFL students' writing development

    Get PDF
    The dominance of technology in many learners’ lives is inescapable and is an opportunity upon which educators could capitalize. Using educational blogging in language teaching, this study aimed to explore and explain the nature impact of the use of educational blogs on EFL students’ writing development. The study used a mixed-methodsdesign to analyse the impact of the educational blogging. The first phase was a quasi-experimental study with an intervention and comparison group, with 90 participants in total (45 in each group). Participants were undertaking an English Language writing course during the Preparatory Year Programme at a higher education institution in Saudi Arabia. The comparison group was taught using traditional teaching methods and the intervention group was taught by using educational blogs both individual and class blogs. Both groups had the same course materials and teaching hours. The sentence variety, syntactic complexity, vocabulary, paragraph organisation and the coherence and cohesion of student pre and post writing tests were measured in order to compare the groups. Mann-Whitney tests were used to investigate whether there was a significant difference. In the second phase, a sequential mixed-methods case study focused on the intervention group was developed to explore and explain the participants’ attitudes towards the use of educational blogs. Attitudes were measured using a closed questionnaire, and then this data was supplemented by open-ended questions, focus group discussion and semi-strucured interviews designed to explain the nature of the impact of the intervention in more detail. This phase also investigated the first blog and last blog entries on the class blog using the same procedure used in investigating the pre and post tests. Statistical findings reveal that the intervention group outperformed the students in the comparison group who were given similar lessons but using traditional methods (pen and 4 paper). Qualitative findings suggest that the use of educational blogging seems to have increased these students’ motivation to practise writing, and that this resulted in more sophisticated and syntactically complex texts after the intervention. The study supports the theory of using educational technology as a pedagogical teaching method in English classes, based on the socio-cultural and cognitive theory of social interactional learning. In so doing, it extends the relation of educational blogging affordances and writing development context, particularly in the context of HE students taking a non-English major, who might be expected to be possibly less motivated or invested in developing their English writing skills than those students who have typically formed the sample for similar previous studies. This study is significant in investigating the pedagogical use of blogging a new context, revealing how educational blogs can be used in a context which traditionally hinders pedagogical approaches which are collaborative or student-centered: one with large class sizes, a tradition of transmission-style teaching and limited opportunities for peer interaction. The findings suggest how and why blogging can be an effective pedagogical approach for supporting writing development in similar context

    Applications of opinion mining to data journalism

    Get PDF
    Dissertação de mest., Processamento de Linguagem Natural e Indústrias da Língua, Faculdade de Ciências Humanas e Sociais, Univ. do Algarve, 2013Nowadays social media play a central role in every day life. A huge volume of user-generated data spins around online social networks, such as Twitter, having an extraordinary impact on the media industry and on the users’ everyday life. More and more users and people use social networks from their computers and smartphones to share their emotions and opinions about the facts happening in the world. Natural language processing and, in particular, sentiment analysis are key technologies to make sense out of the data about news that circulates in the online social networks. The application of opinion mining to news-oriented user-generated contents, such as news-linking tweets, can provide novel views on the news audience behaviour and help to interpret the evolution of sentiments. Applying this capability in the social news-sphere permits (i) to measure the impact of news onto readers and (ii) to gather elements that contain stories. From a broad perspective, the main aim of this research is to face this challenge, that is, to explore how opinion mining (or sentiment analysis) can be adopted into the field of digital media and data-driven journalism

    Bloggers and the Blogosphere in Lebanon & Syria: meanings and activities

    Get PDF
    The use of blogging and its potential effects on society and politics have been widely debated but the meanings and understandings that bloggers themselves hold about the activity have not been sufficiently explored; indeed in Lebanon and Syria they have barely been investigated at all. Through interviews with bloggers, ISPs, Internet café owners and others, as well as informal online participant observation and an online questionnaire, this thesis explores the structural and cultural variables that have allowed Lebanese and Syrian bloggers to understand and use blogs in their own specific ways. The study not only recounts what bloggers say about themselves but investigates the structural variables that surround them, including government and institutional policy, censorship, impediments to Internet access, historical conditions under which blogging emerged, attitudes to the Internet, changing events and new entrants to blogging. By its comparative nature, the project reveals how the meanings that bloggers attach to their blogging activities and to their socialization with other bloggers are situated in the social and historical conditions under which blogging is practiced. The changing meanings blogging acquired for bloggers during the course of this research illustrated its shifting and relational attributes. Thus an unexpectedly complex array of interrelated factors is shown to contribute to the tool acquiring certain meanings and being used in specific ways. The research uncovers differing reasons between Lebanese and Syrian bloggers as to why they blog, what socialisation with other bloggers means to them, and what marks of differentiation such as anonymity and choice of language they use to distinguish the activity of one blogger from another. Both the Lebanese and Syrian bloggers at this point belong to a collective effort of other bloggers in their own countries, but the thesis also shows the meanings of socialisation online and how it is regarded change over time

    Building a Test Collection for Significant-Event Detection in Arabic Tweets

    Get PDF
    With the increasing popularity of microblogging services like Twitter, researchers discov- ered a rich medium for tackling real-life problems like event detection. However, event detection in Twitter is often obstructed by the lack of public evaluation mechanisms such as test collections (set of tweets, labels, and queries to measure the eectiveness of an information retrieval system). The problem is more evident when non-English lan- guages, e.g., Arabic, are concerned. With the recent surge of signicant events in the Arab world, news agencies and decision makers rely on Twitters microblogging service to obtain recent information on events. In this thesis, we address the problem of building a test collection of Arabic tweets (named EveTAR) for the task of event detection. To build EveTAR, we rst adopted an adequate denition of an event, which is a signicant occurrence that takes place at a certain time. An occurrence is signicant if there are news articles about it. We collected Arabic tweets using Twitter's streaming API. Then, we identied a set of events from the Arabic data collection using Wikipedias current events portal. Corresponding tweets were extracted by querying the Arabic data collection with a set of manually-constructed queries. To obtain relevance judgments for those tweets, we leveraged CrowdFlower's crowdsourcing platform. Over a period of 4 weeks, we crawled over 590M tweets, from which we identied 66 events that cover 8 dierent categories and gathered more than 134k relevance judgments. Each event contains an average of 779 relevant tweets. Over all events, we got an average Kappa of 0.6, which is a substantially acceptable value. EveTAR was used to evalu- ate three state-of-the-art event detection algorithms. The best performing algorithms achieved 0.60 in F1 measure and 0.80 in both precision and recall. We plan to make our test collection available for research, including events description, manually-crafted queries to extract potentially-relevant tweets, and all judgments per tweet. EveTAR is the rst Arabic test collection built from scratch for the task of event detection. Addi- tionally, we show in our experiments that it supports other tasks like ad-hoc search

    Credibility of online political news among Egyptian youth

    Get PDF
    This study confirms previous literature discoveries that perceived credibility of medium, source and message have a direct influence on the perceived credibility of online political news. Each credibility factor is examined separately to identify its influence and strength on the dependent variable. Medium credibility, Internet, is considered the factor that has the strongest influence on perceived credibility of online political news, followed by the source then finally message of online political news. Consumption of online political news has also a direct influence on perception of credibility of online political news but to a much lesser extent. Timely coverage, freedom of speech and dynamic representation are Internet’s inherent features and are highly recognized by Egyptian youth as credibility factors. As for other credibility factors related to believability, objectivity and balance youth become more skeptical about credibility of online political news. Many reasons led to this doubtful view but the most apparent reason in this study is attributed to the abuse of Internet as a communication tool for online political news and lack of regulation. Hence, the uncontrollable freedom of publishing and commenting on political news stories by average citizens who could be politically ignorant, with personal agendas and without any media knowledge

    Understanding Bots on Social Media - An Application in Disaster Response

    Get PDF
    abstract: Social media has become a primary platform for real-time information sharing among users. News on social media spreads faster than traditional outlets and millions of users turn to this platform to receive the latest updates on major events especially disasters. Social media bridges the gap between the people who are affected by disasters, volunteers who offer contributions, and first responders. On the other hand, social media is a fertile ground for malicious users who purposefully disturb the relief processes facilitated on social media. These malicious users take advantage of social bots to overrun social media posts with fake images, rumors, and false information. This process causes distress and prevents actionable information from reaching the affected people. Social bots are automated accounts that are controlled by a malicious user and these bots have become prevalent on social media in recent years. In spite of existing efforts towards understanding and removing bots on social media, there are at least two drawbacks associated with the current bot detection algorithms: general-purpose bot detection methods are designed to be conservative and not label a user as a bot unless the algorithm is highly confident and they overlook the effect of users who are manipulated by bots and (unintentionally) spread their content. This study is trifold. First, I design a Machine Learning model that uses content and context of social media posts to detect actionable ones among them; it specifically focuses on tweets in which people ask for help after major disasters. Second, I focus on bots who can be a facilitator of malicious content spreading during disasters. I propose two methods for detecting bots on social media with a focus on the recall of the detection. Third, I study the characteristics of users who spread the content of malicious actors. These features have the potential to improve methods that detect malicious content such as fake news.Dissertation/ThesisDoctoral Dissertation Computer Science 201
    • …
    corecore