2,971 research outputs found

    Solutions to Detect and Analyze Online Radicalization : A Survey

    Full text link
    Online Radicalization (also called Cyber-Terrorism or Extremism or Cyber-Racism or Cyber- Hate) is widespread and has become a major and growing concern to the society, governments and law enforcement agencies around the world. Research shows that various platforms on the Internet (low barrier to publish content, allows anonymity, provides exposure to millions of users and a potential of a very quick and widespread diffusion of message) such as YouTube (a popular video sharing website), Twitter (an online micro-blogging service), Facebook (a popular social networking website), online discussion forums and blogosphere are being misused for malicious intent. Such platforms are being used to form hate groups, racist communities, spread extremist agenda, incite anger or violence, promote radicalization, recruit members and create virtual organi- zations and communities. Automatic detection of online radicalization is a technically challenging problem because of the vast amount of the data, unstructured and noisy user-generated content, dynamically changing content and adversary behavior. There are several solutions proposed in the literature aiming to combat and counter cyber-hate and cyber-extremism. In this survey, we review solutions to detect and analyze online radicalization. We review 40 papers published at 12 venues from June 2003 to November 2011. We present a novel classification scheme to classify these papers. We analyze these techniques, perform trend analysis, discuss limitations of existing techniques and find out research gaps

    Semantic modelling of user interests based on cross-folksonomy analysis

    Get PDF
    The continued increase in Web usage, in particular participation in folksonomies, reveals a trend towards a more dynamic and interactive Web where individuals can organise and share resources. Tagging has emerged as the de-facto standard for the organisation of such resources, providing a versatile and reactive knowledge management mechanism that users find easy to use and understand. It is common nowadays for users to have multiple profiles in various folksonomies, thus distributing their tagging activities. In this paper, we present a method for the automatic consolidation of user profiles across two popular social networking sites, and subsequent semantic modelling of their interests utilising Wikipedia as a multi-domain model. We evaluate how much can be learned from such sites, and in which domains the knowledge acquired is focussed. Results show that far richer interest profiles can be generated for users when multiple tag-clouds are combine

    Blogs as Infrastructure for Scholarly Communication.

    Full text link
    This project systematically analyzes digital humanities blogs as an infrastructure for scholarly communication. This exploratory research maps the discourses of a scholarly community to understand the infrastructural dynamics of blogs and the Open Web. The text contents of 106,804 individual blog posts from a corpus of 396 blogs were analyzed using a mix of computational and qualitative methods. Analysis uses an experimental methodology (trace ethnography) combined with unsupervised machine learning (topic modeling), to perform an interpretive analysis at scale. Methodological findings show topic modeling can be integrated with qualitative and interpretive analysis. Special attention must be paid to data fitness, or the shape and re-shaping practices involved with preparing data for machine learning algorithms. Quantitative analysis of computationally generated topics indicates that while the community writes about diverse subject matter, individual scholars focus their attention on only a couple of topics. Four categories of informal scholarly communication emerged from the qualitative analysis: quasi-academic, para-academic, meta-academic, and extra-academic. The quasi and para-academic categories represent discourse with scholarly value within the digital humanities community, but do not necessarily have an obvious path into formal publication and preservation. A conceptual model, the (in)visible college, is introduced for situating scholarly communication on blogs and the Open Web. An (in)visible college is a kind of scholarly communication that is informal, yet visible at scale. This combination of factors opens up a new space for the study of scholarly communities and communication. While (in)invisible colleges are programmatically observable, care must be taken with any effort to count and measure knowledge work in these spaces. This is the first systematic, data driven analysis of the digital humanities and lays the groundwork for subsequent social studies of digital humanities.PhDInformationUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/111592/1/mcburton_1.pd

    Weblog and short text feature extraction and impact on categorisation

    Full text link
    The characterisation and categorisation of weblogs and other short texts has become an important research theme in the areas of topic/trend detection, and pattern recognition, amongst others. The value of analysing and characterising short text is to understand and identify the features that can identify and distinguish them, thereby improving input to the classification process. In this research work, we analyse a large number of text features and establish which combinations are useful to discriminate between the different genres of short text. Having identified the most promising features, we then confirm our findings by performing the categorisation task using three approaches: the Gaussian and SVM classifiers and the K-means clustering algorithm. Several hundred combinations of features were analysed in order to identify the best combinations and the results confirmed the observations made. The novel aspect of our work is the detection of the best combination of individual metrics which are identified as potential features to be used for the categorisation process.The research work of the third author is partially funded by the WIQ-EI (IRSES grant n. 269180) and DIANA APPLICATIONS (TIN2012-38603-C02-01), and done in the framework of the VLC/Campus Microcluster on Multimodal Interaction in Intelligent Systems.Perez Tellez, F.; Cardiff, J.; Rosso, P.; Pinto Avendaño, DE. (2014). Weblog and short text feature extraction and impact on categorisation. Journal of Intelligent and Fuzzy Systems. 27(5):2529-2544. https://doi.org/10.3233/IFS-141227S2529254427

    Technology in the 21st Century: New Challenges and Opportunities

    Get PDF
    Although big data, big data analytics (BDA) and business intelligence have attracted growing attention of both academics and practitioners, a lack of clarity persists about how BDA has been applied in business and management domains. In reflecting on Professor Ayre's contributions, we want to extend his ideas on technological change by incorporating the discourses around big data, BDA and business intelligence. With this in mind, we integrate the burgeoning but disjointed streams of research on big data, BDA and business intelligence to develop unified frameworks. Our review takes on both technical and managerial perspectives to explore the complex nature of big data, techniques in big data analytics and utilisation of big data in business and management community. The advanced analytics techniques appear pivotal in bridging big data and business intelligence. The study of advanced analytics techniques and their applications in big data analytics led to identification of promising avenues for future research

    Sentiment Analysis Meets Semantic Analysis: Constructing Insight Knowledge Bases

    Get PDF
    Numerous Web 2.0 applications collect user opinions, and other user-generated content in the form of product reviews, discussion boards, and blogs, which are often captured as unstructured data. Text mining techniques are important for analyzing users’ opinions (sentiment analysis) and identifying topics of interest (semantic analysis). However, little work has been carried out that combines semantics with user’s sentiments. This research proposes a Sentiment-Semantic Framework that incorporates results from both semantic and sentiment analysis to construct a knowledge base of insights gained from integrating the information extracted from each type of analysis. To evaluate the framework, a prototype is developed and applied to two different domains (e-commerce and politics) and the resulting insight knowledge bases constructed

    Destination image online analyzed through user generated content: a systematic literature review

    Get PDF
    Destination Image is a concept that has been studied for a long time in tourism research. The question of how a destination is perceived by tourists and potential new guests is an important insight, especially for local tourism managers, in order to evaluate the implemented strategies and to plan further tactics. Since the last two decades, due to a drastic digitalization, tourism research is now increasingly examining the Destination Image online. This creates new challenges in the selection of sources, methods, and in data collection. The aim of the present study was to systematically capture the approach to analyze the online Destination Image through User Generated Content using studies from the last ten years. Therefore, a Systematic Literature Review on primary research from academic databases was conducted. As a summary of the findings, a conceptual model was developed, based on the insights of the studies in the dataset, to contribute a guidance for the preparation phase of future online Destination Image research. In short, the main findings are: TripAdvisor.com is the main source for online Destination Image analysis. Researchers recommend using the help of software and programming languages to collect and analyzed the data. Equally to earlier Destination Image studies, the main methods applied in online Destination Image analysis are quantitative content analysis, qualitative content analysis and sentiment analysis. In combination with the examination of cognitive and affective factors, co-occurrence analysis, and correlation analysis. The present study has several limitations, which are: the loss of detail information due to reducing the studies to comparable key parameters, the absence of Anglo-American studies, due to the database selection as well as the lack of quality testing of the studies included.A Destination Image é um conceito que tem sido estudado há muito tempo na investigação turística. A questão de como o destino é visto pelos turistas e pelos potenciais novos hóspedes é uma perspectiva importante, especialmente para os gestores de turismo da região, a fim de avaliar as estratégias implementadas e de planear novas tácticas. Desde as últimas duas décadas, ocorreu uma digitalização drástica, a investigação turística adaptou-se a este fenómeno e está agora a estudar cada vez mais a imagem do destino online. Esta alteração criou novos desafios na selecção de fontes, métodos, e na recolha de dados. O objetivo do presente trabalho foi o de captar, de forma sistemática, as abordagens consideradas para analisar a imagem do destino online utilizando estudos dos últimos dez anos. Para este efeito, os estudos primários dos anos 2010-2020 das bases de dados académicos Web of Science, ProQuest e b-on, foram recolhidos utilizando palavras-chave de pesquisa pré-definidas. O grupo de artigos obtidos como resultado foram subsequentemente sujeitos a avaliação de eligibilidade, como recomendado por Moher et al. (2009). Isto significa que os estudos que não cumpriam os critérios pré-definidos foram excluídos. Os critérios de inclusão foram: O trabalho académico tinha de ser uma referência primária de uma revista científica, escrita em inglês e a amostra analisada tinha de ter uma origem associada à comunicação nas social media online. Posteriormente, os restantes 35 artigos foram transferidos para uma base de dados utilizando uma matriz de codificação. A matriz de codificação foi concebida para capturar os parâmetros-chave de cada estudo primário de uma forma padronizada e, portanto, comparável. Foi considerada informação geral, como o ano, localização e revista publicada, bem como informação temática específica, como o campo do turismo pesquisado e os meios analisados, juntamente com as categorias referentes à metodologia considerada, as ferramentas utilizadas e os resultados obtidos. A base de dados resultante foi então utilizada para obter declarações sobre a abordagem metodológica utilizada na análise da imagem de destinos online. Como resumo dos resultados, foi desenvolvido um modelo conceptual, baseado nos conhecimentos obtidos a partir do grupo de artigos, que constituiu o conjunto de dados para análise, para contribuir com um guião para a fase de preparação de uma futura investigação sobre imagem dos destinos online. Em resumo, as principais conclusões são: TripAdvisor.com é a principal fonte para a análise da imagem de destinos online. Os investigadores recomendam a utilização da ajuda de software e linguagens de programação para a recolha e análise dos dados. À semelhança de estudos anteriores de Destination Image, os principais métodos aplicados na análise imagem dos destinos online são a análise quantitativa do conteúdo, a análise qualitativa do conteúdo e a análise dos sentimentos. Em combinação com a análise dos fatores cognitivos e afectivos, análise de co-ocorrência, e análise de correlação. O presente estudo tem várias limitações. Que são: a perda de informação detalhada devido à redução dos estudos a parâmetros-chave comparáveis, a ausência de estudos anglo-americanos, devido à selecção do banco de dados, bem como a falta de testes de qualidade dos estudos incluídos.(TurExperience - Tourist experiences' impacts on the destination image: searching for new opportunities to the Algarve”)

    What’s Happening Around the World? A Survey and Framework on Event Detection Techniques on Twitter

    Full text link
    © 2019, Springer Nature B.V. In the last few years, Twitter has become a popular platform for sharing opinions, experiences, news, and views in real-time. Twitter presents an interesting opportunity for detecting events happening around the world. The content (tweets) published on Twitter are short and pose diverse challenges for detecting and interpreting event-related information. This article provides insights into ongoing research and helps in understanding recent research trends and techniques used for event detection using Twitter data. We classify techniques and methodologies according to event types, orientation of content, event detection tasks, their evaluation, and common practices. We highlight the limitations of existing techniques and accordingly propose solutions to address the shortcomings. We propose a framework called EDoT based on the research trends, common practices, and techniques used for detecting events on Twitter. EDoT can serve as a guideline for developing event detection methods, especially for researchers who are new in this area. We also describe and compare data collection techniques, the effectiveness and shortcomings of various Twitter and non-Twitter-based features, and discuss various evaluation measures and benchmarking methodologies. Finally, we discuss the trends, limitations, and future directions for detecting events on Twitter
    corecore