268,344 research outputs found

    Performance analysis of the Oráculo framework for data collection from Twitter / Análise do desempenho do Framework Oráculo para coletas no Twitter

    Get PDF
    Online social networks are important social spaces for human interaction, with far-reaching applications in communication, entertainment, advertising, social campaigning and community empowerment. Shared data have become a research source for several studies seeking to analyze user interactions in these networks. Because of the large volume of data produced, text mining techniques are required for analyzing the collected data efficiently. One of the challenges of the text mining process is the lack of direct access to data from online social networks, which requires the use of specialized tools for collecting data. The present study conducts a performance analysis of Oráculo Application Development Framework as a tool for collecting and mining texts shared on the social network Twitter. In this framework, different algorithms and techniques were applied to circumvent the limitations imposed by the Twitter API. Performance tests were conducted comparing the Oráculo and DMI-TCAT algorithms. The results show that Oráculo presents superior performance in the number of tweets collected compared to DMI-TCAT considering the algorithms and scenarios analyzed

    Online social network data as sociometric markers

    Get PDF
    Data from online social networks carry enormous potential for psychological research, yet their use and the ethical implications thereof are currently hotly debated. The present work aims to outline in detail the unique information richness of this data type and, in doing so, to support researchers when deciding on ethically appropriate ways of collecting, storing, publishing, and sharing data from online sources. Focusing on the very nature of social networks, their structural characteristics and depth of information, a detailed and accessible account of the challenges associated with data management and data storage is provided. In particular, the general non-anonymity of network data sets is discussed, and an approach is developed to quantify the level of uniqueness that a particular online network bestows upon the individual maintaining it. Using graph enumeration techniques, it can be shown that comparatively sparse information on a network is suitable as a sociometric marker that allows for the identification of an individual from the global population of online users. The impossibility of anonymizing specific types of network data carries implications for ethical guidelines and research practice. At the same time, network uniqueness opens up opportunities for novel research in psychology

    Consistent Image Decoding from Multiple Lossy Versions

    Get PDF
    With the recent development of tools for data sharing in social networks and peer to peer networks, the same information is often stored in different nodes. Peer-to-peer protocols usually allow one user to collect portions of the same file from different nodes in the network, substantially improving the rate at which data are received by the end user. In some cases, however, the same multimedia document is available in different lossy versions on the network nodes. In such situations, one may be interested in collecting all available versions of the same document and jointly decoding them to obtain a better reconstruction of the original. In this paper we study some methods to jointly decode different versions of the same image. We compare different uses of the method of Projections Onto Convex Sets (POCS) with some Convex Optimization techniques in order to reconstruct an image for which JPEG and JPEG2000 lossy versions are available

    Social information landscapes: automated mapping of large multimodal, longitudinal social networks

    Get PDF
    Purpose – This article presents a Big Data solution as a methodological approach to the automated collection, cleaning, collation and mapping of multimodal, longitudinal datasets from social media. The article constructs Social Information Landscapes. Design/methodology/approach – The research presented here adopts a Big Data methodological approach for mapping user-generated contents in social media. The methodology and algorithms presented are generic, and can be applied to diverse types of social media or user-generated contents involving user interactions, such as within blogs, comments in product pages and other forms of media, so long as a formal data structure proposed here can be constructed. Findings – The limited presentation of the sequential nature of content listings within social media and Web 2.0 pages, as viewed on Web browsers or on mobile devices, do not necessarily reveal nor make obvious an unknown nature of the medium; that every participant, from content producers, to consumers, to followers and subscribers, including the contents they produce or subscribed to, are intrinsically connected in a hidden but massive network. Such networks when mapped, could be quantitatively analysed using social network analysis (e.g., centralities), and the semantics and sentiments could equally reveal valuable information with appropriate analytics. Yet that which is difficult is the traditional approach of collecting, cleaning, collating and mapping such datasets into a sufficiently large sample of data that could yield important insights into the community structure and the directional, and polarity of interaction on diverse topics. This research solves this particular strand of problem. Research limitations/implications – The automated mapping of extremely large networks involving hundreds of thousands to millions of nodes, over a long period of time could possibly assist in the proving or even disproving of theories. The goal of this article is to demonstrate the feasibility of using automated approaches for acquiring massive, connected datasets for academic inquiry in the social sciences. Practical implications – The methods presented in this article, and the Big Data architecture presented here have great practical values to individuals and institutions which have low budgets. The software-hardward integrated architecture uses open source software, and the social information landscapes mapping algorithms are not difficult to implement. Originality/value – The majority of research in the literatures uses traditional approach for collecting social networks data. The traditional approach is slow, tedious and does not yield a large enough sample for the data to be significant for analysis. Whilst traditional approach collects only a small percentage of data, the original methods presented could possibility collect entire datasets in social media due to its scalability and automated mapping techniques

    Performance-Aware High-Performance Computing for Remote Sensing Big Data Analytics

    Get PDF
    The incredible increase in the volume of data emerging along with recent technological developments has made the analysis processes which use traditional approaches more difficult for many organizations. Especially applications involving subjects that require timely processing and big data such as satellite imagery, sensor data, bank operations, web servers, and social networks require efficient mechanisms for collecting, storing, processing, and analyzing these data. At this point, big data analytics, which contains data mining, machine learning, statistics, and similar techniques, comes to the help of organizations for end-to-end managing of the data. In this chapter, we introduce a novel high-performance computing system on the geo-distributed private cloud for remote sensing applications, which takes advantages of network topology, exploits utilization and workloads of CPU, storage, and memory resources in a distributed fashion, and optimizes resource allocation for realizing big data analytics efficiently

    Social forecasting: a literature review of research promoted by the United States National Security System to model human behavior

    Get PDF
    The development of new information and communication technologies increased the volume of information flows within society. For the security forces, this phenomenon presents new opportunities for collecting, processing and analyzing information linked with the opportunity to collect a vast and diverse amount data, and at the same time it requires new organizational and individual competences to deal with the new forms and huge volumes of information. Our study aimed to outline the research areas funded by the US defense and intelligence agencies with respect to social forecasting. Based on bibliometric techniques, we clustered 2688 articles funded by US defense or intelligence agencies in five research areas: a) Complex networks, b) Social networks, c) Human reasoning, d) Optimization algorithms, and e) Neuroscience. After that, we analyzed qualitatively the most cited papers in each area. Our analysis identified that the research areas are compatible with the US intelligence doctrine. Besides that, we considered that the research areas could be incorporated in the work of security forces provided that basic training is offered. The basic training would not only enhance capabilities of law enforcement agencies but also help safeguard against (unwitting) biases and mistakes in the analysis of data

    Introducing the Email Knowledge Extraction with Social Network Analysis (EKESNA) tool for discovering an organisation’s expertise network

    Get PDF
    Manually collating social network analysis (SNA) data can be a tedious and time consuming process, but if automated could save time and aid efficiency utilising the organisations knowledge network. In this paper the authors propose Email Knowledge Extraction with Social Network Analysis (EKESNA); a system for automating the continuous discovery and collation of organisation’s social network, as well as expertise network. The research adopted a systems development methodology, which comprised four stages. The first reviewed the approaches for collecting SNA data. The second involved carrying out SNA using traditional techniques. In the third stage, the EKESNA tool was developed, piloted in-house, and a trial was run at the same organisation used in stage two. The final stage evaluated the SNA data collected using both approach. The knowledge network obtained from the traditional social network gathering method, and from EKESNA, revealed similarities during the analysis of members of the organisation that were central to the flow of Information and Knowledge. When compared with the traditional method of conducting knowledge network analysis, the EKESNA tool allows for continuous collection as new topics are discussed, and new members are introduced into the organisation. The nature of the tools continuous discovery of the organisation’s knowledge network affords members up-to-date data to inform business process reengineering. The data collected is continuously evolving meaning organisations can integrate networks around core processes, ensure integration (post-merger or reorganisation), improve the strategic decision making in top leadership networks (recruiting the right people for a particular project based on expertise and connectivity), or even identify/facilitate potential communities of practice. Analysis was limited by the sample size of both collection methods, emphasis was therefore made using centrality measures. Email knowledge extraction, and email social network systems are not new concepts, however this paper presents EKESNA which is a novel system as it combines both concepts in a way that also allows for the continuous discovery, visualisation, and analysis of networks around specified topics of interest within an organisation; linking conversations to specific expert knowledge
    • …
    corecore