Search CORE

103,860 research outputs found

Extracting textual overlays from social media videos using neural networks

Author: Bielski Adam
Cyrta Paweł
Słucki Adam
Trzcinski Tomasz
Publication venue
Publication date: 01/05/2018
Field of study

Textual overlays are often used in social media videos as people who watch them without the sound would otherwise miss essential information conveyed in the audio stream. This is why extraction of those overlays can serve as an important meta-data source, e.g. for content classification or retrieval tasks. In this work, we present a robust method for extracting textual overlays from videos that builds up on multiple neural network architectures. The proposed solution relies on several processing steps: keyframe extraction, text detection and text recognition. The main component of our system, i.e. the text recognition module, is inspired by a convolutional recurrent neural network architecture and we improve its performance using synthetically generated dataset of over 600,000 images with text prepared by authors specifically for this task. We also develop a filtering method that reduces the amount of overlapping text phrases using Levenshtein distance and further boosts system's performance. The final accuracy of our solution reaches over 80A% and is au pair with state-of-the-art methods.Comment: International Conference on Computer Vision and Graphics (ICCVG) 201

arXiv.org e-Print Archive

Crossref

Signed Network Modeling Based on Structural Balance Theory

Author: Beigi Ghazaleh
Derr Tyler
Grover Aditya
Ludwig Mark
You Jiaxuan
Publication venue
Publication date: 11/12/2018
Field of study

The modeling of networks, specifically generative models, have been shown to provide a plethora of information about the underlying network structures, as well as many other benefits behind their construction. Recently there has been a considerable increase in interest for the better understanding and modeling of networks, but the vast majority of this work has been for unsigned networks. However, many networks can have positive and negative links(or signed networks), especially in online social media, and they inherently have properties not found in unsigned networks due to the added complexity. Specifically, the positive to negative link ratio and the distribution of signed triangles in the networks are properties that are unique to signed networks and would need to be explicitly modeled. This is because their underlying dynamics are not random, but controlled by social theories, such as Structural Balance Theory, which loosely states that users in social networks will prefer triadic relations that involve less tension. Therefore, we propose a model based on Structural Balance Theory and the unsigned Transitive Chung-Lu model for the modeling of signed networks. Our model introduces two parameters that are able to help maintain the positive link ratio and proportion of balanced triangles. Empirical experiments on three real-world signed networks demonstrate the importance of designing models specific to signed networks based on social theories to obtain better performance in maintaining signed network properties while generating synthetic networks.Comment: CIKM 2018: https://dl.acm.org/citation.cfm?id=327174

arXiv.org e-Print Archive

Crossref

Synthetic Data Generation using Benerator Tool

Author: Ayala-Rivera Vanessa
Cerqueus Thomas
McDonagh Patrick
Murphy Liam
Publication venue
Publication date: 13/11/2013
Field of study

Datasets of different characteristics are needed by the research community for experimental purposes. However, real data may be difficult to obtain due to privacy concerns. Moreover, real data may not meet specific characteristics which are needed to verify new approaches under certain conditions. Given these limitations, the use of synthetic data is a viable alternative to complement the real data. In this report, we describe the process followed to generate synthetic data using Benerator, a publicly available tool. The results show that the synthetic data preserves a high level of accuracy compared to the original data. The generated datasets correspond to microdata containing records with social, economic and demographic data which mimics the distribution of aggregated statistics from the 2011 Irish Census data.Comment: 12 pages, 5 figures, 10 reference

arXiv.org e-Print Archive

CiteSeerX

Better Safe Than Sorry: An Adversarial Approach to Improve Social Bot Detection

Author: Cresci Stefano
Petrocchi Marinella
Spognardi Angelo
Tognazzi Stefano
Publication venue
Publication date: 01/01/2019
Field of study

The arm race between spambots and spambot-detectors is made of several cycles (or generations): a new wave of spambots is created (and new spam is spread), new spambot filters are derived and old spambots mutate (or evolve) to new species. Recently, with the diffusion of the adversarial learning approach, a new practice is emerging: to manipulate on purpose target samples in order to make stronger detection models. Here, we manipulate generations of Twitter social bots, to obtain - and study - their possible future evolutions, with the aim of eventually deriving more effective detection techniques. In detail, we propose and experiment with a novel genetic algorithm for the synthesis of online accounts. The algorithm allows to create synthetic evolved versions of current state-of-the-art social bots. Results demonstrate that synthetic bots really escape current detection techniques. However, they give all the needed elements to improve such techniques, making possible a proactive approach for the design of social bot detection systems.Comment: This is the pre-final version of a paper accepted @ 11th ACM Conference on Web Science, June 30-July 3, 2019, Boston, U

arXiv.org e-Print Archive

Crossref

Archivio della Ricerca - Università di Pisa

Archivio della ricerca- Università di Roma La Sapienza