97,299 research outputs found
A framework for interrogating social media images to reveal an emergent archive of war
The visual image has long been central to how war is seen, contested and legitimised, remembered and forgotten. Archives are pivotal to these ends as is their ownership and access, from state and other official repositories through to the countless photographs scattered and hidden from a collective understanding of what war looks like in individual collections and dusty attics. With the advent and rapid development of social media, however, the amateur and the professional, the illicit and the sanctioned, the personal and the official, and the past and the present, all seem to inhabit the same connected and chaotic space.However, to even begin to render intelligible the complexity, scale and volume of what war looks like in social media archives is a considerable task, given the limitations of any traditional human-based method of collection and analysis. We thus propose the production of a series of ‘snapshots’, using computer-aided extraction and identification techniques to try to offer an experimental way in to conceiving a new imaginary of war. We were particularly interested in testing to see if twentieth century wars, obviously initially captured via pre-digital means, had become more ‘settled’ over time in terms of their remediated presence today through their visual representations and connections on social media, compared with wars fought in digital media ecologies (i.e. those fought and initially represented amidst the volume and pervasiveness of social media images).To this end, we developed a framework for automatically extracting and analysing war images that appear in social media, using both the features of the images themselves, and the text and metadata associated with each image. The framework utilises a workflow comprising four core stages: (1) information retrieval, (2) data pre-processing, (3) feature extraction, and (4) machine learning. Our corpus was drawn from the social media platforms Facebook and Flickr
Extracting textual overlays from social media videos using neural networks
Textual overlays are often used in social media videos as people who watch
them without the sound would otherwise miss essential information conveyed in
the audio stream. This is why extraction of those overlays can serve as an
important meta-data source, e.g. for content classification or retrieval tasks.
In this work, we present a robust method for extracting textual overlays from
videos that builds up on multiple neural network architectures. The proposed
solution relies on several processing steps: keyframe extraction, text
detection and text recognition. The main component of our system, i.e. the text
recognition module, is inspired by a convolutional recurrent neural network
architecture and we improve its performance using synthetically generated
dataset of over 600,000 images with text prepared by authors specifically for
this task. We also develop a filtering method that reduces the amount of
overlapping text phrases using Levenshtein distance and further boosts system's
performance. The final accuracy of our solution reaches over 80A% and is au
pair with state-of-the-art methods.Comment: International Conference on Computer Vision and Graphics (ICCVG) 201
Exploring the Power of Topic Modeling Techniques in Analyzing Customer Reviews: A Comparative Analysis
The exponential growth of online social network platforms and applications
has led to a staggering volume of user-generated textual content, including
comments and reviews. Consequently, users often face difficulties in extracting
valuable insights or relevant information from such content. To address this
challenge, machine learning and natural language processing algorithms have
been deployed to analyze the vast amount of textual data available online. In
recent years, topic modeling techniques have gained significant popularity in
this domain. In this study, we comprehensively examine and compare five
frequently used topic modeling methods specifically applied to customer
reviews. The methods under investigation are latent semantic analysis (LSA),
latent Dirichlet allocation (LDA), non-negative matrix factorization (NMF),
pachinko allocation model (PAM), Top2Vec, and BERTopic. By practically
demonstrating their benefits in detecting important topics, we aim to highlight
their efficacy in real-world scenarios. To evaluate the performance of these
topic modeling methods, we carefully select two textual datasets. The
evaluation is based on standard statistical evaluation metrics such as topic
coherence score. Our findings reveal that BERTopic consistently yield more
meaningful extracted topics and achieve favorable results.Comment: 13 page
Probing the topological properties of complex networks modeling short written texts
In recent years, graph theory has been widely employed to probe several
language properties. More specifically, the so-called word adjacency model has
been proven useful for tackling several practical problems, especially those
relying on textual stylistic analysis. The most common approach to treat texts
as networks has simply considered either large pieces of texts or entire books.
This approach has certainly worked well -- many informative discoveries have
been made this way -- but it raises an uncomfortable question: could there be
important topological patterns in small pieces of texts? To address this
problem, the topological properties of subtexts sampled from entire books was
probed. Statistical analyzes performed on a dataset comprising 50 novels
revealed that most of the traditional topological measurements are stable for
short subtexts. When the performance of the authorship recognition task was
analyzed, it was found that a proper sampling yields a discriminability similar
to the one found with full texts. Surprisingly, the support vector machine
classification based on the characterization of short texts outperformed the
one performed with entire books. These findings suggest that a local
topological analysis of large documents might improve its global
characterization. Most importantly, it was verified, as a proof of principle,
that short texts can be analyzed with the methods and concepts of complex
networks. As a consequence, the techniques described here can be extended in a
straightforward fashion to analyze texts as time-varying complex networks
- …