217,554 research outputs found
Context-Aware Embeddings for Automatic Art Analysis
Automatic art analysis aims to classify and retrieve artistic representations
from a collection of images by using computer vision and machine learning
techniques. In this work, we propose to enhance visual representations from
neural networks with contextual artistic information. Whereas visual
representations are able to capture information about the content and the style
of an artwork, our proposed context-aware embeddings additionally encode
relationships between different artistic attributes, such as author, school, or
historical period. We design two different approaches for using context in
automatic art analysis. In the first one, contextual data is obtained through a
multi-task learning model, in which several attributes are trained together to
find visual relationships between elements. In the second approach, context is
obtained through an art-specific knowledge graph, which encodes relationships
between artistic attributes. An exhaustive evaluation of both of our models in
several art analysis problems, such as author identification, type
classification, or cross-modal retrieval, show that performance is improved by
up to 7.3% in art classification and 37.24% in retrieval when context-aware
embeddings are used
Authors\u27 Writing Styles Based Authorship Identification System Using the Text Representation Vector
© 2019 IEEE. Text mining is one of the main and typical tasks of machine learning (ML). Authorship identification (AI) is a standard research subject in text mining and natural language processing (NLP) that has undergone a remarkable evolution these last years. We need to identify/determine the actual author of anonymous texts given on the basis of a set of writing samples. Standard text classification often focuses on many handcrafted features such as dictionaries, knowledge bases, and different stylometric characteristics, which often leads to remarkable dimensionality. Unlike traditional approaches, this paper suggests an authorship identification approach based on automatic feature engineering using word2vec word embeddings, taking into account each author\u27s writing style. This system includes two learning phases, the first stage aims to generate the semantic representation of each author by using word2vec to learn and extract the most relevant characteristics of the raw document. The second stage is to apply the multilayer perceptron (MLP) classifier to fix the classification rules using the backpropagation learning algorithm. Experiments show that MLP classifier with word2vec model earns an accuracy of 95.83% for an English corpus, suggesting that the word2vec word embedding model can evidently enhance the identification accuracy compared to other classical models such as n-gram frequencies and bag of words
BlogForever D2.6: Data Extraction Methodology
This report outlines an inquiry into the area of web data extraction, conducted within the context of blog preservation. The report reviews theoretical advances and practical developments for implementing data extraction. The inquiry is extended through an experiment that demonstrates the effectiveness and feasibility of implementing some of the suggested approaches. More specifically, the report discusses an approach based on unsupervised machine learning that employs the RSS feeds and HTML representations of blogs. It outlines the possibilities of extracting semantics available in blogs and demonstrates the benefits of exploiting available standards such as microformats and microdata. The report proceeds to propose a methodology for extracting and processing blog data to further inform the design and development of the BlogForever platform
Integrated process of images and acceleration measurements for damage detection
The use of mobile robots and UAV to catch unthinkable images together with on-site global automated acceleration measurements easy achievable by wireless sensors, able of remote data transfer, have strongly enhanced the capability of defect and damage evaluation in bridges. A sequential procedure is, here, proposed for damage monitoring and bridge condition assessment based on both: digital image processing for survey and defect evaluation and structural identification based on acceleration measurements. A steel bridge has been simultaneously inspected by UAV to acquire images using visible light, or infrared radiation, and monitored through a wireless sensor network (WSN) measuring structural vibrations. First, image processing has been used to construct a geometrical model and to quantify corrosion extension. Then, the consistent structural model has been updated based on the modal quantities identified using the acceleration measurements acquired by the deployed WSN. © 2017 The Authors. Published by Elsevier Ltd
Computational Sociolinguistics: A Survey
Language is a social phenomenon and variation is inherent to its social
nature. Recently, there has been a surge of interest within the computational
linguistics (CL) community in the social dimension of language. In this article
we present a survey of the emerging field of "Computational Sociolinguistics"
that reflects this increased interest. We aim to provide a comprehensive
overview of CL research on sociolinguistic themes, featuring topics such as the
relation between language and social identity, language use in social
interaction and multilingual communication. Moreover, we demonstrate the
potential for synergy between the research communities involved, by showing how
the large-scale data-driven methods that are widely used in CL can complement
existing sociolinguistic studies, and how sociolinguistics can inform and
challenge the methods and assumptions employed in CL studies. We hope to convey
the possible benefits of a closer collaboration between the two communities and
conclude with a discussion of open challenges.Comment: To appear in Computational Linguistics. Accepted for publication:
18th February, 201
- âŠ