154 research outputs found
Robust optimal design of quantum electronic devices
We consider the optimal design of a sequence of quantum barriers in order to manufacture an electronic device at the nanoscale such that the dependence of its transmission coefficient on the bias voltage is linear. The technique presented here is easily adaptable to other response characteristics. The transmission coefficient is computed using the Wentzel-Kramers-Brillouin (WKB) method, so we can explicitly compute the gradient of the objective function. In contrast with earlier treatments, manufacturing uncertainties are incorporated in the model through random variables and the optimal design problem is formulated in a probabilistic setting. As a measure of robustness, a weighted sum of the expectation and the variance of a least-squares performance metric is considered. Several simulations illustrate the proposed approach.Ociel Morales was supported by CONACyT (Consejo Nacional de Ciencia y Tecnología, Mexico) (Grant no. 726714), under programMovilidad en el Extranjero (291062). Francisco Periago was supported by Ministerio de Economía y Competitividad (Spain) (Projects DPI2016-77538-R and MTM2017-83740-P) and Fundación Séneca (Agencia de Ciencia y Tecnología de la Región de Murcia (Spain)) (19274/PI/14). José A. Vallejo was supported by a CONACyT Project CB-179115
InstanceRank: Bringing order to datasets
In this paper we present InstanceRank, a ranking algorithm that reflects the relevance of the instances
within a dataset. InstanceRank applies a similar solution to that used by PageRank, the web pages ranking
algorithm in the Google search engine. We also present ISR, an instance selection technique that uses
InstanceRank. This algorithm chooses the most representative instances from a learning database. Experiments
show that ISR algorithm, with InstanceRank as ranking criteria, obtains similar results in accuracy
to other instance reduction techniques, noticeably reducing the size of the instance set.Ministerio de Educación y Ciencia HUM2007-66607-C04-0
PolaritySpam: Propagating Content-based Information Through a Web-Graph to Detect Web Spam
Spam web pages have become a problem for Information Retrieval systems
due to the negative effects that this phenomenon can cause in their results. In this work
we tackle the problem of detecting these pages with a propagation algorithm that, taking
as input a web graph, chooses a set of spam and not-spam web pages in order to spread
their spam likelihood over the rest of the network. Thus we take advantage of the links
between pages to obtain a ranking of pages according to their relevance and their spam
likelihood. Our intuition consists in giving a high reputation to those pages related to
relevant ones, and giving a high spam likelihood to the pages linked to spam web pages.
We introduce the novelty of including the content of the web pages in the computation of
an a priori estimation of the spam likelihood of the pages, and propagate this information.
Our graph-based algorithm computes two scores for each node in the graph. Intuitively,
these values represent how bad or good (spam-like or not) is a web page, according to its
textual content and its relations in the graph. The experimental results show that our
method outperforms other techniques for spam detectionMinisterio de Educación y Ciencia HUM2007-66607-C04-0
WIRS. Un Algoritmo de Reducción de Instancias Basado en Ranking
En este artículo se presenta el algoritmo WIRS, una técnica
de reducción de instancias que tiene como objetivo seleccionar las ins tancias más representativas de una base de datos de aprendizaje. Este
tipo de técnicas se utilizan para conseguir bases de datos más pequeñas
sobre las que se pueda aplicar el algoritmo de los vecinos más cercanos
con menor coste computacional y sin excesiva pérdida de precisión. El
algoritmo WIRS es una adaptación del algoritmo WITS en el que se ha
sustituido el criterio de la tipicidad por el de ranking a la hora de calcular
el orden de las instancias necesario para aplicar WITS. Para calcular el
ranking utilizamos una solución similar a la empleada por PageRank, el
algoritmo de cálculo de relevancia de páginas web del buscador Google.
Los experimentos demuestran que el uso del ranking como criterio de
ordenación obtiene resultados comparables a los obtenidos por la versión
original de WITS, mejorando incluso estos resultados para algunas de
las bases de datos utilizadas.Ministerio de Educación y Ciencia TIN 2004-07246-C03-0
‘Long autonomy or long delay?’ The importance of domain in opinion mining
Nowadays, people do not only navigate the web, but they also contribute contents to the Internet. Among
other things, they write their thoughts and opinions in review sites, forums, social networks, blogs and
other websites. These opinions constitute a valuable resource for businesses, governments and consumers.
In the last years, some researchers have proposed opinion extraction systems, mostly domain-independent
ones, to automatically extract structured representations of opinions contained in those texts. In
this work, we tackle this task in a domain-oriented approach, defining a set of domain-specific resources
which capture valuable knowledge about how people express opinions on a given domain. These
resources are automatically induced from a set of annotated documents. Some experiments were carried
out on three different domains (user-generated reviews of headphones, hotels and cars), comparing our
approach to other state-of-the-art, domain-independent techniques. The results confirm the importance
of the domain in order to build accurate opinion extraction systems. Some experiments on the influence
of the dataset size and an example of aggregation and visualization of the extracted opinions are also
shown
MCFS: Min-cut-based feature-selection
In this paper, MCFS (Min-Cut-based feature-selection) is presented, which is a feature-selection algorithm based on the representation of the features in a dataset by means of a directed graph. The main contribution of our work is to show the usefulness of a general graph-processing technique in the feature-selection problem for classification datasets. The vertices of the graphs used herein are the features together with two special-purpose vertices (one of which denotes high correlation to the feature class of the dataset, and the other denotes a low correlation to the feature class). The edges are functions of the correlations among the features and also between the features and the classes. A classic max-flow min-cut algorithm is applied to this graph. The cut returned by this algorithm provides the selected features. We have compared the results of our proposal with well-known feature-selection techniques. Our algorithm obtains results statistically similar to those achieved by the other techniques in terms of number of features selected, while additionally significantly improving the accuracy.Ministerio de Ciencia, Innovación y Universidades RTI2018-098 062-A-I00Ministerio de Economía y Competitividad TIN2017-82113-C2-1-
Propagation of trust and distrust for the detection of trolls in a social network
Trust and Reputation Systems constitute an essential part of many social networks due to
the great expansion of these on-line communities in the past few years. As a consequence
of this growth, some users try to disturb the normal atmosphere of these communities, or
even to take advantage of them in order to obtain some kind of benefits. Therefore, the concept
of trust is a key point in the performance of on-line systems such as on-line marketplaces,
review aggregators, social news sites, and forums. In this work we propose a
method to compute a ranking of the users in a social network, regarding their trustworthiness.
The aim of our method is to prevent malicious users from illicitly gaining high reputation
in the network by demoting them in the ranking of users. We propose a novel
system intended to propagate both positive and negative opinions of the users through a
network, in such way that the opinions from each user about others influence their global
trust score. Our proposal has been evaluated in different challenging situations. The experiments
include the generation of random graphs, the use of a real-world dataset extracted
from a social news site, and a combination of both a real dataset and generation techniques,
in order to test our proposals in different environments. The results show that
our method performs well in every situations, showing the propagation of trust and distrust
to be a reliable mechanism in a Trust and Reputation System
A comparative study of classifier combination applied to NLP tasks
The paper is devoted to a comparative study of classifier combination methods, which have been successfully
applied to multiple tasks including Natural Language Processing (NLP) tasks. There is variety of classifier
combination techniques and the major difficulty is to choose one that is the best fit for a particular
task. In our study we explored the performance of a number of combination methods such as voting,
Bayesian merging, behavior knowledge space, bagging, stacking, feature sub-spacing and cascading, for
the part-of-speech tagging task using nine corpora in five languages. The results show that some methods
that, currently, are not very popular could demonstrate much better performance. In addition, we learned
how the corpus size and quality influence the combination methods performance. We also provide the
results of applying the classifier combination methods to the other NLP tasks, such as name entity recognition
and chunking. We believe that our study is the most exhaustive comparison made with combination
methods applied to NLP tasks so far
A Knowledge-Rich Approach to Feature-Based Opinion Extraction from Product Reviews
Feature-based opinion extraction is a task related to infor-
mation extraction, which consists of extracting structured
opinions on features of some object from reviews or other
subjective textual sources. Over the last years, this prob-lem
has been studied by some researchers, generally in an
unsupervised, domain-independent manner. As opposed to
that, in this work we propose a rede nition of the problem
from a more practical point of view, and describe a domain-
speci c, resource-based opinion extraction system. We fo-cus
on the description and generation of those resources, and
brie
y report the extraction system architecture and a few
initial experiments. The results suggest that domain-speci c
knowledge is a valuable resource in order to build precise
opinion extraction systems
- …