93 research outputs found
ALEC: Active learning with ensemble of classifiers for clinical diagnosis of coronary artery disease
Invasive angiography is the reference standard for coronary artery disease (CAD) diagnosis but is expensive and
associated with certain risks. Machine learning (ML) using clinical and noninvasive imaging parameters can be
used for CAD diagnosis to avoid the side effects and cost of angiography. However, ML methods require labeled
samples for efficient training. The labeled data scarcity and high labeling costs can be mitigated by active
learning. This is achieved through selective query of challenging samples for labeling. To the best of our
knowledge, active learning has not been used for CAD diagnosis yet. An Active Learning with Ensemble of
Classifiers (ALEC) method is proposed for CAD diagnosis, consisting of four classifiers. Three of these classifiers
determine whether a patient’s three main coronary arteries are stenotic or not. The fourth classifier predicts
whether the patient has CAD or not. ALEC is first trained using labeled samples. For each unlabeled sample, if the
outputs of the classifiers are consistent, the sample along with its predicted label is added to the pool of labeled
samples. Inconsistent samples are manually labeled by medical experts before being added to the pool. The
training is performed once more using the samples labeled so far. The interleaved phases of labeling and training
are repeated until all samples are labeled. Compared with 19 other active learning algorithms, ALEC combined
with a support vector machine classifier attained superior performance with 97.01% accuracy. Our method is
justified mathematically as well. We also comprehensively analyze the CAD dataset used in this paper. As part of
dataset analysis, features pairwise correlation is computed. The top 15 features contributing to CAD and stenosis
of the three main coronary arteries are determined. The relationship between stenosis of the main arteries is
presented using conditional probabilities. The effect of considering the number of stenotic arteries on sample
discrimination is investigated. The discrimination power over dataset samples is visualized, assuming each of the
three main coronary arteries as a sample label and considering the two remaining arteries as sample features
Análisis de técnicas de aprendizaje automático en el sector de la viticultura
Este Trabajo Fin de Grado ofrece contribuciones relevantes al estado del arte de la investigación relacionada con la tecnología en el sector de la viticultura.
En primer lugar, se presenta una exhaustiva visión de las técnicas de Inteligencia Artificial empleadas en los últimos años en el ámbito de la vinificación a partir del estudio de artículos que inciden en las técnicas empleadas y cómo estas ayudan a mejorar diversos aspectos, como puede ser la calidad del vino o incluso factores relacionados con la producción o cantidad del vino producido. A partir de estos datos, podemos ofrecer un recorrido documentado sobre las inclinaciones actuales de emplear este gran recurso, la Inteligencia Artificial. Este estudio se centra en las técnicas de Aprendizaje Automático que se pueden integrar en la gestión y procesos de vinificación de viñedos actuales para brindar resultados relevantes y útiles para la industria.
Por otra parte, el segundo componente del trabajo destaca la importancia de las Bases de Datos empleadas, ofreciendo ejemplos y unas breves pinceladas sobre características importantes que influyen a la hora de afrontar un estudio con muestras de vino.
Este documento concluye ofreciendo una interpretación de las nuevas tendencias que se adoptarán en el futuro cercano para mejorar un sector enormemente influyente en nuestro país y a nivel mundial.This Final Project offers relevant contributions to the state of the art’s research related to technology in the viticulture sector.
On the one hand, an exhaustive vision of Artificial Intelligence techniques used in recent years in the field of winemaking is presented. In order to meet that goal, the study of articles that affect the techniques used and how they help to improve various aspects -such as wine quality or factors related to the production or quantity of the wine produced- are used. From these data, we can offer a documented tour of the current inclinations to use this great resource, Artificial Intelligence. This study focuses on Machine Learning techniques that can be integrated into current vineyard management and winemaking processes to deliver industry-relevant and useful results.
On the other hand, the second component of the current work highlights the importance of the databases used, offering examples and a few brief notes on important characteristics that influence when facing a study with wine samples.
This document concludes offering an interpretation of the new trends that will be adopted in the near future to improve a greatly influential sector in our country.Departamento de Teoría de la Señal y Comunicaciones e Ingeniería TelemáticaGrado en Ingeniería de Tecnologías de Telecomunicació
Recent Advances in Social Data and Artificial Intelligence 2019
The importance and usefulness of subjects and topics involving social data and artificial intelligence are becoming widely recognized. This book contains invited review, expository, and original research articles dealing with, and presenting state-of-the-art accounts pf, the recent advances in the subjects of social data and artificial intelligence, and potentially their links to Cyberspace
Detecting New, Informative Propositions in Social Media
The ever growing quantity of online text produced makes it increasingly challenging to find new important or useful information. This is especially so when topics of potential interest are not known a-priori, such as in “breaking news stories”. This thesis examines techniques for detecting the emergence of new, interesting information in Social Media. It sets the investigation in the context of a hypothetical knowledge discovery and acquisition system, and addresses two objectives. The first objective addressed is the detection of new topics. The second is filtering of non-informative text from Social Media. A rolling time-slicing approach is proposed for discovery, in which daily frequencies of nouns, named entities, and multiword expressions are compared to their expected daily frequencies, as estimated from previous days using a Poisson model. Trending features, those showing a significant surge in use, in Social Media are potentially interesting. Features that have not shown a similar recent surge in News are selected as indicative of new information. It is demonstrated that surges in nouns and news entities can be detected that predict corresponding surges in mainstream news. Co-occurring trending features are used to create clusters of potentially topic-related documents. Those formed from co-occurrences of named entities are shown to be the most topically coherent.
Machine learning based filtering models are proposed for finding informative text in Social Media. News/Non-News and Dialogue Act models are explored using the News annotated Redites corpus of Twitter messages. A simple 5-act Dialogue scheme, used to annotate a small sample thereof, is presented. For both News/Non-News and Informative/Non-Informative classification tasks, using non-lexical message features produces more discriminative and robust classification models than using message terms alone. The
combination of all investigated features yield the most accurate models
Advances in Data Mining Knowledge Discovery and Applications
Advances in Data Mining Knowledge Discovery and Applications aims to help data miners, researchers, scholars, and PhD students who wish to apply data mining techniques. The primary contribution of this book is highlighting frontier fields and implementations of the knowledge discovery and data mining. It seems to be same things are repeated again. But in general, same approach and techniques may help us in different fields and expertise areas. This book presents knowledge discovery and data mining applications in two different sections. As known that, data mining covers areas of statistics, machine learning, data management and databases, pattern recognition, artificial intelligence, and other areas. In this book, most of the areas are covered with different data mining applications. The eighteen chapters have been classified in two parts: Knowledge Discovery and Data Mining Applications
A constraint-based hypergraph partitioning approach to coreference resolution
The objectives of this thesis are focused on research in machine learning for
coreference resolution. Coreference resolution is a natural language processing
task that consists of determining the expressions in a discourse that mention or
refer to the same entity.
The main contributions of this thesis are (i) a new approach to coreference
resolution based on constraint satisfaction, using a hypergraph to represent
the problem and solving it by relaxation labeling; and (ii) research towards
improving coreference resolution performance using world knowledge extracted
from Wikipedia.
The developed approach is able to use entity-mention classi cation model
with more expressiveness than the pair-based ones, and overcome the weaknesses
of previous approaches in the state of the art such as linking contradictions,
classi cations without context and lack of information evaluating pairs. Furthermore,
the approach allows the incorporation of new information by adding
constraints, and a research has been done in order to use world knowledge to
improve performances.
RelaxCor, the implementation of the approach, achieved results in the
state of the art, and participated in international competitions: SemEval-2010
and CoNLL-2011. RelaxCor achieved second position in CoNLL-2011.La resolució de correferències és una tasca de processament del llenguatge natural que consisteix en determinar les expressions
d'un discurs que es refereixen a la mateixa entitat del mon real. La tasca té un efecte directe en la minería de textos així com en
moltes tasques de llenguatge natural que requereixin interpretació del discurs com resumidors, responedors de preguntes o
traducció automàtica. Resoldre les correferències és essencial si es vol poder “entendre” un text o un discurs.
Els objectius d'aquesta tesi es centren en la recerca en resolució de correferències amb aprenentatge automàtic. Concretament,
els objectius de la recerca es centren en els següents camps:
+ Models de classificació: Els models de classificació més comuns a l'estat de l'art estan basats en la classificació independent de
parelles de mencions. Més recentment han aparegut models que classifiquen grups de mencions. Un dels objectius de la tesi és
incorporar el model entity-mention a l'aproximació desenvolupada.
+ Representació del problema: Encara no hi ha una representació definitiva del problema. En aquesta tesi es presenta una
representació en hypergraf.
+ Algorismes de resolució. Depenent de la representació del problema i del model de classificació, els algorismes de ressolució
poden ser molt diversos. Un dels objectius d'aquesta tesi és trobar un algorisme de resolució capaç d'utilitzar els models de
classificació en la representació d'hypergraf.
+ Representació del coneixement: Per poder administrar coneixement de diverses fonts, cal una representació simbòlica i
expressiva d'aquest coneixement. En aquesta tesi es proposa l'ús de restriccions.
+ Incorporació de coneixement del mon: Algunes correferències no es poden resoldre només amb informació lingüística. Sovint
cal sentit comú i coneixement del mon per poder resoldre coreferències. En aquesta tesi es proposa un mètode per extreure
coneixement del mon de Wikipedia i incorporar-lo al sistem de resolució.
Les contribucions principals d'aquesta tesi son (i) una nova aproximació al problema de resolució de correferències basada en
satisfacció de restriccions, fent servir un hypergraf per representar el problema, i resolent-ho amb l'algorisme relaxation labeling; i
(ii) una recerca per millorar els resultats afegint informació del mon extreta de la Wikipedia.
L'aproximació presentada pot fer servir els models mention-pair i entity-mention de forma combinada evitant així els problemes
que es troben moltes altres aproximacions de l'estat de l'art com per exemple: contradiccions de classificacions independents,
falta de context i falta d'informació. A més a més, l'aproximació presentada permet incorporar informació afegint restriccions i s'ha
fet recerca per aconseguir afegir informació del mon que millori els resultats.
RelaxCor, el sistema que ha estat implementat durant la tesi per experimentar amb l'aproximació proposada, ha aconseguit uns
resultats comparables als millors que hi ha a l'estat de l'art. S'ha participat a les competicions internacionals SemEval-2010 i
CoNLL-2011. RelaxCor va obtenir la segona posició al CoNLL-2010
Recent Trends in Computational Intelligence
Traditional models struggle to cope with complexity, noise, and the existence of a changing environment, while Computational Intelligence (CI) offers solutions to complicated problems as well as reverse problems. The main feature of CI is adaptability, spanning the fields of machine learning and computational neuroscience. CI also comprises biologically-inspired technologies such as the intellect of swarm as part of evolutionary computation and encompassing wider areas such as image processing, data collection, and natural language processing. This book aims to discuss the usage of CI for optimal solving of various applications proving its wide reach and relevance. Bounding of optimization methods and data mining strategies make a strong and reliable prediction tool for handling real-life applications
Recommended from our members
Automated detection of reflection in texts. A machine learning based approach
Promoting reflective thinking is an important educational goal. A common educational practice is to provide opportunities for learners to express their reflective thoughts in writing. The analysis of such text with regard to reflection is mainly a manual task that employs the principles of content analysis.
Considering the amount of text produced by online learning systems, tools that automatically analyse text with regard to reflection would greatly benefit research and practice.
Previous research has explored the potential of dictionary-based approaches that automatically map keywords to categories associated with reflection. Other automated methods use manually constructed rules to gauge insight from text. Machine learning has shown potential for classifying text with regard to reflection-related constructs. However, not much is known of whether machine learning can be used to reliably analyse text with regard to the categories of reflective writing models.
This thesis investigates the reliability of machine learning algorithms to detect reflective thinking in text. In particular, it studies whether text segments from student writings can be analysed automatically to detect the presence (or absence) of reflective writing model categories.
A synthesis of the models of reflective writing is performed to determine the categories frequently used to analyse reflective writing. For each of these categories, several machine learning algorithms are evaluated with regard to their ability to reliably detect reflective writing categories.
The evaluation finds that many of the categories can be predicted reliably. The automated method, however, does not achieve the same level of reliability as humans do
- …