561 research outputs found
Exploring Patterns of Epigenetic Information With Data Mining Techniques
[Abstract] Data mining, a part of the Knowledge Discovery in Databases process (KDD), is the process of extracting patterns from large data sets by combining methods from statistics and artificial intelligence with database management. Analyses of epigenetic data have evolved towards genome-wide and high-throughput approaches, thus generating great amounts of data for which data mining is essential. Part of these data may contain patterns of epigenetic information which are mitotically and/or meiotically heritable determining gene expression and cellular differentiation, as well as cellular fate. Epigenetic lesions and genetic mutations are acquired by individuals during their life and accumulate with ageing. Both defects, either together or individually, can result in losing control over cell growth and, thus, causing cancer development. Data mining techniques could be then used to extract the previous patterns. This work reviews some of the most important applications of data mining to epigenetics.Programa Iberoamericano de Ciencia y TecnologĂa para el Desarrollo; 209RT-0366Galicia. ConsellerĂa de EconomĂa e Industria; 10SIN105004PRInstituto de Salud Carlos III; RD07/0067/000
What is behind a summary-evaluation decision?
Research in psychology has reported that, among the variety of possibilities for assessment methodologies, summary evaluation offers a particularly adequate context for inferring text comprehension and topic understanding. However, grades obtained in this methodology are hard to quantify objectively. Therefore, we carried out an empirical study to analyze the decisions underlying human summary-grading behavior. The task consisted of expert evaluation of summaries produced in critically relevant contexts of summarization development, and the resulting data were modeled by means of Bayesian networks using an application called Elvira, which allows for graphically observing the predictive power (if any) of the resultant variables. Thus, in this article, we analyzed summary-evaluation decision making in a computational framewor
Distribution-Based Categorization of Classifier Transfer Learning
Transfer Learning (TL) aims to transfer knowledge acquired in one problem,
the source problem, onto another problem, the target problem, dispensing with
the bottom-up construction of the target model. Due to its relevance, TL has
gained significant interest in the Machine Learning community since it paves
the way to devise intelligent learning models that can easily be tailored to
many different applications. As it is natural in a fast evolving area, a wide
variety of TL methods, settings and nomenclature have been proposed so far.
However, a wide range of works have been reporting different names for the same
concepts. This concept and terminology mixture contribute however to obscure
the TL field, hindering its proper consideration. In this paper we present a
review of the literature on the majority of classification TL methods, and also
a distribution-based categorization of TL with a common nomenclature suitable
to classification problems. Under this perspective three main TL categories are
presented, discussed and illustrated with examples
Overview of recent advances in Health care technology and its impact on health care delivery
Recent advancement in technology such as Machine Learning (ML), Artificial intelligence(AI), Robotics, internet of things (IOT), Block Chain technologies, Big Data analytics, Cloud computing Natural Language Processing, Mobile Applications is making a huge impact on the day to day lives of human beings. These technologies started helping us to save resources, time and cost and at the same time increase the accuracy and efficiency. Biomedical domain also started embracing these new technologies in the areas of diagnosis, surgery and therapeutics. These technologies also have applications in the areas of pattern recognition and expert systems. The paper provides an overview of recent advancement in technologies and its impact on the biomedical domain Â
Recommended from our members
The Role of Data Quality and Heterogeneity on the Calibration of Neural Networks
Neural networks have been widely studied and used in recent years due to its highclassification accuracy and training efficiency. With the increase of network depth, however,the models become worse calibrated, meaning they cannot reflect the true probabilities. Onthe other hand, in many applications such as medical diagnosis, facial recognition and selfdriving cars, the calibrated output probabilities are of critical importance. Therefore, theunderstanding of the cause of deep neural network uncalibration is of much concern.The influence of model structures on the output calibration has been explored.However, the impact of the training dataset quality and heterogeneity, such as dataset sizeand label noise remains unclear. In this thesis, the impact of data quality and heterogeneityon the output calibration is investigated theoretically and experimentally. Afterwards, thedefect of calibration methods using single global parameter are discussed. To overcomethe calibration issues resulting from the dataset heterogeneity, we propose an improvedcalibration technique that can give better performance
Cognitive Spam Recognition Using Hadoop and Multicast-Update
In today's world of exponentially growing technology, spam is a very common issue faced by users on the internet. Spam not only hinders the performance of a network, but it also wastes space and time, and causes general irritation and presents a multitude of dangers - of viruses, malware, spyware and consequent system failure, identity theft, and other cyber criminal activity. In this context, cognition provides us with a method to help improve the performance of the distributed system. It enables the system to learn what it is supposed to do for different input types as different classifications are made over time and this learning helps it increase its accuracy as time passes. Each system on its own can only do so much learning, because of the limited sample set of inputs that it gets to process. However, in a network, we can make sure that every system knows the different kinds of inputs available and learns what it is supposed to do with a better success rate. Thus, distribution and combination of this cognition across different components of the network leads to an overall improvement in the performance of the system. In this paper, we describe a method to make machines cognitively label spam using Machine Learning and the Naive Bayesian approach. We also present two possible methods of implementation - using a MapReduce Framework (hadoop), and also using messages coupled with a multicast-send based network - with their own subtypes, and the pros and cons of each. We finally present a comparative analysis of the two main methods and provide a basic idea about the usefulness of the two in various different scenarios
Learning Discrete-Time Markov Chains Under Concept Drift
Learning under concept drift is a novel and promising research area aiming at designing learning algorithms able to deal with nonstationary data-generating processes. In this research field, most of the literature focuses on learning nonstationary probabilistic frameworks, while some extensions about learning graphs and signals under concept drift exist. For the first time in the literature, this paper addresses the problem of learning discrete-time Markov chains (DTMCs) under concept drift. More specifically, following a hybrid active/passive approach, this paper introduces both a family of change-detection mechanisms (CDMs), differing in the required assumptions and performance, for detecting changes in DTMCs and an adaptive learning algorithm able to deal with DTMCs under concept drift. The effectiveness of both the proposed CDMs and the adaptive learning algorithm has been extensively tested on synthetically generated experiments and real data sets
- …