Search CORE

13 research outputs found

Recommended from our members

Entropy Based Feature Selection For Multi-Relational Naïve Bayesian Classifier

Author: Modi Nilesh K
Vaghela Vimalkumar B
Vandra Kalpesh H
Publication venue: CSUSB ScholarWorks
Publication date: 01/01/2014
Field of study

Current industries data’s are stored in relation structures. In usual approach to mine these data, we often use to join several relations to form a single relation using foreign key links, which is known as flatten. Flatten may cause troubles such as time consuming, data redundancy and statistical skew on data. Hence, the critical issues arise that how to mine data directly on numerous relations. The solution of the given issue is the approach called multi-relational data mining (MRDM). Other issues are irrelevant or redundant attributes in a relation may not make contribution to classification accuracy. Thus, feature selection is an essential data pre- processing step in multi-relational data mining. By filtering out irrelevant or redundant features from relations for data mining, we improve classification accuracy, achieve good time performance, and improve comprehensibility of the models. We had proposed the entropy based feature selection method for Multi-relational Naïve Bayesian Classifier. We have use method InfoDist and Pearson’s Correlation parameters, which will be used to filter out irrelevant and redundant features from the multi-relational database and will enhance classification accuracy. We analyzed our algorithm over PKDD financial dataset and achieved the better accuracy compare to the existing features selection methods

CSUSB ScholarWorks

Prediction of DNA-binding propensity of proteins by the ball-histogram method using automatic template search

Author: A Bhattacharyya
A Szabóová
A Szilágyi
Andrea Szabóová
CJC Burges
CO Pabo
DH Ohlendorf
DW Hosmer
EW Stawiski
Filip Železný
G Nimrod
J Moreland
Jakub Tolar
L Breiman
N Bhardwaj
N Lavrač
Ondřej Kuželka
R Caruana
R Sathyapriya
S Ahmad
S Jones
S Jones
T Cathomen
T Hastie
Y Mandel-Gutfreund
Y Tsuchiya
Publication venue: BioMed Central
Publication date: 01/01/2012
Field of study

We contribute a novel, ball-histogram approach to DNA-binding propensity prediction of proteins. Unlike state-of-the-art methods based on constructing an ad-hoc set of features describing physicochemical properties of the proteins, the ball-histogram technique enables a systematic, Monte-Carlo exploration of the spatial distribution of amino acids complying with automatically selected properties. This exploration yields a model for the prediction of DNA binding propensity. We validate our method in prediction experiments, improving on state-of-the-art accuracies. Moreover, our method also provides interpretable features involving spatial distributions of selected amino acids

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Studying the Functional Genomics of Stress Responses in Loblolly Pine With the Expresso Microarray Experiment Management System

Author: Aharoni
Alexandre
Alscher
Bard
Boris I. Chevone
Brachat
Brown
Callis
Chang
Chen
Cho
Chu
Claverie
Costa
Costa
Craig A. Struble
Cushman
Daniels
Dawei Chen
Degenhardt
Donahue
Dong
Dzeroski
Eisen
Epstein
Flach
Fraley
Gallant
Gang
Garofalakis
Gasch
Geisler
Gilchrest
Golub
Gracey
Greller
Hilsenbeck
Hong
Jain
Jelinsky
Jordan
Kannan
Kawasaki
Khan
Lavrac
Lazzeroni
Lee
Lenwood S. Heath
Leonel van Zyl
Lev-Yadun
May
Monni
Muggleton
Muggleton
Mullineaux
Naren Ramakrishnan
Perou
Reymond
Rial
Ronald R. Sederoff
Ross W. Whetten
Ruan
Ruth Grene
Scandalios
Schaffer
Schnaider
Seki
Sherlock
Shinozaki
Shinozaki
Smyth
Somerville
Srinivasan
Sullivan
Uno
Vapnik
Vincent Y. Jouenne
Wang
Wang
White
Wu
Yang
Zhu
Publication venue: Hindawi Publishing Corporation
Publication date: 01/01/2002
Field of study

Conception, design, and implementation of cDNA microarray experiments present a variety of bioinformatics challenges for biologists and computational scientists. The multiple stages of data acquisition and analysis have motivated the design of Expresso, a system for microarray experiment management. Salient aspects of Expresso include support for clone replication and randomized placement; automatic gridding, extraction of expression data from each spot, and quality monitoring; flexible methods of combining data from individual spots into information about clones and functional categories; and the use of inductive logic programming for higher-level data analysis and mining. The development of Expresso is occurring in parallel with several generations of microarray experiments aimed at elucidating genomic responses to drought stress in loblolly pine seedlings. The current experimental design incorporates 384 pine cDNAs replicated and randomly placed in two specific microarray layouts. We describe the design of Expresso as well as results of analysis with Expresso that suggest the importance of molecular chaperones and membrane transport proteins in mechanisms conferring successful adaptation to long-term drought stress

epublications@Marquette

Crossref

Directory of Open Access Journals

PubMed Central

Compositional Mining of Multi-Relational Biological Datasets

Author: Jin Ying
Murali T.M.
Ramakrishnan Naren
Publication venue
Publication date: 01/01/2007
Field of study

High-throughput biological screens are yielding ever-growing streams of information about multiple aspects of cellular activity. As more and more categories of datasets come online, there is a corresponding multitude of ways in which inferences can be chained across them, motivating the need for compositional data mining algorithms. In this paper, we argue that such compositional data mining can be effectively realized by functionally cascading redescription mining and biclustering algorithms as primitives. Both these primitives mirror shifts of vocabulary that can be composed in arbitrary ways to create rich chains of inferences. Given a relational database and its schema, we show how the schema can be automatically compiled into a compositional data mining program, and how different domains in the schema can be related through logical sequences of biclustering and redescription invocations. This feature allows us to rapidly prototype new data mining applications, yielding greater understanding of scientific datasets. We describe two applications of compositional data mining: (i) matching terms across categories of the Gene Ontology and (ii) understanding the molecular mechanisms underlying stress response in human cells

Computer Science Technical Reports @Virginia Tech

CiteSeerX

Hierarchical Relational Learning

Author: Luís Filipe Cruz Queijo
Publication venue
Publication date: 02/12/2021
Field of study

Repositório Aberto da Universidade do Porto

Compositional mining of multirelational biological datasets

Author: Agrawal R.
Ball C.
Bayardo R.
Benjamini Y.
Matzke M.
Michalski R.
Murali T.
Naren Ramakrishnan
Parida L.
T. M. Murali
Ying Jin
Zaki M.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Naive Bayesian Classification of Structured Data

Author: Nicolas Lachiche
Peter A. Flach
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

An Extended Transformation Approach to Inductive Logic Programming

Author: Nada Lavrac
Peter A. Flach
Publication venue
Publication date: 01/01/2000
Field of study

this paper we show how this limitation can be overcome, by systematic first-order feature construction using a particular individual-centered feature bias. The approach can be applied in any domain where there is a clear notion of individual. We also show how to improve upon exhaustive first-order feature construction by using a relevancy filter. The proposed approach is illustrated on the "trains" and "mutagenesis" ILP domain

CiteSeerX

Crossref

Explore Bristol Research