Search CORE

17 research outputs found

A Novel Approach in Feature Selection Method for Text Document Classification

Author: S.W. Mohod, Dr. C.A. Dhote
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 31/07/2015
Field of study

In this paper, a novel approach is proposed for extract eminence features for classifier. Instead of traditional feature selection techniques used for text document classification. We introduce a new model based on probability and over all class frequency of term. We applied this new technique to extract features from training text documents to generate training set for machine learning. Using these machine learning training set to automatic classify documents into corresponding class labels and improve the classification accuracy. The results on these proposed feature selection method illustrates that the proposed method performs much better than traditional methods. DOI: 10.17762/ijritcc2321-8169.15075

International Journal on Recent and Innovation Trends in Computing and Communication

Chunking with Max-Margin Markov Networks

Author: Tang Buzhou
Wang Xiaolong
Wang Xuan
Publication venue: De La Salle University - Dasmarinas
Publication date: 01/01/2008
Field of study

PACLIC / The University of the Philippines Visayas Cebu College Cebu City, Philippines / November 20-22, 200

Waseda University Repository

Bayesian networks : a better than frequentist approach for parametrization, and a more accurate structural complexity measure than the number of parameters

Author: Gelly Sylvain
Teytaud Olivier
Publication venue: HAL CCSD
Publication date: 01/01/2005
Field of study

We propose and justify a better-than-frequentist approach for bayesian network parametrization, and propose a structural entropy term that more precisely quantifies the complexity of a BN than the number of parameters. Algorithms for BN learning are deduced

INRIA a CCSD electronic archive server

HAL-Polytechnique

Active Learning of Continuous-time Bayesian Networks through Interventions

Author: Koeppl Heinz
Linzner Dominik
Publication venue
Publication date: 11/06/2021
Field of study

We consider the problem of learning structures and parameters of Continuous-time Bayesian Networks (CTBNs) from time-course data under minimal experimental resources. In practice, the cost of generating experimental data poses a bottleneck, especially in the natural and social sciences. A popular approach to overcome this is Bayesian optimal experimental design (BOED). However, BOED becomes infeasible in high-dimensional settings, as it involves integration over all possible experimental outcomes. We propose a novel criterion for experimental design based on a variational approximation of the expected information gain. We show that for CTBNs, a semi-analytical expression for this criterion can be calculated for structure and parameter learning. By doing so, we can replace sampling over experimental outcomes by solving the CTBNs master-equation, for which scalable approximations exist. This alleviates the computational burden of sampling possible experimental outcomes in high-dimensions. We employ this framework in order to recommend interventional sequences. In this context, we extend the CTBN model to conditional CTBNs in order to incorporate interventions. We demonstrate the performance of our criterion on synthetic and real-world data.Comment: Accepted at ICML202

arXiv.org e-Print Archive

TUbiblio

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

B-splines in EMD and Graph Theory in Pattern Recognition

Author: Wu Qin
Publication venue: The Research Repository @ WVU
Publication date: 01/01/2011
Field of study

With the development of science and technology, a large amount of data is waiting for further scientific exploration. We can always build up some good mathematical models based on the given data to analyze and solve the real life problems. In this work, we propose three types of mathematical models for different applications.;In chapter 1, we use Bspline based EMD to analysis nonlinear and no-stationary signal data. A new idea about the boundary extension is introduced and applied to the Empirical Mode Decomposition(EMD) algorithm. Instead of the traditional mirror extension on the boundary, we propose a ratio extension on the boundary.;In chapter 2 we propose a weighted directed multigraph for text pattern recognition. We set up a weighted directed multigraph model using the distances between the keywords as the weights of arcs. We then developed a keyword-frequency-distance-based algorithm which not only utilizes the frequency information of keywords but also their ordering information.;In chapter 3, we propose a centrality guided clustering method. Different from traditional methods which choose a center of a cluster randomly, we start clustering from a LEADER - a vertex with highest centrality score, and a new member is added into an existing community if the new vertex meet some criteria and the new community with the new vertex maintain a certain density.;In chapter 4, we define a new graph optimization problem which is called postman tour with minimum route-pair cost. And we model the DNA sequence assembly problem as the postman tour with minimum route-pair cost problem

The Research Repository @ WVU (West Virginia University)