5,454,957 research outputs found

    On Horizontal and Vertical Separation in Hierarchical Text Classification

    Get PDF
    Hierarchy is a common and effective way of organizing data and representing their relationships at different levels of abstraction. However, hierarchical data dependencies cause difficulties in the estimation of "separable" models that can distinguish between the entities in the hierarchy. Extracting separable models of hierarchical entities requires us to take their relative position into account and to consider the different types of dependencies in the hierarchy. In this paper, we present an investigation of the effect of separability in text-based entity classification and argue that in hierarchical classification, a separation property should be established between entities not only in the same layer, but also in different layers. Our main findings are the followings. First, we analyse the importance of separability on the data representation in the task of classification and based on that, we introduce a "Strong Separation Principle" for optimizing expected effectiveness of classifiers decision based on separation property. Second, we present Hierarchical Significant Words Language Models (HSWLM) which capture all, and only, the essential features of hierarchical entities according to their relative position in the hierarchy resulting in horizontally and vertically separable models. Third, we validate our claims on real-world data and demonstrate that how HSWLM improves the accuracy of classification and how it provides transferable models over time. Although discussions in this paper focus on the classification problem, the models are applicable to any information access tasks on data that has, or can be mapped to, a hierarchical structure.Comment: Full paper (10 pages) accepted for publication in proceedings of ACM SIGIR International Conference on the Theory of Information Retrieval (ICTIR'16

    Feature extraction from electroencephalograms for Bayesian assessment of newborn brain maturity

    Get PDF
    We explored the feature extraction techniques for Bayesian assessment of EEG maturity of newborns in the context that the continuity of EEG is the most important feature for assessment of the brain development. The continuity is associated with EEG “stationarity” which we propose to evaluate with adaptive segmentation of EEG into pseudo-stationary intervals. The histograms of these intervals are then used as new features for the assessment of EEG maturity. In our experiments, we used Bayesian model averaging over decision trees to differentiate two age groups, each included 110 EEG recordings. The use of the proposed EEG features has shown, on average, a 6% increase in the accuracy of age differentiation

    Improving feature selection algorithms using normalised feature histograms

    Full text link
    The proposed feature selection method builds a histogram of the most stable features from random subsets of a training set and ranks the features based on a classifier based cross-validation. This approach reduces the instability of features obtained by conventional feature selection methods that occur with variation in training data and selection criteria. Classification results on four microarray and three image datasets using three major feature selection criteria and a naive Bayes classifier show considerable improvement over benchmark results

    Effect of dietary supplement of sugar beet, neem leaf, linseed and coriander on growth performance and carcass trait of Vanaraja chicken

    Get PDF
    Aim: This study was planned to investigate the effect of sugar beet, neem leaf, linseed and coriander on growth parameters such as feed intake, body weight gain, feed conversion ratio (FCR), performance index (PI), and carcass characteristics in broiler birds. Materials and Methods: The experiment was conducted for a period of 42 days on Vanaraja strain of broiler birds. Different dietary supplement such as sugar beet meal, neem leaf meal, linseed meal and coriander seed meal were used in the basal diet. All day-old 150 male chicks were individually weighed and distributed into five groups having 30 birds in each. Each group was further sub-divided into triplicates having 10 birds in each. Group T1served as control and rest groups T2, T3, T4 and T5 as treatment groups. Birds in T1 group were fed basal ration only, however, T2 , T3, T4 and T5 groups were fed basal ration mixed with 2.5% sugar beet meal, neem leaf meal, linseed meal, and coriander seed meal individually, respectively. Results: Broilers supplemented with herbs/spices showed improvement in growth attributes and carcass characteristics. Broilers fed with herbs at the rate of 2.5% had higher feed intake except sugar beet and coriander seed meal fed group. The body weight and weight gain was also significantly (p<0.05) higher than control. Both FCR and PI were improved in supplemented groups in comparison to control. Dressing percentage was not significantly (p>0.05) affected. Average giblet percentage of all supplemented groups were significantly (p<0.05) higher than control and was found to be highest in neem leaf meal fed group. Average by-product percentage was found to be highest in linseed fed group. Conclusion: Various herbs such as sugar beet, neem leaf, linseed and coriander seed meals affected the growth performance, and carcass trait showed positive inclination toward supplemented groups in broilers. The exact mode of action of these herbs/spices is still not clear, however, one or more numbers of active compounds present in these supplements may be responsible

    Feature learning in feature-sample networks using multi-objective optimization

    Full text link
    Data and knowledge representation are fundamental concepts in machine learning. The quality of the representation impacts the performance of the learning model directly. Feature learning transforms or enhances raw data to structures that are effectively exploited by those models. In recent years, several works have been using complex networks for data representation and analysis. However, no feature learning method has been proposed for such category of techniques. Here, we present an unsupervised feature learning mechanism that works on datasets with binary features. First, the dataset is mapped into a feature--sample network. Then, a multi-objective optimization process selects a set of new vertices to produce an enhanced version of the network. The new features depend on a nonlinear function of a combination of preexisting features. Effectively, the process projects the input data into a higher-dimensional space. To solve the optimization problem, we design two metaheuristics based on the lexicographic genetic algorithm and the improved strength Pareto evolutionary algorithm (SPEA2). We show that the enhanced network contains more information and can be exploited to improve the performance of machine learning methods. The advantages and disadvantages of each optimization strategy are discussed.Comment: 7 pages, 4 figure

    Feature-Aware Verification

    Full text link
    A software product line is a set of software products that are distinguished in terms of features (i.e., end-user--visible units of behavior). Feature interactions ---situations in which the combination of features leads to emergent and possibly critical behavior--- are a major source of failures in software product lines. We explore how feature-aware verification can improve the automatic detection of feature interactions in software product lines. Feature-aware verification uses product-line verification techniques and supports the specification of feature properties along with the features in separate and composable units. It integrates the technique of variability encoding to verify a product line without generating and checking a possibly exponential number of feature combinations. We developed the tool suite SPLverifier for feature-aware verification, which is based on standard model-checking technology. We applied it to an e-mail system that incorporates domain knowledge of AT&T. We found that feature interactions can be detected automatically based on specifications that have only feature-local knowledge, and that variability encoding significantly improves the verification performance when proving the absence of interactions.Comment: 12 pages, 9 figures, 1 tabl
    corecore