Search CORE

13,564 research outputs found

Recommended from our members

Hierarchical classification for multiple, distributed web databases

Author: Yang Hui
Zhang Minjie
Publication venue
Publication date: 01/01/2004
Field of study

The proliferation of online information resources increases the importance of effective and efficient distributed searching. Our research aims to provide an alternative hierarchical categorization and search capability based on a Bayesian network learning algorithm. Our proposed approach, which is grounded on automatic textual analysis of subject content of online web databases, attempts to address the database selection problem by first classifying web databases into a hierarchy of topic categories. The experimental results reported demonstrate that such a classification approach not only effectively reduces the class search space, but also helps to significantly improve the accuracy of classification performance

Open Research Online (The Open University)

White Rose Research Online

Discriminating word senses with tourist walks in complex networks

Author: Amancio Diego R.
Silva Thiago C.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/06/2013
Field of study

Patterns of topological arrangement are widely used for both animal and human brains in the learning process. Nevertheless, automatic learning techniques frequently overlook these patterns. In this paper, we apply a learning technique based on the structural organization of the data in the attribute space to the problem of discriminating the senses of 10 polysemous words. Using two types of characterization of meanings, namely semantical and topological approaches, we have observed significative accuracy rates in identifying the suitable meanings in both techniques. Most importantly, we have found that the characterization based on the deterministic tourist walk improves the disambiguation process when one compares with the discrimination achieved with traditional complex networks measurements such as assortativity and clustering coefficient. To our knowledge, this is the first time that such deterministic walk has been applied to such a kind of problem. Therefore, our finding suggests that the tourist walk characterization may be useful in other related applications

arXiv.org e-Print Archive

EDP Sciences OAI-PMH repository (1.2.0)

Rationale in Development Chat Messages: An Exploratory Study

Author: Alkadhi Rana
Bruegge Bernd
Guzman Emitza
Lata Teodora
Publication venue
Publication date: 27/04/2017
Field of study

Chat messages of development teams play an increasingly significant role in software development, having replaced emails in some cases. Chat messages contain information about discussed issues, considered alternatives and argumentation leading to the decisions made during software development. These elements, defined as rationale, are invaluable during software evolution for documenting and reusing development knowledge. Rationale is also essential for coping with changes and for effective maintenance of the software system. However, exploiting the rationale hidden in the chat messages is challenging due to the high volume of unstructured messages covering a wide range of topics. This work presents the results of an exploratory study examining the frequency of rationale in chat messages, the completeness of the available rationale and the potential of automatic techniques for rationale extraction. For this purpose, we apply content analysis and machine learning techniques on more than 8,700 chat messages from three software development projects. Our results show that chat messages are a rich source of rationale and that machine learning is a promising technique for detecting rationale and identifying different rationale elements.Comment: 11 pages, 6 figures. The 14th International Conference on Mining Software Repositories (MSR'17

arXiv.org e-Print Archive

Crossref

Feature extraction and classification of spam emails

Author: Hassan Muhammad Ali
Mtetwa Nhamo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 02/05/2019
Field of study

Crossref

ResearchOnline@GCU

Naive bayes multi-label classification approach for high-voltage condition monitoring

Author: Boreham Philip
Mitiche Imene
Morison Gordon
Nesbitt Alan
Stewart Brian G.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/01/2019
Field of study

This paper addresses for the first time the multilabel classification of High-Voltage (HV) discharges captured using the Electromagnetic Interference (EMI) method for HV machines. The approach involves feature extraction from EMI time signals, emitted during the discharge events, by means of 1D-Local Binary Pattern (LBP) and 1D-Histogram of Oriented Gradients (HOG) techniques. Their combination provides a feature vector that is implemented in a naive Bayes classifier designed to identify the labels of two or more discharge sources contained within a single signal. The performance of this novel approach is measured using various metrics including average precision, accuracy, specificity, hamming loss etc. Results demonstrate a successful performance that is in line with similar application to other fields such as biology and image processing. This first attempt of multi-label classification of EMI discharge sources opens a new research topic in HV condition monitoring

Crossref

University of Strathclyde Institutional Repository

ResearchOnline@GCU