21,228 research outputs found

    Classifiers With a Reject Option for Early Time-Series Classification

    Full text link
    Early classification of time-series data in a dynamic environment is a challenging problem of great importance in signal processing. This paper proposes a classifier architecture with a reject option capable of online decision making without the need to wait for the entire time series signal to be present. The main idea is to classify an odor/gas signal with an acceptable accuracy as early as possible. Instead of using posterior probability of a classifier, the proposed method uses the "agreement" of an ensemble to decide whether to accept or reject the candidate label. The introduced algorithm is applied to the bio-chemistry problem of odor classification to build a novel Electronic-Nose called Forefront-Nose. Experimental results on wind tunnel test-bed facility confirms the robustness of the forefront-nose compared to the standard classifiers from both earliness and recognition perspectives

    Modern Approaches to Uncertain Database Exploration from Categorizing Data to Advanced Mining Solutions

    Get PDF
    In today's digitized era, the ubiquity of data from diverse sources has introduced unique challenges in database management, notably the issue of data uncertainty. Uncertainty in databases can arise from various factors – sensor inaccuracies, human input errors, or inherent vagueness in data interpretation. Addressing these challenges, this research delves into modern approaches to uncertain database exploration. The paper begins by exploring methods for categorizing data based on certainty levels, emphasizing the importance and mechanisms to distinguish between certain and uncertain data. The discussion then transitions to highlight pioneering mining solutions that enhance the utility of uncertain databases. By integrating state-of-the-art techniques with traditional database management principles, this study aims to bolster the reliability, efficiency, and versatility of data mining in uncertain contexts. The implications of these methods, both theoretically and in real-world applications, hold the potential to redefine how uncertain data is perceived and utilized in diverse sectors, from healthcare to finance

    Learning-based Rule-Extraction from Support Vector Machines

    Get PDF
    In recent years, support vector machines (SVMs) have shown good performance in a number of application areas, including text classification. However, the success of SVMs comes at a cost - an inability to explain the process by which a learning result was reached and why a decision is being made. Rule-extraction from SVMs is important for the acceptance of this machine learning technology, especially for applications such as medical diagnosis. It is crucial for the users to understand how the system makes a decision. In this paper, a novel approach for rule-extraction from support vector machines is presented. This approach handles rule-extraction as a learning task, which proceeds in two steps. The first is to use the labeled patterns from a data set to train an SVM. The second step is to use the generated model to predict the label (class) for an extended data set or different, unlabeled data set. The resulting patterns are then used to train a decision tree learning system and to extract the corresponding rule sets. The output rule sets are verified against available knowledge for the domain problem (e.g. a medical expert), and other classification techniques, to assure correctness and validity of rules

    Named Entity Recognition in Twitter using Images and Text

    Full text link
    Named Entity Recognition (NER) is an important subtask of information extraction that seeks to locate and recognise named entities. Despite recent achievements, we still face limitations with correctly detecting and classifying entities, prominently in short and noisy text, such as Twitter. An important negative aspect in most of NER approaches is the high dependency on hand-crafted features and domain-specific knowledge, necessary to achieve state-of-the-art results. Thus, devising models to deal with such linguistically complex contexts is still challenging. In this paper, we propose a novel multi-level architecture that does not rely on any specific linguistic resource or encoded rule. Unlike traditional approaches, we use features extracted from images and text to classify named entities. Experimental tests against state-of-the-art NER for Twitter on the Ritter dataset present competitive results (0.59 F-measure), indicating that this approach may lead towards better NER models.Comment: The 3rd International Workshop on Natural Language Processing for Informal Text (NLPIT 2017), 8 page
    • …
    corecore