43,588 research outputs found

    Hybrid Approaches for Classification Under Information Acquisition Cost Constraint

    Get PDF
    The practical use of classification systems may be limited because the current classification systems do not allow decision makers to incorporate cost constraint. For example, in several financial applications (loan approval, credit scoring, etc.) an applicant is asked to submit a processing fee with the application (Mookerjee and Mannino 1997). The processing fee may be used to validate the information entered in the application. From an economic standpoint, it is important that the cost of validating the information not exceed the processing fee. Traditional classification systems do not allow the decision maker to incorporate information acquisition cost constraint. We term the problem of designing a classification system, where information acquisition costs are considered,astheproblemofclassificationwithinformationacquisitioncostconstraint(CIACC). TheCIACCproblemisaNP hard problem and is very difficult to solve to optimality

    Wireless Data Acquisition for Edge Learning: Data-Importance Aware Retransmission

    Full text link
    By deploying machine-learning algorithms at the network edge, edge learning can leverage the enormous real-time data generated by billions of mobile devices to train AI models, which enable intelligent mobile applications. In this emerging research area, one key direction is to efficiently utilize radio resources for wireless data acquisition to minimize the latency of executing a learning task at an edge server. Along this direction, we consider the specific problem of retransmission decision in each communication round to ensure both reliability and quantity of those training data for accelerating model convergence. To solve the problem, a new retransmission protocol called data-importance aware automatic-repeat-request (importance ARQ) is proposed. Unlike the classic ARQ focusing merely on reliability, importance ARQ selectively retransmits a data sample based on its uncertainty which helps learning and can be measured using the model under training. Underpinning the proposed protocol is a derived elegant communication-learning relation between two corresponding metrics, i.e., signal-to-noise ratio (SNR) and data uncertainty. This relation facilitates the design of a simple threshold based policy for importance ARQ. The policy is first derived based on the classic classifier model of support vector machine (SVM), where the uncertainty of a data sample is measured by its distance to the decision boundary. The policy is then extended to the more complex model of convolutional neural networks (CNN) where data uncertainty is measured by entropy. Extensive experiments have been conducted for both the SVM and CNN using real datasets with balanced and imbalanced distributions. Experimental results demonstrate that importance ARQ effectively copes with channel fading and noise in wireless data acquisition to achieve faster model convergence than the conventional channel-aware ARQ.Comment: This is an updated version: 1) extension to general classifiers; 2) consideration of imbalanced classification in the experiments. Submitted to IEEE Journal for possible publicatio

    A Machine learning approach to POS tagging

    Get PDF
    We have applied inductive learning of statistical decision trees and relaxation labelling to the Natural Language Processing (NLP) task of morphosyntactic disambiguation (Part Of Speech Tagging). The learning process is supervised and obtains a language model oriented to resolve POS ambiguities. This model consists of a set of statistical decision trees expressing distribution of tags and words in some relevant contexts. The acquired language models are complete enough to be directly used as sets of POS disambiguation rules, and include more complex contextual information than simple collections of n-grams usually used in statistical taggers. We have implemented a quite simple and fast tagger that has been tested and evaluated on the Wall Street Journal (WSJ) corpus with a remarkable accuracy. However, better results can be obtained by translating the trees into rules to feed a flexible relaxation labelling based tagger. In this direction we describe a tagger which is able to use information of any kind (n-grams, automatically acquired constraints, linguistically motivated manually written constraints, etc.), and in particular to incorporate the machine learned decision trees. Simultaneously, we address the problem of tagging when only small training material is available, which is crucial in any process of constructing, from scratch, an annotated corpus. We show that quite high accuracy can be achieved with our system in this situation.Postprint (published version

    Synergy-Based Hand Pose Sensing: Optimal Glove Design

    Get PDF
    In this paper we study the problem of improving human hand pose sensing device performance by exploiting the knowledge on how humans most frequently use their hands in grasping tasks. In a companion paper we studied the problem of maximizing the reconstruction accuracy of the hand pose from partial and noisy data provided by any given pose sensing device (a sensorized "glove") taking into account statistical a priori information. In this paper we consider the dual problem of how to design pose sensing devices, i.e. how and where to place sensors on a glove, to get maximum information about the actual hand posture. We study the continuous case, whereas individual sensing elements in the glove measure a linear combination of joint angles, the discrete case, whereas each measure corresponds to a single joint angle, and the most general hybrid case, whereas both continuous and discrete sensing elements are available. The objective is to provide, for given a priori information and fixed number of measurements, the optimal design minimizing in average the reconstruction error. Solutions relying on the geometrical synergy definition as well as gradient flow-based techniques are provided. Simulations of reconstruction performance show the effectiveness of the proposed optimal design.Comment: Submitted to International Journal of Robotics Research 201

    Towards Cancer Hybrid Automata

    Full text link
    This paper introduces Cancer Hybrid Automata (CHAs), a formalism to model the progression of cancers through discrete phenotypes. The classification of cancer progression using discrete states like stages and hallmarks has become common in the biology literature, but primarily as an organizing principle, and not as an executable formalism. The precise computational model developed here aims to exploit this untapped potential, namely, through automatic verification of progression models (e.g., consistency, causal connections, etc.), classification of unreachable or unstable states and computer-generated (individualized or universal) therapy plans. The paper builds on a phenomenological approach, and as such does not need to assume a model for the biochemistry of the underlying natural progression. Rather, it abstractly models transition timings between states as well as the effects of drugs and clinical tests, and thus allows formalization of temporal statements about the progression as well as notions of timed therapies. The model proposed here is ultimately based on hybrid automata, and we show how existing controller synthesis algorithms can be generalized to CHA models, so that therapies can be generated automatically. Throughout this paper we use cancer hallmarks to represent the discrete states through which cancer progresses, but other notions of discretely or continuously varying state formalisms could also be used to derive similar therapies.Comment: In Proceedings HSB 2012, arXiv:1208.315
    • …
    corecore