12 research outputs found

    Regularising Non-linear Models Using Feature Side-information

    Full text link
    Very often features come with their own vectorial descriptions which provide detailed information about their properties. We refer to these vectorial descriptions as feature side-information. In the standard learning scenario, input is represented as a vector of features and the feature side-information is most often ignored or used only for feature selection prior to model fitting. We believe that feature side-information which carries information about features intrinsic property will help improve model prediction if used in a proper way during learning process. In this paper, we propose a framework that allows for the incorporation of the feature side-information during the learning of very general model families to improve the prediction performance. We control the structures of the learned models so that they reflect features similarities as these are defined on the basis of the side-information. We perform experiments on a number of benchmark datasets which show significant predictive performance gains, over a number of baselines, as a result of the exploitation of the side-information.Comment: 11 page with appendi

    KLOSURE: Closing in on open–ended patient questionnaires with text mining

    Get PDF
    Background: Knee injury and Osteoarthritis Outcome Score (KOOS) is an instrument used to quantify patients' perceptions about their knee condition and associated problems. It is administered as a 42-item closed-ended questionnaire in which patients are asked to self-assess five outcomes: pain, other symptoms, activities of daily living, sport and recreation activities, and quality of life. We developed KLOG as a 10-item open-ended version of the KOOS questionnaire in an attempt to obtain deeper insight into patients' opinions including their unmet needs. However, the open–ended nature of the questionnaire incurs analytical overhead associated with the interpretation of responses. The goal of this study was to automate such analysis. We implemented KLOSURE as a system for mining free–text responses to the KLOG questionnaire. It consists of two subsystems, one concerned with feature extraction and the other one concerned with classification of feature vectors. Feature extraction is performed by a set of four modules whose main functionalities are linguistic pre-processing, sentiment analysis, named entity recognition and lexicon lookup respectively. Outputs produced by each module are combined into feature vectors. The structure of feature vectors will vary across the KLOG questions. Finally, Weka, a machine learning workbench, was used for classification of feature vectors. Results: The precision of the system varied between 62.8% and 95.3%, whereas the recall varied from 58.3% to 87.6% across the 10 questions. The overall performance in terms of F–measure varied between 59.0% and 91.3% with an average of 74.4% and a standard deviation of 8.8. Conclusions: We demonstrated the feasibility of mining open-ended patient questionnaires. By automatically mapping free text answers onto a Likert scale, we can effectively measure the progress of rehabilitation over time. In comparison to traditional closed-ended questionnaires, our approach offers much richer information that can be utilised to support clinical decision making. In conclusion, we demonstrated how text mining can be used to combine the benefits of qualitative and quantitative analysis of patient experiences

    Closing in on open-ended patient questionnaires with text mining

    Get PDF
    Knee injury and Osteoarthritis Outcome Score (KOOS) is an instrument used to quantify patients' perceptions about their knee condition and associated problems. It is administered as a 42-item closed-ended questionnaire in which patients are asked to self-assess five outcomes: pain, other symptoms, activities of daily living, sport and recreation activities, and quality of life. We developed KLOG as a 10-item open-ended version of the KOOS questionnaire in an attempt to obtain deeper insight into patients’ opinions including their unmet needs. However, the open–ended nature of the questionnaire incurs analytical overhead associated with the interpretation of responses. The goal of this study was to automate such analysis. To that end, we implemented KLOSURE as a system for mining free–text responses to the KLOG questionnaire. The precision of the system varied between 64.8% and 95.3%, whereas the recall varied from 61.3% to 87.8% across the 10 questions

    Izbor atributa integracijom znanja o domenu primenom metoda odlučivanja kod prediktivnog modelovanja vremenskih serija nadgledanim mašinskim učenjem

    Get PDF
    The aim of the research presented within this doctoral dissertation is to develop a feature selection methodology through integrating domain-specific knowledge by applying mathematical methods of decision-making, to improve the feature selection process and the precision of supervised machine learning methods for predictive modeling of time series. To integrate domain-specific knowledge, a multi-criteria decision making method is used, i.e. an analytical hierarchical process proven to be successful in numerous studies carried out to date. This approach was selected because it allows the selection of a set of factors based on their relevance, even in the case of mutually opposite criteria. In predicting the movement of time series, the possibility of integrating feature relevance into support vector machines to improve their prediction accuracy was studied. The proposed methodology was applied as a feature-selection method for the predictive modelling of movement of financial time series. Unlike existing approaches, where the feature selection method is based on a quantitative analysis of the input values, the proposed methodology carries out a qualitative evaluation of the attributes in relation to the prediction domain and represents a means of integrating a priori knowledge of the prediction domain
    corecore