79 research outputs found

    Wavelet Features for Recognition of First Episode of Schizophrenia from MRI Brain Images

    Get PDF
    Machine learning methods are increasingly used in various fields of medicine, contributing to early diagnosis and better quality of care. These outputs are particularly desirable in case of neuropsychiatric disorders, such as schizophrenia, due to the inherent potential for creating a new gold standard in the diagnosis and differentiation of particular disorders. This paper presents a scheme for automated classification from magnetic resonance images based on multiresolution representation in the wavelet domain. Implementation of the proposed algorithm, utilizing support vector machines classifier, is introduced and tested on a dataset containing 104 patients with first episode schizophrenia and healthy volunteers. Optimal parameters of different phases of the algorithm are sought and the quality of classification is estimated by robust cross validation techniques. Values of accuracy, sensitivity and specificity over 71% are achieved

    The predictive ability of corporate narrative disclosures: Australian evidence

    Get PDF
    The mam objective of this study is to contribute to the academic literature by investigating the relationship between narrative disclosures and corporate performance based on Australian evidence. The research design takes as its starting from the content analysis of discretionary narrative disclosures conducted by Smith and Taffler (2000), and extends their research by combining thematic content analysis and syntactic content analysis. This study focuses on the discretionary disclosures (the Chairman\u27s Statement) of· · Australian manufacturing companies. Based on the Earnings per Share (EPS) movement between 2008 and 2009, 64 sample companies are classified into two groups: good performer and poor performer. This study is grounded on signalling theory and agency theory, and links with the impression management strategy. Based on two branches of impression management (rationalisation and enhancement), six groups of variables are collected to examine narrative disclosures from both quantity ( what to disclose ) and quality ( how to disclose ) perspectives. Manual coding and two computer-based software programs are employed in this study. This study finds that the word-based and theme-based variables based on discretionary disclosures are significantly correlated with corporate performance. Moreover, word-based variables can successfully classify companies between good performer and poor performer with an accuracy of 86%. However, there is no significant relationship between corporate performance and report size, use of long words (as a proxy for jargon), FLESCH readability score, or persuasive language. The main value of this study is to build a classification model based on Australian evidence for continuing companies, since most prior research focuses on UK, US and New Zealand companies and is based on a healthy/failed distinction

    Introduction to Facial Micro Expressions Analysis Using Color and Depth Images: A Matlab Coding Approach (Second Edition, 2023)

    Full text link
    The book attempts to introduce a gentle introduction to the field of Facial Micro Expressions Recognition (FMER) using Color and Depth images, with the aid of MATLAB programming environment. FMER is a subset of image processing and it is a multidisciplinary topic to analysis. So, it requires familiarity with other topics of Artifactual Intelligence (AI) such as machine learning, digital image processing, psychology and more. So, it is a great opportunity to write a book which covers all of these topics for beginner to professional readers in the field of AI and even without having background of AI. Our goal is to provide a standalone introduction in the field of MFER analysis in the form of theorical descriptions for readers with no background in image processing with reproducible Matlab practical examples. Also, we describe any basic definitions for FMER analysis and MATLAB library which is used in the text, that helps final reader to apply the experiments in the real-world applications. We believe that this book is suitable for students, researchers, and professionals alike, who need to develop practical skills, along with a basic understanding of the field. We expect that, after reading this book, the reader feels comfortable with different key stages such as color and depth image processing, color and depth image representation, classification, machine learning, facial micro-expressions recognition, feature extraction and dimensionality reduction. The book attempts to introduce a gentle introduction to the field of Facial Micro Expressions Recognition (FMER) using Color and Depth images, with the aid of MATLAB programming environment.Comment: This is the second edition of the boo

    Adaptive Phishing Detection System using Machine Learning

    Full text link
    Despite the availability of toolbars and studies in phishing, the number of phishing attacks has been increasing in the past years. It remains a challenge to develop robust phishing detection systems due to the continuous change of attack models. We attempt to address this by designing an adaptive phishing detection system with the ability to continually learn and detect phishing robustly. In the first work, we demonstrate a systematic way to develop a novel phishing detection approach using compression algorithm. We also propose the use of compression ratio as a novel machine learning feature, which significantly improves machine learning based phishing detection over previous studies. Our proposed method outperforms the use of best-performing HTML-based features in past studies, with a true positive rate of 80.04%. In the following work, we propose a feature-free method using Normalised Compression Distance (NCD), a metric which computes the similarity of two websites by compressing them, eliminating the need to perform any feature extraction. This method examines the HTML of webpages and computes their similarity with known phishing websites. Our approach is feasible to deploy in real systems with a processing time of roughly 0.3 seconds, and significantly outperforms previous methods in detecting phishing websites, with an AUC score of 98.68%, a G-mean score of 94.47%, a high true positive rate (TPR) of around 90%, while maintaining a low false positive rate (FPR) of 0.58%. We also discuss the implication of automation offered by AutoML frameworks towards the role of human experts and data scientists in the domain of phishing detection. Our work investigates whether models that are built using AutoML frameworks can outperform the results achieved by human data scientists in phishing datasets and analyses the relationship between the performances and various data complexity measures. There remain many challenges for building a real-world phishing detection system using AutoML frameworks due to the current support only for supervised classification problems, leading to the need for labelled data, and the inability to update the AutoML-based models incrementally. This indicates that experts with knowledge in the domain of phishing and cybersecurity are still essential in phishing detection

    Feature Space Modeling for Accurate and Efficient Learning From Non-Stationary Data

    Get PDF
    A non-stationary dataset is one whose statistical properties such as the mean, variance, correlation, probability distribution, etc. change over a specific interval of time. On the contrary, a stationary dataset is one whose statistical properties remain constant over time. Apart from the volatile statistical properties, non-stationary data poses other challenges such as time and memory management due to the limitation of computational resources mostly caused by the recent advancements in data collection technologies which generate a variety of data at an alarming pace and volume. Additionally, when the collected data is complex, managing data complexity, emerging from its dimensionality and heterogeneity, can pose another challenge for effective computational learning. The problem is to enable accurate and efficient learning from non-stationary data in a continuous fashion over time while facing and managing the critical challenges of time, memory, concept change, and complexity simultaneously. Feature space modeling is one of the most effective solutions to address this problem. For non-stationary data, selecting relevant features is even more critical than stationary data due to the reduction of feature dimension which can ensure the best use a computational resource to produce higher accuracy and efficiency by data mining algorithms. In this dissertation, we investigated a variety of feature space modeling techniques to improve the overall performance of data mining algorithms. In particular, we built Relief based feature sub selection method in combination with data complexity iv analysis to improve the classification performance using ovarian cancer image data collected in a non-stationary batch mode. We also collected time series health sensor data in a streaming environment and deployed feature space transformation using Singular Value Decomposition (SVD). This led to reduced dimensionality of feature space resulting in better accuracy and efficiency produced by Density Ration Estimation Method in identifying potential change points in data over time. We have also built an unsupervised feature space modeling using matrix factorization and Lasso Regression which was successfully deployed in conjugate with Relative Density Ratio Estimation to address the botnet attacks in a non-stationary environment. Relief based feature model improved 16% accuracy of Fuzzy Forest classifier. For change detection framework, we observed 9% improvement in accuracy for PCA feature transformation. Due to the unsupervised feature selection model, for 2% and 5% malicious traffic ratio, the proposed botnet detection framework exhibited average 20% better accuracy than One Class Support Vector Machine (OSVM) and average 25% better accuracy than Autoencoder. All these results successfully demonstrate the effectives of these feature space models. The fundamental theme that repeats itself in this dissertation is about modeling efficient feature space to improve both accuracy and efficiency of selected data mining models. Every contribution in this dissertation has been subsequently and successfully employed to capitalize on those advantages to solve real-world problems. Our work bridges the concepts from multiple disciplines ineffective and surprising ways, leading to new insights, new frameworks, and ultimately to a cross-production of diverse fields like mathematics, statistics, and data mining

    Content-based image retrieval-- a small sample learning approach.

    Get PDF
    Tao Dacheng.Thesis (M.Phil.)--Chinese University of Hong Kong, 2004.Includes bibliographical references (leaves 70-75).Abstracts in English and Chinese.Chapter Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Content-based Image Retrieval --- p.1Chapter 1.2 --- SVM based RF in CBIR --- p.3Chapter 1.3 --- DA based RF in CBIR --- p.4Chapter 1.4 --- Existing CBIR Engines --- p.5Chapter 1.5 --- Practical Applications of CBIR --- p.10Chapter 1.6 --- Organization of this thesis --- p.11Chapter Chapter 2 --- Statistical Learning Theory and Support Vector Machine --- p.12Chapter 2.1 --- The Recognition Problem --- p.12Chapter 2.2 --- Regularization --- p.14Chapter 2.3 --- The VC Dimension --- p.14Chapter 2.4 --- Structure Risk Minimization --- p.15Chapter 2.5 --- Support Vector Machine --- p.15Chapter 2.6 --- Kernel Space --- p.17Chapter Chapter 3 --- Discriminant Analysis --- p.18Chapter 3.1 --- PCA --- p.18Chapter 3.2 --- KPCA --- p.18Chapter 3.3 --- LDA --- p.20Chapter 3.4 --- BDA --- p.20Chapter 3.5 --- KBDA --- p.21Chapter Chapter 4 --- Random Sampling Based SVM --- p.24Chapter 4.1 --- Asymmetric Bagging SVM --- p.25Chapter 4.2 --- Random Subspace Method SVM --- p.26Chapter 4.3 --- Asymmetric Bagging RSM SVM --- p.26Chapter 4.4 --- Aggregation Model --- p.30Chapter 4.5 --- Dissimilarity Measure --- p.31Chapter 4.6 --- Computational Complexity Analysis --- p.31Chapter 4.7 --- QueryGo Image Retrieval System --- p.32Chapter 4.8 --- Toy Experiments --- p.35Chapter 4.9 --- Statistical Experimental Results --- p.36Chapter Chapter 5 --- SSS Problems in KBDA RF --- p.42Chapter 5.1 --- DKBDA --- p.43Chapter 5.1.1 --- DLDA --- p.43Chapter 5.1.2 --- DKBDA --- p.43Chapter 5.2 --- NKBDA --- p.48Chapter 5.2.1 --- NLDA --- p.48Chapter 5.2.2 --- NKBDA --- p.48Chapter 5.3 --- FKBDA --- p.49Chapter 5.3.1 --- FLDA --- p.49Chapter 5.3.2 --- FKBDA --- p.49Chapter 5.4 --- Experimental Results --- p.50Chapter Chapter 6 --- NDA based RF for CBIR --- p.52Chapter 6.1 --- NDA --- p.52Chapter 6.2 --- SSS Problem in NDA --- p.53Chapter 6.2.1 --- Regularization method --- p.53Chapter 6.2.2 --- Null-space method --- p.54Chapter 6.2.3 --- Full-space method --- p.54Chapter 6.3 --- Experimental results --- p.55Chapter 6.3.1 --- K nearest neighbor evaluation for NDA --- p.55Chapter 6.3.2 --- SSS problem --- p.56Chapter 6.3.3 --- Evaluation experiments --- p.57Chapter Chapter 7 --- Medical Image Classification --- p.59Chapter 7.1 --- Introduction --- p.59Chapter 7.2 --- Region-based Co-occurrence Matrix Texture Feature --- p.60Chapter 7.3 --- Multi-level Feature Selection --- p.62Chapter 7.4 --- Experimental Results --- p.63Chapter 7.4.1 --- Data Set --- p.64Chapter 7.4.2 --- Classification Using Traditional Features --- p.65Chapter 7.4.3 --- Classification Using the New Features --- p.66Chapter Chapter 8 --- Conclusion --- p.68Bibliography --- p.7

    Factors characterizing the academic experiences of children with mild bilateral or unilateral hearing loss

    Get PDF
    Students with mild bilateral (MBHL) or unilateral hearing loss (UHL) are frequently overlooked in service provision under the umbrella of special education services as they are typically viewed as having insignificant disability (Brown, Holstrum, & Ringwalt, 2008). However, up to 50% of these students fail at least one grade during their K-12 experience, demonstrating a significant risk associated with this population (Bess & Tharpe, 1984, 1986; Most, 2006). Despite evidence of risk for failure, little research exists to aid in the identification of need for services, including risk factors or potential risk factors. The aim of this study is to fill that gap of evidence required to better identify students who may need interventions to prevent failure academically. In summary, this study is an analysis of family demographic and student characteristics in order to identify common traits among students with MB/UHL who are likely to be associated with failure in academic performance
    corecore