Search CORE

1,125 research outputs found

Hybrid Committee Classifier for a Computerized Colonic Polyp Detection System

Author: Hara Amy K.
Li Jiang
Petrick Nicholas
Pluim Josien P.W., (Ed.)
Reinhardt Joseph M., (Ed.)
Summers Ronald M.
Yao Jianhua
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 01/01/2006
Field of study

We present a hybrid committee classifier for computer-aided detection (CAD) of colonic polyps in CT colonography (CTC). The classifier involved an ensemble of support vector machines (SVM) and neural networks (NN) for classification, a progressive search algorithm for selecting a set of features used by the SVMs and a floating search algorithm for selecting features used by the NNs. A total of 102 quantitative features were calculated for each polyp candidate found by a prototype CAD system. 3 features were selected for each of 7 SVM classifiers which were then combined to form a committee of SVMs classifier. Similarly, features (numbers varied from 10-20) were selected for 11 NN classifiers which were again combined to form a NN committee classifier. Finally, a hybrid committee classifier was defined by combining the outputs of both the SVM and NN committees. The method was tested on CTC scans (supine and prone views) of 29 patients, in terms of the partial area under a free response receiving operation characteristic (FROC) curve (AUC). Our results showed that the hybrid committee classifier performed the best for the prone scans and was comparable to other classifiers for the supine scans

Features and Algorithms for Visual Parsing of Handwritten Mathematical Expressions

Author: Hu Lei
Publication venue: RIT Scholar Works
Publication date: 01/05/2016
Field of study

Math expressions are an essential part of scientific documents. Handwritten math expressions recognition can benefit human-computer interaction especially in the education domain and is a critical part of document recognition and analysis. Parsing the spatial arrangement of symbols is an essential part of math expression recognition. A variety of parsing techniques have been developed during the past three decades, and fall into two groups. The first group is graph-based parsing. It selects a path or sub-graph which obeys some rule to form a possible interpretation for the given expression. The second group is grammar driven parsing. Grammars and related parameters are defined manually for different tasks. The time complexity of these two groups parsing is high, and they often impose some strict constraints to reduce the computation. The aim of this thesis is working towards building a straightforward and effective parser with as few constraints as possible. First, we propose using a line of sight graph for representing the layout of strokes and symbols in math expressions. It achieves higher F-score than other graph representations and reduces search space for parsing. Second, we modify the shape context feature with Parzen window density estimation. This feature set works well for symbol segmentation, symbol classification and symbol layout analysis. We get a higher symbol segmentation F-score than other systems on CROHME 2014 dataset. Finally, we develop a Maximum Spanning Tree (MST) based parser using Edmonds\u27 algorithm, which extracts an MST from the directed line of sight graph in two passes: first symbols are segmented, and then symbols and spatial relationship are labeled. The time complexity of our MST-based parsing is lower than the time complexity of CYK parsing with context-free grammars. Also, our MST-based parsing obtains higher structure rate and expression rate than CYK parsing when symbol segmentation is accurate. Correct structure means we get the structure of the symbol layout tree correct, even though the label of the edge in the symbol layout tree might be wrong. The performance of our math expression recognition system with MST-based parsing is competitive on CROHME 2012 and 2014 datasets. For future work, how to incorporate symbol classifier result and correct segmentation error in MST-based parsing needs more research

RIT Scholar Works

Predicting Academic Performance of Potential Electrical Engineering Majors

Author: Sundararaj Sanjhana
Publication venue
Publication date: 05/02/2018
Field of study

Data Analytics for education is fast growing into an important part of higher learning institutions, which helps to improve student success rate and decision-making with regards to teaching methods, course selection, and student retention. The undergraduate program at Texas A&M University requires students to take up a general engineering program during their freshman and sophomore years. During the course of this period, student academic performance, abilities and participation is assessed. As per the Entry-to-a-Major policy, departments place the students in the best possible major based on their displayed capacities and in alignment with their goals. Our focus is on the Electrical Engineering department and the success rate of students with aspirations and background in this major. An approach to improve student retention rate is to predict beforehand the performance of students in specific course disciplines based on the information that is mined from their previous records. Based on the outcome, decisions can be made in advance regarding their further enrollment in the area and need for specific attention in certain aspects to get students up to the benchmark. In this thesis, we put together a set attributes related to students in the general program and with an electrical engineering aligned background. The analysis centers around building a method that explains the joint influence of attributes on our target variable and comparison of prediction performances between our models. The prime tools used are Supervised classification and Ensemble learning methods. We also develop a metric-based learning framework suitable for our application that enables competitive accuracy results and efficient pattern recognition from the underlying data

Comparison of CNN-Learned vs. Handcrafted Features for Detection of Parkinson's Disease Dysgraphia in a Multilingual Dataset

Author: Brabenec Lubos
Castrillon Reinel
Drotar Péter
Faundez-Zanuy Marcos
Galaz Zoltán
Gazda Matej
Kincses Zsigmond Tamás
Mekyska Jiri
Mucha Jan
Orozco-Arroyave Juan Rafael
Rapcsak Steven
Rektorova Irena
Smekal Zdenek
Zvoncak Vojtech
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2022
Field of study

Parkinson's disease dysgraphia (PDYS), one of the earliest signs of Parkinson's disease (PD), has been researched as a promising biomarker of PD and as the target of a noninvasive and inexpensive approach to monitoring the progress of the disease. However, although several approaches to supportive PDYS diagnosis have been proposed (mainly based on handcrafted features (HF) extracted from online handwriting or the utilization of deep neural networks), it remains unclear which approach provides the highest discrimination power and how these approaches can be transferred between different datasets and languages. This study aims to compare classification performance based on two types of features: features automatically extracted by a pretrained convolutional neural network (CNN) and HF designed by human experts. Both approaches are evaluated on a multilingual dataset collected from 143 PD patients and 151 healthy controls in the Czech Republic, United States, Colombia, and Hungary. The subjects performed the spiral drawing task (SDT; a language-independent task) and the sentence writing task (SWT; a language-dependent task). Models based on logistic regression and gradient boosting were trained in several scenarios, specifically single language (SL), leave one language out (LOLO), and all languages combined (ALC). We found that the HF slightly outperformed the CNN-extracted features in all considered evaluation scenarios for the SWT. In detail, the following balanced accuracy (BACC) scores were achieved: SL—0.65 (HF), 0.58 (CNN); LOLO—0.65 (HF), 0.57 (CNN); and ALC—0.69 (HF), 0.66 (CNN). However, in the case of the SDT, features extracted by a CNN provided competitive results: SL—0.66 (HF), 0.62 (CNN); LOLO—0.56 (HF), 0.54 (CNN); and ALC—0.60 (HF), 0.60 (CNN). In summary, regarding the SWT, the HF outperformed the CNN-extracted features over 6% (mean BACC of 0.66 for HF, and 0.60 for CNN). In the case of the SDT, both feature sets provided almost identical classification performance (mean BACC of 0.60 for HF, and 0.58 for CNN). Copyright © 2022 Galaz, Drotar, Mekyska, Gazda, Mucha, Zvoncak, Smekal, Faundez-Zanuy, Castrillon, Orozco-Arroyave, Rapcsak, Kincses, Brabenec and Rektorova

Diabetes Prediction Using Artificial Neural Network

Author: Abu-Naser Samy S.
El_Jerjawi Nesreen Samer
Publication venue
Publication date: 01/01/2018
Field of study

Diabetes is one of the most common diseases worldwide where a cure is not found for it yet. Annually it cost a lot of money to care for people with diabetes. Thus the most important issue is the prediction to be very accurate and to use a reliable method for that. One of these methods is using artificial intelligence systems and in particular is the use of Artificial Neural Networks (ANN). So in this paper, we used artificial neural networks to predict whether a person is diabetic or not. The criterion was to minimize the error function in neural network training using a neural network model. After training the ANN model, the average error function of the neural network was equal to 0.01 and the accuracy of the prediction of whether a person is diabetics or not was 87.3

PhilPapers