332 research outputs found

    Advances in Image Processing, Analysis and Recognition Technology

    Get PDF
    For many decades, researchers have been trying to make computers’ analysis of images as effective as the system of human vision is. For this purpose, many algorithms and systems have previously been created. The whole process covers various stages, including image processing, representation and recognition. The results of this work can be applied to many computer-assisted areas of everyday life. They improve particular activities and provide handy tools, which are sometimes only for entertainment, but quite often, they significantly increase our safety. In fact, the practical implementation of image processing algorithms is particularly wide. Moreover, the rapid growth of computational complexity and computer efficiency has allowed for the development of more sophisticated and effective algorithms and tools. Although significant progress has been made so far, many issues still remain, resulting in the need for the development of novel approaches

    Pattern detection and recognition using over-complete and sparse representations

    Get PDF
    Recent research in harmonic analysis and mammalian vision systems has revealed that over-complete and sparse representations play an important role in visual information processing. The research on applying such representations to pattern recognition and detection problems has become an interesting field of study. The main contribution of this thesis is to propose two feature extraction strategies - the global strategy and the local strategy - to make use of these representations. In the global strategy, over-complete and sparse transformations are applied to the input pattern as a whole and features are extracted in the transformed domain. This strategy has been applied to the problems of rotation invariant texture classification and script identification, using the Ridgelet transform. Experimental results have shown that better performance has been achieved when compared with Gabor multi-channel filtering method and Wavelet based methods. The local strategy is divided into two stages. The first one is to analyze the local over-complete and sparse structure, where the input 2-D patterns are divided into patches and the local over-complete and sparse structure is learned from these patches using sparse approximation techniques. The second stage concerns the application of the local over-complete and sparse structure. For an object detection problem, we propose a sparsity testing technique, where a local over-complete and sparse structure is built to give sparse representations to the text patterns and non-sparse representations to other patterns. Object detection is achieved by identifying patterns that can be sparsely represented by the learned. structure. This technique has been applied. to detect texts in scene images with a recall rate of 75.23% (about 6% improvement compared with other works) and a precision rate of 67.64% (about 12% improvement). For applications like character or shape recognition, the learned over-complete and sparse structure is combined. with a Convolutional Neural Network (CNN). A second text detection method is proposed based on such a combination to further improve (about 11% higher compared with our first method based on sparsity testing) the accuracy of text detection in scene images. Finally, this method has been applied to handwritten Farsi numeral recognition, which has obtained a 99.22% recognition rate on the CENPARMI Database and a 99.5% recognition rate on the HODA Database. Meanwhile, a SVM with gradient features achieves recognition rates of 98.98% and 99.22% on these databases respectivel

    A Survey of Algorithms Involved in the Conversion of 2-D Images to 3-D Model

    Get PDF
    Since the advent of machine learning, deep neural networks, and computer graphics, the field of 2D image to 3D model conversion has made tremendous strides. As a result, many algorithms and methods for converting 2D to 3D images have been developed, including SFM, SFS, MVS, and PIFu. Several strategies have been compared, and it was found that each has pros and cons that make it appropriate for particular applications. For instance, SFM is useful for creating realistic 3D models from a collection of pictures, whereas SFS is best for doing so from a single image. While PIFu can create extremely detailed 3D models of human figures from a single image, MVS can manage complicated situations with varied lighting and texture. The method chosen to convert 2D images to 3D ultimately depends on the demands of the application

    Off-line Arabic Handwriting Recognition System Using Fast Wavelet Transform

    Get PDF
    In this research, off-line handwriting recognition system for Arabic alphabet is introduced. The system contains three main stages: preprocessing, segmentation and recognition stage. In the preprocessing stage, Radon transform was used in the design of algorithms for page, line and word skew correction as well as for word slant correction. In the segmentation stage, Hough transform approach was used for line extraction. For line to words and word to characters segmentation, a statistical method using mathematic representation of the lines and words binary image was used. Unlike most of current handwriting recognition system, our system simulates the human mechanism for image recognition, where images are encoded and saved in memory as groups according to their similarity to each other. Characters are decomposed into a coefficient vectors, using fast wavelet transform, then, vectors, that represent a character in different possible shapes, are saved as groups with one representative for each group. The recognition is achieved by comparing a vector of the character to be recognized with group representatives. Experiments showed that the proposed system is able to achieve the recognition task with 90.26% of accuracy. The system needs only 3.41 seconds a most to recognize a single character in a text of 15 lines where each line has 10 words on average

    Teaching Introductory Programming Concepts through a Gesture-Based Interface

    Get PDF
    Computer programming is an integral part of a technology driven society, so there is a tremendous need to teach programming to a wider audience. One of the challenges in meeting this demand for programmers is that most traditional computer programming classes are targeted to university/college students with strong math backgrounds. To expand the computer programming workforce, we need to encourage a wider range of students to learn about programming. The goal of this research is to design and implement a gesture-driven interface to teach computer programming to young and non-traditional students. We designed our user interface based on the feedback from students attending the College of Engineering summer camps at the University of Arkansas. Our system uses the Microsoft Xbox Kinect to capture the movements of new programmers as they use our system. Our software then tracks and interprets student hand movements in order to recognize specific gestures which correspond to different programming constructs, and uses this information to create and execute programs using the Google Blockly visual programming framework. We focus on various gesture recognition algorithms to interpret user data as specific gestures, including template matching, sector quantization, and supervised machine learning clustering algorithms

    Semantic radical consistency and character transparency effects in Chinese: an ERP study

    Get PDF
    BACKGROUND: This event-related potential (ERP) study aims to investigate the representation and temporal dynamics of Chinese orthography-to-semantics mappings by simultaneously manipulating character transparency and semantic radical consistency. Character components, referred to as radicals, make up the building blocks used dur...postprin

    Learning to Read Bushman: Automatic Handwriting Recognition for Bushman Languages

    Get PDF
    The Bleek and Lloyd Collection contains notebooks that document the tradition, language and culture of the Bushman people who lived in South Africa in the late 19th century. Transcriptions of these notebooks would allow for the provision of services such as text-based search and text-to-speech. However, these notebooks are currently only available in the form of digital scans and the manual creation of transcriptions is a costly and time-consuming process. Thus, automatic methods could serve as an alternative approach to creating transcriptions of the text in the notebooks. In order to evaluate the use of automatic methods, a corpus of Bushman texts and their associated transcriptions was created. The creation of this corpus involved: the development of a custom method for encoding the Bushman script, which contains complex diacritics; the creation of a tool for creating and transcribing the texts in the notebooks; and the running of a series of workshops in which the tool was used to create the corpus. The corpus was used to evaluate the use of various techniques for automatically transcribing the texts in the corpus in order to determine which approaches were best suited to the complex Bushman script. These techniques included the use of Support Vector Machines, Artificial Neural Networks and Hidden Markov Models as machine learning algorithms, which were coupled with different descriptive features. The effect of the texts used for training the machine learning algorithms was also investigated as well as the use of a statistical language model. It was found that, for Bushman word recognition, the use of a Support Vector Machine with Histograms of Oriented Gradient features resulted in the best performance and, for Bushman text line recognition, Marti & Bunke features resulted in the best performance when used with Hidden Markov Models. The automatic transcription of the Bushman texts proved to be difficult and the performance of the different recognition systems was largely affected by the complexities of the Bushman script. It was also found that, besides having an influence on determining which techniques may be the most appropriate for automatic handwriting recognition, the texts used in a automatic handwriting recognition system also play a large role in determining whether or not automatic recognition should be attempted at all

    Machine Learning and Pedometers: An Integration-Based Convolutional Neural Network for Step Counting and Detection

    Get PDF
    This thesis explores a machine learning-based approach to step detection and counting for a pedometer. Our novelty is to analyze a window of time containing an arbitrary number of steps, and integrate the detected count using a sliding window technique. We compare the effectiveness of this approach against classic deterministic algorithms. While classic algorithms perform well during regular gait (e.g. walking or running), they can perform significantly worse during semi-regular and irregular gaits that still contribute to a person’s overall step count. These non-regular gaits can make up a significant portion of the daily step count for people, and an improvement to measuring these gaits can drastically improve the performance of the overall pedometer. Using data collected by 30 participants performing 3 different activities to simulate regular, semi-regular, and irregular gaits, a training and testing strategy was implemented using a sliding window algorithm of pedometer accelerometer data. Data was cut in rows representative of the sliding window, normalized according to the minimum and maximum values of the corresponding sensor-axis combination, and finally collated in specific training and holdout groups for validation purposes. Nine models were trained to predict a continuous count of steps within a given window, for each fold of our five-fold validation process. These nine models correspond to each gait and sensor combination from the collected data set. Once models are trained, they are evaluated against the holdout validation set to test for both run count accuracy (RCA), a measure of the pedometers detected step to actual step count, and step detection accuracy (SDA), a measure of how well the algorithm can predict the time of an actual step. These are obtained through an additional post-processing step that integrates the predicted steps per window over time in order to find the total count of steps within a given training data set. Additionally, an algorithm estimates the times when predicted steps occur by using the running count of total steps. Once testing is performed on all nine models, the process is repeated across all five folds to verify model architecture consistency throughout the entire data set. A window size test was implemented to vary the window size of the sliding window algorithm between 1 and 10 seconds to discover the effect of the sliding window size on the convolutional neural network\u27s step count and detection performance. Again, these tests were run across five different folds to ensure an accurate average measure of each model\u27s performance. By comparing the metrics of RCA and SDA between the machine-learning approach and other algorithms, we see that the method introduced in this thesis performs similarly or better than both a consumer pedometer device, as well as the three classic algorithms of peak detection, thresholding, and autocorrelation. It was found that with a window size of two seconds, this novel approach can detect steps with an overall average RCA of 0.99 and SDA of 0.88, better than any individual classic algorithm

    Multi-Modal Interfaces for Sensemaking of Graph-Connected Datasets

    Get PDF
    The visualization of hypothesized evolutionary processes is often shown through phylogenetic trees. Given evolutionary data presented in one of several widely accepted formats, software exists to render these data into a tree diagram. However, software packages commonly in use by biologists today often do not provide means to dynamically adjust and customize these diagrams for studying new hypothetical relationships, and for illustration and publication purposes. Even where these options are available, there can be a lack of intuitiveness and ease-of-use. The goal of our research is, thus, to investigate more natural and effective means of sensemaking of the data with different user input modalities. To this end, we experimented with different input modalities, designing and running a series of prototype studies, ultimately focusing our attention on pen-and-touch. Through several iterations of feedback and revision provided with the help of biology experts and students, we developed a pen-and-touch phylogenetic tree browsing and editing application called PhyloPen. This application expands on the capabilities of existing software with visualization techniques such as overview+detail, linked data views, and new interaction and manipulation techniques using pen-and-touch. To determine its impact on phylogenetic tree sensemaking, we conducted a within-subject comparative summative study against the most comparable and commonly used state-of-the-art mouse-based software system, Mesquite. Conducted with biology majors at the University of Central Florida, each used both software systems on a set number of exercise tasks of the same type. Determining effectiveness by several dependent measures, the results show PhyloPen was significantly better in terms of usefulness, satisfaction, ease-of-learning, ease-of-use, and cognitive load and relatively the same in variation of completion time. These results support an interaction paradigm that is superior to classic mouse-based interaction, which could have the potential to be applied to other communities that employ graph-based representations of their problem domains

    Ballistics Image Processing and Analysis for Firearm Identification

    Get PDF
    Firearm identification is an intensive and time-consuming process that requires physical interpretation of forensic ballistics evidence. Especially as the level of violent crime involving firearms escalates, the number of firearms to be identified accumulates dramatically. The demand for an automatic firearm identification system arises. This chapter proposes a new, analytic system for automatic firearm identification based on the cartridge and projectile specimens. Not only do we present an approach for capturing and storing the surface image of the spent projectiles at high resolution using line-scan imaging technique for the projectiles database, but we also present a novel and effective FFT-based analysis technique for analyzing and identifying the projectiles
    • …
    corecore