112,981 research outputs found

    Visual Analytics and Human Involvement in Machine Learning

    Full text link
    The rapidly developing AI systems and applications still require human involvement in practically all parts of the analytics process. Human decisions are largely based on visualizations, providing data scientists details of data properties and the results of analytical procedures. Different visualizations are used in the different steps of the Machine Learning (ML) process. The decision which visualization to use depends on factors, such as the data domain, the data model and the step in the ML process. In this chapter, we describe the seven steps in the ML process and review different visualization techniques that are relevant for the different steps for different types of data, models and purposes

    Learn on Source, Refine on Target:A Model Transfer Learning Framework with Random Forests

    Full text link
    We propose novel model transfer-learning methods that refine a decision forest model M learned within a "source" domain using a training set sampled from a "target" domain, assumed to be a variation of the source. We present two random forest transfer algorithms. The first algorithm searches greedily for locally optimal modifications of each tree structure by trying to locally expand or reduce the tree around individual nodes. The second algorithm does not modify structure, but only the parameter (thresholds) associated with decision nodes. We also propose to combine both methods by considering an ensemble that contains the union of the two forests. The proposed methods exhibit impressive experimental results over a range of problems.Comment: 2 columns, 14 pages, TPAMI submitte

    Unsupervised Feature Learning for Environmental Sound Classification Using Weighted Cycle-Consistent Generative Adversarial Network

    Full text link
    In this paper we propose a novel environmental sound classification approach incorporating unsupervised feature learning from codebook via spherical KK-Means++ algorithm and a new architecture for high-level data augmentation. The audio signal is transformed into a 2D representation using a discrete wavelet transform (DWT). The DWT spectrograms are then augmented by a novel architecture for cycle-consistent generative adversarial network. This high-level augmentation bootstraps generated spectrograms in both intra and inter class manners by translating structural features from sample to sample. A codebook is built by coding the DWT spectrograms with the speeded-up robust feature detector (SURF) and the K-Means++ algorithm. The Random Forest is our final learning algorithm which learns the environmental sound classification task from the clustered codewords in the codebook. Experimental results in four benchmarking environmental sound datasets (ESC-10, ESC-50, UrbanSound8k, and DCASE-2017) have shown that the proposed classification approach outperforms the state-of-the-art classifiers in the scope, including advanced and dense convolutional neural networks such as AlexNet and GoogLeNet, improving the classification rate between 3.51% and 14.34%, depending on the dataset.Comment: Paper Accepted for Publication in Elsevier Applied Soft Computin

    New Ideas for Brain Modelling 4

    Full text link
    This paper continues the research that considers a new cognitive model based strongly on the human brain. In particular, it considers the neural binding structure of an earlier paper. It also describes some new methods in the areas of image processing and behaviour simulation. The work is all based on earlier research by the author and the new additions are intended to fit in with the overall design. For image processing, a grid-like structure is used with 'full linking'. Each cell in the classifier grid stores a list of all other cells it gets associated with and this is used as the learned image that new input is compared to. For the behaviour metric, a new prediction equation is suggested, as part of a simulation, that uses feedback and history to dynamically determine its course of action. While the new methods are from widely different topics, both can be compared with the binary-analog type of interface that is the main focus of the paper. It is suggested that the simplest of linking between a tree and ensemble can explain neural binding and variable signal strengths

    Mobile Sound Recognition for the Deaf and Hard of Hearing

    Full text link
    Human perception of surrounding events is strongly dependent on audio cues. Thus, acoustic insulation can seriously impact situational awareness. We present an exploratory study in the domain of assistive computing, eliciting requirements and presenting solutions to problems found in the development of an environmental sound recognition system, which aims to assist deaf and hard of hearing people in the perception of sounds. To take advantage of smartphones computational ubiquity, we propose a system that executes all processing on the device itself, from audio features extraction to recognition and visual presentation of results. Our application also presents the confidence level of the classification to the user. A test of the system conducted with deaf users provided important and inspiring feedback from participants.Comment: 25 pages, 8 figure

    Supervised deep learning in high energy phenomenology: a mini review

    Full text link
    Deep learning, a branch of machine learning, have been recently applied to high energy experimental and phenomenological studies. In this note we give a brief review on those applications using supervised deep learning. We first describe various learning models and then recapitulate their applications to high energy phenomenological studies. Some detailed applications are delineated in details, including the machine learning scan in the analysis of new physics parameter space, the graph neural networks in the search of top-squark production and in the CPCP measurement of the top-Higgs coupling at the LHC.Comment: Invited review, 72 pages, 24 figures. References are adde

    Review of Fall Detection Techniques: A Data Availability Perspective

    Full text link
    A fall is an abnormal activity that occurs rarely; however, missing to identify falls can have serious health and safety implications on an individual. Due to the rarity of occurrence of falls, there may be insufficient or no training data available for them. Therefore, standard supervised machine learning methods may not be directly applied to handle this problem. In this paper, we present a taxonomy for the study of fall detection from the perspective of availability of fall data. The proposed taxonomy is independent of the type of sensors used and specific feature extraction/selection methods. The taxonomy identifies different categories of classification methods for the study of fall detection based on the availability of their data during training the classifiers. Then, we present a comprehensive literature review within those categories and identify the approach of treating a fall as an abnormal activity to be a plausible research direction. We conclude our paper by discussing several open research problems in the field and pointers for future research.Comment: 30 pages, 1 figure, 3 Table

    Batch-based Activity Recognition from Egocentric Photo-Streams Revisited

    Full text link
    Wearable cameras can gather large a\-mounts of image data that provide rich visual information about the daily activities of the wearer. Motivated by the large number of health applications that could be enabled by the automatic recognition of daily activities, such as lifestyle characterization for habit improvement, context-aware personal assistance and tele-rehabilitation services, we propose a system to classify 21 daily activities from photo-streams acquired by a wearable photo-camera. Our approach combines the advantages of a Late Fusion Ensemble strategy relying on convolutional neural networks at image level with the ability of recurrent neural networks to account for the temporal evolution of high level features in photo-streams without relying on event boundaries. The proposed batch-based approach achieved an overall accuracy of 89.85\%, outperforming state of the art end-to-end methodologies. These results were achieved on a dataset consists of 44,902 egocentric pictures from three persons captured during 26 days in average

    Toward Automated Classroom Observation: Multimodal Machine Learning to Estimate CLASS Positive Climate and Negative Climate

    Full text link
    In this work we present a multi-modal machine learning-based system, which we call ACORN, to analyze videos of school classrooms for the Positive Climate (PC) and Negative Climate (NC) dimensions of the CLASS observation protocol that is widely used in educational research. ACORN uses convolutional neural networks to analyze spectral audio features, the faces of teachers and students, and the pixels of each image frame, and then integrates this information over time using Temporal Convolutional Networks. The audiovisual ACORN's PC and NC predictions have Pearson correlations of 0.550.55 and 0.630.63 with ground-truth scores provided by expert CLASS coders on the UVA Toddler dataset (cross-validation on n=300n=300 15-min video segments), and a purely auditory ACORN predicts PC and NC with correlations of 0.360.36 and 0.410.41 on the MET dataset (test set of n=2000n=2000 videos segments). These numbers are similar to inter-coder reliability of human coders. Finally, using Graph Convolutional Networks we make early strides (AUC=0.700.70) toward predicting the specific moments (45-90sec clips) when the PC is particularly weak/strong. Our findings inform the design of automatic classroom observation and also more general video activity recognition and summary recognition systems.Comment: The authors discovered that the results are not reproducibl

    A Survey on Content-Aware Video Analysis for Sports

    Full text link
    Sports data analysis is becoming increasingly large-scale, diversified, and shared, but difficulty persists in rapidly accessing the most crucial information. Previous surveys have focused on the methodologies of sports video analysis from the spatiotemporal viewpoint instead of a content-based viewpoint, and few of these studies have considered semantics. This study develops a deeper interpretation of content-aware sports video analysis by examining the insight offered by research into the structure of content under different scenarios. On the basis of this insight, we provide an overview of the themes particularly relevant to the research on content-aware systems for broadcast sports. Specifically, we focus on the video content analysis techniques applied in sportscasts over the past decade from the perspectives of fundamentals and general review, a content hierarchical model, and trends and challenges. Content-aware analysis methods are discussed with respect to object-, event-, and context-oriented groups. In each group, the gap between sensation and content excitement must be bridged using proper strategies. In this regard, a content-aware approach is required to determine user demands. Finally, the paper summarizes the future trends and challenges for sports video analysis. We believe that our findings can advance the field of research on content-aware video analysis for broadcast sports.Comment: Accepted for publication in IEEE Transactions on Circuits and Systems for Video Technology (TCSVT
    • …
    corecore