85 research outputs found

    Action Search: Spotting Actions in Videos and Its Application to Temporal Action Localization

    Full text link
    State-of-the-art temporal action detectors inefficiently search the entire video for specific actions. Despite the encouraging progress these methods achieve, it is crucial to design automated approaches that only explore parts of the video which are the most relevant to the actions being searched for. To address this need, we propose the new problem of action spotting in video, which we define as finding a specific action in a video while observing a small portion of that video. Inspired by the observation that humans are extremely efficient and accurate in spotting and finding action instances in video, we propose Action Search, a novel Recurrent Neural Network approach that mimics the way humans spot actions. Moreover, to address the absence of data recording the behavior of human annotators, we put forward the Human Searches dataset, which compiles the search sequences employed by human annotators spotting actions in the AVA and THUMOS14 datasets. We consider temporal action localization as an application of the action spotting problem. Experiments on the THUMOS14 dataset reveal that our model is not only able to explore the video efficiently (observing on average 17.3% of the video) but it also accurately finds human activities with 30.8% mAP.Comment: Accepted to ECCV 201

    Scalable Federated Learning for Clients with Different Input Image Sizes and Numbers of Output Categories

    Full text link
    Federated learning is a privacy-preserving training method which consists of training from a plurality of clients but without sharing their confidential data. However, previous work on federated learning do not explore suitable neural network architectures for clients with different input images sizes and different numbers of output categories. In this paper, we propose an effective federated learning method named ScalableFL, where the depths and widths of the local models for each client are adjusted according to the clients' input image size and the numbers of output categories. In addition, we provide a new bound for the generalization gap of federated learning. In particular, this bound helps to explain the effectiveness of our scalable neural network approach. We demonstrate the effectiveness of ScalableFL in several heterogeneous client settings for both image classification and object detection tasks.Comment: 15 pages, 1 figure, 2023 22nd International Conference on Machine Learning and Applications (ICMLA

    Application of physical properties measurements to lithological prediction and constrained inversion of potential field data, Victoria Property, Sudbury, Canada.

    Get PDF
    In recent years the number of near-surface deposits has decreased significantly; consequently, exploration companies are transitioning from surface-based exploration to subsurface exploration. Geophysical methods are an important tool to explore below the surface. The physical property data are numerical data derived from geophysical measurements that can be analyzed to extract patterns to illustrate how these measurements vary in different geological units. Having knowledge of links between physical properties and geology is potentially useful to obtain more precise understanding of subsurface geology. Firstly, down-hole density, gamma radioactivity, and magnetic susceptibility measurements in five drillholes at the Victoria property, Sudbury, Ontario were analyzed to identify a meaningful pattern of variations in physical property measurements. The measurements grouped into distinct clusters identified by the fuzzy k-means algorithm, which are termed ‘physical log units’. There was a meaningful spatial and statistical correlation between these physical log units and lithological units (or groups of lithological units), as classified by the geologist. The existence of these relationships suggests that it might be possible to train a classifier to produce an inferred function quantifying this link, which can be used to predict lithological units and physical units based on physical property data. A neural network was trained from the lithological information from one hole, and was applied on a new hole with 64% of the rock types being correctly classified when compared with those logged by geologists. This misclassification can occur as a result of overlap between physical properties of rock types. However, the predictive accuracy in the training process rose to 95% when the network was trained to classify the physical log units (which group together the units with overlapping properties). Secondly, lithological prediction based on down-hole physical property measurements was extended from the borehole to three-dimensional space at the Victoria property. Density and magnetic susceptibility models were produced by geologically constrained inversion of gravity and magnetic field data, and a neural network was trained to predict lithological units from the two physical properties measured in seven holes. Then, the trained network was applied on the 3D distribution of the two physical properties derived from the inversion models to produce a 3D litho-prediction model. The lithologies used were simplified to remove potential ambiguities due to overlap of physical properties. The 3D model obtained was consistent with the geophysical data and resulted in a more holistic understanding of the subsurface lithology. Finally, to extract more information from geophysical logs, the density and gamma-ray response logs were analyzed to detect boundaries between lithological units. A derivative method was successfully applied on the down-hole logs, and picked the boundaries between rock types identified by geologists as well as additional information describing variation of physical properties within and between layers not identified by the geologist.Doctor of Philosophy (PhD) in Mineral Deposits and Precambrian Geolog

    Learning with online constraints : shifting concepts and active learning

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.Includes bibliographical references (p. 99-102).Many practical problems such as forecasting, real-time decision making, streaming data applications, and resource-constrained learning, can be modeled as learning with online constraints. This thesis is concerned with analyzing and designing algorithms for learning under the following online constraints: i) The algorithm has only sequential, or one-at-time, access to data. ii) The time and space complexity of the algorithm must not scale with the number of observations. We analyze learning with online constraints in a variety of settings, including active learning. The active learning model is applicable to any domain in which unlabeled data is easy to come by and there exists a (potentially difficult or expensive) mechanism by which to attain labels. First, we analyze a supervised learning framework in which no statistical assumptions are made about the sequence of observations, and algorithms are evaluated based on their regret, i.e. their relative prediction loss with respect to the hindsight-optimal algorithm in a comparator class. We derive a, lower bound on regret for a class of online learning algorithms designed to track shifting concepts in this framework. We apply an algorithm we provided in previous work, that avoids this lower bound, to an energy-management problem in wireless networks, and demonstrate this application in a network simulation.(cont.) Second, we analyze a supervised learning framework in which the observations are assumed to be iid, and algorithms are compared by the number of prediction mistakes made in reaching a target generalization error. We provide a lower bound on mistakes for Perceptron, a standard online learning algorithm, for this framework. We introduce a modification to Perceptron and show that it avoids this lower bound, and in fact attains the optimal mistake-complexity for this setting. Third, we motivate and analyze an online active learning framework. The observations are assumed to be iid, and algorithms are judged by the number of label queries to reach a target generalization error. Our lower bound applies to the active learning setting as well, as a lower bound on labels for Perceptron paired with any active learning rule. We provide a new online active learning algorithm that avoids the lower bound, and we upper bound its label-complexity. The upper bound is optimal and also bounds the algorithm's total errors (labeled and unlabeled). We analyze the algorithm further, yielding a label-complexity bound under relaxed assumptions. Using optical character recognition data, we empirically compare the new algorithm to an online active learning algorithm with data-dependent performance guarantees, as well as to the combined variants of these two algorithms.by Claire E. Monteleoni.Ph.D

    SPAN: A Stochastic Projected Approximate Newton Method

    Full text link
    Second-order optimization methods have desirable convergence properties. However, the exact Newton method requires expensive computation for the Hessian and its inverse. In this paper, we propose SPAN, a novel approximate and fast Newton method. SPAN computes the inverse of the Hessian matrix via low-rank approximation and stochastic Hessian-vector products. Our experiments on multiple benchmark datasets demonstrate that SPAN outperforms existing first-order and second-order optimization methods in terms of the convergence wall-clock time. Furthermore, we provide a theoretical analysis of the per-iteration complexity, the approximation error, and the convergence rate. Both the theoretical analysis and experimental results show that our proposed method achieves a better trade-off between the convergence rate and the per-iteration efficiency.Comment: Appeared in the AAAI 2020, 25 pages, 6 figure
    • …
    corecore