1,223 research outputs found
Fast Non-Parametric Learning to Accelerate Mixed-Integer Programming for Online Hybrid Model Predictive Control
Today's fast linear algebra and numerical optimization tools have pushed the
frontier of model predictive control (MPC) forward, to the efficient control of
highly nonlinear and hybrid systems. The field of hybrid MPC has demonstrated
that exact optimal control law can be computed, e.g., by mixed-integer
programming (MIP) under piecewise-affine (PWA) system models. Despite the
elegant theory, online solving hybrid MPC is still out of reach for many
applications. We aim to speed up MIP by combining geometric insights from
hybrid MPC, a simple-yet-effective learning algorithm, and MIP warm start
techniques. Following a line of work in approximate explicit MPC, the proposed
learning-control algorithm, LNMS, gains computational advantage over MIP at
little cost and is straightforward for practitioners to implement
Structured learning of metric ensembles with application to person re-identification
Matching individuals across non-overlapping camera networks, known as person
re-identification, is a fundamentally challenging problem due to the large
visual appearance changes caused by variations of viewpoints, lighting, and
occlusion. Approaches in literature can be categoried into two streams: The
first stream is to develop reliable features against realistic conditions by
combining several visual features in a pre-defined way; the second stream is to
learn a metric from training data to ensure strong inter-class differences and
intra-class similarities. However, seeking an optimal combination of visual
features which is generic yet adaptive to different benchmarks is a unsoved
problem, and metric learning models easily get over-fitted due to the scarcity
of training data in person re-identification. In this paper, we propose two
effective structured learning based approaches which explore the adaptive
effects of visual features in recognizing persons in different benchmark data
sets. Our framework is built on the basis of multiple low-level visual features
with an optimal ensemble of their metrics. We formulate two optimization
algorithms, CMCtriplet and CMCstruct, which directly optimize evaluation
measures commonly used in person re-identification, also known as the
Cumulative Matching Characteristic (CMC) curve.Comment: 16 pages. Extended version of "Learning to Rank in Person
Re-Identification With Metric Ensembles", at
http://www.cv-foundation.org/openaccess/content_cvpr_2015/html/Paisitkriangkrai_Learning_to_Rank_2015_CVPR_paper.html.
arXiv admin note: text overlap with arXiv:1503.0154
GraphMatch: Efficient Large-Scale Graph Construction for Structure from Motion
We present GraphMatch, an approximate yet efficient method for building the
matching graph for large-scale structure-from-motion (SfM) pipelines. Unlike
modern SfM pipelines that use vocabulary (Voc.) trees to quickly build the
matching graph and avoid a costly brute-force search of matching image pairs,
GraphMatch does not require an expensive offline pre-processing phase to
construct a Voc. tree. Instead, GraphMatch leverages two priors that can
predict which image pairs are likely to match, thereby making the matching
process for SfM much more efficient. The first is a score computed from the
distance between the Fisher vectors of any two images. The second prior is
based on the graph distance between vertices in the underlying matching graph.
GraphMatch combines these two priors into an iterative "sample-and-propagate"
scheme similar to the PatchMatch algorithm. Its sampling stage uses Fisher
similarity priors to guide the search for matching image pairs, while its
propagation stage explores neighbors of matched pairs to find new ones with a
high image similarity score. Our experiments show that GraphMatch finds the
most image pairs as compared to competing, approximate methods while at the
same time being the most efficient.Comment: Published at IEEE 3DV 201
Preferences in Case-Based Reasoning
Case-based reasoning (CBR) is a well-established problem solving paradigm
that has been used in a wide range of real-world applications. Despite
its great practical success, work on the theoretical foundations of CBR is
still under way, and a coherent and universally applicable methodological
framework is yet missing. The absence of such a framework inspired the
motivation for the work developed in this thesis. Drawing on recent research
on preference handling in Artificial Intelligence and related fields, the goal of
this work is to develop a well theoretically-founded framework on the basis
of formal concepts and methods for knowledge representation and reasoning
with preferences
A Robotic System for Learning Visually-Driven Grasp Planning (Dissertation Proposal)
We use findings in machine learning, developmental psychology, and neurophysiology to guide a robotic learning system\u27s level of representation both for actions and for percepts. Visually-driven grasping is chosen as the experimental task since it has general applicability and it has been extensively researched from several perspectives. An implementation of a robotic system with a gripper, compliant instrumented wrist, arm and vision is used to test these ideas. Several sensorimotor primitives (vision segmentation and manipulatory reflexes) are implemented in this system and may be thought of as the innate perceptual and motor abilities of the system.
Applying empirical learning techniques to real situations brings up such important issues as observation sparsity in high-dimensional spaces, arbitrary underlying functional forms of the reinforcement distribution and robustness to noise in exemplars. The well-established technique of non-parametric projection pursuit regression (PPR) is used to accomplish reinforcement learning by searching for projections of high-dimensional data sets that capture task invariants.
We also pursue the following problem: how can we use human expertise and insight into grasping to train a system to select both appropriate hand preshapes and approaches for a wide variety of objects, and then have it verify and refine its skills through trial and error. To accomplish this learning we propose a new class of Density Adaptive reinforcement learning algorithms. These algorithms use statistical tests to identify possibly interesting regions of the attribute space in which the dynamics of the task change. They automatically concentrate the building of high resolution descriptions of the reinforcement in those areas, and build low resolution representations in regions that are either not populated in the given task or are highly uniform in outcome.
Additionally, the use of any learning process generally implies failures along the way. Therefore, the mechanics of the untrained robotic system must be able to tolerate mistakes during learning and not damage itself. We address this by the use of an instrumented, compliant robot wrist that controls impact forces
Voronoi-Based Compact Image Descriptors: Efficient Region-of-Interest Retrieval With VLAD and Deep-Learning-Based Descriptors
We investigate the problem of image retrieval based on visual queries when the latter comprise arbitrary regionsof- interest (ROI) rather than entire images. Our proposal is a compact image descriptor that combines the state-of-the-art in content-based descriptor extraction with a multi-level, Voronoibased spatial partitioning of each dataset image. The proposed multi-level Voronoi-based encoding uses a spatial hierarchical K-means over interest-point locations, and computes a contentbased descriptor over each cell. In order to reduce the matching complexity with minimal or no sacrifice in retrieval performance: (i) we utilize the tree structure of the spatial hierarchical Kmeans to perform a top-to-bottom pruning for local similarity maxima; (ii) we propose a new image similarity score that combines relevant information from all partition levels into a single measure for similarity; (iii) we combine our proposal with a novel and efficient approach for optimal bit allocation within quantized descriptor representations. By deriving both a Voronoi-based VLAD descriptor (termed as Fast-VVLAD) and a Voronoi-based deep convolutional neural network (CNN) descriptor (termed as Fast-VDCNN), we demonstrate that our Voronoi-based framework is agnostic to the descriptor basis, and can easily be slotted into existing frameworks. Via a range of ROI queries in two standard datasets, it is shown that the Voronoibased descriptors achieve comparable or higher mean Average Precision against conventional grid-based spatial search, while offering more than two-fold reduction in complexity. Finally, beyond ROI queries, we show that Voronoi partitioning improves the geometric invariance of compact CNN descriptors, thereby resulting in competitive performance to the current state-of-theart on whole image retrieval
Containing Analog Data Deluge at Edge through Frequency-Domain Compression in Collaborative Compute-in-Memory Networks
Edge computing is a promising solution for handling high-dimensional,
multispectral analog data from sensors and IoT devices for applications such as
autonomous drones. However, edge devices' limited storage and computing
resources make it challenging to perform complex predictive modeling at the
edge. Compute-in-memory (CiM) has emerged as a principal paradigm to minimize
energy for deep learning-based inference at the edge. Nevertheless, integrating
storage and processing complicates memory cells and/or memory peripherals,
essentially trading off area efficiency for energy efficiency. This paper
proposes a novel solution to improve area efficiency in deep learning inference
tasks. The proposed method employs two key strategies. Firstly, a Frequency
domain learning approach uses binarized Walsh-Hadamard Transforms, reducing the
necessary parameters for DNN (by 87% in MobileNetV2) and enabling
compute-in-SRAM, which better utilizes parallelism during inference. Secondly,
a memory-immersed collaborative digitization method is described among CiM
arrays to reduce the area overheads of conventional ADCs. This facilitates more
CiM arrays in limited footprint designs, leading to better parallelism and
reduced external memory accesses. Different networking configurations are
explored, where Flash, SA, and their hybrid digitization steps can be
implemented using the memory-immersed scheme. The results are demonstrated
using a 65 nm CMOS test chip, exhibiting significant area and energy savings
compared to a 40 nm-node 5-bit SAR ADC and 5-bit Flash ADC. By processing
analog data more efficiently, it is possible to selectively retain valuable
data from sensors and alleviate the challenges posed by the analog data deluge.Comment: arXiv admin note: text overlap with arXiv:2307.03863,
arXiv:2309.0177
- …