6,607 research outputs found
Unlabeled Sensing with Random Linear Measurements
We study the problem of solving a linear sensing system when the observations are unlabeled. Specifically we seek a solution to a linear system of equations y = Ax when the order of the observations in the vector y is unknown. Focusing on the setting in which A is a random matrix with i.i.d. entries, we show that if the sensing matrix A admits an oversampling ratio of 2 or higher, then, with probability 1, it is possible to recover x exactly without the knowledge of the order of the observations in y. Furthermore, if x is of dimension K, then any 2K entries of y are sufficient to recover x. This result implies the existence of deterministic unlabeled sensing matrices with an oversampling factor of 2 that admit perfect reconstruction. The result is universal in that conditioned on the realization of matrix A, recovery is guaranteed for all possible choices of x. While the proof is constructive, it uses a combinatorial algorithm which is not practical, leaving the question of complexity open. We also analyze a noisy version of the problem and show that local stability is guaranteed by the solution. In particular, for every x, the recovery error tends to zero as the signal-to-noise ratio tends to infinity. The question of universal stability is unclear. In addition, we obtain a converse of the result in the noiseless case: If the number of observations in y is less than 2K, then with probability 1, universal recovery fails, i.e., with probability 1, there exist distinct choices of x which lead to the same unordered list of observations in y. We also present extensions of the result of the noiseless case to special cases with non-i.i.d. entries in A, and to a different setting in which the labels of a portion of the observations y are known. In terms of applications, the unlabeled sensing problem is related to data association problems encountered in different domains including robotics where it is appears in a method called “simultaneous localization and mapping”, multi-target tracking applications, and in sampling signals in the presence of jitter
Multiple pattern classification by sparse subspace decomposition
A robust classification method is developed on the basis of sparse subspace
decomposition. This method tries to decompose a mixture of subspaces of
unlabeled data (queries) into class subspaces as few as possible. Each query is
classified into the class whose subspace significantly contributes to the
decomposed subspace. Multiple queries from different classes can be
simultaneously classified into their respective classes. A practical greedy
algorithm of the sparse subspace decomposition is designed for the
classification. The present method achieves high recognition rate and robust
performance exploiting joint sparsity.Comment: 8 pages, 3 figures, 2nd IEEE International Workshop on Subspace
Methods, Workshop Proceedings of ICCV 200
Task-Driven Dictionary Learning
Modeling data with linear combinations of a few elements from a learned
dictionary has been the focus of much recent research in machine learning,
neuroscience and signal processing. For signals such as natural images that
admit such sparse representations, it is now well established that these models
are well suited to restoration tasks. In this context, learning the dictionary
amounts to solving a large-scale matrix factorization problem, which can be
done efficiently with classical optimization tools. The same approach has also
been used for learning features from data for other purposes, e.g., image
classification, but tuning the dictionary in a supervised way for these tasks
has proven to be more difficult. In this paper, we present a general
formulation for supervised dictionary learning adapted to a wide variety of
tasks, and present an efficient algorithm for solving the corresponding
optimization problem. Experiments on handwritten digit classification, digital
art identification, nonlinear inverse image problems, and compressed sensing
demonstrate that our approach is effective in large-scale settings, and is well
suited to supervised and semi-supervised classification, as well as regression
tasks for data that admit sparse representations.Comment: final draft post-refereein
Demographic Inference and Representative Population Estimates from Multilingual Social Media Data
Social media provide access to behavioural data at an unprecedented scale and
granularity. However, using these data to understand phenomena in a broader
population is difficult due to their non-representativeness and the bias of
statistical inference tools towards dominant languages and groups. While
demographic attribute inference could be used to mitigate such bias, current
techniques are almost entirely monolingual and fail to work in a global
environment. We address these challenges by combining multilingual demographic
inference with post-stratification to create a more representative population
sample. To learn demographic attributes, we create a new multimodal deep neural
architecture for joint classification of age, gender, and organization-status
of social media users that operates in 32 languages. This method substantially
outperforms current state of the art while also reducing algorithmic bias. To
correct for sampling biases, we propose fully interpretable multilevel
regression methods that estimate inclusion probabilities from inferred joint
population counts and ground-truth population counts. In a large experiment
over multilingual heterogeneous European regions, we show that our demographic
inference and bias correction together allow for more accurate estimates of
populations and make a significant step towards representative social sensing
in downstream applications with multilingual social media.Comment: 12 pages, 10 figures, Proceedings of the 2019 World Wide Web
Conference (WWW '19
- …