14 research outputs found
DeepPCR: Parallelizing Sequential Operations in Neural Networks
Parallelization techniques have become ubiquitous for accelerating inference
and training of deep neural networks. Despite this, several operations are
still performed in a sequential manner. For instance, the forward and backward
passes are executed layer-by-layer, and the output of diffusion models is
produced by applying a sequence of denoising steps. This sequential approach
results in a computational cost proportional to the number of steps involved,
presenting a potential bottleneck as the number of steps increases. In this
work, we introduce DeepPCR, a novel algorithm which parallelizes typically
sequential operations in order to speed up inference and training of neural
networks. DeepPCR is based on interpreting a sequence of steps as the
solution of a specific system of equations, which we recover using the Parallel
Cyclic Reduction algorithm. This reduces the complexity of computing the
sequential operations from to , thus
yielding a speedup for large . To verify the theoretical lower complexity of
the algorithm, and to identify regimes for speedup, we test the effectiveness
of DeepPCR in parallelizing the forward and backward pass in multi-layer
perceptrons, and reach speedups of up to for the forward and
for the backward pass. We additionally showcase the flexibility of
DeepPCR by parallelizing training of ResNets with as many as 1024 layers, and
generation in diffusion models, enabling up to faster training and
faster generation, respectively, when compared to the sequential
approach
Designing Data: Proactive Data Collection and Iteration for Machine Learning
Lack of diversity in data collection has caused significant failures in
machine learning (ML) applications. While ML developers perform post-collection
interventions, these are time intensive and rarely comprehensive. Thus, new
methods to track and manage data collection, iteration, and model training are
necessary for evaluating whether datasets reflect real world variability. We
present designing data, an iterative, bias mitigating approach to data
collection connecting HCI concepts with ML techniques. Our process includes (1)
Pre-Collection Planning, to reflexively prompt and document expected data
distributions; (2) Collection Monitoring, to systematically encourage sampling
diversity; and (3) Data Familiarity, to identify samples that are unfamiliar to
a model through Out-of-Distribution (OOD) methods. We instantiate designing
data through our own data collection and applied ML case study. We find models
trained on "designed" datasets generalize better across intersectional groups
than those trained on similarly sized but less targeted datasets, and that data
familiarity is effective for debugging datasets
The Role of Entropy and Reconstruction in Multi-View Self-Supervised Learning
The mechanisms behind the success of multi-view self-supervised learning
(MVSSL) are not yet fully understood. Contrastive MVSSL methods have been
studied through the lens of InfoNCE, a lower bound of the Mutual Information
(MI). However, the relation between other MVSSL methods and MI remains unclear.
We consider a different lower bound on the MI consisting of an entropy and a
reconstruction term (ER), and analyze the main MVSSL families through its lens.
Through this ER bound, we show that clustering-based methods such as
DeepCluster and SwAV maximize the MI. We also re-interpret the mechanisms of
distillation-based approaches such as BYOL and DINO, showing that they
explicitly maximize the reconstruction term and implicitly encourage a stable
entropy, and we confirm this empirically. We show that replacing the objectives
of common MVSSL methods with this ER bound achieves competitive performance,
while making them stable when training with smaller batch sizes or smaller
exponential moving average (EMA) coefficients.
Github repo: https://github.com/apple/ml-entropy-reconstruction.Comment: 18 pages: 9 of main text, 2 of references, and 7 of supplementary
material. Appears in the proceedings of ICML 202
DUET: 2D Structured and Approximately Equivariant Representations
Multiview Self-Supervised Learning (MSSL) is based on learning invariances
with respect to a set of input transformations. However, invariance partially
or totally removes transformation-related information from the representations,
which might harm performance for specific downstream tasks that require such
information. We propose 2D strUctured and EquivarianT representations (coined
DUET), which are 2d representations organized in a matrix structure, and
equivariant with respect to transformations acting on the input data. DUET
representations maintain information about an input transformation, while
remaining semantically expressive. Compared to SimCLR (Chen et al., 2020)
(unstructured and invariant) and ESSL (Dangovski et al., 2022) (unstructured
and equivariant), the structured and equivariant nature of DUET representations
enables controlled generation with lower reconstruction error, while
controllability is not possible with SimCLR or ESSL. DUET also achieves higher
accuracy for several discriminative tasks, and improves transfer learning.Comment: Accepted at ICML 202
Antimicrobials: a global alliance for optimizing their rational use in intra-abdominal infections (AGORA)
Manifold clustering for motion segmentation
En aquesta tesi s’estudia el problema de la segmentació del moviment. La tesi presenta una revisió dels principals algoritmes de segmentació del moviment, s’analitzen les característiques principals i es proposa una classificació de les tècniques més recents i importants. La segmentació es pot entendre com un problema d’agrupament d’espais (manifold clustering). Aquest estudi aborda alguns dels reptes més difícils de la segmentació de moviment a través l’agrupament d’espais. S’han proposat nous algoritmes per a l’estimació del rang de la matriu de trajectòries, s’ha presenta una mesura de similitud entre subespais, s’han abordat problemes relacionats amb el comportament dels angles canònics i s’ha desenvolupat una eina genèrica per estimar quants moviments apareixen en una seqüència. L´ultima part de l’estudi es dedica a la correcció de l’estimació inicial d’una segmentació. Aquesta correcció es du a terme ajuntant els problemes de la segmentació del moviment i de l’estructura a partir del moviment.IN THIS STUDY THE PROBLEM OF MOTION SEGMENTATION IS DISCUSSED. MOTION SEGMENTATION STATE OF THE ART IS PRESENTED, THE MAIN FEATURES OF MOTION SEGMENTATION ALGORITHMS ARE ANALYSED, AND A CLASSIFICATION OF THE RECENT AND MOST IMPORTANT TECHNIQUES IS PROPOSED. THE SEGMENTATION PROBLEM COULD BE CAST INTO A MANIFOLD CLUSTERING PROBLEM. IN THIS STUDY SOME OF THE MOST CHALLENGING ISSUES RELATED TO MOTION SEGMENTATION VIA MANIFOLD CLUSTERING ARE TACKLED. NEW ALGORITHMS FOR THE RANK ESTIMATION OF THE TRAJECTORY MATRIX ARE PROPOSED. A MEASURE OF SIMILARITY BETWEEN SUBSPACES IS PRESENTED. THE BEHAVIOUR OF PRINCIPAL ANGLES IS DISCUSSED. A GENERIC TOOL FOR THE ESTIMATION OF THE NUMBER OF MOTIONS IS DEVELOPED. THE LAST PART OF THE STUDY IS DEDICATED TO THE DEVELOPMENT OF AN ALGORITHM FOR THE CORRECTION OF AN INITIAL MOTION SEGMENTATION SOLUTION. SUCH A CORRECTION IS ACHIEVED BY BRINGING TOGETHER THE PROBLEMS OF MOTION SEGMENTATION AND STRUCTURE FROM MOTION
Enhanced Model Selection for motion segmentation
In this paper a novel rank estimation technique for trajectories motion segmentation within the Local Subspace Affinity (LSA) framework is presented. This technique, called Enhanced Model Selection (EMS), is based on the relationship between the estimated rank of the trajectory matrix and the affinity matrix built by LSA. The results on synthetic and real data show that without any a priori knowledge, EMS automatically provides an accurate and robust rank estimation, improving the accuracy of the final motion segmentatio