3 research outputs found
Efficient Generation of Multimodal Fluid Simulation Data
Applying the representational power of machine learning to the prediction of
complex fluid dynamics has been a relevant subject of study for years. However,
the amount of available fluid simulation data does not match the notoriously
high requirements of machine learning methods. Researchers have typically
addressed this issue by generating their own datasets, preventing a consistent
evaluation of their proposed approaches. Our work introduces a generation
procedure for synthetic multi-modal fluid simulations datasets. By leveraging a
GPU implementation, our procedure is also efficient enough that no data needs
to be exchanged between users, except for configuration files required to
reproduce the dataset. Furthermore, our procedure allows multiple modalities
(generating both geometry and photorealistic renderings) and is general enough
for it to be applied to various tasks in data-driven fluid simulation. We then
employ our framework to generate a set of thoughtfully designed benchmark
datasets, which attempt to span specific fluid simulation scenarios in a
meaningful way. The properties of our contributions are demonstrated by
evaluating recently published algorithms for the neural fluid simulation and
fluid inverse rendering tasks using our benchmark datasets. Our contribution
aims to fulfill the community's need for standardized benchmarks, fostering
research that is more reproducible and robust than previous endeavors.Comment: 10 pages, 7 figure
From Charts to Atlas: Merging Latent Spaces into One
Models trained on semantically related datasets and tasks exhibit comparable
inter-sample relations within their latent spaces. We investigate in this study
the aggregation of such latent spaces to create a unified space encompassing
the combined information. To this end, we introduce Relative Latent Space
Aggregation, a two-step approach that first renders the spaces comparable using
relative representations, and then aggregates them via a simple mean. We
carefully divide a classification problem into a series of learning tasks under
three different settings: sharing samples, classes, or neither. We then train a
model on each task and aggregate the resulting latent spaces. We compare the
aggregated space with that derived from an end-to-end model trained over all
tasks and show that the two spaces are similar. We then observe that the
aggregated space is better suited for classification, and empirically
demonstrate that it is due to the unique imprints left by task-specific
embedders within the representations. We finally test our framework in
scenarios where no shared region exists and show that it can still be used to
merge the spaces, albeit with diminished benefits over naive merging.Comment: To appear in the NeurReps workshop @ NeurIPS 202
Few-Shot Object Detection: A Survey
Deep Learning approaches have recently raised the bar in many fields, from Natural Language Processing to Computer Vision, by leveraging large amounts of data. However, they could fail when the retrieved information is not enough to it the vast number of parameters, frequently resulting in overfitting and, therefore, in poor generalizability. Few-Shot Learning aims at designing models which can effectively operate in a scarce data regime, yielding learning strategies that only need few supervised examples to be trained. These procedures are of both practical and theoretical importance, as they are crucial for many real-life scenarios in which data is either costly or even impossible to retrieve. Moreover, they bridge the distance between current data-hungry models and human-like generalization capability. Computer Vision offers various tasks which can be few-shot inherent, such as person re-identification. This survey, which to the best of our knowledge is the first tackling this problem, is focused on Few-Shot Object Detection, which has received far less attention compared to Few-Shot Classification due to the intrinsic challenge level. In this regard, this review presents an extensive description of the approaches that have been tested in the current literature, discussing their pros and cons, and classifying them according to a rigorous taxonomy