43 research outputs found
Advancing Multi-Modal Deep Learning: Towards Language-Grounded Visual Understanding
Using deep learning, computer vision now rivals people at object recognition and detection, opening doors to tackle new challenges in image understanding. Among these challenges, understanding and reasoning about language grounded visual content is of fundamental importance to advancing artificial intelligence. Recently, multiple datasets and algorithms have been created as proxy tasks towards this goal, with visual question answering (VQA) being the most widely studied. In VQA, an algorithm needs to produce an answer to a natural language question about an image. However, our survey of datasets and algorithms for VQA uncovered several sources of dataset bias and sub-optimal evaluation metrics that allowed algorithms to perform well by merely exploiting superficial statistical patterns. In this dissertation, we describe new algorithms and datasets that address these issues. We developed two new datasets and evaluation metrics that enable a more accurate measurement of abilities of a VQA model, and also expand VQA to include new abilities, such as reading text, handling out-of-vocabulary words, and understanding data-visualization. We also created new algorithms for VQA that have helped advance the state-of-the-art for VQA, including an algorithm that surpasses humans on two different chart question answering datasets about bar-charts, line-graphs and pie charts. Finally, we provide a holistic overview of several yet-unsolved challenges in not only VQA but vision and language research at large. Despite enormous progress, we find that a robust understanding and integration of vision and language is still an elusive goal, and much of the progress may be misleading due to dataset bias, superficial correlations and flaws in standard evaluation metrics. We carefully study and categorize these issues for several vision and language tasks and outline several possible paths towards development of safe, robust and trustworthy AI for language-grounded visual understanding
Towards Multimodal Open-World Learning in Deep Neural Networks
Over the past decade, deep neural networks have enormously advanced machine perception, especially object classification, object detection, and multimodal scene understanding. But, a major limitation of these systems is that they assume a closed-world setting, i.e., the train and the test distribution match exactly. As a result, any input belonging to a category that the system has never seen during training will not be recognized as unknown. However, many real-world applications often need this capability. For example, self-driving cars operate in a dynamic world where the data can change over time due to changes in season, geographic location, sensor types, etc. Handling such changes requires building models with open-world learning capabilities. In open-world learning, the system needs to detect novel examples which are not seen during training and update the system with new knowledge, without retraining from scratch. In this dissertation, we address gaps in the open-world learning literature and develop methods that enable efficient multimodal open-world learning in deep neural networks
Study of the Quantum Advantage in Quantum Machine Learning Applications for drug discovery
Εθνικό Μετσόβιο Πολυτεχνείο--Μεταπτυχιακή Εργασία. Διεπιστημονικό-Διατμηματικό Πρόγραμμα Μεταπτυχιακών Σπουδών (Δ.Π.Μ.Σ.) “Μαθηματική Προτυποποίηση σε Σύγχρονες Τεχνολογίες και στα Χρηματοοικονομικά
Investigating quantum many-body systems with tensor networks, machine learning and quantum computers
(English) We perform quantum simulation on classical and quantum computers and set up a machine learning framework in which we can map out phase diagrams of known and unknown quantum many-body systems in an unsupervised fashion.
The classical simulations are done with state-of-the-art tensor network methods in one and two spatial dimensions. For one dimensional systems, we utilize matrix product states (MPS) that have many practical advantages and can be optimized using the efficient density matrix renormalization group (DMRG) algorithm. The data for two dimensional systems is obtained from entangled projected pair states (PEPS) optimized via imaginary time evolution.
Data in form of observables, entanglement spectra, or parts of the state vectors from these simulations, is then fed into a deep learning (DL) pipeline where we perform anomaly detection to map out the phase diagram.
We extend this notion to quantum computers and introduce quantum variational anomaly detection. Here, we first simulate the ground state and then process it in a quantum machine learning (QML) manner. Both simulation and QML routines are performed on the same device, which we demonstrate both in classical simulation and on a physical quantum computer hosted by IBM.(Español) En esta tesis, realizamos simulaciónes cuánticas en ordenadores clásicos y cuánticos y diseñamos un marco de aprendizaje automático en el que podemos construir diagramas de fase de sistemas cuánticos de muchas partículas de manera no supervisada. Las simulaciones clásicas se realizan con métodos de red de tensores de última generación en una y dos dimensiones espaciales. Para sistemas unidimensionales, utilizamos estados de productos de matrices (MPS) que tienen muchas ventajas prácticas y pueden optimizarse utilizando el eficiente algoritmo del grupo de renormalización de matrices de densidad (DMRG). Los datos para sistemas bidimensionales se obtienen mediante los denominados estados de pares entrelazados proyectados (PEPS) optimizados a través de la evolución en tiempo imaginario. Los datos, en forma de observables, espectros de entrelazamiento o partes de los vectores de estado de estas simulaciones, se introducen luego en un algoritmo de aprendizaje profundo (DL) donde realizamos la detección de anomalías para construir el diagrama de fase. Extendemos esta noción a los ordenadores cuánticos e introducimos la detección de anomalías cuánticas variacionales. Aquí, primero simulamos el estado fundamental y luego lo procesamos utilizando el aprendizaje automático cuántico (QML). Tanto las rutinas de simulación como el QML se realizan en el mismo dispositivo, lo que demostramos tanto en una simulación clásica como en un ordenador cuántico real de IBM.Postprint (published version
Quantum Simulation for High Energy Physics
It is for the first time that Quantum Simulation for High Energy Physics
(HEP) is studied in the U.S. decadal particle-physics community planning, and
in fact until recently, this was not considered a mainstream topic in the
community. This fact speaks of a remarkable rate of growth of this subfield
over the past few years, stimulated by the impressive advancements in Quantum
Information Sciences (QIS) and associated technologies over the past decade,
and the significant investment in this area by the government and private
sectors in the U.S. and other countries. High-energy physicists have quickly
identified problems of importance to our understanding of nature at the most
fundamental level, from tiniest distances to cosmological extents, that are
intractable with classical computers but may benefit from quantum advantage.
They have initiated, and continue to carry out, a vigorous program in theory,
algorithm, and hardware co-design for simulations of relevance to the HEP
mission. This community whitepaper is an attempt to bring this exciting and yet
challenging area of research to the spotlight, and to elaborate on what the
promises, requirements, challenges, and potential solutions are over the next
decade and beyond.Comment: This is a whitepaper prepared for the topical groups CompF6 (Quantum
computing), TF05 (Lattice Gauge Theory), and TF10 (Quantum Information
Science) within the Computational Frontier and Theory Frontier of the U.S.
Community Study on the Future of Particle Physics (Snowmass 2021). 103 pages
and 1 figur
An Outlook into the Future of Egocentric Vision
What will the future be? We wonder! In this survey, we explore the gap
between current research in egocentric vision and the ever-anticipated future,
where wearable computing, with outward facing cameras and digital overlays, is
expected to be integrated in our every day lives. To understand this gap, the
article starts by envisaging the future through character-based stories,
showcasing through examples the limitations of current technology. We then
provide a mapping between this future and previously defined research tasks.
For each task, we survey its seminal works, current state-of-the-art
methodologies and available datasets, then reflect on shortcomings that limit
its applicability to future research. Note that this survey focuses on software
models for egocentric vision, independent of any specific hardware. The paper
concludes with recommendations for areas of immediate explorations so as to
unlock our path to the future always-on, personalised and life-enhancing
egocentric vision.Comment: We invite comments, suggestions and corrections here:
https://openreview.net/forum?id=V3974SUk1
Quantum-Inspired Machine Learning: a Survey
Quantum-inspired Machine Learning (QiML) is a burgeoning field, receiving
global attention from researchers for its potential to leverage principles of
quantum mechanics within classical computational frameworks. However, current
review literature often presents a superficial exploration of QiML, focusing
instead on the broader Quantum Machine Learning (QML) field. In response to
this gap, this survey provides an integrated and comprehensive examination of
QiML, exploring QiML's diverse research domains including tensor network
simulations, dequantized algorithms, and others, showcasing recent
advancements, practical applications, and illuminating potential future
research avenues. Further, a concrete definition of QiML is established by
analyzing various prior interpretations of the term and their inherent
ambiguities. As QiML continues to evolve, we anticipate a wealth of future
developments drawing from quantum mechanics, quantum computing, and classical
machine learning, enriching the field further. This survey serves as a guide
for researchers and practitioners alike, providing a holistic understanding of
QiML's current landscape and future directions.Comment: 56 pages, 13 figures, 8 table