814 research outputs found
Commuting-projector Hamiltonians for chiral topological phases built from parafermions
We introduce a family of commuting-projector Hamiltonians whose degrees of
freedom involve parafermion zero modes residing in a parent
fractional-quantum-Hall fluid. The two simplest models in this family emerge
from dressing Ising-paramagnet and toric-code spin models with parafermions; we
study their edge properties, anyonic excitations, and ground-state degeneracy.
We show that the first model realizes a symmetry-enriched topological phase
(SET) for which spin-flip symmetry from the Ising paramagnet
permutes the anyons. Interestingly, the interface between this SET and the
parent quantum-Hall phase realizes symmetry-enforced parafermion
criticality with no fine-tuning required. The second model exhibits a
non-Abelian phase that is consistent with topological order,
and can be accessed by gauging the symmetry in the SET.
Employing Levin-Wen string-net models with -graded structure,
we generalize this picture to construct a large class of commuting-projector
models for SETs and non-Abelian topological orders exhibiting
the same relation. Our construction provides the first
commuting-projector-Hamiltonian realization of chiral bosonic non-Abelian
topological order.Comment: 29+18 pages, 25 figure
Multi-modal Machine Learning in Engineering Design: A Review and Future Directions
In the rapidly advancing field of multi-modal machine learning (MMML), the
convergence of multiple data modalities has the potential to reshape various
applications. This paper presents a comprehensive overview of the current
state, advancements, and challenges of MMML within the sphere of engineering
design. The review begins with a deep dive into five fundamental concepts of
MMML:multi-modal information representation, fusion, alignment, translation,
and co-learning. Following this, we explore the cutting-edge applications of
MMML, placing a particular emphasis on tasks pertinent to engineering design,
such as cross-modal synthesis, multi-modal prediction, and cross-modal
information retrieval. Through this comprehensive overview, we highlight the
inherent challenges in adopting MMML in engineering design, and proffer
potential directions for future research. To spur on the continued evolution of
MMML in engineering design, we advocate for concentrated efforts to construct
extensive multi-modal design datasets, develop effective data-driven MMML
techniques tailored to design applications, and enhance the scalability and
interpretability of MMML models. MMML models, as the next generation of
intelligent design tools, hold a promising future to impact how products are
designed
Federated Deep Multi-View Clustering with Global Self-Supervision
Federated multi-view clustering has the potential to learn a global
clustering model from data distributed across multiple devices. In this
setting, label information is unknown and data privacy must be preserved,
leading to two major challenges. First, views on different clients often have
feature heterogeneity, and mining their complementary cluster information is
not trivial. Second, the storage and usage of data from multiple clients in a
distributed environment can lead to incompleteness of multi-view data. To
address these challenges, we propose a novel federated deep multi-view
clustering method that can mine complementary cluster structures from multiple
clients, while dealing with data incompleteness and privacy concerns.
Specifically, in the server environment, we propose sample alignment and data
extension techniques to explore the complementary cluster structures of
multiple views. The server then distributes global prototypes and global
pseudo-labels to each client as global self-supervised information. In the
client environment, multiple clients use the global self-supervised information
and deep autoencoders to learn view-specific cluster assignments and embedded
features, which are then uploaded to the server for refining the global
self-supervised information. Finally, the results of our extensive experiments
demonstrate that our proposed method exhibits superior performance in
addressing the challenges of incomplete multi-view data in distributed
environments
Multimodal Prediction based on Graph Representations
This paper proposes a learning model, based on rank-fusion graphs, for
general applicability in multimodal prediction tasks, such as multimodal
regression and image classification. Rank-fusion graphs encode information from
multiple descriptors and retrieval models, thus being able to capture
underlying relationships between modalities, samples, and the collection
itself. The solution is based on the encoding of multiple ranks for a query (or
test sample), defined according to different criteria, into a graph. Later, we
project the generated graph into an induced vector space, creating fusion
vectors, targeting broader generality and efficiency. A fusion vector estimator
is then built to infer whether a multimodal input object refers to a class or
not. Our method is capable of promoting a fusion model better than early-fusion
and late-fusion alternatives. Performed experiments in the context of multiple
multimodal and visual datasets, as well as several descriptors and retrieval
models, demonstrate that our learning model is highly effective for different
prediction scenarios involving visual, textual, and multimodal features,
yielding better effectiveness than state-of-the-art methods
Example-based image colorization using locality consistent sparse representation
—Image colorization aims to produce a natural looking color image from a given grayscale image, which remains a challenging problem. In this paper, we propose a novel examplebased image colorization method exploiting a new locality consistent sparse representation. Given a single reference color image, our method automatically colorizes the target grayscale image by sparse pursuit. For efficiency and robustness, our method operates at the superpixel level. We extract low-level intensity features, mid-level texture features and high-level semantic features for each superpixel, which are then concatenated to form its descriptor. The collection of feature vectors for all the superpixels from the reference image composes the dictionary. We formulate colorization of target superpixels as a dictionary-based sparse reconstruction problem. Inspired by the observation that superpixels with similar spatial location and/or feature representation are likely to match spatially close regions from the reference image, we further introduce a locality promoting regularization term into the energy formulation which substantially improves the matching consistency and subsequent colorization results. Target superpixels are colorized based on the chrominance information from the dominant reference superpixels. Finally, to further improve coherence while preserving sharpness, we develop a new edge-preserving filter for chrominance channels with the guidance from the target grayscale image. To the best of our knowledge, this is the first work on sparse pursuit image colorization from single reference images. Experimental results demonstrate that our colorization method outperforms state-ofthe-art methods, both visually and quantitatively using a user stud
Application of Multi-Sensor Fusion Technology in Target Detection and Recognition
Application of multi-sensor fusion technology has drawn a lot of industrial and academic interest in recent years. The multi-sensor fusion methods are widely used in many applications, such as autonomous systems, remote sensing, video surveillance, and the military. These methods can obtain the complementary properties of targets by considering multiple sensors. On the other hand, they can achieve a detailed environment description and accurate detection of interest targets based on the information from different sensors.This book collects novel developments in the field of multi-sensor, multi-source, and multi-process information fusion. Articles are expected to emphasize one or more of the three facets: architectures, algorithms, and applications. Published papers dealing with fundamental theoretical analyses, as well as those demonstrating their application to real-world problems
Learning Non-Metric Visual Similarity for Image Retrieval
Measuring visual similarity between two or more instances within a data distribution is a fundamental task in image retrieval. Theoretically, non-metric distances are able to generate a more complex and accurate similarity model than metric distances, provided that the non-linear data distribution is precisely captured by the system. In this work, we explore neural networks models for learning a non-metric similarity function for instance search. We argue that non-metric similarity functions based on neural networks can build a better model of human visual perception than standard metric distances. As our proposed similarity function is differentiable, we explore a real end-to-end trainable approach for image retrieval, i.e. we learn the weights from the input image pixels to the final similarity score. Experimental evaluation shows that non-metric similarity networks are able to learn visual similarities between images and improve performance on top of state-of-the-art image representations, boosting results in standard image retrieval datasets with respect standard metric distances
- …