270,378 research outputs found
Multi-view constrained clustering with an incomplete mapping between views
Multi-view learning algorithms typically assume a complete bipartite mapping
between the different views in order to exchange information during the
learning process. However, many applications provide only a partial mapping
between the views, creating a challenge for current methods. To address this
problem, we propose a multi-view algorithm based on constrained clustering that
can operate with an incomplete mapping. Given a set of pairwise constraints in
each view, our approach propagates these constraints using a local similarity
measure to those instances that can be mapped to the other views, allowing the
propagated constraints to be transferred across views via the partial mapping.
It uses co-EM to iteratively estimate the propagation within each view based on
the current clustering model, transfer the constraints across views, and then
update the clustering model. By alternating the learning process between views,
this approach produces a unified clustering model that is consistent with all
views. We show that this approach significantly improves clustering performance
over several other methods for transferring constraints and allows multi-view
clustering to be reliably applied when given a limited mapping between the
views. Our evaluation reveals that the propagated constraints have high
precision with respect to the true clusters in the data, explaining their
benefit to clustering performance in both single- and multi-view learning
scenarios
Multi-Target Prediction: A Unifying View on Problems and Methods
Multi-target prediction (MTP) is concerned with the simultaneous prediction
of multiple target variables of diverse type. Due to its enormous application
potential, it has developed into an active and rapidly expanding research field
that combines several subfields of machine learning, including multivariate
regression, multi-label classification, multi-task learning, dyadic prediction,
zero-shot learning, network inference, and matrix completion. In this paper, we
present a unifying view on MTP problems and methods. First, we formally discuss
commonalities and differences between existing MTP problems. To this end, we
introduce a general framework that covers the above subfields as special cases.
As a second contribution, we provide a structured overview of MTP methods. This
is accomplished by identifying a number of key properties, which distinguish
such methods and determine their suitability for different types of problems.
Finally, we also discuss a few challenges for future research
Semi-supervised model-based clustering with controlled clusters leakage
In this paper, we focus on finding clusters in partially categorized data
sets. We propose a semi-supervised version of Gaussian mixture model, called
C3L, which retrieves natural subgroups of given categories. In contrast to
other semi-supervised models, C3L is parametrized by user-defined leakage
level, which controls maximal inconsistency between initial categorization and
resulting clustering. Our method can be implemented as a module in practical
expert systems to detect clusters, which combine expert knowledge with true
distribution of data. Moreover, it can be used for improving the results of
less flexible clustering techniques, such as projection pursuit clustering. The
paper presents extensive theoretical analysis of the model and fast algorithm
for its efficient optimization. Experimental results show that C3L finds high
quality clustering model, which can be applied in discovering meaningful groups
in partially classified data
A Radio-fingerprinting-based Vehicle Classification System for Intelligent Traffic Control in Smart Cities
The measurement and provision of precise and upto-date traffic-related key
performance indicators is a key element and crucial factor for intelligent
traffic controls systems in upcoming smart cities. The street network is
considered as a highly-dynamic Cyber Physical System (CPS) where measured
information forms the foundation for dynamic control methods aiming to optimize
the overall system state. Apart from global system parameters like traffic flow
and density, specific data such as velocity of individual vehicles as well as
vehicle type information can be leveraged for highly sophisticated traffic
control methods like dynamic type-specific lane assignments. Consequently,
solutions for acquiring these kinds of information are required and have to
comply with strict requirements ranging from accuracy over cost-efficiency to
privacy preservation. In this paper, we present a system for classifying
vehicles based on their radio-fingerprint. In contrast to other approaches, the
proposed system is able to provide real-time capable and precise vehicle
classification as well as cost-efficient installation and maintenance, privacy
preservation and weather independence. The system performance in terms of
accuracy and resource-efficiency is evaluated in the field using comprehensive
measurements. Using a machine learning based approach, the resulting success
ratio for classifying cars and trucks is above 99%
On systematic approaches for interpreted information transfer of inspection data from bridge models to structural analysis
In conjunction with the improved methods of monitoring damage and degradation processes, the interest in reliability assessment of reinforced concrete bridges is increasing in recent years. Automated imagebased inspections of the structural surface provide valuable data to extract quantitative information about deteriorations, such as crack patterns. However, the knowledge gain results from processing this information in a structural context, i.e. relating the damage artifacts to building components. This way, transformation to structural analysis is enabled. This approach sets two further requirements: availability of structural bridge information and a standardized storage for interoperability with subsequent analysis tools. Since the involved large datasets are only efficiently processed in an automated manner, the implementation of the complete workflow from damage and building data to structural analysis is targeted in this work. First, domain concepts are derived from the back-end tasks: structural analysis, damage modeling, and life-cycle assessment. The common interoperability format, the Industry Foundation Class (IFC), and processes in these domains are further assessed. The need for usercontrolled interpretation steps is identified and the developed prototype thus allows interaction at subsequent model stages. The latter has the advantage that interpretation steps can be individually separated into either a structural analysis or a damage information model or a combination of both. This approach to damage information processing from the perspective of structural analysis is then validated in different case studies
Semi-supervised cross-entropy clustering with information bottleneck constraint
In this paper, we propose a semi-supervised clustering method, CEC-IB, that
models data with a set of Gaussian distributions and that retrieves clusters
based on a partial labeling provided by the user (partition-level side
information). By combining the ideas from cross-entropy clustering (CEC) with
those from the information bottleneck method (IB), our method trades between
three conflicting goals: the accuracy with which the data set is modeled, the
simplicity of the model, and the consistency of the clustering with side
information. Experiments demonstrate that CEC-IB has a performance comparable
to Gaussian mixture models (GMM) in a classical semi-supervised scenario, but
is faster, more robust to noisy labels, automatically determines the optimal
number of clusters, and performs well when not all classes are present in the
side information. Moreover, in contrast to other semi-supervised models, it can
be successfully applied in discovering natural subgroups if the partition-level
side information is derived from the top levels of a hierarchical clustering
- …