840 research outputs found
BiETech : Bicluster Ensemble Techniques
Various biclustering algorithms have emerged now a days that try to deliver good biclusters from gene expression data which satisfy a particular objective function. Users are lost in finding the best out of these algorithms. Ensemble techniques come to rescue of these users by aggregating all the solutions and providing a single solution which is more robust and stable than its constituent solutions. In this paper, we present two different ensemble techniques for biclustering solutions. We have used classifiers in one approach and the other approach uses the concept of metaclustering for forming the consensus. Experiments in this research are performed on synthetic and real gene expression datasets as biologists are interested in finding meaningful patterns in expression of genes. The experiments show that both the approaches proposed in the paper show improvement over the input solutions as well as the existing bicluster ensemble techniques
Evolutionary multiobjective clustering algorithms with ensemble for patient stratification
The file attached to this record is the author's final peer reviewed version.Patient stratification has been studied widely to tackle subtype diagnosis problems for effective treatment. Due to the dimensionality curse and poor interpretability of data, there is always a long-lasting challenge in constructing a stratification model with high diagnostic ability and good generalization. To address these problems, this paper proposes two novel evolutionary multiobjective clustering algorithms with ensemble (NSGA-II-ECFE and MOEA/D-ECFE) with four cluster validity indices used as the objective functions. First, an effective ensemble construction method is developed to enrich the ensemble diversity. After that, an ensemble clustering fitness evaluation (ECFE) method is proposed to evaluate the ensembles by measuring the consensus clustering under those four objective functions. To generate the consensus clustering, ECFE exploits the hybrid co-association matrix from the ensembles and then dynamically selects the suitable clustering algorithm on that matrix. Multiple experiments have been conducted to demonstrate the effectiveness of the proposed algorithm in comparison with seven clustering algorithms, twelve ensemble clustering approaches, and two multiobjective clustering algorithms on 55 synthetic datasets and 35 real patient stratification datasets. The experimental results demonstrate the competitive edges of the proposed algorithms over those compared methods. Furthermore, the proposed algorithm is applied to extend its advantages by identifying cancer subtypes from five cancer-related single-cell RNA-seq datasets
A biocompatible technique for magnetic field sensing at (sub)cellular scale using Nitrogen-Vacancy centers
open10We present an innovative experimental set-up that uses Nitrogen-Vacancy centres in diamonds to measure magnetic fields with the sensitivity of eta = 68 +/- 3 nT/root Hz at demonstrated (sub)cellular scale. The presented method of magnetic sensing, utilizing a lock-in based ODMR technique for the optical detection of microwave-driven spin resonances induced in NV centers, is characterized by the excellent magnetic sensitivity at such small scale and the full biocompatibility. The cellular scale is obtained using a NV-rich sensing layer of 15 nm thickness along z axis and a focused laser spot of (10x10) mu m(2) in x-y plane. The biocompatibility derives from an accurate choice of the applied optical power. For this regard, we also report how the magnetic sensitivity changes for different applied laser power and discuss the limits of the sensitivity sustainable with biosystem at such small volume scale. As such, this method offers a whole range of research possibilities for biosciences.openBernardi, E; Moreva, E; Traina, P; Petrini, G; Tchernij, SD; Forneris, J; Pastuovic, A; Degiovanni, IP; Olivero, P; Genovese, MBernardi, E; Moreva, E; Traina, P; Petrini, G; Tchernij, Sd; Forneris, J; Pastuovic, A; Degiovanni, Ip; Olivero, P; Genovese,
Text Detection Using Transformation Scaling Extension Algorithm in Natural Scene Images
In recent study efforts, the importance of text identification and recognition in images of natural scenes has been stressed more and more. Natural scene text contains an enormous amount of useful semantic data that can be applied in a variety of vision-related applications. The detection of shape-robust text confronts two major challenges: 1. A large number of traditional quadrangular bounding box-based detectors failed to identify text with irregular forms, making it difficult to include such text within perfect rectangles.2. Pixel-wise segmentation-based detectors sometimes struggle to identify closely positioned text examples from one another. Understanding the surroundings and extracting information from images of natural scenes depends heavily on the ability to detect and recognise text. Scene text can be aligned in a variety of ways, including vertical, curved, random, and horizontal alignments. This paper has created a novel method, the Transformation Scaling Extention Algorithm (TSEA), for text detection using a mask-scoring R-ConvNN (Region Convolutional Neural Network). This method works exceptionally well at accurately identifying text that is curved and text that has multiple orientations inside real-world input images. This study incorporates a mask-scoring R-ConvNN network framework to enhance the model's ability to score masks correctly for the observed occurrences. By providing more weight to accurate mask predictions, our scoring system eliminates inconsistencies between mask quality and score and enhances the effectiveness of instance segmentation. This paper also incorporates a Pyramid-based Text Proposal Network (PBTPN) and a Transformation Component Network (TCN) to enhance the feature extraction capabilities of the mask-scoring R-ConvNN for text identification and segmentation with the TSEA. Studies show that Pyramid Networks are especially effective in reducing false alarms caused by images with backgrounds that mimic text. On benchmark datasets ICDAR 2015, SCUT-CTW1500 containing multi-oriented and curved text, this method outperforms existing methods by conducting extensive testing across several scales and utilizing a single model. This study expands the field of vision-oriented applications by highlighting the growing significance of effectively locating and detecting text in natural situations
Survey: Leakage and Privacy at Inference Time
Leakage of data from publicly available Machine Learning (ML) models is an
area of growing significance as commercial and government applications of ML
can draw on multiple sources of data, potentially including users' and clients'
sensitive data. We provide a comprehensive survey of contemporary advances on
several fronts, covering involuntary data leakage which is natural to ML
models, potential malevolent leakage which is caused by privacy attacks, and
currently available defence mechanisms. We focus on inference-time leakage, as
the most likely scenario for publicly available models. We first discuss what
leakage is in the context of different data, tasks, and model architectures. We
then propose a taxonomy across involuntary and malevolent leakage, available
defences, followed by the currently available assessment metrics and
applications. We conclude with outstanding challenges and open questions,
outlining some promising directions for future research
Intrinsic Dimension Estimation for non-Euclidean manifolds: from metagenomics to unweighted networks
Within the field of unsupervised manifold learning, Intrinsic Dimension estimators are
among the most important analysis tools. The Intrinsic Dimension provides a measure of the
dimensionality of the hidden manifold from which data are sampled, even if the manifold is
embedded in a space with a much higher number of features. The present Thesis tackles the
still unanswered problem of computing the Intrinsic Dimension (ID) of spaces characterised
by non-Euclidean metrics. In particular, we focus on datasets where the distances between
points are measured by means of Manhattan, Hamming or shortest-path metrics and, thus, can
only assume discrete values. This peculiarity has deep consequences on the way datapoints
populate the neighbourhoods and on the structure on the manifold. For this reason we
develop a general purpose, nearest-neighbours-based ID estimator that has two peculiar
features: the capability of selecting explicitly the scale at which the Intrinsic Dimension is
computed and a validation procedure to check the reliability of the provided estimate. We
thus specialise the estimator to lattice spaces, where the volume is measured by means of the
Ehrhart polynomials. After testing the reliability of the estimator on artificial datasets, we
apply it to genomics sequences and discover an unexpectedly low ID, suggesting that the
evolutive pressure exerts strong restraints on the way the nucleotide basis are allowed to
mutate. This same framework is then employed to profile the scaling of the ID of unweighted
networks. The diversity of the obtained ID signatures prompted us into using it as a signature
to characterise the networks. Concretely, we employ the ID as a summary statistics within
an Approximate Bayesian Computation framework in order to pinpoint the parameters
of network mechanistic generative models of increasing complexity. We discover that, by
targeting the ID of a given network, other typical network properties are also fairly retrieved.
As a last methodological development, we improved the ID estimator by adaptively selecting,
for each datapoint, the largest neighbourhoods with an approximately constant density. This
offers a quantitative criterion to automatically select a meaningful scale at which the ID is
computed and, at the same time, allows to enforce the hypothesis of the method, implying
more reliable estimates
Downstream Task Self-Supervised Learning for Object Recognition and Tracking
This dissertation addresses three limitations of deep learning methods in image and video understanding-based machine vision applications. Firstly, although deep convolutional neural networks (CNNs) are efficient for image recognition applications such as object detection and segmentation, they perform poorly under perspective distortions. In real-world applications, the camera perspective is a common problem that we can address by annotating large amounts of data, thus limiting the applicability of the deep learning models. Secondly, the typical approach for single-camera tracking problems is to use separate motion and appearance models, which are expensive in terms of computations and training data requirements. Finally, conventional multi-camera video understanding techniques use supervised learning algorithms to determine temporal relationships among objects. In large-scale applications, these methods are also limited by the requirement of extensive manually annotated data and computational resources.To address these limitations, we develop an uncertainty-aware self-supervised learning (SSL) technique that captures a model\u27s instance or semantic segmentation uncertainty from overhead images and guides the model to learn the impact of the new perspective on object appearance. The test-time data augmentation-based pseudo-label refinement technique continuously trains a model until convergence on new perspective images. The proposed method can be applied for both self-supervision and semi-supervision, thus increasing the effectiveness of a deep pre-trained model in new domains. Extensive experiments demonstrate the effectiveness of the SSL technique in both object detection and semantic segmentation problems. In video understanding applications, we introduce simultaneous segmentation and tracking as an unsupervised spatio-temporal latent feature clustering problem. The jointly learned multi-task features leverage the task-dependent uncertainty to generate discriminative features in multi-object videos. Experiments have shown that the proposed tracker outperforms several state-of-the-art supervised methods. Finally, we proposed an unsupervised multi-camera tracklet association (MCTA) algorithm to track multiple objects in real-time. MCTA leverages the self-supervised detector model for single-camera tracking and solves the multi-camera tracking problem using multiple pair-wise camera associations modeled as a connected graph. The graph optimization method generates a global solution for partially or fully overlapping camera networks
- …