65 research outputs found

    Combining Multiple Classifiers with Dynamic Weighted Voting

    Get PDF
    When a multiple classifier system is employed, one of the most popular methods to accomplish the classifier fusion is the simple majority voting. However, when the performance of the ensemble members is not uniform, the efficiency of this type of voting generally results affected negatively. In this paper, new functions for dynamic weighting in classifier fusion are introduced. Experimental results demonstrate the advantages of these novel strategies over the simple voting scheme

    Implicitly Constrained Semi-Supervised Least Squares Classification

    Full text link
    We introduce a novel semi-supervised version of the least squares classifier. This implicitly constrained least squares (ICLS) classifier minimizes the squared loss on the labeled data among the set of parameters implied by all possible labelings of the unlabeled data. Unlike other discriminative semi-supervised methods, our approach does not introduce explicit additional assumptions into the objective function, but leverages implicit assumptions already present in the choice of the supervised least squares classifier. We show this approach can be formulated as a quadratic programming problem and its solution can be found using a simple gradient descent procedure. We prove that, in a certain way, our method never leads to performance worse than the supervised classifier. Experimental results corroborate this theoretical result in the multidimensional case on benchmark datasets, also in terms of the error rate.Comment: 12 pages, 2 figures, 1 table. The Fourteenth International Symposium on Intelligent Data Analysis (2015), Saint-Etienne, Franc

    A novel semi-fragile forensic watermarking scheme for remote sensing images

    Get PDF
    Peer-reviewedA semi-fragile watermarking scheme for multiple band images is presented. We propose to embed a mark into remote sensing images applying a tree structured vector quantization approach to the pixel signatures, instead of processing each band separately. The signature of themmultispectral or hyperspectral image is used to embed the mark in it order to detect any significant modification of the original image. The image is segmented into threedimensional blocks and a tree structured vector quantizer is built for each block. These trees are manipulated using an iterative algorithm until the resulting block satisfies a required criterion which establishes the embedded mark. The method is shown to be able to preserve the mark under lossy compression (above a given threshold) but, at the same time, it detects possibly forged blocks and their position in the whole image.Se presenta un esquema de marcas de agua semi-frágiles para múltiples imágenes de banda. Proponemos incorporar una marca en imágenes de detección remota, aplicando un enfoque de cuantización del vector de árbol estructurado con las definiciones de píxel, en lugar de procesar cada banda por separado. La firma de la imagen hiperespectral se utiliza para insertar la marca en el mismo orden para detectar cualquier modificación significativa de la imagen original. La imagen es segmentada en bloques tridimensionales y un cuantificador de vector de estructura de árbol se construye para cada bloque. Estos árboles son manipulados utilizando un algoritmo iteractivo hasta que el bloque resultante satisface un criterio necesario que establece la marca incrustada. El método se muestra para poder preservar la marca bajo compresión con pérdida (por encima de un umbral establecido) pero, al mismo tiempo, detecta posiblemente bloques forjados y su posición en la imagen entera.Es presenta un esquema de marques d'aigua semi-fràgils per a múltiples imatges de banda. Proposem incorporar una marca en imatges de detecció remota, aplicant un enfocament de quantització del vector d'arbre estructurat amb les definicions de píxel, en lloc de processar cada banda per separat. La signatura de la imatge hiperespectral s'utilitza per inserir la marca en el mateix ordre per detectar qualsevol modificació significativa de la imatge original. La imatge és segmentada en blocs tridimensionals i un quantificador de vector d'estructura d'arbre es construeix per a cada bloc. Aquests arbres són manipulats utilitzant un algoritme iteractiu fins que el bloc resultant satisfà un criteri necessari que estableix la marca incrustada. El mètode es mostra per poder preservar la marca sota compressió amb pèrdua (per sobre d'un llindar establert) però, al mateix temps, detecta possiblement blocs forjats i la seva posició en la imatge sencera

    Very Important Pool (VIP) genes – an application for microarray-based molecular signatures

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Advances in DNA microarray technology portend that molecular signatures from which microarray will eventually be used in clinical environments and personalized medicine. Derivation of biomarkers is a large step beyond hypothesis generation and imposes considerably more stringency for accuracy in identifying informative gene subsets to differentiate phenotypes. The inherent nature of microarray data, with fewer samples and replicates compared to the large number of genes, requires identifying informative genes prior to classifier construction. However, improving the ability to identify differentiating genes remains a challenge in bioinformatics.</p> <p>Results</p> <p>A new hybrid gene selection approach was investigated and tested with nine publicly available microarray datasets. The new method identifies a Very Important Pool (VIP) of genes from the broad patterns of gene expression data. The method uses a bagging sampling principle, where the re-sampled arrays are used to identify the most informative genes. Frequency of selection is used in a repetitive process to identify the VIP genes. The putative informative genes are selected using two methods, t-statistic and discriminatory analysis. In the t-statistic, the informative genes are identified based on p-values. In the discriminatory analysis, disjoint Principal Component Analyses (PCAs) are conducted for each class of samples, and genes with high discrimination power (DP) are identified. The VIP gene selection approach was compared with the p-value ranking approach. The genes identified by the VIP method but not by the p-value ranking approach are also related to the disease investigated. More importantly, these genes are part of the pathways derived from the common genes shared by both the VIP and p-ranking methods. Moreover, the binary classifiers built from these genes are statistically equivalent to those built from the top 50 p-value ranked genes in distinguishing different types of samples.</p> <p>Conclusion</p> <p>The VIP gene selection approach could identify additional subsets of informative genes that would not always be selected by the p-value ranking method. These genes are likely to be additional true positives since they are a part of pathways identified by the p-value ranking method and expected to be related to the relevant biology. Therefore, these additional genes derived from the VIP method potentially provide valuable biological insights.</p

    A Machine Learning Approach for Identifying Novel Cell Type–Specific Transcriptional Regulators of Myogenesis

    Get PDF
    Transcriptional enhancers integrate the contributions of multiple classes of transcription factors (TFs) to orchestrate the myriad spatio-temporal gene expression programs that occur during development. A molecular understanding of enhancers with similar activities requires the identification of both their unique and their shared sequence features. To address this problem, we combined phylogenetic profiling with a DNA–based enhancer sequence classifier that analyzes the TF binding sites (TFBSs) governing the transcription of a co-expressed gene set. We first assembled a small number of enhancers that are active in Drosophila melanogaster muscle founder cells (FCs) and other mesodermal cell types. Using phylogenetic profiling, we increased the number of enhancers by incorporating orthologous but divergent sequences from other Drosophila species. Functional assays revealed that the diverged enhancer orthologs were active in largely similar patterns as their D. melanogaster counterparts, although there was extensive evolutionary shuffling of known TFBSs. We then built and trained a classifier using this enhancer set and identified additional related enhancers based on the presence or absence of known and putative TFBSs. Predicted FC enhancers were over-represented in proximity to known FC genes; and many of the TFBSs learned by the classifier were found to be critical for enhancer activity, including POU homeodomain, Myb, Ets, Forkhead, and T-box motifs. Empirical testing also revealed that the T-box TF encoded by org-1 is a previously uncharacterized regulator of muscle cell identity. Finally, we found extensive diversity in the composition of TFBSs within known FC enhancers, suggesting that motif combinatorics plays an essential role in the cellular specificity exhibited by such enhancers. In summary, machine learning combined with evolutionary sequence analysis is useful for recognizing novel TFBSs and for facilitating the identification of cognate TFs that coordinate cell type–specific developmental gene expression patterns

    A Tutorial on EEG Signal Processing Techniques for Mental State Recognition in Brain-Computer Interfaces

    Get PDF
    International audienceThis chapter presents an introductory overview and a tutorial of signal processing techniques that can be used to recognize mental states from electroencephalographic (EEG) signals in Brain-Computer Interfaces. More particularly, this chapter presents how to extract relevant and robust spectral, spatial and temporal information from noisy EEG signals (e.g., Band Power features, spatial filters such as Common Spatial Patterns or xDAWN, etc.), as well as a few classification algorithms (e.g., Linear Discriminant Analysis) used to classify this information into a class of mental state. It also briefly touches on alternative, but currently less used approaches. The overall objective of this chapter is to provide the reader with practical knowledge about how to analyse EEG signals as well as to stress the key points to understand when performing such an analysis
    corecore