61 research outputs found

    Kernel-based distance metric learning for microarray data classification

    Get PDF
    BACKGROUND: The most fundamental task using gene expression data in clinical oncology is to classify tissue samples according to their gene expression levels. Compared with traditional pattern classifications, gene expression-based data classification is typically characterized by high dimensionality and small sample size, which make the task quite challenging. RESULTS: In this paper, we present a modified K-nearest-neighbor (KNN) scheme, which is based on learning an adaptive distance metric in the data space, for cancer classification using microarray data. The distance metric, derived from the procedure of a data-dependent kernel optimization, can substantially increase the class separability of the data and, consequently, lead to a significant improvement in the performance of the KNN classifier. Intensive experiments show that the performance of the proposed kernel-based KNN scheme is competitive to those of some sophisticated classifiers such as support vector machines (SVMs) and the uncorrelated linear discriminant analysis (ULDA) in classifying the gene expression data. CONCLUSION: A novel distance metric is developed and incorporated into the KNN scheme for cancer classification. This metric can substantially increase the class separability of the data in the feature space and, hence, lead to a significant improvement in the performance of the KNN classifier

    The Validity of CET-6 among Chinese Students Studying Overseas

    Get PDF
    This paper focuses on the validity of College English Test Band 6 (CET-6) in oversea life among Chinese students to find out whether the scores of CET-6 can truly reflect students’ English language ability and whether it is possible to use the scores of CET-6 as a proof for English language proficiency. To do the survey, we conducted the survey by quantitative research methods with 50 samples in Universiti Putra Malaysia(UPM). After the collection and analysis of data, some current issues about the assessment standards of CET-6 are found, and suggestions are also given to improve the validity of CET-6

    A ship detector applying Principal Component Analysis to the polarimetric Notch Filter

    Get PDF
    Ship detection using polarimetric synthetic aperture radar (PolSAR) data has attracted a lot of attention in recent years. Polarimetry can provide information regarding the scattering mechanisms of targets, which helps discriminate between ships and sea clutter. This enhancement is particularly valuable when we aim at detecting smaller vessels in rough sea states. This work exploits a ship detector called the Geometrical Perturbation-Polarimetric Notch Filter (GP-PNF), and it is aimed at improving its performance especially when less polarimetric images are available (e.g., dual-polarimetric data). The idea is to design a new polarimetric feature vector containing more features that are renowned to allow separation between ships and sea clutter. Then, a Principal Component Analysis (PCA) is further used to reduce the dimensionality of the new feature space. Experiments on four real Sentinel-1 datasets are carried out to demonstrate the validity of the proposed method and compare it against other ship detectors. Analyses of the experimental results show that the proposed algorithm can not only reduce the false alarms significantly, but also enhance the target-to-clutter ratio (TCR) so that it can more effectively detect weaker ships

    OpenSARUrban: A Sentinel-1 SAR Image Dataset for Urban Interpretation

    Get PDF
    Sentinel-1 mission provides a freely accessible opportunity for urban interpretation from synthetic aperture radar (SAR) images with specific resolution, which is of paramount importance for earth observation. In parallel, with the rapid development of advanced technologies, especially deep learning, it is urgently needed to construct a large-scale SAR dataset leading urban interpretation. This paper presents OpenSARUrban: a Sentinel-1 dataset dedicated to urban interpretation from SAR images, including a well-defined hierarchical annotation scheme, the data collection, the well-established procedures for dataset construction and organizations, the properties, visualizations, and applications of this dataset. Particularly, the OpenSARUrban provides 33358 image patches of SAR urban scene, covering 21 major cities of China, including 10 different categories, 4 kinds of formats, 2 kinds of polarization modes, and owning 5 essential properties: large-scale, diversity, specificity, reliability, and sustainability. These properties guarantee the achievable of several goals for OpenSARUrban. The first is to support urban target characterization. The second is to help develop applicable and advanced algorithms for Sentinel-1 urban target classification. The dataset visualization is implemented from the perspective of manifold to give an intuitive understanding. Besides a detailed description and visualization of the dataset, we present results of some benchmark algorithms, demonstrating that this dataset is practical and challenging. Notably, developing algorithms to enhance the classification performance on the whole dataset and considering the data imbalance are especially challenging

    The large area detector onboard the eXTP mission

    Get PDF
    The Large Area Detector (LAD) is the high-throughput, spectral-timing instrument onboard the eXTP mission, a flagship mission of the Chinese Academy of Sciences and the China National Space Administration, with a large European participation coordinated by Italy and Spain. The eXTP mission is currently performing its phase B study, with a target launch at the end-2027. The eXTP scientific payload includes four instruments (SFA, PFA, LAD and WFM) offering unprecedented simultaneous wide-band X-ray timing and polarimetry sensitivity. The LAD instrument is based on the design originally proposed for the LOFT mission. It envisages a deployed 3.2 m2 effective area in the 2-30 keV energy range, achieved through the technology of the large-area Silicon Drift Detectors - offering a spectral resolution of up to 200 eV FWHM at 6 keV - and of capillary plate collimators - limiting the field of view to about 1 degree. In this paper we will provide an overview of the LAD instrument design, its current status of development and anticipated performance

    Optimized Kernel Machine for Cancer Classification Using Gene Expression Data

    No full text
    Abstract — The cancer classification using gene expression data has shown to be very useful for cancer diagnose and prediction. However, the nature of very high dimensionality and relatively small sample size associated with the gene expression data make the tasks of classification quite challenging. In this paper, we present a new approach, which is based on optimizing the kernel function, to improve the performances of the classifiers in classifying gene expression data. Aiming to increase the class separability of the data, we utilize a more flexible kernel function model, the data-dependent kernel, as the objective kernel to be optimized. The experimental results show that using the optimized kernel usually results in a substantial improvement for the K-nearest-neighbor (KNN) and support vector machine (SVM) in classifying gene expression data. I
    • …
    corecore