258 research outputs found

    Stress evolution in GaAsN alloy films

    Full text link
    We have investigated stress evolution in dilute nitride GaAs1−xNxGaAs1−xNx alloy films grown by plasma-assisted molecular-beam epitaxy. For coherently strained films (x2.5%x>2.5%, in situ wafer curvature measurements reveal a signature for stress relaxation. Atomic force microscopy and transmission electron microscopy measurements indicate that stress relaxation occurs by a combination of elastic relaxation via island formation and plastic relaxation associated with the formation of stacking faults.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/87566/2/103523_1.pd

    Triple dehydrofluorination as a route to amidine-functionalized, aromatic phosphorus heterocycles

    Get PDF
    An unexpected route to hitherto unknown amidine-functionalized phosphinines has been developed that is rapid and simple. Starting from primary amines and CF3-substituted λ3,σ2-phosphinines, a cascade of dehydrofluorination reactions leads selectively to ortho-amidinephosphinines. DFT calculations reveal that this unusual transformation can take place via a series of nucleophilic attacks at the electrophilic, low-coordinate phosphorus atom

    Triple dehydrofluorination as a route to amidine-functionalized, aromatic phosphorus heterocycles

    Get PDF
    Hitherto unknown amidine-functionalized phosphabenzenes selectively form by a cascade of dehydrofluorination reactions

    Improving Cancer Classification Accuracy Using Gene Pairs

    Get PDF
    Recent studies suggest that the deregulation of pathways, rather than individual genes, may be critical in triggering carcinogenesis. The pathway deregulation is often caused by the simultaneous deregulation of more than one gene in the pathway. This suggests that robust gene pair combinations may exploit the underlying bio-molecular reactions that are relevant to the pathway deregulation and thus they could provide better biomarkers for cancer, as compared to individual genes. In order to validate this hypothesis, in this paper, we used gene pair combinations, called doublets, as input to the cancer classification algorithms, instead of the original expression values, and we showed that the classification accuracy was consistently improved across different datasets and classification algorithms. We validated the proposed approach using nine cancer datasets and five classification algorithms including Prediction Analysis for Microarrays (PAM), C4.5 Decision Trees (DT), Naive Bayesian (NB), Support Vector Machine (SVM), and k-Nearest Neighbor (k-NN)

    Gene selection for classification of microarray data based on the Bayes error

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>With DNA microarray data, selecting a compact subset of discriminative genes from thousands of genes is a critical step for accurate classification of phenotypes for, e.g., disease diagnosis. Several widely used gene selection methods often select top-ranked genes according to their individual discriminative power in classifying samples into distinct categories, without considering correlations among genes. A limitation of these gene selection methods is that they may result in gene sets with some redundancy and yield an unnecessary large number of candidate genes for classification analyses. Some latest studies show that incorporating gene to gene correlations into gene selection can remove redundant genes and improve classification accuracy.</p> <p>Results</p> <p>In this study, we propose a new method, Based Bayes error Filter (BBF), to select relevant genes and remove redundant genes in classification analyses of microarray data. The effectiveness and accuracy of this method is demonstrated through analyses of five publicly available microarray datasets. The results show that our gene selection method is capable of achieving better accuracies than previous studies, while being able to effectively select relevant genes, remove redundant genes and obtain efficient and small gene sets for sample classification purposes.</p> <p>Conclusion</p> <p>The proposed method can effectively identify a compact set of genes with high classification accuracy. This study also indicates that application of the Bayes error is a feasible and effective wayfor removing redundant genes in gene selection.</p

    ANMM4CBR: a case-based reasoning method for gene expression data classification

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Accurate classification of microarray data is critical for successful clinical diagnosis and treatment. The "curse of dimensionality" problem and noise in the data, however, undermines the performance of many algorithms.</p> <p>Method</p> <p>In order to obtain a robust classifier, a novel Additive Nonparametric Margin Maximum for Case-Based Reasoning (ANMM4CBR) method is proposed in this article. ANMM4CBR employs a case-based reasoning (CBR) method for classification. CBR is a suitable paradigm for microarray analysis, where the rules that define the domain knowledge are difficult to obtain because usually only a small number of training samples are available. Moreover, in order to select the most informative genes, we propose to perform feature selection via additively optimizing a nonparametric margin maximum criterion, which is defined based on gene pre-selection and sample clustering. Our feature selection method is very robust to noise in the data.</p> <p>Results</p> <p>The effectiveness of our method is demonstrated on both simulated and real data sets. We show that the ANMM4CBR method performs better than some state-of-the-art methods such as support vector machine (SVM) and <it>k </it>nearest neighbor (<it>k</it>NN), especially when the data contains a high level of noise.</p> <p>Availability</p> <p>The source code is attached as an additional file of this paper.</p

    Supervised group Lasso with applications to microarray data analysis

    Get PDF
    BACKGROUND: A tremendous amount of efforts have been devoted to identifying genes for diagnosis and prognosis of diseases using microarray gene expression data. It has been demonstrated that gene expression data have cluster structure, where the clusters consist of co-regulated genes which tend to have coordinated functions. However, most available statistical methods for gene selection do not take into consideration the cluster structure. RESULTS: We propose a supervised group Lasso approach that takes into account the cluster structure in gene expression data for gene selection and predictive model building. For gene expression data without biological cluster information, we first divide genes into clusters using the K-means approach and determine the optimal number of clusters using the Gap method. The supervised group Lasso consists of two steps. In the first step, we identify important genes within each cluster using the Lasso method. In the second step, we select important clusters using the group Lasso. Tuning parameters are determined using V-fold cross validation at both steps to allow for further flexibility. Prediction performance is evaluated using leave-one-out cross validation. We apply the proposed method to disease classification and survival analysis with microarray data. CONCLUSION: We analyze four microarray data sets using the proposed approach: two cancer data sets with binary cancer occurrence as outcomes and two lymphoma data sets with survival outcomes. The results show that the proposed approach is capable of identifying a small number of influential gene clusters and important genes within those clusters, and has better prediction performance than existing methods

    A boosting method for maximizing the partial area under the ROC curve

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The receiver operating characteristic (ROC) curve is a fundamental tool to assess the discriminant performance for not only a single marker but also a score function combining multiple markers. The area under the ROC curve (AUC) for a score function measures the intrinsic ability for the score function to discriminate between the controls and cases. Recently, the partial AUC (pAUC) has been paid more attention than the AUC, because a suitable range of the false positive rate can be focused according to various clinical situations. However, existing pAUC-based methods only handle a few markers and do not take nonlinear combination of markers into consideration.</p> <p>Results</p> <p>We have developed a new statistical method that focuses on the pAUC based on a boosting technique. The markers are combined componentially for maximizing the pAUC in the boosting algorithm using natural cubic splines or decision stumps (single-level decision trees), according to the values of markers (continuous or discrete). We show that the resulting score plots are useful for understanding how each marker is associated with the outcome variable. We compare the performance of the proposed boosting method with those of other existing methods, and demonstrate the utility using real data sets. As a result, we have much better discrimination performances in the sense of the pAUC in both simulation studies and real data analysis.</p> <p>Conclusions</p> <p>The proposed method addresses how to combine the markers after a pAUC-based filtering procedure in high dimensional setting. Hence, it provides a consistent way of analyzing data based on the pAUC from maker selection to marker combination for discrimination problems. The method can capture not only linear but also nonlinear association between the outcome variable and the markers, about which the nonlinearity is known to be necessary in general for the maximization of the pAUC. The method also puts importance on the accuracy of classification performance as well as interpretability of the association, by offering simple and smooth resultant score plots for each marker.</p
    corecore