11,713 research outputs found

    Using similarity metrics for mining variability from software repositories

    Get PDF

    Statistical advances and challenges for analyzing correlated high dimensional SNP data in genomic study for complex diseases

    Full text link
    Recent advances of information technology in biomedical sciences and other applied areas have created numerous large diverse data sets with a high dimensional feature space, which provide us a tremendous amount of information and new opportunities for improving the quality of human life. Meanwhile, great challenges are also created driven by the continuous arrival of new data that requires researchers to convert these raw data into scientific knowledge in order to benefit from it. Association studies of complex diseases using SNP data have become more and more popular in biomedical research in recent years. In this paper, we present a review of recent statistical advances and challenges for analyzing correlated high dimensional SNP data in genomic association studies for complex diseases. The review includes both general feature reduction approaches for high dimensional correlated data and more specific approaches for SNPs data, which include unsupervised haplotype mapping, tag SNP selection, and supervised SNPs selection using statistical testing/scoring, statistical modeling and machine learning methods with an emphasis on how to identify interacting loci.Comment: Published in at http://dx.doi.org/10.1214/07-SS026 the Statistics Surveys (http://www.i-journals.org/ss/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Strategic Project Portfolio Management by Predicting Project Performance and Estimating Strategic Fit

    Get PDF
    Candidate project selections are extremely crucial for infrastructure construction companies. First, they determine how well the planned strategy will be realized during the following years. If the selected projects do not align with the competences of the organization major losses can occur during the projects’ execution phase. Second, participating in tendering competitions is costly manual labour and losing the bid directly increase the overhead costs of the organization. Still, contractors rarely utilize statistical methods to select projects that are more likely to be successful. In response to these two issues, a tool for project portfolio selection phase was developed based on existing literature about strategic fit estimation and project performance prediction. One way to define the strategic fit of a project is to evaluate the alignment between the characteristics of a project to the strategic objectives of an organisation. Project performance on the other-hand can be measured with various financial, technical, production, risk or human-resource related criteria. Depending on which measure is highlighted, the likelihood of succeeding with regards to a performance measure can be predicted with numerous machine learning methods of which decision trees were used in this study. By combining the strategic fit and likelihood of success measures, a two-by-two matrix was formed. The matrix can be used to categorize the project opportunities into four categories, ignore, analyse, cash-in and focus, that can guide candidate project selections. To test and demonstrate the performance of the matrix, the case company’s CRM data was used to estimate strategic fit and likelihood of succeeding in tendering competitions. First, the projects were plotted on the matrix and their position and accuracy was analysed per quartile. Afterwards, the project selections were simulated and compared against the case company’s real selections during a six-month period. The first implication after plotting the projects on the matrix was that only a handful of projects were positioned in the focus category of the matrix, which indicates a discrepancy between the planned strategy and the competences of the case company in tendering competitions. Second, the tendering competition outcomes were easier to predict in the low strategic fit quartiles as the project selections in them were more accurate than in the high strategic fit categories. Finally, the matrix also quite accurately filtered the worst low strategic fit projects out from the market. The simulation was done in two stages. First, by emphasizing the likelihood of success predictions the matrix increased the hit rate and average strategic fit of the selected project portfolio. When strategic fit values were emphasized on the other hand, the simulation did not yield useful results. The study contributes to the project portfolio management literature by developing a practice-oriented tool that emphasizes the strategical and statistical perspectives of the candidate project selection phase
    • …
    corecore