9 research outputs found

    Tables of moments of sample extremes of order statistics from discrete uniform distribution

    Get PDF
    In this paper, moments of sample extremes of order statistics from discrete uniform distribution are given. For n up to 15, algebraic expressions for the expected values and variances of sample extremes of order statistics from discrete uniform distribution are obtained. It is shown that with the help of the sum s (k) n , one can obtain all moments for sample extremes of order statistics from a discrete uniform distribution. Furthermore, for sample size k = 20 and n =1(1)20 , numerical results calculated by using Matlab.Makalede, kesikli düzgün dağılımdaki sıra istatistiklerin örnek ekstremlerinin momentleri verilmiştir. Kesikli düzgün dağılımdaki sıra istatistiklerin örnek ekstremlerinin beklenen değer ve varyansları için n=15’ e kadar cebirsel ifadeler bulunmuştur. s (k) n toplamı yardımıyla kesikli düzgün dağılımdaki sıra istatistiklerin örnek ekstremlerinin bütün momentlerinin bulunabileceği görülmüştür. Ayrıca, Matlab kullanılarak k = 20 ve n =1(1)20 örnek boyutu için sayısal sonuçlar hesaplanmıştır

    Statistical method on nonrandom clustering with application to somatic mutations in cancer

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Human cancer is caused by the accumulation of tumor-specific mutations in oncogenes and tumor suppressors that confer a selective growth advantage to cells. As a consequence of genomic instability and high levels of proliferation, many passenger mutations that do not contribute to the cancer phenotype arise alongside mutations that drive oncogenesis. While several approaches have been developed to separate driver mutations from passengers, few approaches can specifically identify activating driver mutations in oncogenes, which are more amenable for pharmacological intervention.</p> <p>Results</p> <p>We propose a new statistical method for detecting activating mutations in cancer by identifying nonrandom clusters of amino acid mutations in protein sequences. A probability model is derived using order statistics assuming that the location of amino acid mutations on a protein follows a uniform distribution. Our statistical measure is the differences between pair-wise order statistics, which is equivalent to the size of an amino acid mutation cluster, and the probabilities are derived from exact and approximate distributions of the statistical measure. Using data in the Catalog of Somatic Mutations in Cancer (COSMIC) database, we have demonstrated that our method detects well-known clusters of activating mutations in KRAS, BRAF, PI3K, and <it>β</it>-catenin. The method can also identify new cancer targets as well as gain-of-function mutations in tumor suppressors.</p> <p>Conclusions</p> <p>Our proposed method is useful to discover activating driver mutations in cancer by identifying nonrandom clusters of somatic amino acid mutations in protein sequences.</p

    Consumer Search on the Internet

    Get PDF
    This paper uses consumer search data to explain search frictions in online markets, within the context of an equilibrium search model. I use a novel dataset of consumer online browsing and purchasing behavior, which tracks all consumer search prior to each transaction. Using observed search intensities from the online book industry, I estimate search cost distributions that allow for asymmetric consumer sampling. Research on consumer search often assumes a symmetric sampling rule for analytical convenience despite its lack of realism. Search behavior in the online book industry is quite limited: in only 25 percen of the transactions did consumers visit more than one bookstore's website. The industry is characterized by a strong consumer preference for certain retailers. Accounting for unequal consumer sampling halves the search cost estimates from 1.8 to 0.9 dollars per search in the online book industry. Analysis of time spent online suggests substitution between the time consumers spend searching and the relative opportunity cost of their time. Retired people, those with lower education levels, and minorities (with the exception of Hispanics) spent significantly more time searching for a book online. There is a negative relationship between income levels and time spent searching.consumer search, internet, search costs

    Novel Rank-Based Statistical Methods Reveal MicroRNAs with Differential Expression in Multiple Cancer Types

    Get PDF
    BACKGROUND:MicroRNAs (miRNAs) regulate target genes at the post-transcriptional level and play important roles in cancer pathogenesis and development. Variation amongst individuals is a significant confounding factor in miRNA (or other) expression studies. The true character of biologically or clinically meaningful differential expression can be obscured by inter-patient variation. In this study we aim to identify miRNAs with consistent differential expression in multiple tumor types using a novel data analysis approach. METHODS:Using microarrays we profiled the expression of more than 700 miRNAs in 28 matched tumor/normal samples from 8 different tumor types (breast, colon, liver, lung, lymphoma, ovary, prostate and testis). This set is unique in putting emphasis on minimizing tissue type and patient related variability using normal and tumor samples from the same patient. We develop scores for comparing miRNA expression in the above matched sample data based on a rigorous characterization of the distribution of order statistics over a discrete state set, including exact p-values. Specifically, we compute a Rank Consistency Score (RCoS) for every miRNA measured in our data. Our methods are also applicable in various other contexts. We compare our methods, as applied to matched samples, to paired t-test and to the Wilcoxon Signed Rank test. RESULTS:We identify consistent (across the cancer types measured) differentially expressed miRNAs. 41 miRNAs are under-expressed in cancer compared to normal, at FDR (False Discovery Rate) of 0.05 and 17 are over-expressed at the same FDR level. Differentially expressed miRNAs include known oncomiRs (e.g miR-96) as well as miRNAs that were not previously universally associated with cancer. Specific examples include miR-133b and miR-486-5p, which are consistently down regulated and mir-629* which is consistently up regulated in cancer, in the context of our cohort. Data is available in GEO. Software is available at: http://bioinfo.cs.technion.ac.il/people/zohar/RCoS

    Probabilistic Issues in Biometric Template Design, Journal of Telecommunications and Information Technology, 2010, nr 4

    Get PDF
    Since the notion of biometric template is not well defined, various concepts are used in biometrics practice. In this paper we present a systematic view on a family of template concepts based on the L1 or L2 dissimilarities. In particular, for sample vectors of independent components we find out how likely it is for the median code to be a sample vector

    The distribution of order statistics for discrete random variables with applications to bootstrapping

    No full text
    An algorithm for computing the PDF of order statistics drawn from discrete parent populations is presented, along with an implementation of the algorithm in a computer algebra system. Several examples and applications, including exact bootstrapping analysis, illustrate the utility of this algorithm. Bootstrapping procedures require that B bootstrap samples be generated in order to perform statistical inference concerning a data set. Although the requirements for the magnitude of B are typically modest, a practitioner would prefer to avoid the resampling error introduced by choosing a finite B, if possible. The part of the order-statistic algorithm for sampling with replacement from a finite sample can be used to perform exact bootstrapping analysis in certain applications, eliminating the need for replication in the analysis of a data set
    corecore