3,134 research outputs found

    Unbiased Assessment Of First Language Acquisition In English: Distinguishing Development And Dialect From Disorder

    Get PDF
    No abstract available

    Improved model identification for non-linear systems using a random subsampling and multifold modelling (RSMM) approach

    Get PDF
    In non-linear system identification, the available observed data are conventionally partitioned into two parts: the training data that are used for model identification and the test data that are used for model performance testing. This sort of 'hold-out' or 'split-sample' data partitioning method is convenient and the associated model identification procedure is in general easy to implement. The resultant model obtained from such a once-partitioned single training dataset, however, may occasionally lack robustness and generalisation to represent future unseen data, because the performance of the identified model may be highly dependent on how the data partition is made. To overcome the drawback of the hold-out data partitioning method, this study presents a new random subsampling and multifold modelling (RSMM) approach to produce less biased or preferably unbiased models. The basic idea and the associated procedure are as follows. First, generate K training datasets (and also K validation datasets), using a K-fold random subsampling method. Secondly, detect significant model terms and identify a common model structure that fits all the K datasets using a new proposed common model selection approach, called the multiple orthogonal search algorithm. Finally, estimate and refine the model parameters for the identified common-structured model using a multifold parameter estimation method. The proposed method can produce robust models with better generalisation performance

    Clustering exact matches of pairwise sequence alignments by weighted linear regression

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>At intermediate stages of genome assembly projects, when a number of contigs have been generated and their validity needs to be verified, it is desirable to align these contigs to a reference genome when it is available. The interest is not to analyze a detailed alignment between a contig and the reference genome at the base level, but rather to have a rough estimate of where the contig aligns to the reference genome, specifically, by identifying the starting and ending positions of such a region. This information is very useful in ordering the contigs, facilitating post-assembly analysis such as gap closure and resolving repeats. There exist programs, such as BLAST and MUMmer, that can quickly align and identify high similarity segments between two sequences, which, when seen in a dot plot, tend to agglomerate along a diagonal but can also be disrupted by gaps or shifted away from the main diagonal due to mismatches between the contig and the reference. It is a tedious and practically impossible task to visually inspect the dot plot to identify the regions covered by a large number of contigs from sequence assembly projects. A forced global alignment between a contig and the reference is not only time consuming but often meaningless.</p> <p>Results</p> <p>We have developed an algorithm that uses the coordinates of all the exact matches or high similarity local alignments, clusters them with respect to the main diagonal in the dot plot using a weighted linear regression technique, and identifies the starting and ending coordinates of the region of interest.</p> <p>Conclusion</p> <p>This algorithm complements existing pairwise sequence alignment packages by replacing the time-consuming seed extension phase with a weighted linear regression for the alignment seeds. It was experimentally shown that the gain in execution time can be outstanding without compromising the accuracy. This method should be of great utility to sequence assembly and genome comparison projects.</p

    Quantifying the behavior of stock correlations under market stress

    Get PDF
    Understanding correlations in complex systems is crucial in the face of turbulence, such as the ongoing financial crisis. However, in complex systems, such as financial systems, correlations are not constant but instead vary in time. Here we address the question of quantifying state-dependent correlations in stock markets. Reliable estimates of correlations are absolutely necessary to protect a portfolio. We analyze 72 years of daily closing prices of the 30 stocks forming the Dow Jones Industrial Average (DJIA). We find the striking result that the average correlation among these stocks scales linearly with market stress reflected by normalized DJIA index returns on various time scales. Consequently, the diversification effect which should protect a portfolio melts away in times of market losses, just when it would most urgently be needed. Our empirical analysis is consistent with the interesting possibility that one could anticipate diversification breakdowns, guiding the design of protected portfolios

    Evaluating the Quality of Research into a Single Prognostic Biomarker: A Systematic Review and Meta-analysis of 83 Studies of C-Reactive Protein in Stable Coronary Artery Disease

    Get PDF
    Background Systematic evaluations of the quality of research on a single prognostic biomarker are rare. We sought to evaluate the quality of prognostic research evidence for the association of C-reactive protein (CRP) with fatal and nonfatal events among patients with stable coronary disease. Methods and Findings We searched MEDLINE (1966 to 2009) and EMBASE (1980 to 2009) and selected prospective studies of patients with stable coronary disease, reporting a relative risk for the association of CRP with death and nonfatal cardiovascular events. We included 83 studies, reporting 61,684 patients and 6,485 outcome events. No study reported a prespecified statistical analysis protocol; only two studies reported the time elapsed (in months or years) between initial presentation of symptomatic coronary disease and inclusion in the study. Studies reported a median of seven items (of 17) from the REMARK reporting guidelines, with no evidence of change over time. The pooled relative risk for the top versus bottom third of CRP distribution was 1.97 (95% confidence interval [CI] 1.78–2.17), with substantial heterogeneity (I2 = 79.5). Only 13 studies adjusted for conventional risk factors (age, sex, smoking, obesity, diabetes, and low-density lipoprotein [LDL] cholesterol) and these had a relative risk of 1.65 (95% CI 1.39–1.96), I2 = 33.7. Studies reported ten different ways of comparing CRP values, with weaker relative risks for those based on continuous measures. Adjusting for publication bias (for which there was strong evidence, Egger's p<0.001) using a validated method reduced the relative risk to 1.19 (95% CI 1.13–1.25). Only two studies reported a measure of discrimination (c-statistic). In 20 studies the detection rate for subsequent events could be calculated and was 31% for a 10% false positive rate, and the calculated pooled c-statistic was 0.61 (0.57–0.66). Conclusion Multiple types of reporting bias, and publication bias, make the magnitude of any independent association between CRP and prognosis among patients with stable coronary disease sufficiently uncertain that no clinical practice recommendations can be made. Publication of prespecified statistical analytic protocols and prospective registration of studies, among other measures, might help improve the quality of prognostic biomarker research

    Multi-institutional evaluation of a Pareto navigation guided automated radiotherapy planning solution for prostate cancer

    Get PDF
    \ua9 The Author(s) 2024.Background: Current automated planning solutions are calibrated using trial and error or machine learning on historical datasets. Neither method allows for the intuitive exploration of differing trade-off options during calibration, which may aid in ensuring automated solutions align with clinical preference. Pareto navigation provides this functionality and offers a potential calibration alternative. The purpose of this study was to validate an automated radiotherapy planning solution with a novel multi-dimensional Pareto navigation calibration interface across two external institutions for prostate cancer. Methods: The implemented ‘Pareto Guided Automated Planning’ (PGAP) methodology was developed in RayStation using scripting and consisted of a Pareto navigation calibration interface built upon a ‘Protocol Based Automatic Iterative Optimisation’ planning framework. 30 previous patients were randomly selected by each institution (IA and IB), 10 for calibration and 20 for validation. Utilising the Pareto navigation interface automated protocols were calibrated to the institutions’ clinical preferences. A single automated plan (VMATAuto) was generated for each validation patient with plan quality compared against the previously treated clinical plan (VMATClinical) both quantitatively, using a range of DVH metrics, and qualitatively through blind review at the external institution. Results: PGAP led to marked improvements across the majority of rectal dose metrics, with Dmean reduced by 3.7 Gy and 1.8 Gy for IA and IB respectively (p &lt; 0.001). For bladder, results were mixed with low and intermediate dose metrics reduced for IB but increased for IA. Differences, whilst statistically significant (p &lt; 0.05) were small and not considered clinically relevant. The reduction in rectum dose was not at the expense of PTV coverage (D98% was generally improved with VMATAuto), but was somewhat detrimental to PTV conformality. The prioritisation of rectum over conformality was however aligned with preferences expressed during calibration and was a key driver in both institutions demonstrating a clear preference towards VMATAuto, with 31/40 considered superior to VMATClinical upon blind review. Conclusions: PGAP enabled intuitive adaptation of automated protocols to an institution’s planning aims and yielded plans more congruent with the institution’s clinical preference than the locally produced manual clinical plans

    The GATA1s isoform is normally down-regulated during terminal haematopoietic differentiation and over-expression leads to failure to repress MYB, CCND2 and SKI during erythroid differentiation of K562 cells

    Get PDF
    Background: Although GATA1 is one of the most extensively studied haematopoietic transcription factors little is currently known about the physiological functions of its naturally occurring isoforms GATA1s and GATA1FL in humans—particularly whether the isoforms have distinct roles in different lineages and whether they have non-redundant roles in haematopoietic differentiation. As well as being of general interest to understanding of haematopoiesis, GATA1 isoform biology is important for children with Down syndrome associated acute megakaryoblastic leukaemia (DS-AMKL) where GATA1FL mutations are an essential driver for disease pathogenesis. &lt;p/&gt;Methods: Human primary cells and cell lines were analyzed using GATA1 isoform specific PCR. K562 cells expressing GATA1s or GATA1FL transgenes were used to model the effects of the two isoforms on in vitro haematopoietic differentiation. &lt;p/&gt;Results: We found no evidence for lineage specific use of GATA1 isoforms; however GATA1s transcripts, but not GATA1FL transcripts, are down-regulated during in vitro induction of terminal megakaryocytic and erythroid differentiation in the cell line K562. In addition, transgenic K562-GATA1s and K562-GATA1FL cells have distinct gene expression profiles both in steady state and during terminal erythroid differentiation, with GATA1s expression characterised by lack of repression of MYB, CCND2 and SKI. &lt;p/&gt;Conclusions: These findings support the theory that the GATA1s isoform plays a role in the maintenance of proliferative multipotent megakaryocyte-erythroid precursor cells and must be down-regulated prior to terminal differentiation. In addition our data suggest that SKI may be a potential therapeutic target for the treatment of children with DS-AMKL
    corecore