3,419 research outputs found

    Evaluation der Leistungskriterien der optimalen Grenzen fĂĽr das Stoppen aufgrund der Aussichtslosigkeit bei flexiblen Designs

    Get PDF
    Group sequential design and adaptive design are flexible designs that are frequently applied in clinical trials. Unlike fixed designs, flexible designs allow for statistical inferences on trial endpoints prior to complete data collection. Such early inferences on a trial may lead to different decisions regarding trial continuation after the interim analyses. If the treatment effect can already be shown, the trial may be stopped early for efficacy. On the contrary, if the interim inference indicates a small treatment effect, the trial may be stopped early for futility. Various options for efficacy and futility stopping boundaries have been proposed in the statistical literature. However, futility boundaries are often chosen without the thorough planning of operational characteristics and evaluation of design performance. In this research work, performance criteria in flexible designs with early futility stops are evaluated. Moreover, previous work from Schüler is further developed to select the so-called "optimal futility boundaries". The optimization approach is developed for trials with continuous or binary endpoints. Application examples of real clinical trials demonstrate the advantages of the new optimal approach and have evaluated the performance criteria in various flexible designs. The results indicate that the optimal futility stopping boundaries simultaneously minimize the probability of wrongly stopping for futility and power loss. Additionally, boundaries from the optimal approach improve the probability of correctly stopping for futility early. In conclusion, it is recommended to investigate and optimize futility boundaries thoroughly at the planning stage of a clinical trial to achieve greater design efficiency.Gruppensequenzielles Design und adaptives Design sind flexible Designs, die häufig in klinischen Studien angewendet werden. Anders als bei festen Designs, ermöglichen flexible Designs vor der vollständigen Datenerfassung statistische Inferenzen auf Studienendpunkte zu ziehen. Eine solche frühe Inferenz kann zum Zeitpunkt der Zwischenanalysen zu unterschiedlichen Entscheidungen über die Fortsetzung der Studie führen. Bei validiertem Behandlungseffekt kann die Studie wegen Wirksamkeit vorzeitig abgeschlossen werden. Im Gegenteil kann die Studie vorzeitig wegen Aussichtslosigkeit abgebrochen werden, wenn die vorläufige Schlussfolgerung auf einen geringen Behandlungseffekt hinweist. In der statistischen Literatur finden sich bereits diverse Optionen für das Stoppen aufgrund der Wirksamkeit und Aussichtslosigkeit. Die Wahl der Grenzen für das Stoppen aufgrund der Aussichtslosigkeit erfolgt allerdings oft ohne gründliche Planung der operativen Eigenschaften und Evaluation der Güte von Designs. In dieser Forschungsarbeit werden Leistungskriterien in flexiblen Designs mit frühem Stoppen aufgrund der Aussichtslosigkeit evaluiert und frühere Arbeiten von Schüler weiterentwickelt, um sogenannte "optimale Grenzen für das Stoppen aufgrund der Aussichtslosigkeit" auszuwählen. Der Optimierungsansatz wurde für Studien mit kontinuierlichen oder binären Endpunkten entwickelt. Echte klinische Studien werden als Anwendungsbeispiele verwendet, um die Vorteile des neuen optimalen Ansatzes zu demonstrieren und die Leistungskriterien in verschiedenen flexiblen Designs zu bewerten. Die Ergebnisse zeigen, dass die optimalen Grenzen für das Stoppen aus Aussichtslosigkeit sowohl die Wahrscheinlichkeit eines falschen Stoppens aus Aussichtslosigkeit als auch den Verlust der Trennschärfe gleichzeitig minimieren. Zusätzlich verbessert der optimale Ansatz die Wahrscheinlichkeit, frühzeitig korrekt wegen Aussichtslosigkeit aufzuhören. Schließlich wird empfohlen, die Grenzen in der Planungsphase einer klinischen Studie gründlich zu untersuchen und zu optimieren, um eine höhere Designeffizienz zu erreichen

    Statistics, ethics, and probiotica

    Full text link
    A randomized clinical trial comparing an experimental new treatment to a standard therapy for a life-threatening medical condition should be stopped early on ethical grounds, in either of the following scenarios: (1) it has become overwhelmingly clear that the new treatment is better than the standard; (2) it has become overwhelmingly clear that the trial is not going to show that the new treatment is any better than the standard. The trial is continued in the third scenario: (3) there is a reasonable chance that the new treatment will finally turn out to be better than the standard, but we aren't sure yet. However, the (blinded) data monitoring committee in the "PROPATRIA" trial of an experimental probiotica treatment for patients with acute pancreatitis allowed the trial to continue at the half way interim analysis, in effect because there was still a good chance of proving that the probiotica treatment was very harmful to their patients. The committee did not know whether treatment A was the probiotica treatment or the placebo. In itself this should not have caused a problem, since it could easily have determined the appropriate decision under both scenarios. Were the decisions in the two scenarios different, then the data would have to be de-blinded, in order to determine the appropriate decision. The committee mis-read the output of SPSS, which reports the smaller of two one-sided p-values, without informing the user what it is doing. It seems that about 5 lives were sacrificed to the chance of getting a significant result that the probiotica treatment was bad for the patients in the trial

    Integrating Phase 2 into Phase 3 based on an Intermediate Endpoint While Accounting for a Cure Proportion -- with an Application to the Design of a Clinical Trial in Acute Myeloid Leukemia

    Full text link
    For a trial with primary endpoint overall survival for a molecule with curative potential, statistical methods that rely on the proportional hazards assumption may underestimate the power and the time to final analysis. We show how a cure proportion model can be used to get the necessary number of events and appropriate timing via simulation. If Phase 1 results for the new drug are exceptional and/or the medical need in the target population is high, a Phase 3 trial might be initiated after Phase 1. Building in a futility interim analysis into such a pivotal trial may mitigate the uncertainty of moving directly to Phase 3. However, if cure is possible, overall survival might not be mature enough at the interim to support a futility decision. We propose to base this decision on an intermediate endpoint that is sufficiently associated with survival. Planning for such an interim can be interpreted as making a randomized Phase 2 trial a part of the pivotal trial: if stopped at the interim, the trial data would be analyzed and a decision on a subsequent Phase 3 trial would be made. If the trial continues at the interim then the Phase 3 trial is already underway. To select a futility boundary, a mechanistic simulation model that connects the intermediate endpoint and survival is proposed. We illustrate how this approach was used to design a pivotal randomized trial in acute myeloid leukemia, discuss historical data that informed the simulation model, and operational challenges when implementing it.Comment: 23 pages, 3 figures, 3 tables. All code is available on github: https://github.com/numbersman77/integratePhase2.gi

    Futility Analysis in the Cross-Validation of Machine Learning Models

    Full text link
    Many machine learning models have important structural tuning parameters that cannot be directly estimated from the data. The common tactic for setting these parameters is to use resampling methods, such as cross--validation or the bootstrap, to evaluate a candidate set of values and choose the best based on some pre--defined criterion. Unfortunately, this process can be time consuming. However, the model tuning process can be streamlined by adaptively resampling candidate values so that settings that are clearly sub-optimal can be discarded. The notion of futility analysis is introduced in this context. An example is shown that illustrates how adaptive resampling can be used to reduce training time. Simulation studies are used to understand how the potential speed--up is affected by parallel processing techniques.Comment: 22 pages, 5 figure

    SAFA : a semi-asynchronous protocol for fast federated learning with low overhead

    Get PDF
    Federated learning (FL) has attracted increasing attention as a promising approach to driving a vast number of end devices with artificial intelligence. However, it is very challenging to guarantee the efficiency of FL considering the unreliable nature of end devices while the cost of device-server communication cannot be neglected. In this paper, we propose SAFA, a semi-asynchronous FL protocol, to address the problems in federated learning such as low round efficiency and poor convergence rate in extreme conditions (e.g., clients dropping offline frequently). We introduce novel designs in the steps of model distribution, client selection and global aggregation to mitigate the impacts of stragglers, crashes and model staleness in order to boost efficiency and improve the quality of the global model. We have conducted extensive experiments with typical machine learning tasks. The results demonstrate that the proposed protocol is effective in terms of shortening federated round duration, reducing local resource wastage, and improving the accuracy of the global model at an acceptable communication cost

    Evaluierung und Verbesserung von Fallzahlrekalkulation in adaptiven klinischen Studiendesigns

    Get PDF
    A valid sample size calculation is a key aspect for ethical medical research. While the sample size must be large enough to detect an existing relevant effect with sufficient power, it is at the same time crucial to include as few patients as possible to minimize exposure to study related risks and time to potential market approval. Different parameter assumptions, like the expected effect size and the outcome’s variance, are required to calculate the sample size. However, even with high medical knowledge it is often not easy to make reasonable assumptions on these parameters. Published results from the literature may vary or may not be comparable to the current situation. Adaptive designs offer a possible solution to deal with those planning difficulties. At an interim analysis, the standardized treatment effect is estimated and used to adapt the sample size. In the literature, there exists a variety of strategies for recalculating the sample size. However, the definition of performance criteria for those strategies is complex since the second stage sample size is a random variable. It is also known since long that most existing sample size recalculation strategies have major shortcomings, such as a high variability in the recalculated sample size. Within Thesis Article 1, me and my coauthors developed a new performance score for comparing different sample size recalculation rules in a fair and transparent manner. This performance score is the basis to develop improved sample size recalculation strategies in a second step. In Thesis Article 2, me and my supervisor propose smoothing corrections to be combined with existing sample size recalculation rules to reduce the variability. Thesis Article 3 deals with the determination of the second stage sample size as the numerical solution of a constrained optimization problem, which is solved by a new R-package named adoptr. To illustrate the relation of the three Thesis Articles, all new approaches are applied to a clinical trial example to show the methods’ benefits in comparison to an established sample size recalculation strategy. The global aim of defining high-performance sample size recalculation rules was approached considerably by my work. The performance of adaptive designs with sample size recalculation can now be compared by means of a single comprehensive score. Moreover, our new smoothing corrections define one possibility to improve an existing sample size recalculation rule with respect to this new performance score. The new software further allows to directly determine an optimal second stage sample size with respect to predefined optimality criteria. I was able to reduce methodological shortcomings in sample size recalculation by four aspects: providing new methods for 1) performance evaluation, 2) performance improvement, 3) performance optimization and 4) software solutions. In addition, I illustrate how these methods can be combined and applied to a clinical trial example.Hintergrund Eine valide Fallzahlberechnung ist ein zentraler Aspekt für ethische medizinische Forschung. Während die Fallzahl groß genug sein muss, um einen vorliegenden relevanten Effekt mit genügend großer Power zu entdecken, ist es gleichzeitig wichtig, so wenig Patient*innen wie möglich einzuschließen, um studienbezogene Risiken sowie die Zeit bis zur Marktzulassung zu minimieren. Verschiedene Parameterannahmen, wie die erwartete Effektgröße und die Varianz des Endpunktes, werden benötigt, um die Fallzahl zu berechnen. Auch mit hoher medizinischer Expertise ist es häufig nicht einfach, die zugrundliegenden Parameterannahmen zu treffen. Publizierte Ergebnisse aus der Literatur können variieren oder auf die aktuelle Situation nicht übertragbar sein. Adaptive Designs sind eine Möglichkeit, um mit diesen Planungsunsicherheiten umzugehen. Zur Zwischenanalyse wird der Behandlungseffekt geschätzt und genutzt, um die Fallzahl anzupassen. In der Literatur gibt es eine Vielzahl an Strategien die Fallzahl anzupassen. Die Definition von Beurteilungskriterien dieser Strategien ist jedoch komplex, da die Fallzahl der zweiten Stufe eine Zufallsvariable ist. Hinzu kommt, dass viele existierende Fallzahlrekalkulations-Strategien Defizite haben, beispielsweise eine hohe Variabilität in der rekalkulierten Fallzahl. Methoden Im Promotionsartikel 1 entwickelten meine Koautor*innen und ich einen neuen Performance- Score für einen fairen und transparenten Vergleich von Fallzahlrekalkulations-Strategien. Dieser Performance-Score diente im zweiten Schritt als Basis, um verbesserte Fallzahlrekalkulations-Strategien zu entwickeln. Hierfür schlugen meine Betreuerin und ich im Promotionsartikel 2 Smoothing-Korrekturen zur Varianzreduktion vor, die mit bereits existierenden Fallzahlrekalkulations-Strategien kombiniert werden können. Im Promotionsartikel 3 wurde die Fallzahl der zweiten Stufe als numerische Lösung eines Optimierungsproblems aufgefasst, welche durch das neue R-Paket adoptr berechnet wird. Um den Zusammenhang der drei zugrundeliegenden Artikel zu illustrieren, wurden die neuen Methoden auf ein klinisches Studienbeispiel angewandt und ihre Vorteile gegenüber einer etablierten Fallzahlrekalkulations-Strategie erläutert. Ergebnisse Das übergeordnete Ziel qualitativ hochwertige Fallzahlrekalkulations-Strategien zu definieren, wurde durch meine Arbeit beträchtlich vorangetrieben. Die Performance von adaptiven Designs mit Fallzahlrekalkulation kann nun durch einen umfassenden Score beurteilt werden. Darüberhinaus stellen die neuen Smoothing-Korrekturen eine Möglichkeit dar, um Fallzahlrekalkulations-Strategien hinsichtlich des neuen Performance-Scores zu verbessern. Die neue Software erlaubt darüber hinaus, eine optimale Fallzahl der zweiten Stufe in Bezug auf vorab definierte Optimalitätskriterien zu bestimmen. Schlussfolgerungen Im Rahmen dieser Arbeit habe ich durch vier Aspekte dazu beigetragen, methodische Defizite im Bereich der Fallzahlrekalkulation zu reduzieren: 1) Performance-Bewertung, 2) Performance-Verbesserung, 3) Performance-Optimierung und 4) Software-Lösungen. Zusätzlich wird illustriert wie diese Methoden kombiniert und auf ein klinisches Studienbeispiel angewandt werden können

    OPTIMIZED ADAPTIVE ENRICHMENT DESIGNS FOR MULTI-ARM TRIALS: LEARNING WHICH SUBPOPULATIONS BENEFIT FROM DIFFERENT TREATMENTS

    Get PDF
    We consider the problem of designing a randomized trial for comparing two treatments versus a common control in two disjoint subpopulations. The subpopulations could be defined in terms of a biomarker or disease severity measured at baseline. The goal is to determine which treatments benefit which subpopulations. We develop a new class of adaptive enrichment designs tailored to solving this problem. Adaptive enrichment designs involve a preplanned rule for modifying enrollment based on accruing data in an ongoing trial. The proposed designs have preplanned rules for stopping accrual of treatment by subpopulation combinations, either for efficacy or futility. The motivation for this adaptive feature is that interim data may indicate that a subpopulation, such as those with lower disease severity at baseline, is unlikely to benefit from a particular treatment while uncertainty remains for the other treatment and/or subpopulation. We optimize these adaptive designs to have the minimum expected sample size under power and Type I error constraints. We compare the performance of the optimized adaptive design versus an optimized non-adaptive (single stage) design. Our approach is demonstrated in simulation studies that mimic features of a completed trial of a medical device for treating heart failure. The optimized adaptive design has 25% smaller expected sample size compared to the optimized non-adaptive design; however, the cost is that the optimized adaptive design has 8% greater maximum sample size. Open-source software that implements the trial design optimization is provided, allowing users to investigate the tradeoffs in using the proposed adaptive versus standard designs
    • …
    corecore