DSpace@RPI (Rensselaer Polytechnic Institute)
Not a member yet
    6376 research outputs found

    Stochastic multi-armed bandits: optimality, complexity, robustness, and risk-sensitivity

    No full text
    December 2024School of EngineeringMulti-armed bandit problems serve as fundamental models in sequential experimental design, where the goal is to balance exploration and exploitation to maximize cumulative rewards. These problems are pivotal in various real-world applications, including clinical trials, online advertising, and recommendation systems. In the realm of multi-armed bandits, two primary paradigms have emerged: regret minimization, where the objective is to minimize the difference between the rewards obtained and those that would have been obtained by always choosing the best arm, and best arm identification (BAI), aiming to accurately identify the arm with the highest reward mean. This dissertation delves into both paradigms, addressing some unresolved questions spanning across {\em optimal} and {\em efficient} experiment design. Furthermore, the viewpoint of {\em decision safety} and {\em risk sensitivity} is adopted, addressing some of the practical challenges in sequential experimental design. In the context of BAI, existing optimal algorithms have been computationally expensive. These algorithms aim to compute an optimal allocation over the arms by solving a minimax optimization problem in each round, introducing significant computational challenges. Conversely, existing efficient algorithms for BAI have not been optimal. These algorithms allocate a pre-chosen fraction of samples to the best arm, denoted by a tunable parameter β\beta, and their optimality depends on selecting an appropriate value of β\beta, creating a gap between efficiency and optimality. This dissertation addresses this gap by introducing algorithms that are both optimal and computationally efficient. The efficiency is attributed to {\em implicitly} estimating the optimal value of β\beta, without having to solve any optimization problem. Furthermore, while existing literature predominantly concentrates on the exponential distribution family, this approach extends its scope to encompass a broader range of distributions parameterized by their mean values, subject to mild regularity conditions. As the second focus, this dissertation delves into the aspect of {\em decision safety} in stochastic bandits. In applications such as recommender systems, user feedback is often malicious, and in clinical trials, experiment results are susceptible to human errors. These fluctuations in the reward is modeled through an oblivious adversary which is capable of replacing the reward by adversarial samples for a fraction of rounds, also known as the Huber's contamination model. Viewing the problem from a robustness perspective, this dissertation introduces two algorithms: a gap-based algorithm and a successive elimination-based algorithm, Performance guarantees of these algorithms show that these are robust to adversarial contamination, and may achieve optimal sample complexity (up to constant factors). Hence, this dissertation takes a step towards decision safety in practical applications involving experimental design. As the third focus, this dissertation proposes to investigate human risk-taking behavior in decision-making contexts. While most existing algorithms aim to maximize the average reward, this is not always the objective in many human-centric applications. For instance, in high-frequency trading, decision-making revolves around the principle of high-risk and high-reward, in contrast to maximizing the average reward over a period of time. Designing experiments in such risk-sensitive settings involve significantly distinct approaches compared to its canonical risk-neutral counterpart. This dissertation proposes a framework for risk-sensitive bandits and lays down theoretical context and real-world applications in which decision safety is critical to the algorithm design.Ph

    Deep probabilistic and generative models for x-ray based imaging and ecg

    No full text
    May2025School of EngineeringArtificial intelligence (AI) is poised to transform modern medicine and, in particular, medical imaging, as the need for healthcare services continues to outstrip available resources worldwide. Deep learning models show promise in enhancing clinical workflows and patient outcomes through data-driven diagnoses and improved image quality. Despite the pressing need and considerable research efforts, several challenges have delayed the integration of AI into healthcare: Neural networks often lack explainability, making overconfident inferences on out-of-sample data, and are susceptible to adversarial attacks—small, targeted input perturbations capable of fooling otherwise accurate networks. Furthermore, large models are data-hungry, yet patient images and information are legally guarded by health privacy protections. Future virtual clinical trials for medical device validation also require high-quality synthetic datasets that sufficiently represent rare pathologies and population demographics. While deep generative models may address these data gaps, their outputs often lack clinical fidelity despite appearing visually convincing. Finally, many image-enhancement tasks, such as deblurring, lack the ground-truth labels necessary for supervised learning, forcing reliance on simulated degradations that can introduce artifacts when applied to real-world data. Compounding the above issues is the high-dimensional nature of most medical signals, invalidating solutions that cannot be feasibly scaled. The following dissertation investigates solutions to several of the aforementioned challenges within specific applications, primarily leveraging deep probabilistic and generative models. First, we examine how diverse deep ensembles, aided by a feature decorrelation mechanism, can improve adversarial robustness in high-dimensional tasks like electrocardiogram classification. Next, drawing inspired by the 2023 AAPM Deep Generative Modeling Challenge, we employ denoising diffusion probabilistic models to produce synthetic medical images that are realistic in both visual appearance and clinical relevance—an effort that earned first place in the competition. Finally, we adapt state-of-the-art simulation techniques to create realistic, system specific degradations to train deep deblurring models for photon-counting computed tomography. By scaling a diffusion model to 3D through a joint 2D inference process and disentangling noise from signal prior to deblurring, we successfully mitigate texture distortions and improve performance on real-world data.Ph

    Ethical Web Science Workshop Overview

    No full text
    The Ethical Web Science Workshop at WebSci’25 addresses the ethical challenges that arise from the rapidly evolving relationship between the Web and AI. With increasing reliance on Web-sourced data, issues of fairness, transparency, and consent have become central to technological innovation. To this end, we aim at bringing together researchers, practitioners, ethicists, and policymakers in order to create tools and guidelines for ethical compliance in Web-based research. By doing so, standards for ethical data sourcing and usage might be discussed and lead to guidelines for best practices

    Measurements of dilatational rheology of an insoluble lung surfactant using an oscillatory interfacial dialator

    No full text
    December 2024School of EngineeringThis research investigates the interfacial rheology of a DPPC monolayer, the main phospholipid in lung surfactants, within a model lung flow apparatus, oscillatory interface dilator, OID. The study primarily measures interfacial velocity to explore how surfactant concentration and phase influence the behavior of a dilating monolayer, particularly regarding Marangoni flow and surface viscosity. A particle tracking velocimetry, PTV, code was utilized to capture velocity distributions across the interface. Temporal and spatial analyses revealed minimal variations in global surface measurements, highlighting the need for finely resolved data to capture subtle changes. Observations identified increased velocity variation during the compression cycle and heightened sensitivity at the cavity center with higher DPPC concentrations. Preliminary computational modeling was performed for comparative purposes, revealing limitations in simulating realistic dynamic contact angles. This study advances the understanding of surfactant dynamics at the air-liquid interface, with implications for respiratory function modeling in health and disease.M

    Folktale story generation and automatic evaluation of generated text

    No full text
    December 2024School of ScienceFolktales, as cultural narratives originating from oral traditions, simultaneously entertain and educateindividuals. These indigenous tales, deeply rooted in societal contexts and imbued with moral lessons, provide invaluable insights into the customs and traditions of the communities responsible for their creation and transmission across generations. Consequently, folktales possess historical and literary importance, making them ideal tools for fostering understanding and promoting cultural exchange among diverse cultures. Remarkably, folktales from various societies frequently exhibit striking similarities, highlighting the interconnectedness of human experiences across the globe. Folklore researchers have meticulously analyzed and classified folktales based on types and motifs to identify shared themes and characteristics. Nonetheless, the automatic identification and discovery of folktale types remain an area of ongoing study, and little work has focused on automatic folktale classification and clustering. The research presented here has three distinct contributions: addressing the challenges associated with automatically recognizing folktale plot types and motifs to cluster tales, generating narratives conforming to one or more established folktale types, and evaluating the produced text with regard to the plot type and a holistic measure of quality. This research lays the foundation for valuable applications in diverse fields, including information retrieval, persuasive communication, negotiation strategies, natural language comprehension, and computational creativity. Additionally, the capacity to abstract natural language semantics could be crucial in various cognitive tasks, and this study offers insights into these fundamental processes. Lastly, this research advances a computational perspective on cultural influences, enabling the exploration of cultural distinctions as they manifest within narratives.Ph

    A Collaborative Data Integration and Visualization Platform for Planetary Exploration: The Mars 2020 Campfire and Mars Mission Minder

    No full text
    The Mars Perseverance Rover is critical to a decades-long investigation of Mars’ habitability and NASA’s Mars Sample Return campaign. The Perseverance instrument payload, including SHERLOC, LIBS, and PIXL, is capturing an unprecedented wealth of data about Mars’ surface. We introduce two new tools for data exploration and analytics. Firstly, we apply the Rensselaer Campfire, a multi-user, collaborative, immersive computing interface, to facilitate interactive data analytics with real-time, publicly-available planetary observational data. “Mars 2020 on the Campfire” transcends traditional data presentation methods by integrating a geolocated, multi-faceted dataset atop an interactive map of the rover’s journey. Secondly, through a course-based data analytics research program, we have engaged undergraduate teams to develop an interactive application, the “Mars Mission Minder”, which enables data exploration on personal computing platforms. Both the Campfire and Mars Mission Minder can be shared interactively and virtually across institutions. The Campfire [1,2] is a desk-height, six-foot panoramic screen and floor projection (Figure 1), connected to two large displays that host data analytics content. The system integrates high-resolution panoramic and aerial imagery, and 3D terrestrial models with instrument data. The Mars Mission Minder is an interactive app that allows users to choose data sets and analytical techniques based on a suite of scientific questions. By leveraging cutting-edge data analytics and sophisticated visualization tools, our project offers a comprehensive, intuitive data exploration. Participating scientists can select data by geospatial location, sampling metadata category, or even PIXL, LIBS, and SHERLOC data class. The data analytics platform allows users to explore the relationship between organic compounds, mineralogy, and inorganic compositions. Results highlight mineral/organic compound co-location, and enable prediction of most likely locations for abiotic organic synthesis. Integrating a comprehensive data analytics suite emphasizes analytical rigor with dynamic visualizations. This platform enables hypothesis testing and can accelerate discovery, foster cross-disciplinary collaborations, and serve as an example for ongoing and future astrobiology and planetary science missions. [1] EL Ameres, Ph.D. diss., RPI, USA, 2018; [2] EL Ameres and GP Clement, U.S. Patent 10,996,552 B2. May 2021

    Variable selection and manipulation with missing data

    No full text
    December 2024School of EngineeringBoth causal modeling and associative feature selection aim to identify essential relationships among the set of modeled variables, albeit with different objectives. While causal modeling focuses on functional relationships, associative feature selection focuses on statistical relationships. In practice, inferring causal or associational relations often has to be done from a dataset with missing entries under the potential bias missing data introduces. Traditionally, missingness has been attributed to benign data collection processes, but as datasets are increasingly curated from diverse sources, including untrusted parties, maliciously engineered missingness has become a likely threat. In turn, to make reliable inferences, a practitioner has to understand how the methods used to extract these causal or associational relationships are affected by benign and adversarial missingness. This dissertation addresses these challenges in three parts. First, we examine the impact of benign missing data on the model-X knockoffs framework, a recent method that provides false discovery rate (FDR) control across a broad range of feature selection techniques. We identify how the distribution shift resulting from imputing the missing entries or dropping partially observed data points interferes with the model-X knockoffs’ FDR guarantees. Next, we introduce sufficient conditions under which imputation using the generative model originally intended for FDR calibration can preserve all assumptions of the model-X framework. Second, we study the effects of adversarial missing data on causal structural learning from observational data. We introduce the adversarial missingness treat model, where an attacker selectively omits data entries. Under this threat model, we show an adversary can asymptotically render a corrupted causal model an optimal solution by concealing a subset of the features in certain observations. We also propose learning-based attacks that are effective with finite data and show that they can successfully obscure adversarially targeted causal relationships in various experimental setups. Third, we extend our study of adversarial missingness to associative learning tasks through a bi-level optimization approach. To tailor attacks to standard missing data handling methods, we develop differentiable approximations for three widely used techniques: mean imputation, regression-based imputation, and complete-case analysis. Our results demonstrate that these attacks can effectively manipulate generalized linear models, altering p-values from significant to insignificant by omitting less than 20% of targeted features.Ph

    Hydrologic and spatiotemporal controls on pyrogenic carbon export from fire-affected, coastal california streams

    No full text
    May2025School of ScienceWildfires play a crucial role in shaping ecosystems but are increasingly driven by anthropogenic factors, leading to shifts in fire regimes and their associated environmental impacts. One significant consequence of fire is the production and mobilization of pyrogenic carbon (PyC), a complex mixture of thermally altered organic compounds that influence carbon cycling in terrestrial and aquatic systems. Recent increases in wildfire frequency and severity, driven by climate change and land-use alterations, have amplified the need to understand the environmental fate of fire-derived materials in order to mitigate and prepare for continued fire perturbations. The chapters of this dissertation aim to improve our understanding of PyC in fire-affected aquatic systems by assessing the efficacy of current PyC quantification methods, determining the variability in PyC concentration and character, and/or identifying any drivers of PyC export from riverine systems. In Chapter 1, I introduce the two specific fractions of PyC targeted in following chapters: levoglucosan (a rapidly degraded anhydrosugar) and black carbon (a refractory condensed aromatic compound class). In Chapter 2, I assess the degree of impact free benzenepolycarboxylic acids (BPCAs) have on the current dissolved black carbon quantification method. Black carbon is most commonly quantified via oxidation of its characteristic condensed aromatic structures into BPCA molecular markers for analysis. Results show that dissolved black carbon concentrations are not artificially overestimated by the presence of free BPCAs as long as only certain conversion factors are utilized. In Chapter 3, I investigate two paired temporary streams that were equivalently impacted by the 2020 SCU Lightning Complex Fires. Through high-temporal resolution measurements, watershed-specific drivers of dissolved organic and black carbon export were identified. These streams also showcased that while there is a significant first-flush of dissolved black carbon exported during the first stream wet-up post-fire, the greatest concentrations were observed during wet-up of the second year post-fire. In Chapters 4 and 5, I investigate the post-fire PyC export behavior from five coastal mountain watersheds variably impacted by the 2020 CZU Lightning Complex Fires. In Chapter 4, results from black carbon concentrations suggest that hydrology rather than the percentage of watershed area burned is the primary driver of post-fire export behavior. In Chapter 5, levoglucosan concentrations were compared to that of black carbon from these same watersheds. Distinct molecular class-specific characteristics were revealed. Levoglucosan, being highly soluble, was rapidly mobilized during the first events post-fire. In contrast, black carbon concentrations were more closely linked with discharge and the ratio of dissolved black carbon to organic carbon increased with time since fire highlighting the role of black carbon aging increasing solubility and therefore mobilization. In Chapter 6, I contextualize the findings from the previous chapters, commenting on spatiotemporal controls on PyC export and make suggestions for future avenues of research. Overall, these findings contribute to a more comprehensive understanding of PyC fluxes and their implications for carbon budgets, water quality, and role in the Earth system.Ph

    Artificial impulse responses for emulating decca trees

    No full text
    May2025School of ArchitectureABSTRACTThis research explores the generation of artificial impulse responses designed to emulate the acoustical properties of the Decca Tree microphone array, a widely adopted method in classical and film music recording known for its immersive and spatially cohesive sound characteristics. A series of Dirac impulses was systematically created, incorporating calculated time delays and sound pressure level adjustments to represent realistic microphone capture scenarios. These impulses were then combined with various early reflection and diffuse tail algorithms to produce authentic spatial audio simulations. The thesis presents three listening tests conducted to evaluate the efficacy of these artificially generated impulse responses in emulating the acoustic characteristics captured by traditional Decca Tree setups. Results showed a clear listener preference for Cinematic Rooms early reflections over Valhalla Rooms algorithmic reverb, particularly highlighting moderate gain adjustments. Additionally, the research demonstrated listeners’ nuanced sensitivity to distance-dependent processing, revealing that while detailed, custom-tailored reflections and tails were generally favored, simpler uniform reflections remained competitively preferred by attentive listeners. Overall, this study validates a methodological approach for convincingly emulating the revered Decca Tree microphone technique in spatial audio production.M

    Development of integrated and scalable platforms for mrna synthesis, purification, and thermostable mrna-lipid nanoparticle drug formulation

    No full text
    August2025School of EngineeringMessenger RNA (mRNA) therapeutics are poised to transform modern medicine, but their widespread adoption is limited by challenges in large-scale manufacturing, impurity removal, and formulation stability. This dissertation presents a series of integrated solutions for the production, purification, and stabilization of mRNA medicines, emphasizing the translation of laboratory innovation into scalable, industrial processes. The thesis work began with the design and synthesis of mRNAs of varying lengths, encoding for concatemeric EGFP proteins, which serve as reference materials for studying the impact of sequence and structure on product quality and delivery. These synthetic mRNAs were thoroughly characterized for integrity, size, and purity using electrophoresis, HPLC, bioanalyzer analyses, and next-generation sequencing (NGS). Following purification, the EGFP mRNAs were encapsulated in lipid nanoparticles (LNPs) using clinically relevant lipid compositions. A comprehensive series of lyophilization protocols were then developed and optimized to produce stable, freeze-dried mRNA-LNP formulations. The physicochemical and functional stability of these lyophilized products was evaluated over extended storage at a range of temperatures, providing new insights into the factors that govern long-term stability of mRNA therapeutics. To address early-stage process impurities, a polyethylene glycol (PEG)-citrate aqueous two-phase system (ATPS) was developed for the rapid and scalable removal of proteins and rNTPs from crude in vitro transcription reactions. The ATPS workflow leverages unique interfacial adsorption properties of mRNA to enable high-yield and gentle separation directly from unprocessed reaction mixtures. This method significantly accelerates the purification process and reduces the burden on downstream chromatographic steps. For final polishing and impurity clearance, a tandem chromatography strategy was established. Hydrogen bonding chromatography was employed as a first step for the efficient removal of double-stranded RNA (dsRNA) and process-related impurities. This was immediately followed by oligo d(T) affinity chromatography, which selectively captures full-length, polyadenylated mRNA. The two-stage process is fully compatible with large-scale manufacturing, provides high product purity, and meets regulatory requirements for clinical-grade mRNA therapeutics. These advancements provide a robust and scalable framework for the manufacturing of high-purity, thermostable mRNA therapeutics. We hope that the resulting workflow not only meets current regulatory and quality demands but also provides a foundation for the broader deployment and global accessibility of next-generation mRNA medicines.Ph

    223

    full texts

    6,376

    metadata records
    Updated in last 30 days.
    DSpace@RPI (Rensselaer Polytechnic Institute) is based in United States
    Access Repository Dashboard
    Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇