DSpace@RPI (Rensselaer Polytechnic Institute)
Not a member yet
    6376 research outputs found

    Towards explainable and actionable bayesian deep learning

    No full text
    December 2024School of EngineeringDespite significant progress in many fields, conventional deep learning models cannot effectively quantify their prediction uncertainties. They are typically overconfident in areas they do not know and they are prone to adversarial attacks and out-of-distribution inputs. To address these limitations, we propose utilizing explainable and actionable Bayesian deep learning (BDL) methods to perform accurate and efficient uncertainty quantification, identify uncertainty sources, and develop strategies to mitigate their impacts on prediction accuracy. Existing BDL methods have several shortcomings: First, they are either accurate but computationally intractable or efficient but inaccurate in uncertainty quantification. Second, they typically have limited explainability and lack an understanding of the sources of uncertainties. Finally, they often fail to mitigate uncertainties to improve model prediction performance. To address these shortcomings, this thesis focuses on three thrusts: uncertainty quantification (UQ), uncertainty attribution (UA), and uncertainty mitigation (UM). For UQ, we introduce advanced techniques to achieve a better efficiency-accuracy trade-off. The first approach enhances traditional ensemble methods by increasing component diversity, achieving state-of-the-art UQ performance. The second approach integrates an evidential neural network with Bayesian deep learning, allowing for simultaneous prediction and UQ in a single forward pass, which significantly improves computational efficiency without compromising accuracy. Additionally, we developed a gradient-based UQ method for pretrained models, enabling easy calculation of epistemic uncertainty without the need for model refinement and access to the training data. In UA, to improve explainability, we developed both gradient-based and optimization-based methods to identify problematic regions in the input that contribute to prediction uncertainty. The gradient-based method offers competitive accuracy, relaxed assumptions, and high efficiency, whereas the optimization-based method formulates UA as an optimization problem, achieving state-of-the-art performance by learning informative perturbations. Finally, for UM, we leverage insights from uncertainty attribution to develop strategies that enhance model performance. By using uncertainty attribution maps as attention mechanisms, our approach directs the model's learning toward more informative regions with low uncertainty, improving prediction accuracy and robustness.Ph

    Curation and completion of knowledge graphs

    No full text
    August2025School of ScienceKnowledge graphs (KGs) store facts expressed in relationships between entities. Each fact is represented as an ordered triple of \textit{(h,r,t)} or \textit{(head, relation, tail)}. The incompleteness, incorrectness, and sparseness of real-world KGs motivate the task of link prediction or KG completion, where a machine learning model predicts missing entities or relations by learning patterns from a training graph. In this thesis, our contribution focuses on the curation and completion of knowledge graphs, specifically addressing the following challenges. 1. In Chapter 2, we design a novel ontology to capture malware threat intelligence from unstructured text data. The ontology is combined with a framework we developed, TINKER, to instantiate a malware knowledge graph. We show the graph's competency by assessing it using prominent KG completion models.2. In Chapter 3, we consider the inductive KG completion task, where not all entities were seen during training. The most performant models in the inductive setting have employed path encoding modules in addition to subgraph encoding modules. We propose a scalable and efficient alternative to the explicit use of paths. Experimental evaluations on existing inductive KG completion benchmark datasets demonstrate the efficacy of our model. 3. In Chapter 4, we identify gaps in the benchmark datasets and experimental setup of inductive KG completion. We establish a new research direction of realistic and temporally aware inductive knowledge graph completion by providing appropriate datasets and baseline models. Specifically, we propose an algorithm and utilize it to generate datasets that consider temporal progression with varying degrees of entity overlap and independent validation context graphs. We suggest an alternative negative sampling strategy for improved representation learning. Furthermore, we propose two baseline models to address the temporal aspect and entity overlap in the datasets.Ph

    Two-stage chromatographic purification of lentiviral vectors with enhanced particle characterization via nanoflow cytometry

    No full text
    August2025School of EngineeringLentiviral vectors (LVVs) are central to the advancement of gene therapy, particularly in ex-vivo applications such as chimeric antigen receptor T (CAR-T) cell therapy. However, existing purification methods of LVVs face significant challenges, including process variability, colloidal instability, and low functional recovery. This thesis addresses these limitations through the development of a two-stage, orthogonal chromatographic purification process designed to enhance both recovery and stability of LVVs.Anion exchange chromatography (AEX) remains a foundational unit operation in LVV purification, but the strong electrostatic interactions between LVVs and quaternary amine ligands often necessitates high salt concentrations for elution, which can compromise vector integrity. Additionally, conventional bead-based stationary phases rely on pore diffusion to utilize their total binding capacity- however, the size of LVV inhibits its diffusion into traditional bead-based media. To overcome these limitations, we evaluated monoliths, which enable large modalities like LVV to access the entire functionalized surface area via convective transport, resulting in improved mass transfer, higher binding capacity, and faster processing times, enabling greater recoveries. We evaluated Arginine hydrochloride (ArgHCl) for its potential to improve recovery and colloidal stability during the CIM QA step. Arginine is a widely used formulation additive known for enhancing protein solubility, suppressing aggregation, and improving elution profiles. ArgHCl substantially improved physical and infectious recovery in linear gradient elution experiments. In addition to improving recovery, dynamic light scattering measurements revealed that ArgHCl enhanced the colloidal stability of eluted fractions compared to the conventional NaCl eluent, where particle aggregation was evident. Nanoflow cytometry was then employed to characterize particle heterogeneity and assess particle retention behavior of VSVG-positive and VSVG-negative subpopulations across LGE experiments. This analysis revealed a 2-peak elution profile on CIM QA, with a primary peak consisting predominantly of VSVG-negative particles and a secondary peak enriched for VSVG-positive particles, which corresponded to infectious recovery. The tetraspanin protein CD9, a known exosomal marker protein, was found to be enriched in the initial impurity peak of LGE fractions. The impurity fraction containing the highest levels of CD9 was then recombined with infectious particles from the secondary peak, resulting in a measurable reduction in infectivity. These findings suggest that exosomes constitute a substantial portion of the initial impurity peak and may impair LVV transduction efficiency if not effectively removed during purification. Following AEX development, a secondary flowthrough polishing step using Capto Core 700 was implemented to further reduce host cell protein (HCP) and double-stranded DNA (dsDNA) impurities. Improved LVV recoveries were observed when low concentrations of ArgHCl (150–300 mM) were added in the feed, without compromising impurity clearance. As a result, the Capto Core step was positioned second to the CIM QA capture stage. When operated in tandem, the integrated two-step process achieved high recoveries of both vector transgene and p24, along with a 2-log reduction in HCP and dsDNA levels. Overall, this study establishes a two-stage purification process that enables effective impurity clearance and high LVV recovery.Ph

    Characterizing conformational landscapes of arf gtpases using high pressure

    No full text
    May2025School of ScienceThe adenosine diphosphate (ADP) ribosylation factor (Arf) small guanosine triphosphate (GTP)ases function as molecular switches to activate signaling cascades that control membrane organization in eukaryotic cells. In Arf1, the GDP/GTP switch does not occur spontaneously but requires guanine nucleotide exchange factors (GEFs) and membranes. The related small GTPase Arf6, however, is able to undergo spontaneous nucleotide exchange. In both cases, exchange involves massive conformational changes, including disruption of the core β-sheet. To probe the switch mechanism, we coupled pressure perturbation with nuclear magnetic resonance (NMR), Fourier Transform infrared spectroscopy (FTIR), small-angle X-ray scattering (SAXS), fluorescence, and computation. For both proteins, pressure induced the formation of a classical molten globule (MG) ensemble, though Arf6 populates an ensemble which is energetically distinct from that of Arf1. Pressure also favored the GDP to GTP transition in both proteins, providing strong support for the notion that the MG ensemble plays a functional role in the nucleotide switch. We propose that the MG ensemble allows for switching without the requirement for complete unfolding and may be recognized by GEFs. Substitutions in helix α5 of Arf6 locally destabilize this helix relative to Arf1, in which it is the most stable element. Mutation of the α5 sequence in Arf6 to that of Arf1 resulted in both increased stability and slower switching, demonstrating “back-to-front” control of nucleotide exchange kinetics. Evolutionary covariance analysis highlighted an extensive non-interacting coupling network in the C-terminal half of Arf6 as well as many other small GTPases. Our work suggests that an MG-based switching mechanism as well as “back-to-front” control of switching could constitute pervasive features in Arfs and Arf-like GTPases, and more generally, the evolutionarily related Rags (Ras-like small GTPases) and Gα GTPases.Ph

    Asymmetric consequence relations and many-valued semantics

    No full text
    May2025School of Humanities, Arts, and Social SciencesThis work is concerned with the definition and motivation of asymmetric logics. These are many-valued logics with different sets of designated values for the premises and the conclusions. Their existence will be motivated both philosophically and formally. Not only do they constitute examples of novel logics that can model some aspects of our reasoning, but they are formally and historically important as counterexamples to the famous thesis of Roman Suszko. Suszko’s Thesis states that every logic is (logically) two-valued. Importantly, asymmetric logics can be given a many-valued (Suszko-style) semantics (which was first suggested by Grzegorz Malinowski in his notion of q-consequence). However, these logics are non-Tarskian, which forces the need to reevaluate the standard definition of validity. The asymmetric logics ST and wST are discussed and motivated in further detail.M

    Current limitation of ai in education

    No full text
    March2025Information Technology and Web Science ProgramThis study examines the effectiveness and limitations of artificial intelligence (AI) in education, focusing on the subjects where AI can enhance student learning and the barriers to its widespread adoption. Using a semantic review methodology, the research synthesizes existing literature to assess AI's impact across different academic domains. Findings indicate that AI shows promise in subjects that benefit from personalized learning and data-driven insights, such as mathematics and language learning. However, significant challenges remain, including ethical concerns, data privacy risks, and the potential to exacerbate educational inequalities. These findings underscore the need for a balanced approach to AI integration, ensuring that technological advancements align with principles of equity and ethics. Further research, incorporating empirical methods, is recommended to deepen understanding and address these challenges.M

    Physical modeling of a liquefiable soil-sheet pile retaining wall system

    No full text
    December 2024School of EngineeringSheet-pile walls are retaining structures that are prone to liquefaction-induced lateral spreading in waterfront areas. This dissertation explores the dynamic behavior of soil-sheet pile retaining wall systems under seismic loading, focusing on the effects of various parameters such as peak ground acceleration (PGA), number of peak cycles, Arias Intensity, soil density, and embedment ratio. The Liquefaction Experiments and Analysis Projects (LEAP) is a global research collaboration aimed at producing reliable test data to advance and verify numerical models for soil liquefaction studies. The study employs a series of dynamic centrifuge tests performed at the Rensselaer Polytechnic Institute (RPI) as part of the LEAP 2020 and LEAP 2022 projects. The LEAP 2020 experiments simulated the seismic response of retaining wall systems in saturated granular soil to understand the soil-structure interaction and liquefaction phenomena. The problem involved a floating sheet pile wall supporting a deposit of liquefiable soil. The experimental setup of the LEAP 2020 models featured a backfill height to embedment support ratio of 2:1. These experiments investigated the impact of varying mass density and input motions on soil-structural interaction. As part of the LEAP 2020 research campaign, test RPI-LEAP 20-E was conducted at the RPI centrifuge facility by Dr. Evangelia Korre to assess the seismic behavior of the sheet pile wall. Subsequently, the author of this thesis performed the RPI-LEAP 20-S test at the same facility as a repeat experiment, achieving results that were highly consistent with the initial test. Within the framework of LEAP 2022, a series of six centrifuge experiments, RPI-LEAP 22-5, RPI-LEAP 22-10-1, RPI-LEAP 22-10-2, RPI-LEAP 22-10-3, RPI-LEAP 22-19, and RPI-LEAP 22-36 were conducted at RPI to investigate the impact of varied input motion parameters (Number of peak cycles, motion duration and PGA) on the retaining wall models. LEAP 22 models had an increased embedment depth with an embedment ratio of 1:1. The key findings from the LEAP 22 experiments indicated that higher PGAs lead to increased soil-structure interaction, manifesting in higher accelerations, pore water pressures, and lateral displacements of the retaining walls. An increase in Arias Intensity, while maintaining a comparable number of peak cycles, resulted in greater lateral displacement of the retaining wall. The settlement of the backfill surface away from the wall was found to increase with a higher number of peak cycles. The experiments also highlighted the significant role of soil density in mitigating seismic impacts, with denser soil models showing lower displacements. Additionally, varying the embedment ratio of the sheet-pile walls significantly influences their seismic performance, with deeper embedment reducing wall displacements and rotations. The study underscores the importance of considering PGA, number of cycles, Arias Intensity, and soil properties in designing resilient geotechnical structures. The experimental results contribute to a better understanding of the complex dynamic behavior of retaining wall systems and provide valuable data for validating numerical models used in predicting soil liquefaction and seismic responses. This research highlights the critical factors influencing the stability of geotechnical structures in seismic-prone areas and offers insights for improving design and risk assessment practices.Ph

    Accelerating architectural workflows through automated code compliance

    No full text
    August2025School of ArchitectureAs cities become more interconnected, the government plays an increasingly vital role in shaping the relationship between clients and architects. This thesis explores how data and workflows are shared across these three key stakeholders, emphasizing the city’s impact through zoning laws, building codes, and regulatory processes. To address the challenges at these intersections, a set of digital tools was developed. Among them, CodeCheck—a Grasshopper plugin that addresses the legal and administrative setbacks caused by non-compliant designs— stands out for enabling real-time compliance checks directly within architectural design software. By automating zoning and code validation, it helps architects stay focused on design while ensuring projects meet legal requirements. CodeCheck also aims to incorporate generative design capabilities that respond to site boundary conditions, enabling more context-sensitive spatial solutions. To support this functionality, the system introduces mMake, a generative tool that produces compliant building models in real time. Future improvements include expanding the platform beyond New York City to support a wider range of building codes and jurisdictions. Ultimately, this research shows that compliance is not just about rules—it’s key to building better cities. By closing the gap between design and regulation, CodeCheck improves efficiency, supports accountability, and contributes to smarter, more sustainable urban development.M

    Ai models for decentralized finance

    No full text
    August2025School of ScienceThis dissertation develops a general framework for modeling user behavior from transactional data, using decentralized finance (DeFi) as a motivating case study. DeFi protocols such as Aave offer a rare opportunity to study large-scale, real-world financial behavior through publicly available transaction-level data. However, this data is complex. It is high-dimensional, heterogeneous, and irregularly timestamped. To address the modeling challenges this poses, this dissertation explores a range of methods, including clustering, survival analysis, transformer-based representation learning, and code generation with large language models. We begin by characterizing user behaviors in Aave using quarterly address-level summaries and unsupervised clustering, revealing dominant behavioral archetypes and their evolution over time. Building on this, we introduce a novel survival analysis framework tailored to DeFi, modeling event timing from raw transaction sequences and uncovering patterns in loan repayments, liquidations, and platform usage. These insights motivate the creation of FinSurvival, a benchmark suite of 16 large-scale survival prediction and classification tasks. FinSurvival is the first publicly available benchmark of its kind in finance, and we show that standard machine learning methods often outperform deep learning models in this high-censoring environment. To explore learned representations, we develop Large Transaction Models (LTMs). These transformer-based models generate embeddings of transaction sequences. We evaluate the effectiveness of these embeddings on FinSurvival tasks, demonstrating that they can improve performance in classification settings relative to both raw and hand-engineered features. Finally, we present a benchmark for evaluating the ability of LLMs to generate code for analyzing transaction data, demonstrating the feasibility of natural-language-driven automation of data querying and transformation. Collectively, this work contributes new methods, datasets, and evaluation tools for behavioral modeling with transaction data. It highlights the challenges of modeling with transaction data, the tradeoffs between hand-engineered and learned representations, and the promise of AI models for handling this kind of data.Ph

    The Logic of Bias: Using Cognitive Architecture to Explore Interactions Between Cognitive Abilities and Decision Error

    No full text
    The traditional view of biases being cognitive imperfection has been challenged by several strains of research, such as the PSI cognitive architecture. Here, biases are considered to be engineered by evolution, to prevent dissatisfaction and assist subsequent satisfaction of human needs. PSI's general assumption of higher skills and reasoning capacities alleviating biases has been recently called into question, as high numeracy was associated with an exacerbated effect of political bias. We conduct two studies, the results of which indicate that the basis for this effect 1) does not represent a general cognitive fallacy caused by modulations of perceptional and attentional processes, 2) nor is rooted in the long-term forming of habituated action patterns, associated with prior beliefs. This strengthens the evidence for it to be specific to group dynamics with strong affiliative bounds. Further, we propose a set of revisions to PSI, necessary to model this expert bias phenomenon

    223

    full texts

    6,376

    metadata records
    Updated in last 30 days.
    DSpace@RPI (Rensselaer Polytechnic Institute) is based in United States
    Access Repository Dashboard
    Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇