Harvard University - DASH

Harvard University

Harvard University - DASH
Not a member yet
    71302 research outputs found

    Disrupting Bipartite Trading Networks: Matching for Revenue Maximization

    Get PDF
    Online platforms and marketplaces have revolutionized everyday commerce, breaking down many of the physical and geographic barriers that previously prevented trade. Today, these online marketplaces facilitate trillions of dollars of trade and have disrupted a wide variety of markets like retail commerce (Amazon, eBay), transportation (Uber, Lyft), food delivery (Instacart, UberEats), and many others. However, the growing power of these platforms also comes at a cost, with regulatory bodies taking an increasing interest in the competitive practices of these platforms in the past years. Motivated by the interest in these competitive practices, we study the incentives facing platforms when choosing how to match their users together – e.g., buyers and sellers on Amazon. We model the role of an online platform disrupting a network of n unit-demand buyers and m unit-supply sellers. Each seller can transact with a subset of the buyers whom she already knows outside of the platform, as well as with any additional buyers to whom she is introduced or matched by the platform. Given these constraints on trade, we model prices and transactions as being induced by a competitive equilibrium. The platform's revenue is proportional to the total price of all trades between platform-introduced buyers and sellers. We consider a revenue-maximizing platform and study the effect of the platform on social welfare (the sum of transacting buyers' values for the items they receive). We show that even when the platform optimizes for revenue, the social welfare is at least an O(log(min{n,m}))-approximation to the ideal welfare, giving non-trivial guarantees for social welfare even in the presence of platform behavior. When the platform can significantly increase social welfare, i.e. when the existing market is inefficient, we give a polynomial-time algorithm that guarantees a logarithmic approximation of the optimal welfare as revenue, also attaining a logarithmic fraction of optimal social welfare in the process. In general, we show that the platform's revenue-maximization problem is computationally intractable, but we provide structural results for revenue-optimal matchings and isolate special cases in which the platform can efficiently compute them. Finally, we prove significantly stronger bounds for revenue and social welfare in homogeneous-goods markets, where each seller is selling an identical item. We prove that revenue maximization aligns perfectly with welfare maximization in these markets; any revenue-optimal platform matching also maximizes overall social welfare. In inefficient homogeneous-goods markets, we give a constant-factor poly-time approximation algorithm for revenue that also maximizes social welfare.Applied Mathematic

    Beyond Intention-to-Treat Analyses in Randomized Trials: Utilizing Per-Protocol Estimands and Observational Data to Investigate Strategies for Cardiovascular Disease Prevention

    No full text
    Randomized trials are considered the gold standard study design for conducting comparative effectiveness research. Yet, evidence from randomized trials to inform cardiovascular disease (CVD) prevention efforts remains scarce for some patient populations, partly due to restrictive eligibility criteria and shorter follow-up of surrogate outcomes. Emulating target trials in observational data can provide insights into the effects of existing medications for CVD prevention in new target populations and on longer-term clinical endpoints. Regardless of study design, that is, whether we use data from randomized trials or emulate target trials in observational data, complementing intention-to-treat analyses, which estimate the effect of the assigned treatment strategies, with appropriate per-protocol analyses, which estimate effects that would have been observed had there been full adherence to the assigned treatment strategies, can be important to facilitate informed clinical decision-making. In this dissertation, I estimate both intention-to-treat and per-protocol effects using data from a large-scale, pragmatic randomized trial and from electronic health records and health insurance claims to elucidate the impact of medications important for CVD prevention in populations for whom evidence has been limited based on existing intention-to-treat analyses of randomized trials. In Chapter 1, I estimated the effect of adhering to assigned treatment strategies of pravastatin or usual care on death and CVD in the Antihypertensive and Lipid-Lowering Treatment to Prevent Heart Attack – Lipid-Lowering Trial. The high initiation of lipid-lowering therapy in the usual care comparator group has been previously suggested to explain the null intention-to-treat findings. I demonstrate that deviations from the pravastatin treatment strategy may better explain the intention-to-treat results, showing that pravastatin reduced the risk of death and CVD even when allowing individuals in the usual care group to initiate lipid-lowering therapy. In Chapter 2, I estimated the effect of statin therapy versus usual care on the 5-year risk of CVD among women with breast cancer. I first specified the target trial and then emulated the trial using electronic health record data from a large cohort of patients at several Kaiser Permanente networks. While intention-to-treat findings showed no difference in risk of CVD over five years, the risk of CVD was lower for the statin therapy group compared to usual care in per-protocol analyses, though estimates were imprecise. In Chapter 3, I estimated the effects of adding GLP1-RA to SGLT2i and adding SGLT2i to GLP1-RA on glycemic control and intermediate cardiovascular risk factors at one-year, and risk of CVD over three-years among persons with inadequately controlled type 2 diabetes. I outlined the protocol of two target trials, then emulated them using electronic health records and claims data from 12 insurance/healthcare systems in the US. Regardless of initial background treatment, adding SGLT2i to GLP1-RA reduced HbA1c, systolic blood pressure, and body mass index after one-year. Estimates for the effect of combination therapy on cardiovascular events were imprecise, and additional data are needed to confirm whether adding SGLT2i or GLP1-RA leads to long-term protection against cardiovascular disease. In conclusion, when evidence from intention-to-treat analyses of existing randomized trials is insufficient to guide clinical decision-making alone, using per-protocol effects and pragmatic trials as a framework for conducting comparative effectiveness research in observational data can help inform future CVD prevention efforts for populations at high risk, such as breast cancer survivors and patients with type 2 diabetes.Population Health Science

    Membrane-Electrolyte System Studies for Aqueous Redox Flow Reactors

    Get PDF
    Electrochemical flow reactors are promising for electrochemistry at scale. Reactants are flowed continuously through typically porous electrodes where redox occurs, and ions transport through the electrolyte and typically at least one ion exchange membrane to provide current in the electrochemical circuit and balance charge. Ion exchange membranes are made of charged polymers that take up solvent to create ionically conductive pathways while blocking bulk mixing of adjacent electrolyte solutions. This thesis examines a crucial interface in electrochemical flow reactors, where the membrane and contacting electrolyte interact and exchange chemical species, and we find that the interaction of membrane and electrolyte governs the structure and transport of the membrane phase, which affects device scale performance. In redox flow batteries, the membrane must block crossover of reactants while enabling high conductivity. A combination of reactant size and charge effects influence permeation rates through charged membranes, with charge exerting especially strong influence under dilute conditions. The overall concentration and composition of the battery electrolyte influences the membrane hydration and hence transport, and we use conclusions from a systematic study of these effects to design crossover-free membrane-electrolyte systems with both commercial and novel membranes. In bipolar membranes, we find that the composition of impure strong electrolytes affects the local composition at the bipolar junction, which determines open circuit voltage and polarization behavior. Finally, we involve a series of ion exchange membranes including a bipolar membrane in a redox electrodialysis process for pH-driven separations, and untangle the concentration- and current-dependent membrane transport phenomena that limit the process efficiency.Engineering and Applied Sciences - Engineering Science

    Inferring dynamic extracellular matrix composition of the thymus: towards a biomimetic scaffold for T cell culture

    Get PDF
    Immunotherapy is a pioneering approach using T lymphocytes (T cells) to fight cancer and immune deficiency, which together affect millions of patients in the US. Current clinical therapies require patient- or donor-derived T cells, leading to manufacturing delays and limited supply. While stem cell-derived T cells offer a scalable strategy to meet growing demand and increase accessibility to therapy, generating mature and clinically usable T cells through this method is challenging and cell yields remain low. In the body, T cell maturation relies on cell migration through thymic structures that change chemically and physically over time; the extracellular matrix (ECM) has a role in shaping these evolving niches. Existing solutions to producing “off the shelf” engineered T cells use mouse thymic epithelial cell lines and native ECM to adapt key in vivo components of the T differentiation process. However, these solutions are not fully synthetic, limiting their scalability and potential therapeutic use. Additionally, these models of the thymic ECM overlook developmental shifts in thymic structure that might advance an understanding of T cell maturation. Therefore, dynamic biological and chemical properties in the thymic microenvironment are important in efficiently producing mature T cells in vivo. This project constructs a proof-of- concept biomimetic platform for immune cell differentiation from stem cells based on these changing characteristics of the thymic extracellular matrix. Using single-cell RNA sequencing data to quantify gene expression levels, the extracellular matrix composition of the thymus at different developmental stages is computationally characterized, and characteristic ratios of key extracellular matrix components (fibronectin, collagen, and laminin) are identified at these different stages. These ratios are then used to engineer a stage-specific alginate-based hydrogel for cell culture, designed to replicate thymic tissue stiffness and viscoelasticity with tunable properties that reflect different developmental stages. Using this stage-specific platform, researchers can investigate how T cell differentiation, activation, and toxicity are influenced by the thymus's developmental stage. Additionally, this platform could allow for the identification of the optimal stage of thymic development to mimic in T cell culture scaffolds, enabling more scalable T cell expansion.Engineering Sciences S

    Remembering the Enslaved and the Emancipated in Latin Verse Epitaphs, Epigrams, and Poems of Consolation

    No full text
    In this dissertation, I examine the extant Latin poetry written to commemorate enslaved and emancipated persons in the ancient Roman world. My study engages with two bodies of evidence. The first comprises the 243 surviving Latin verse epitaphs inscribed on the gravestones of persons from these social groups: these epitaphs date from the first century BCE to the fourth century CE and originate from across the Roman empire. The second body of evidence comprises the poems within the Latin literary canon (that is, the texts circulated in manuscript form rather than being incised on stone) that were written to commemorate enslaved and emancipated persons and/or to console someone on the loss of one such person. 19 poems of this nature survive: one by Lucilius, 14 by Martial, and four by Statius, ranging in length from two lines to 234. These forms of sepulchral poetry, I argue, not only served to honor deceased individuals, but also provided a space in which contrasting perspectives on slavery, including both positive and negative sentiments, could be expressed. In Chapters 1 and 2, I demonstrate that, when slaveholders commemorate their enslaved and emancipated dependents (which is the case in all of the canonical texts and some of the inscribed epitaphs), they tend to depict slavery as benevolent and paternalistic and to position themselves as central to their dependents’ social and professional lives. In Chapter 1, I analyze how these tendencies are manifested in the inscribed verse epitaphs; in Chapter 2, I consider how they are manifested in two canonical genres of poetry (namely, the epigram and the poem of consolation). In Chapters 3 and 4, I examine how, when enslaved and emancipated persons memorialize themselves, or are memorialized by their family and friends (which is the case in most of the inscribed epitaphs), they often challenge or reframe this worldview. In Chapter 3, I examine cases where this is done overtly, including an epitaph from Gaul that describes enslavement as traumatic, and a group of epitaphs, two from Carthage and one from Italy, that equate death with liberation from bondage. As I demonstrate in Chapter 4, enslaved and emancipated persons could also engage in more subtle modes of resistance. Many of them, when writing their own epitaphs or those of their family members, choose to tell their life stories in ways that conflict with the picture presented by the slaveholders: for instance, they may foreground their family and community rather than their relationship with their slaveholder. Finally, in Chapter 5, I underscore the subjectivity of sepulchral poetry, and its capacity to shape perceptions of particular social groups, by examining the extant poems, both canonical and epigraphic, for a more complex social group: namely, alumni, meaning children born to enslaved or emancipated parents and ‘fostered’ by their slaveholder. These alumni may be represented in different ways (e.g., as ordinary household slaves or as adoptive children) according to the perspective and agenda of their commemorator(s). Overall, therefore, these poems enabled persons from various social backgrounds not just to mourn their deceased loved ones, but also to participate in a broader dialogue on slavery and to shape the ways in which the experiences of enslavement and emancipation were remembered.Classic

    Essays on Data Science: Computational Measurement for Learning and Teaching

    Get PDF
    To study teaching and learning at a large scale, we must introduce new methods for the analysis of rich, unstructured data -- such as audio, video, and transcribed text -- from classrooms. In this dissertation, I develop and apply computational and statistical methods to measure teaching and learning processes captured in two unstructured sources: student writing and transcripts of teacher speech from classroom lessons. My first essay introduces Coupled Likelihood Estimation (CLE), a method that improves the precision of parameter estimates in models of unstructured data features while requiring fewer expert-labeled observations. It combines information from limited samples of expert-labeled data with larger samples of data with machine-predicted labels. CLE leverages the geometric structure of the joint likelihood from both identifying (labeled) and non-identifying (unlabeled) data, constraining parameter estimates to, approximately, a surface defined by the unlabeled data's likelihood. Simulations demonstrate that CLE is unbiased, reduces root mean squared error, and yields narrower confidence intervals compared to existing methods, in some cases effectively achieving average efficiency gains equivalent to doubling the expert-labeled sample size. An application estimating the effect of an educational intervention on student writing quality illustrates CLE’s practical utility, producing estimates closer to an oracle benchmark using only 18\% of the expert-labeled data. By amplifying the value of limited labeled data, CLE lowers barriers to high-quality inference in resource-constrained domains such as healthcare, education, and policy evaluation. The method’s broad applicability, theoretical guarantees, and computational approach offer a pathway to cost-effective, reliable analyses in settings where researchers face high labeling costs. My second essay leverages natural language processing techniques to study the use of mathematical vocabulary in elementary math classrooms. My collaborators and I develop a rules-based computational measure of mathematical vocabulary use. We find that teachers differ substantially in the amount of mathematical vocabulary they model for their students. Students of teachers in the 75th percentile were exposed to 28 more mathematical terms per lesson (4,480 per year) than students of a teacher in the 25th percentile. Observed characteristics explain very little of this variation in teachers' mathematical vocabulary use. Finally, students randomly assigned to teachers’ who used more mathematical vocabulary in previous years scored higher on standardized tests of mathematics. This implies that teachers who expose their students to more mathematical vocabulary are more effective teachers of mathematics. Across value-added studies, a teacher one standard deviation above the mean in effectiveness raises math scores by between .10 and .15 \citep{BacherHicks2023}; our estimate of the effect of being assigned to a teacher who uses one standard deviation more mathematical language accounts for roughly half of this variation, indicating that our measure is a powerful predictor of teacher effectiveness. In my third essay, I develop Contextual Value Separation (CVS), a general method for identifying words used differently between pre-specified subsets of documents in large text corpora. CVS achieves this by combining contextual embeddings with machine learning classifiers, permutation testing, and statistical adjustments for multiple comparisons. Whereas current methods identify words that predict membership within a given class of documents, CVS reveals cases where separate classes of the documents use the same word in differing ways. For example, experienced and novice math teachers may use a mathematical vocabulary term with similar frequency but in markedly different ways or contexts. This approach can search over a specified set of target words or over the entire vocabulary of the corpus. For each target word, CVS infers how consistently its contextual embeddings differ by subset. Because vocabularies are large, the method includes multiple testing correction to control the false-discovery rate, typically yielding a small set of words whose usage varies between the document classes. After identifying the words whose usage most consistently differs, example usages from each subset are extracted for qualitative examination. CVS easily extends to other forms of unstructured data represented by embeddings, such as video and audio. The method can be used as an exploratory tool for hypothesis generation, to test a priori hypotheses, or to detect treatment effects on textual outcomes in experimental settings. To demonstrate the method, I analyze a set of transcripts from upper elementary mathematics lessons and identify two ways that teachers with larger impacts on math scores use mathematical vocabulary differently: more use of the mathematical meanings of polysemous terms and more requests that students engage with questions related to the terms. The method can be easily extended to other forms of unstructured data can be encoded into vectors, e.g., audio and video. As a collection, these three essays reveal the promise of computational methods for enabling the analysis of text data (and other rich, unstructured data sources). They contribute several novel findings in the field of education regarding mathematical vocabulary and effective teaching. From a statistical point of view, CLE introduces a new way to leverage large amounts of machine labeled data, which, in addition to its value for educational research, can lower the cost of research in several domains, such as phenotyping electronic health records.Educatio

    Algorithms for Fair Redistricting

    Get PDF
    Redistricting is the act of dividing land into geographic regions, called districts, for political or administrative purposes. The most notable---and contentious---example comes from the United States House of Representatives, where seats are apportioned among states according to their respective populations following a decennial census, then assigned to political districts, with each district holding a separate election for a single representative. Political redistricting occurs periodically at other state and local levels, alongside other forms of redistricting for non-political purposes, such as the determination of school attendance zones. In all of these settings, there is a vast space of possible outcomes to consider and a compelling need for fairness and transparency. In this thesis, we design algorithms for various redistricting tasks and provide theoretical underpinnings for existing algorithms. We begin by considering the initial task of seat apportionment, exploring the space of randomized allocation methods satisfying strong fairness guarantees while eliminating unwanted correlations across states. We then turn to the task of drawing redistricting maps that are provably fair, using ideas and results from the Cake-Cutting model in fair division. Finally, we contribute new algorithms and theory for the task of sampling random redistricting maps, with the aim of building robust statistical tests for assessing partisan fairness.Engineering and Applied Sciences - Computer Scienc

    A Marketplace for Populism: The Moral Politics of Digitization in India’s Informal Economy

    No full text
    Digital India is a flagship policy of the Government of India to foster “economic growth combined with social inclusion.” Central to Digital India is the Aadhaar card, a biometric identification now distributed to 1.3 billion Indians, and a real time mobile payment technology, Unified Payments Interface (UPI), that has significantly replaced cash transactions. This technological transformation was imagined and implemented against the social backdrop of India's large informal economy. Prior scholarship has shown that technological promises to improve governance and economic opportunities are often not fully realized and create new sources of friction, especially for marginalized communities. Nevertheless, digital technologies have been taken up by actors in the informal economy, such as street vendors in urban areas. In this dissertation, I ask: How has digitization altered informal workers’ economic and political subjectivity, and with what consequences for their agency in the market and political sphere? How do citizens rationalize breakdowns in techno-economic promises, and how has digitization become an instrument of populist politics? I explore these questions through fieldwork across Delhi, Mumbai and Bangalore. While my ethnography is based in the Sarojini Nagar market of New Delhi, my methodology also includes discourse analysis, controversy studies, legal and policy analysis, archival work, and oral histories. I find that over time the Indian state has evolved into a heterogenous set of actors and institutions with competing politics. Bureaucrats, administrators, lower-level officials, politicians, ministers, and middlemen all represent overlapping components of “the state,” with benevolent, oppressive or ambiguous relationships to street vendors. Digitization of identity and payments alters the political relationship between street vendors, governing authorities, and the nation-state in four salient ways. First, digitization opens the possibility of being seen and potentially recognized by the benevolent arms of the state. Second, digital payments represent a traceable and transparent alternative to the cash-based corruption the state is seeking to eliminate. Third, digital welfare creates a personified state with direct connections between welfare recipients and political leaders rather than state services being mediated through the bureaucracy. Fourth, in political discourse, street vendors are reimagined from being subjects in need of social transformation to agents of nation-building through a digital economy that is accessible and immediate. Simultaneously, there exists resistance to the digital imaginary of the Indian nation-state by street vendors, critiques of being seen without recognition, distribution without redistribution, and duty without rights. In postcolonial India, the bureaucratic-state apparatus was strengthened to facilitate the progress of the nation and its citizens. Politics in contemporary India recasts these mediating institutions as threats to democracy. Digitization represents an effort to diminish the bureaucratic state that is understood by political leaders, tech-entrepreneurs, and street vendors as inhibiting the nation’s and its citizens’ progress.Public Polic

    Leasing Out Sovereignty; The proliferation of Chinese Surveillance Technologies in Africa

    No full text
    In the everyday lexicon, digital technology is simply an instrument, a functional conduit that automatically realizes ends. Contrary to this supposition, we can explore digital technology as a political and cultural artifact that embodies values embedded in postcolonial societies. This dissertation project, based on ethnographic, documentary, and archival research, discusses digital technology's functional architecture and how it operates as the preeminent artifact to envision developed futures. The project investigates how Chinese surveillance technology has proliferated throughout Africa, particularly in Kenya, and examines the technological fix as an aspiration for improving the capabilities of the postcolonial state to bolster social order and spur economic development. The project's principal goal is to examine the effects of digital technology on (i) state sovereignty and the governance of local populations, including the management of crime and the enforcement of economic regulation; (ii) civil society, concerning the sense of in/security, and of un/freedom it instills in citizens; and (iii) patterns of social and material inequality. What, further, are its hidden, unintended consequences in Kenya, and what particularities, if any, emerge from its postcolonial context? And, finally, what does the spread of Chinese digital infrastructure, as exemplified by the case study here, augur for China's future in Africa? These questions have significant implications for Africa and the global order at large. This project seeks to resolve these questions by focusing on the procurement and use of digital surveillance technology.African and African American Studie

    Laboratory evolution of botulinum neurotoxins for therapeutic applications

    No full text
    The goal of this thesis is to expand the therapeutic potential of botulinum neurotoxins (BoNTs) by using directed evolution to reprogram both major functional domains. Specifically, we aimed to engineer the protease domain to cleave novel, disease-relevant substrates and to reprogram the receptor-binding domain towards targeting of non-neuronal cell types, such as cancer cells. Through this work, BoNTs are transformed from neurotoxins into programmable delivery systems for targeted therapies, with applications ranging from cancer treatment to broader modulation of cell fate. Cancers evolve numerous mechanisms to evade both cell death and immune surveillance. Pyroptosis, an inflammatory form of programmed cell death, offers a promising therapeutic strategy by eliminating cancer cells while simultaneously activating immune responses. In this work, phage-assisted evolution was used to reprogram BoNT proteases to cleave two key activators of pyroptosis: procaspase-1 and gasdermin D. A substrate profiling platform was developed to assess the specificity of both wild-type and engineered proteases. While the wild-type protease failed to induce cell death, evolved variants triggered robust cytotoxicity in multiple cancer cell lines. The gasdermin D-cleaving variant induced strictly pyroptotic death, while the procaspase-1-cleaving variant triggered both pyroptosis and apoptosis, suggesting broader caspase-like activity. To enable targeted delivery, evolved proteases were reconstituted into BoNT toxins containing the native translocation domain, resulting in selective death of cancer cells but not non-cancerous cells. These findings demonstrate that BoNT proteases can be reprogrammed to modulate inflammatory cell death and serve as programmable tools for targeted cancer therapy. In addition to protease engineering, we reprogrammed the BoNT receptor-binding domain as a potential strategy for cell-specific targeting. Using two target nomination strategies—manual selection and a weighted alignment tool—we successfully evolved binders against all nominated targets. A bacterial two-hybrid circuit paired with phage-assisted evolution supported the evolution of binders to CD44v6, CD30, ACHA3, CD1b, RXFP1, and TSN8. Notably, evolved variants targeting ACHA3 and RXFP1 bound to full-length ectodomains with single-digit nanomolar affinity, exceeding the affinity of the wild-type BoNT HC for its native receptor. While in-cell and in vivo validations are ongoing, this work lays the foundation for BoNT HC evolution, analogous to antibody engineering strategies. Collectively, these studies demonstrate that BoNTs are highly reprogrammable and can be evolved for diverse therapeutic applications. By combining protease and receptor-binding domain engineering, this platform offers a blueprint for developing programmable protein-based therapeutics that modulate cell fate with precision.Chemical Biolog

    26,017

    full texts

    71,312

    metadata records
    Updated in last 30 days.
    Harvard University - DASH is based in United States
    Access Repository Dashboard
    Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇