117 research outputs found

    Economic and Fiscal Impacts of Proposed LNG Facility in Robbinston, Maine

    Get PDF
    The purpose of this study is to examine the economic and fiscal impacts of the proposed Downeast LNG facility on the Town of Robbinston, Washington County, and the State of Maine. The economic impact analysis focuses on the employment and income that are associated with the LNG facility construction and operations. The fiscal impact analysis considers additional local and state tax revenues associated with the facility, as well as increased local government expenditures that are projected to result from the LNG project. This report does not address the environmental, homeland security, or energy security impacts of the LNG facility. In addition, this report does not estimate any changes in the price of delivered natural gas in Maine that could potentially result from a new major energy supplier

    RCT Rejection Sampling for Causal Estimation Evaluation

    Full text link
    Confounding is a significant obstacle to unbiased estimation of causal effects from observational data. For settings with high-dimensional covariates -- such as text data, genomics, or the behavioral social sciences -- researchers have proposed methods to adjust for confounding by adapting machine learning methods to the goal of causal estimation. However, empirical evaluation of these adjustment methods has been challenging and limited. In this work, we build on a promising empirical evaluation strategy that simplifies evaluation design and uses real data: subsampling randomized controlled trials (RCTs) to create confounded observational datasets while using the average causal effects from the RCTs as ground-truth. We contribute a new sampling algorithm, which we call RCT rejection sampling, and provide theoretical guarantees that causal identification holds in the observational data to allow for valid comparisons to the ground-truth RCT. Using synthetic data, we show our algorithm indeed results in low bias when oracle estimators are evaluated on the confounded samples, which is not always the case for a previously proposed algorithm. In addition to this identification result, we highlight several finite data considerations for evaluation designers who plan to use RCT rejection sampling on their own datasets. As a proof of concept, we implement an example evaluation pipeline and walk through these finite data considerations with a novel, real-world RCT -- which we release publicly -- consisting of approximately 70k observations and text data as high-dimensional covariates. Together, these contributions build towards a broader agenda of improved empirical evaluation for causal estimation.Comment: Code and data at https://github.com/kakeith/rct_rejection_samplin

    ComLittee: Literature Discovery with Personal Elected Author Committees

    Full text link
    In order to help scholars understand and follow a research topic, significant research has been devoted to creating systems that help scholars discover relevant papers and authors. Recent approaches have shown the usefulness of highlighting relevant authors while scholars engage in paper discovery. However, these systems do not capture and utilize users' evolving knowledge of authors. We reflect on the design space and introduce ComLittee, a literature discovery system that supports author-centric exploration. In contrast to paper-centric interaction in prior systems, ComLittee's author-centric interaction supports curation of research threads from individual authors, finding new authors and papers with combined signals from a paper recommender and the curated authors' authorship graphs, and understanding them in the context of those signals. In a within-subjects experiment that compares to an author-highlighting approach, we demonstrate how ComLittee leads to a higher efficiency, quality, and novelty in author discovery that also improves paper discovery

    The Stellar Population of h and chi Persei: Cluster Properties, Membership, and the Intrinsic Colors and Temperatures of Stars

    Full text link
    (Abridged) From photometric observations of ∼\sim 47,000 stars and spectroscopy of ∼\sim 11,000 stars, we describe the first extensive study of the stellar population of the famous Double Cluster, h and χ\chi Persei, down to subsolar masses. Both clusters have E(B-V) ∼\sim 0.52--0.55 and dM = 11.8--11.85; the halo population, while more poorly constrained, likely has identical properties. As determined from the main sequence turnoff, the luminosity of M supergiants, and pre-main sequence isochrones, ages for h Persei, χ\chi Persei and the halo population all converge on ≈\approx 14 Myr. From these data, we establish the first spectroscopic and photometric membership lists of cluster stars down to early/mid M dwarfs. At minimum, there are ∼\sim 5,000 members within 10' of the cluster centers, while the entire h and χ\chi Persei region has at least ∼\sim 13,000 and as many as 20,000 members. The Double Cluster contains ≈\approx 8,400 M⊙_{\odot} of stars within 10' of the cluster centers. We estimate a total mass of at least 20,000 M⊙_{\odot}. We conclude our study by outlining outstanding questions regarding the properties of h and χ\chi Persei. From comparing recent work, we compile a list of intrinsic colors and derive a new effective temperature scale for O--M dwarfs, giants, and supergiants.Comment: 88 pages, many figures, Accepted for publication in The Astrophysical Journal Supplements. Contact lead author for version with high-resolution figure

    ARIES: A Corpus of Scientific Paper Edits Made in Response to Peer Reviews

    Full text link
    Revising scientific papers based on peer feedback is a challenging task that requires not only deep scientific knowledge and reasoning, but also the ability to recognize the implicit requests in high-level feedback and to choose the best of many possible ways to update the manuscript in response. We introduce this task for large language models and release ARIES, a dataset of review comments and their corresponding paper edits, to enable training and evaluating models. We study two versions of the task: comment-edit alignment and edit generation, and evaluate several baselines, including GPT-4. We find that models struggle even to identify the edits that correspond to a comment, especially in cases where the comment is phrased in an indirect way or where the edit addresses the spirit of a comment but not the precise request. When tasked with generating edits, GPT-4 often succeeds in addressing comments on a surface level, but it rigidly follows the wording of the feedback rather than the underlying intent, and includes fewer technical details than human-written edits. We hope that our formalization, dataset, and analysis will form a foundation for future work in this area.Comment: 11 pages, 2 figure

    Relatedly: Scaffolding Literature Reviews with Existing Related Work Sections

    Full text link
    Scholars who want to research a scientific topic must take time to read, extract meaning, and identify connections across many papers. As scientific literature grows, this becomes increasingly challenging. Meanwhile, authors summarize prior research in papers' related work sections, though this is scoped to support a single paper. A formative study found that while reading multiple related work paragraphs helps overview a topic, it is hard to navigate overlapping and diverging references and research foci. In this work, we design a system, Relatedly, that scaffolds exploring and reading multiple related work paragraphs on a topic, with features including dynamic re-ranking and highlighting to spotlight unexplored dissimilar information, auto-generated descriptive paragraph headings, and low-lighting of redundant information. From a within-subjects user study (n=15), we found that scholars generate more coherent, insightful, and comprehensive topic outlines using Relatedly compared to a baseline paper list

    Autoregulation of yeast ribosomal proteins discovered by efficient search for feedback regulation

    Get PDF
    Post-transcriptional autoregulation of gene expression is common in bacteria but many fewer examples are known in eukaryotes. We used the yeast collection of genes fused to GFP as a rapid screen for examples of feedback regulation in ribosomal proteins by overexpressing a non-regulatable version of a gene and observing the effects on the expression of the GFP-fused version. We tested 95 ribosomal protein genes and found a wide continuum of effects, with 30% showing at least a 3-fold reduction in expression. Two genes, RPS22B and RPL1B, showed over a 10-fold repression. In both cases the cis-regulatory segment resides in the 5\u27 UTR of the gene as shown by placing that segment of the mRNA upstream of GFP alone and demonstrating it is sufficient to cause repression of GFP when the protein is over-expressed. Further analyses showed that the intron in the 5\u27 UTR of RPS22B is required for regulation, presumably because the protein inhibits splicing that is necessary for translation. The 5\u27 UTR of RPL1B contains a sequence and structure motif that is conserved in the binding sites of Rpl1 orthologs from bacteria to mammals, and mutations within the motif eliminate repression

    CiteSee: Augmenting Citations in Scientific Papers with Persistent and Personalized Historical Context

    Full text link
    When reading a scholarly article, inline citations help researchers contextualize the current article and discover relevant prior work. However, it can be challenging to prioritize and make sense of the hundreds of citations encountered during literature reviews. This paper introduces CiteSee, a paper reading tool that leverages a user's publishing, reading, and saving activities to provide personalized visual augmentations and context around citations. First, CiteSee connects the current paper to familiar contexts by surfacing known citations a user had cited or opened. Second, CiteSee helps users prioritize their exploration by highlighting relevant but unknown citations based on saving and reading history. We conducted a lab study that suggests CiteSee is significantly more effective for paper discovery than three baselines. A field deployment study shows CiteSee helps participants keep track of their explorations and leads to better situational awareness and increased paper discovery via inline citation when conducting real-world literature reviews
    • …