32 research outputs found

    Non-Standard Errors

    Get PDF
    In statistics, samples are drawn from a population in a data-generating process (DGP). Standard errors measure the uncertainty in estimates of population parameters. In science, evidence is generated to test hypotheses in an evidence-generating process (EGP). We claim that EGP variation across researchers adds uncertainty: Non-standard errors (NSEs). We study NSEs by letting 164 teams test the same hypotheses on the same data. NSEs turn out to be sizable, but smaller for better reproducible or higher rated research. Adding peer-review stages reduces NSEs. We further find that this type of uncertainty is underestimated by participants

    Finishing the euchromatic sequence of the human genome

    Get PDF
    The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe

    Investor sentiment and paradigm shifts in equity premium forecasting

    No full text
    Ministry of Education, Singapore under its Academic Research Funding Tier

    The case for market inefficiency: Investment style and market pricing

    No full text

    Ferrocene-Labelled Electroactive Aptamer-Based Sensors (Aptasensors) for Glycated Haemoglobin

    No full text
    Glycated haemoglobin (HbA1c) is a diagnostic biomarker for type 2 diabetes. Traditional analytical methods for haemoglobin (Hb) detection rely on chromatography, which requires significant instrumentation and is labour-intensive; consequently, miniaturized devices that can rapidly sense HbA1c are urgently required. With this research, we report on an aptamer-based sensor (aptasensor) for the rapid and selective electrochemical detection of HbA1c. Aptamers that specifically bind HbA1c and Hb were modified with a sulfhydryl and ferrocene group at the 3′ and 5′-end, respectively. The modified aptamers were coated through sulfhydryl-gold self-assembly onto screen printed electrodes, producing aptasensors with built in electroactivity. When haemoglobin was added to the electrodes, the current intensity of the ferrocene in the sensor system was reduced in a concentration-dependent manner as determined by differential pulse voltammetry. In addition, electrochemical impedance spectroscopy confirmed selective binding of the analytes to the aptamer-coated electrode. This research offers new insight into the development of portable electrochemical sensors for the detection of HbA1c</sub

    Proceedings of the 29th international conference on computational linguistics

    No full text
    Welcome to COLING 2022 – the 29th International Conference on Computational Linguistics. Held in Gyeongju, this is the first COLING in the Republic of Korea! We visited Gyeongju in 2016, as a site visit together with the local chair Key-Sun Choi, and were amazed by its beauty and its great historical significance. Our report to the ICCL was extremely positive. So here we are, delighted to be General Chairs of COLING in the beautiful capital of the old Silla Kingdom. COLING is organized under the auspices of the International Committee on Computational Linguistics (ICCL, https://ufal.mff.cuni.cz/iccl). The ICCL is a very special committee, with neither bylaws nor bank accounts. The sole function of the ICCL is to ensure that a COLING is held every two years and that the conference is not only scientifically robust but also conducive to the sharing of ideas and cultural experiences in a congenial and inclusive environment. COLING has evolved over the years, together with the changes in our field. But the mission of the ICCL to maintain the COLING “spirit” has never changed: we want COLING to be an inclusive conference that welcomes diversified participants and ideas. We also want to underline the fact that Language is what defines our field and the subject of our scientific inquiries. Thus, we pay special attention to works that help us understand Language, including its complexities, diversity, and robust reflection and facilitation of individual and collective human behaviors and actions. This is why the theme of COLING 2022 is NLP for the Grand Challenges of Our Time. We would like to highlight that, through effective processing of language big data, computational linguistics will play a crucial role in understanding the nature of the grand challenges, how people react to these challenges, and how to manage effective collective behaviors to tackle these challenges. Recall that for COLING the congenial and inclusive environment for exchanges of ideas is part of the gene of the conference that is as important as its scientific excellence. That’s why COLING has kept the tradition of the “excursion” that typically allows participants to be immersed in a new cultural or ecological environment. [excerpt]peer-reviewe

    Non-Standard Errors

    Get PDF
    In statistics, samples are drawn from a population in a data-generating process (DGP). Standard errors measure the uncertainty in sample estimates of population parameters. In science, evidence is generated to test hypotheses in an evidence-generating process (EGP). We claim that EGP variation across researchers adds uncertainty: non-standard errors. To study them, we let 164 teams test six hypotheses on the same sample. We find that non-standard errors are sizeable, on par with standard errors. Their size (i) co-varies only weakly with team merits, reproducibility, or peer rating, (ii) declines significantly after peer-feedback, and (iii) is underestimated by participants

    Non-Standard Errors

    Get PDF
    In statistics, samples are drawn from a population in a data-generating process (DGP). Standard errors measure the uncertainty in estimates of population parameters. In science, evidence is generated to test hypotheses in an evidence-generating process (EGP). We claim that EGP variation across researchers adds uncertainty: Non-standard errors (NSEs). We study NSEs by letting 164 teams test the same hypotheses on the same data. NSEs turn out to be sizable, but smaller for better reproducible or higher rated research. Adding peer-review stages reduces NSEs. We further find that this type of uncertainty is underestimated by participants
    corecore