80 research outputs found

    Outlier Detection in Heterogeneous Datasets using Automatic Tuple Expansion

    Get PDF
    Rapidly developing areas of information technology are generating massive amounts of data. Human errors, sensor failures, and other unforeseen circumstances unfortunately tend to undermine the quality and consistency of these datasets by introducing outliers -- data points that exhibit surprising behavior when compared to the rest of the data. Characterizing, locating, and in some cases eliminating these outliers offers interesting insight about the data under scrutiny and reinforces the confidence that one may have in conclusions drawn from otherwise noisy datasets. In this paper, we describe a tuple expansion procedure which reconstructs rich information from semantically poor SQL data types such as strings, integers, and floating point numbers. We then use this procedure as the foundation of a new user-guided outlier detection framework, dBoost, which relies on inference and statistical modeling of heterogeneous data to flag suspicious fields in database tuples. We show that this novel approach achieves good classification performance, both in traditional numerical datasets and in highly non-numerical contexts such as mostly textual datasets. Our implementation is publicly available, under version 3 of the GNU General Public License

    Investigating the influence of pectin content and structure on its functionality in bio-flocculant extracted from okra

    Get PDF
    © 2020 The Author(s) Okra extract is known to have potential application as a bio-flocculant for wastewater treatment. However, no research to date has given insight into the components responsible for the flocculating ability of okra extract or its flocculating mechanism. The work presented here addresses this knowledge gap showing that pectin, especially pectin homogalacturonan (HGA) regions, appear to be the polysaccharides responsible for the flocculating ability of okra extract. The way pectin works in flocculation may be best explained by a polymer bridging mechanism. Specifically, a linear relationship between okra bio-flocculating ability and pectin homogalacturonan region to rhamnogalacturonan-I region weight ratio (HGA/RG-I) was found (y = 2.0x+47.6, R2 = 0.93, when GalA content > 300 mg/g extract), which was also validated using commercial citrus peel pectin

    Quantification of human C1 esterase inhibitor protein using an automated turbidimetric immunoassay

    Get PDF
    BACKGROUND: Impaired levels or function of C1 inhibitor (C1-INH) results in angioedema due to increased bradykinin. It is important to distinguish between angioedema related to C1-INH deficiency and that caused by other mechanisms, as treatment options are different. In hereditary (HAE) and acquired (AAE) angioedema, C1-INH concentration is measured to aid patient diagnosis. Here, we describe an automated turbidimetric assay to measure C1-INH concentration on the Optilite® analyzer. METHODS: Linearity, precision, and interference were established over a range of C1-INH concentrations. The 95th percentile reference interval was generated from 120 healthy adult donors. To compare the Optilite C1-INH assay with a predicate assay used in a clinical laboratory, samples sent for C1-INH investigation were used. The predicate results were provided to allow comparison. RESULTS: The Optilite C1-INH assay was linear across the measuring range at the standard sample dilution. Intra and interassay variability was <6%. The 95th percentile adult reference interval for the assay was 0.21-0.38 g/L. There was a strong correlation between the Optilite concentrations and those generated with the predicate assay (R2 = 0.94, P < 0.0001, slope y = 0.83x). All patients with Type I HAE (n = 24) and AAE (n = 3) tested had concentrations below the measuring range in both assays, while all patients with unspecified angioedema (UAE), not diagnosed with HAE or AAE had values within the reference range. CONCLUSION: The Optilite assay allows the automated and precise quantification of C1-INH concentrations in patient samples. It could therefore be used as a tool to aid the investigation of patients with angioedema

    The Fourteenth Data Release of the Sloan Digital Sky Survey: First Spectroscopic Data from the extended Baryon Oscillation Spectroscopic Survey and from the second phase of the Apache Point Observatory Galactic Evolution Experiment

    Get PDF
    The fourth generation of the Sloan Digital Sky Survey (SDSS-IV) has been in operation since July 2014. This paper describes the second data release from this phase, and the fourteenth from SDSS overall (making this, Data Release Fourteen or DR14). This release makes public data taken by SDSS-IV in its first two years of operation (July 2014-2016). Like all previous SDSS releases, DR14 is cumulative, including the most recent reductions and calibrations of all data taken by SDSS since the first phase began operations in 2000. New in DR14 is the first public release of data from the extended Baryon Oscillation Spectroscopic Survey (eBOSS); the first data from the second phase of the Apache Point Observatory (APO) Galactic Evolution Experiment (APOGEE-2), including stellar parameter estimates from an innovative data driven machine learning algorithm known as "The Cannon"; and almost twice as many data cubes from the Mapping Nearby Galaxies at APO (MaNGA) survey as were in the previous release (N = 2812 in total). This paper describes the location and format of the publicly available data from SDSS-IV surveys. We provide references to the important technical papers describing how these data have been taken (both targeting and observation details) and processed for scientific use. The SDSS website (www.sdss.org) has been updated for this release, and provides links to data downloads, as well as tutorials and examples of data use. SDSS-IV is planning to continue to collect astronomical data until 2020, and will be followed by SDSS-V.Comment: SDSS-IV collaboration alphabetical author data release paper. DR14 happened on 31st July 2017. 19 pages, 5 figures. Accepted by ApJS on 28th Nov 2017 (this is the "post-print" and "post-proofs" version; minor corrections only from v1, and most of errors found in proofs corrected

    The effects of juvenile stress on anxiety, cognitive bias and decision making in adulthood:a rat model

    Get PDF
    Stress experienced in childhood is associated with an increased risk of developing psychiatric disorders in adulthood. These disorders are particularly characterized by disturbances to emotional and cognitive processes, which are not currently fully modeled in animals. Assays of cognitive bias have recently been used with animals to give an indication of their emotional/cognitive state. We used a cognitive bias test, alongside a traditional measure of anxiety (elevated plus maze), to investigate the effects of juvenile stress (JS) on adulthood behaviour using a rodent model. During the cognitive bias test, animals were trained to discriminate between two reward bowls based on a stimulus (rough/smooth sandpaper) encountered before they reached the bowls. One stimulus (e.g. rough) was associated with a lower value reward than the other (e.g. smooth). Once rats were trained, their cognitive bias was explored through the presentation of an ambiguous stimulus (intermediate grade sandpaper): a rat was classed as optimistic if it chose the bowl ordinarily associated with the high value reward. JS animals were lighter than controls, exhibited increased anxiety-like behaviour in the elevated plus maze and were more optimistic in the cognitive bias test. This increased optimism may represent an optimal foraging strategy for these underweight animals. JS animals were also faster than controls to make a decision when presented with an ambiguous stimulus, suggesting altered decision making. These results demonstrate that stress in the juvenile phase can increase anxiety-like behaviour and alter cognitive bias and decision making in adulthood in a rat model

    AVURT: aspirin versus placebo for the treatment of venous leg ulcers a Phase II pilot randomised controlled trial

    Get PDF
    Background Venous leg ulcers (VLUs) are the most common cause of leg ulceration, affecting 1 in 100 adults. VLUs may take many months to heal (25% fail to heal). Estimated prevalence is between 1% and 3% of the elderly population. Compression is the mainstay of treatment and few additional therapies exist to improve healing. Two previous trials have indicated that low-dose aspirin, as an adjunct to standard care, may improve healing time, but these trials were insufficiently robust. Aspirin is an inexpensive, widely used medication but its safety and efficacy in the treatment of VLUs remains to be established. Objectives Primary objective – to assess the effects of 300 mg of aspirin (daily) versus placebo on the time to healing of the reference VLU. Secondary objectives – to assess the feasibility of leading into a larger pragmatic Phase III trial and the safety of aspirin in this population. Design A multicentred, pilot, Phase II randomised double-blind, parallel-group, placebo-controlled efficacy trial. Setting Community leg ulcer clinics or services, hospital outpatient clinics, leg ulcer clinics, tissue viability clinics and wound clinics in England, Wales and Scotland. Participants Patients aged ≥ 18 years with a chronic VLU (i.e. the VLU is > 6 weeks in duration or the patient has a history of VLU) and who are not regularly taking aspirin. Interventions 300 mg of daily oral aspirin versus placebo. All patients were offered care in accordance with Scottish Intercollegiate Guidelines Network (SIGN) guidance with multicomponent compression therapy aiming to deliver 40 mmHg at the ankle when possible. Randomisation Participants were allocated in a 1 : 1 (aspirin : placebo) ratio by the Research Pharmacy, St George’s University Hospitals NHS Foundation Trust, using a randomisation schedule generated in advance by the investigational medicinal product manufacturer. Randomisation was stratified according to ulcer size (≤ 5cm2 or > 5cm2). Main outcome measure The primary outcome was time to healing of the largest eligible ulcer (reference ulcer). Feasibility results – recruitment 27 patients were recruited from eight sites over a period of 8 months. The target of 100 patients was not achieved and two sites did not recruit. Barriers to recruitment included a short recruitment window and a large proportion of participants failing to meet the eligibility criteria. Results The average age of the 27 randomised participants (placebo, n = 13; aspirin, n = 14) was 62 years (standard deviation 13 years), and two-thirds were male (n = 18). Participants had their reference ulcer for a median of 15 months, and the median size of ulcer was 17.1 cm2. There was no evidence of a difference in time to healing of the reference ulcer between groups in an adjusted analysis for log-ulcer area and duration (hazard ratio 0.58, 95% confidence interval 0.18 to 1.85; p = 0.357). One expected, related serious adverse event was recorded for a participant in the aspirin group. Limitations The trial under-recruited because many patients did not meet the eligibility criteria. Conclusions There was no evidence that aspirin was efficacious in hastening the healing of chronic VLUs. It can be concluded that a larger Phase III (effectiveness) trial would not be feasible. Trial registration Clinical Trials.gov NCT02333123; European Clinical Trials Database (EudraCT) 2014-003979-39. Funding This project was funded by the National Institute for Health Research (NIHR) Health Technology Assessment programme and will be published in full in Health Technology Assessment; Vol. 22, No. 55. See the NIHR Journals Library website for further project information

    Challenges Using Extrapolated Family-level Macroinvertebrate Metrics in Moderately Disturbed Tropical Streams: a Case-study From Belize

    Get PDF
    Family-level biotic metrics were originally designed to rapidly assess gross organic pollution effects, but came to be regarded as general measures of stream degradation. Improvements in water quality in developed countries have reignited debate about the limitations of family-level taxonomy to detect subtle change, and is resulting in a shift back towards generic and species-level analysis to assess smaller effects. Although the scale of pollution characterizing past condition of streams in developed countries persists in many developing regions, some areas are still considered to be only moderately disturbed. We sampled streams in Belize to investigate the ability of family-level macroinvertebrate metrics to detect change in stream catchments where less than 30% of forest had been cleared. Where disturbance did not co-vary with natural gradients of change, and in areas characterized by low intensity activities, none of the metrics tested detected significant change, despite evidence of environmental impacts. We highlight the need for further research to clarify the response of metrics to disturbance over a broader study area that allows replication for confounding sources of natural variation. We also recommend research to develop more detailed understanding of the taxonomy and ecology of Neotropical macroinvertebrates to improve the robustness of metric use

    Final Targeting Strategy for the SDSS-IV APOGEE-2N Survey

    Full text link
    APOGEE-2 is a dual-hemisphere, near-infrared (NIR), spectroscopic survey with the goal of producing a chemo-dynamical mapping of the Milky Way Galaxy. The targeting for APOGEE-2 is complex and has evolved with time. In this paper, we present the updates and additions to the initial targeting strategy for APOGEE-2N presented in Zasowski et al. (2017). These modifications come in two implementation modes: (i) "Ancillary Science Programs" competitively awarded to SDSS-IV PIs through proposal calls in 2015 and 2017 for the pursuit of new scientific avenues outside the main survey, and (ii) an effective 1.5-year expansion of the survey, known as the Bright Time Extension, made possible through accrued efficiency gains over the first years of the APOGEE-2N project. For the 23 distinct ancillary programs, we provide descriptions of the scientific aims, target selection, and how to identify these targets within the APOGEE-2 sample. The Bright Time Extension permitted changes to the main survey strategy, the inclusion of new programs in response to scientific discoveries or to exploit major new datasets not available at the outset of the survey design, and expansions of existing programs to enhance their scientific success and reach. After describing the motivations, implementation, and assessment of these programs, we also leave a summary of lessons learned from nearly a decade of APOGEE-1 and APOGEE-2 survey operations. A companion paper, Santana et al. (submitted), provides a complementary presentation of targeting modifications relevant to APOGEE-2 operations in the Southern Hemisphere.Comment: 59 pages; 11 Figures; 7 Tables; 2 Appendices; Submitted to Journal and Under Review; Posting to accompany papers using the SDSS-IV/APOGEE-2 Data Release 17 scheduled for December 202

    Sloan Digital Sky Survey IV: mapping the Milky Way, nearby galaxies, and the distant universe

    Get PDF
    We describe the Sloan Digital Sky Survey IV (SDSS-IV), a project encompassing three major spectroscopic programs. The Apache Point Observatory Galactic Evolution Experiment 2 (APOGEE-2) is observing hundreds of thousands of Milky Way stars at high resolution and high signal-to-noise ratios in the near-infrared. The Mapping Nearby Galaxies at Apache Point Observatory (MaNGA) survey is obtaining spatially resolved spectroscopy for thousands of nearby galaxies (median ). The extended Baryon Oscillation Spectroscopic Survey (eBOSS) is mapping the galaxy, quasar, and neutral gas distributions between and 3.5 to constrain cosmology using baryon acoustic oscillations, redshift space distortions, and the shape of the power spectrum. Within eBOSS, we are conducting two major subprograms: the SPectroscopic IDentification of eROSITA Sources (SPIDERS), investigating X-ray AGNs and galaxies in X-ray clusters, and the Time Domain Spectroscopic Survey (TDSS), obtaining spectra of variable sources. All programs use the 2.5 m Sloan Foundation Telescope at the Apache Point Observatory; observations there began in Summer 2014. APOGEE-2 also operates a second near-infrared spectrograph at the 2.5 m du Pont Telescope at Las Campanas Observatory, with observations beginning in early 2017. Observations at both facilities are scheduled to continue through 2020. In keeping with previous SDSS policy, SDSS-IV provides regularly scheduled public data releases; the first one, Data Release 13, was made available in 2016 July
    corecore