554 research outputs found

    The value of publicly available, textual and non-textual information for startup performance prediction

    Full text link
    Can publicly available, web-scraped data be used to identify promising business startups at an early stage? To answer this question, we use such textual and non-textual information about the names of Danish firms and their addresses as well as their business purpose statements (BPSs) supplemented by core accounting information along with founder and initial startup characteristics to forecast the performance of newly started enterprises over a five years' time horizon. The performance outcomes we consider are involuntary exit, above-average employment growth, a return on assets of above 20 percent, new patent applications and participation in an innovation subsidy program. Our first key finding is that our models predict startup performance with either high or very high accuracy with the exception of high returns on assets where predictive power remains poor. Our second key finding is that the data requirements for predicting performance outcomes with such accuracy are low. To forecast the two innovation-related performance outcomes well, we only need to include a set of variables derived from the BPS texts while an accurate prediction of startup survival and high employment growth needs the combination of (i) information derived from the names of the startups, (ii) data on elementary founder-related characteristics and (iii) either variables describing the initial characteristics of the startup (to predict startup survival) or business purpose statement information (to predict high employment growth). These sets of variables are easily obtainable since the underlying information is mandatory to report upon business registration. The substantial accuracy of our predictions for survival, employment growth, new patents and participation in innovation subsidy programs indicates ample scope for algorithmic scoring models as an additional pillar of funding and innovation support decisions

    Asthmatics Exhibit Altered Oxylipin Profiles Compared to Healthy Individuals after Subway Air Exposure

    Get PDF
    Asthma is a chronic inflammatory lung disease that causes significant morbidity and mortality worldwide. Air pollutants such as particulate matter (PM) and oxidants are important factors in causing exacerbations in asthmatics, and the source and composition of pollutants greatly affects pathological implications.This randomized crossover study investigated responses of the respiratory system to Stockholm subway air in asthmatics and healthy individuals. Eicosanoids and other oxylipins were quantified in the distal lung to provide a measure of shifts in lipid mediators in association with exposure to subway air relative to ambient air.Sixty-four oxylipins representing the cyclooxygenase (COX), lipoxygenase (LOX) and cytochrome P450 (CYP) metabolic pathways were screened using liquid chromatography-tandem mass spectrometry (LC-MS/MS) of bronchoalveolar lavage (BAL)-fluid. Validations through immunocytochemistry staining of BAL-cells were performed for 15-LOX-1, COX-1, COX-2 and peroxisome proliferator-activated receptor gamma (PPARγ). Multivariate statistics were employed to interrogate acquired oxylipin and immunocytochemistry data in combination with patient clinical information.Asthmatics and healthy individuals exhibited divergent oxylipin profiles following exposure to ambient and subway air. Significant changes were observed in 8 metabolites of linoleic- and α-linolenic acid synthesized via the 15-LOX pathway, and of the COX product prostaglandin E(2) (PGE(2)). Oxylipin levels were increased in healthy individuals following exposure to subway air, whereas asthmatics evidenced decreases or no change.Several of the altered oxylipins have known or suspected bronchoprotective or anti-inflammatory effects, suggesting a possible reduced anti-inflammatory response in asthmatics following exposure to subway air. These observations may have ramifications for sensitive subpopulations in urban areas

    Explaining Support Vector Machines: A Color Based Nomogram.

    Get PDF
    PROBLEM SETTING: Support vector machines (SVMs) are very popular tools for classification, regression and other problems. Due to the large choice of kernels they can be applied with, a large variety of data can be analysed using these tools. Machine learning thanks its popularity to the good performance of the resulting models. However, interpreting the models is far from obvious, especially when non-linear kernels are used. Hence, the methods are used as black boxes. As a consequence, the use of SVMs is less supported in areas where interpretability is important and where people are held responsible for the decisions made by models. OBJECTIVE: In this work, we investigate whether SVMs using linear, polynomial and RBF kernels can be explained such that interpretations for model-based decisions can be provided. We further indicate when SVMs can be explained and in which situations interpretation of SVMs is (hitherto) not possible. Here, explainability is defined as the ability to produce the final decision based on a sum of contributions which depend on one single or at most two input variables. RESULTS: Our experiments on simulated and real-life data show that explainability of an SVM depends on the chosen parameter values (degree of polynomial kernel, width of RBF kernel and regularization constant). When several combinations of parameter values yield the same cross-validation performance, combinations with a lower polynomial degree or a larger kernel width have a higher chance of being explainable. CONCLUSIONS: This work summarizes SVM classifiers obtained with linear, polynomial and RBF kernels in a single plot. Linear and polynomial kernels up to the second degree are represented exactly. For other kernels an indication of the reliability of the approximation is presented. The complete methodology is available as an R package and two apps and a movie are provided to illustrate the possibilities offered by the method

    Liver-Derived IGF-I Regulates Mean Life Span in Mice

    Get PDF
    Background: Transgenic mice with low levels of global insulin-like growth factor-I (IGF-I) throughout their life span, including pre- and postnatal development, have increased longevity. This study investigated whether specific deficiency of liver-derived, endocrine IGF-I is of importance for life span. Methods and Findings: Serum IGF-I was reduced by approximately 80 % in mice with adult, liver-specific IGF-I inactivation (LI-IGF-I-/- mice), and body weight decreased due to reduced body fat. The mean life span of LI-IGF-I-/- mice (n = 84) increased 10 % vs. control mice (n = 137) (Cox’s test, p,0.01), mainly due to increased life span (16%) of female mice [LI-IGF-I-/- mice (n = 31): 26.761.1 vs. control (n = 67): 23.060.7 months, p,0.001]. Male LI-IGF-I-/- mice showed only a tendency for increased longevity (p = 0.10). Energy expenditure, measured as oxygen consumption during and after submaximal exercise, was increased in the LI-IGF-I-/- mice. Moreover, microarray and RT-PCR analyses showed consistent regulation of three genes (heat shock protein 1A and 1B and connective tissue growth factor) in several body organs in the LI-IGF-I-/- mice. Conclusions: Adult inactivation of liver-derived, endocrine IGF-I resulted in moderately increased mean life span. Body weight and body fat decreased in LI-IGF-I-/- mice, possibly due to increased energy expenditure during exercise. Genes earlier reported to modulate stress response and collagen aging showed consistent regulation, providing mechanisms tha

    Hospital mortality is associated with ICU admission time

    Get PDF
    Previous studies have shown that patients admitted to the intensive care unit (ICU) after "office hours" are more likely to die. However these results have been challenged by numerous other studies. We therefore analysed this possible relationship between ICU admission time and in-hospital mortality in The Netherlands. This article relates time of ICU admission to hospital mortality for all patients who were included in the Dutch national ICU registry (National Intensive Care Evaluation, NICE) from 2002 to 2008. We defined office hours as 08:00-22:00 hours during weekdays and 09:00-18:00 hours during weekend days. The weekend was defined as from Saturday 00:00 hours until Sunday 24:00 hours. We corrected hospital mortality for illness severity at admission using Acute Physiology and Chronic Health Evaluation II (APACHE II) score, reason for admission, admission type, age and gender. A total of 149,894 patients were included in this analysis. The relative risk (RR) for mortality outside office hours was 1.059 (1.031-1.088). Mortality varied with time but was consistently higher than expected during "off hours" and lower during office hours. There was no significant difference in mortality between different weekdays of Monday to Thursday, but mortality increased slightly on Friday (RR 1.046; 1.001-1.092). During the weekend the RR was 1.103 (1.071-1.136) in comparison with the rest of the week. Hospital mortality in The Netherlands appears to be increased outside office hours and during the weekends, even when corrected for illness severity at admission. However, incomplete adjustment for certain confounders might still play an important role. Further research is needed to fully explain this differenc

    The FAIR Guiding Principles for scientific data management and stewardship

    Get PDF
    There is an urgent need to improve the infrastructure supporting the reuse of scholarly data. A diverse set of stakeholders—representing academia, industry, funding agencies, and scholarly publishers—have come together to design and jointly endorse a concise and measureable set of principles that we refer to as the FAIR Data Principles. The intent is that these may act as a guideline for those wishing to enhance the reusability of their data holdings. Distinct from peer initiatives that focus on the human scholar, the FAIR Principles put specific emphasis on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals. This Comment is the first formal publication of the FAIR Principles, and includes the rationale behind them, and some exemplar implementations in the community
    corecore