731 research outputs found
Mathematics, Statistics and Data Science
The process of extracting information from data has a long history (see, for example, [1]) stretching back over centuries. Because of the proliferation of data over the last few decades, and projections for its continued proliferation over coming decades, the term Data Science has emerged to describe the substantial current intellectual effort around research with the same overall goal, namely that of extracting information. The type of data currently available in all sorts of application domains is often massive in size, very heterogeneous and far from being collected under designed or controlled experimental conditions. Nonetheless, it contains information, often
substantial information, and data science requires
new interdisciplinary approaches to make maximal use
of this information. Data alone is typically not that informative and (machine) learning from data needs conceptual frameworks. Mathematics and statistics are crucial for providing such conceptual frameworks. The frameworks enhance the understanding of fundamental phenomena, highlight limitations and provide a formalism for properly founded data analysis, information extraction and quantification of uncertainty, as well as for the analysis and development of algorithms that carry out these key tasks. In this personal commentary on data science and its relations to mathematics and statistics, we highlight three important aspects of the emerging field: Models, High-Dimensionality and Heterogeneity, and then conclude with a brief discussion of where the field is now and implications for the mathematical sciences
Toward a unified theory of sparse dimensionality reduction in Euclidean space
Let be a sparse Johnson-Lindenstrauss
transform [KN14] with non-zeroes per column. For a subset of the unit
sphere, given, we study settings for required to
ensure i.e. so that preserves the norm of every
simultaneously and multiplicatively up to . We
introduce a new complexity parameter, which depends on the geometry of , and
show that it suffices to choose and such that this parameter is small.
Our result is a sparse analog of Gordon's theorem, which was concerned with a
dense having i.i.d. Gaussian entries. We qualitatively unify several
results related to the Johnson-Lindenstrauss lemma, subspace embeddings, and
Fourier-based restricted isometries. Our work also implies new results in using
the sparse Johnson-Lindenstrauss transform in numerical linear algebra,
classical and model-based compressed sensing, manifold learning, and
constrained least squares problems such as the Lasso
Internationalisation des élites académiques suisses au 20ème siècle : convergences et contrastes
A partir d'une base de données originale sur les professeurs de droit et de sciences économiques des universités suisses sur l'ensemble du XXe siècle, cet article rend compte des diverses dynamiques d'internationalisation de ces élites. Trois enseignements majeurs peuvent être tirés de nos analyses. D'abord, d'un point de vue diachronique, il est possible de diviser le XXe siècle en trois phases historiques : une internationalité forte des élites académiques au début du siècle, une nationalisation ou « relocalisation » suite à la Première Guerre mondiale, puis une « ré-internationalisation » à partir des années 1960 et de manière accélérée depuis les années 1980. Ensuite, les professeurs de sciences économiques, en terme de nationalités ou de lieu de formation, sont plus cosmopolites et ont moins d'ancrage local que leurs homologues juristes. Enfin, la prédominance germanique parmi les professeurs des universités suisses au début du siècle, qui s'explique autant par une internationalité d'« excellence » que de « proximité », laisse place, surtout en sciences économiques, à une montée de l'influence des Etats-Unis, révélatrice d'un effritement de l'internationalité de « proximité »
Resource use and outcome in critically ill patients with hematological malignancy: a retrospective cohort study
INTRODUCTION: The paucity of data on resource use in critically ill patients with hematological malignancy and on these patients' perceived poor outcome can lead to uncertainty over the extent to which intensive care treatment is appropriate. The aim of the present study was to assess the amount of intensive care resources needed for, and the effect of treatment of, hemato-oncological patients in the intensive care unit (ICU) in comparison with a nononcological patient population with a similar degree of organ dysfunction. METHODS: A retrospective cohort study of 101 ICU admissions of 84 consecutive hemato-oncological patients and 3,808 ICU admissions of 3,478 nononcological patients over a period of 4 years was performed. RESULTS: As assessed by Therapeutic Intervention Scoring System points, resource use was higher in hemato-oncological patients than in nononcological patients (median (interquartile range), 214 (102 to 642) versus 95 (54 to 224), P < 0.0001). Severity of disease at ICU admission was a less important predictor of ICU resource use than necessity for specific treatment modalities. Hemato-oncological patients and nononcological patients with similar admission Simplified Acute Physiology Score scores had the same ICU mortality. In hemato-oncological patients, improvement of organ function within the first 48 hours of the ICU stay was the best predictor of 28-day survival. CONCLUSION: The presence of a hemato-oncological disease per se is associated with higher ICU resource use, but not with increased mortality. If withdrawal of treatment is considered, this decision should not be based on admission parameters but rather on the evolutional changes in organ dysfunctions
Mathematics, Statistics and Data Science
The process of extracting information from data has a long history (see, for example, [1]) stretching back over centuries. Because of the proliferation of data over the last few decades, and projections for its continued proliferation over coming decades, the term Data Science has emerged to describe the substantial current intellectual effort around research with the same overall goal, namely that of extracting information. The type of data currently available in all sorts of application domains is often massive in size, very heterogeneous and far from being collected under designed or controlled experimental conditions. Nonetheless, it contains information, often
substantial information, and data science requires
new interdisciplinary approaches to make maximal use
of this information. Data alone is typically not that informative and (machine) learning from data needs conceptual frameworks. Mathematics and statistics are crucial for providing such conceptual frameworks. The frameworks enhance the understanding of fundamental phenomena, highlight limitations and provide a formalism for properly founded data analysis, information extraction and quantification of uncertainty, as well as for the analysis and development of algorithms that carry out these key tasks. In this personal commentary on data science and its relations to mathematics and statistics, we highlight three important aspects of the emerging field: Models, High-Dimensionality and Heterogeneity, and then conclude with a brief discussion of where the field is now and implications for the mathematical sciences
ACS Applied Materials & Interfaces
Key parameters that influence the specific energy of electrochemical double-layer capacitors (EDLCs) are the double-layer capacitance and the operating potential of the cell. The operating potential of the cell is generally limited by the electrochemical window of the electrolyte solution, that is, the range of applied voltages within which the electrolyte or solvent is not reduced or oxidized. Ionic liquids are of interest as electrolytes for EDLCs because they offer relatively wide potential windows. Here, we provide a systematic study of the influence of the physical properties of ionic liquid electrolytes on the electrochemical stability and electrochemical performance (double-layer capacitance, specific energy) of EDLCs that employ a mesoporous carbon model electrode with uniform, highly interconnected mesopores (3DOm carbon). Several ionic liquids with structurally diverse anions (tetrafluoroborate, trifluoromethanesulfonate, trifluoromethanesulfonimide) and cations (imidazolium, ammonium, pyridinium, piperidinium, and pyrrolidinium) were investigated. We show that the cation size has a significant effect on the electrolyte viscosity and conductivity, as well as the capacitance of EDLCs. Imidazolium- and pyridinium-based ionic liquids provide the highest cell capacitance, and ammonium-based ionic liquids offer potential windows much larger than imidazolium and pyridinium ionic liquids. Increasing the chain length of the alkyl substituents in 1-alkyl-3-methylimidazolium trifluoromethanesulfonimide does not widen the potential window of the ionic liquid. We identified the ionic liquids that maximize the specific energies of EDLCs through the combined effects of their potential windows and the double-layer capacitance. The highest specific energies are obtained with ionic liquid electrolytes that possess moderate electrochemical stability, small ionic volumes, low viscosity, and hence high conductivity, the best performing ionic liquid tested being 1-ethyl-3-methylimidazolium bis(trifluoromethylsulfonyl)imide
A comparison of block and semi-parametric bootstrap methods for variance estimation in spatial statistics
Efron (1979) introduced the bootstrap method for independent data but it cannot be easily applied to spatial data because of their dependency. For spatial data that are correlated in terms of their locations in the underlying space the moving block bootstrap method is usually used to estimate the precision measures of the estimators. The precision of the moving block bootstrap estimators is related to the block size which is difficult to select. In the moving block bootstrap method also the variance estimator is underestimated. In this paper, first the semi-parametric bootstrap is used to estimate the precision measures of estimators in spatial data analysis. In the semi-parametric bootstrap method, we use the estimation of the spatial correlation structure. Then, we compare the semi-parametric bootstrap with a moving block bootstrap for variance estimation of estimators in a simulation study. Finally, we use the semi-parametric bootstrap to analyze the coal-ash data
- …