Search CORE

28 research outputs found

Big Data of Materials Science - Critical Role of the Descriptor

Author: Draxl Claudia
Ghiringhelli Luca M.
Levchenko Sergey V.
Scheffler Matthias
Vybiral Jan
Publication venue: 'American Physical Society (APS)'
Publication date: 05/02/2015
Field of study

Statistical learning of materials properties or functions so far starts with a largely silent, non-challenged step: the choice of the set of descriptive parameters (termed descriptor). However, when the scientific connection between the descriptor and the actuating mechanisms is unclear, causality of the learned descriptor-property relation is uncertain. Thus, trustful prediction of new promising materials, identification of anomalies, and scientific advancement are doubtful. We analyse this issue and define requirements for a suited descriptor. For a classical example, the energy difference of zincblende/wurtzite and rocksalt semiconductors, we demonstrate how a meaningful descriptor can be found systematically.Comment: Accepted to Phys. Rev. Let

arXiv.org e-Print Archive

Crossref

MPG.PuRe

Learning physical descriptors for materials science by compressed sensing

Author: Ahmetcik Emre
Draxl Claudia
Ghiringhelli Luca M.
Levchenko Sergey V.
Ouyang Runhai
Scheffler Matthias
Vybiral Jan
Publication venue: 'IOP Publishing'
Publication date: 13/12/2016
Field of study

The availability of big data in materials science offers new routes for analyzing materials properties and functions and achieving scientific understanding. Finding structure in these data that is not directly visible by standard tools and exploitation of the scientific information requires new and dedicated methodology based on approaches from statistical learning, compressed sensing, and other recent methods from applied mathematics, computer science, statistics, signal processing, and information science. In this paper, we explain and demonstrate a compressed-sensing based methodology for feature selection, specifically for discovering physical descriptors, i.e., physical parameters that describe the material and its properties of interest, and associated equations that explicitly and quantitatively describe those relevant properties. As showcase application and proof of concept, we describe how to build a physical model for the quantitative prediction of the crystal structure of binary compound semiconductors

arXiv.org e-Print Archive

Crossref

edoc Server Dokumenten-Publikationsserver der Humboldt-Universität zu Berlin

MPG.PuRe

Function spaces with dominating mixed smoothness

Author: Vybiral Jan
Publication venue
Publication date: 01/01/2005
Field of study

We study several techniques whichare well known in the case of Besov and TriebelLizorkin spaces and extend them to spaces with dominating mixed smoothness. We use the ideas of Triebel to prove three important decomposition theorems. We deal withsocalled atomic, subatomic and wavelet decompositions. All these theorems have much in common. fRoughly speaking, they say that a function belongs to some function space if, and only if, it can be decomposed into the sum of products of coefficients and corresponding building blocks, where the coefficients belong to an appropriate sequence space. These decomposition theorems estabilisha veryusefulconnection between function and sequence spaces. We use them in the study of the decay of entropy numbers of compact embeddings between two function spaces of dominating mixed smoothness reducingthis problem to the same question on the sequence space level. The considered scales cover many important specific spaces (Sobolev, Zygmund, Besov) and we get generalisations of respective assertions of Belinsky, Dinh Dung and Temlyakov

CiteSeerX

Digitale Bibliothek Thüringen

Sparse Proteomics Analysis - A compressed sensing-based approach for feature selection and classification of high-dimensional proteomics mass spectrometry data

Author: Conrad Tim
Cvetkovic Nada
Genzel Martin
Kutyniok Gitta
Leichtle Alexander
Schütte Christof
Vybiral Jan
Wulkow Niklas
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 26/11/2016
Field of study

Background: High-throughput proteomics techniques, such as mass spectrometry (MS)-based approaches, produce very high-dimensional data-sets. In a clinical setting one is often interested in how mass spectra differ between patients of different classes, for example spectra from healthy patients vs. spectra from patients having a particular disease. Machine learning algorithms are needed to (a) identify these discriminating features and (b) classify unknown spectra based on this feature set. Since the acquired data is usually noisy, the algorithms should be robust against noise and outliers, while the identified feature set should be as small as possible. Results: We present a new algorithm, Sparse Proteomics Analysis (SPA), based on the theory of compressed sensing that allows us to identify a minimal discriminating set of features from mass spectrometry data-sets. We show (1) how our method performs on artificial and real-world data-sets, (2) that its performance is competitive with standard (and widely used) algorithms for analyzing proteomics data, and (3) that it is robust against random and systematic noise. We further demonstrate the applicability of our algorithm to two previously published clinical data-sets

arXiv.org e-Print Archive

Institutional Repository of the Freie Universität Berlin

DepositOnce (Techn. Univ. Berlin)

Crossref

Repository: Freie Universität Berlin (FU), Math Department (fu_mi_publications)

Open Access LMU ( Ludwig-Maximilians-Univ. München)

PubMed Central

Publication Server of Zuse Institute Berlin (ZIB)

Bern Open Repository and Information System (BORIS)