26 research outputs found
Probabilistic waveform inversion: Quest for the law
Full-waveform inversion (FWI) is an algorithm (and a part of the measuring procedure in a wide sense) with the aim to find the governing law of a physical system using the partially measured physical fields with limited computational resources. A law is a forward theory equipped with the model parameters and the data parameters. The main characteristic of the law is the realizability assumption: the law explains all subsets of the measured data parameters and predicts all subsets of the unmeasured (in the given experiment) data parameters. To find the law, we have to guess a law (a forward theory and parametrization), measure some data parameters and check the realizability assumption.
To put it more precisely, I formulate a new probabilistic setting for inverse problems and full-waveform inversion. Instead of using the Bayes\u27 theorem, the Tarantola-Valette conjunction or the principle of maximum entropy based on the prior information for the averaged quantities, I propose a principle of minimum relative information using the prior information for the non-averaged quantities. The Tarantola-Valette formula is obtained as a special case under the assumption that the theoretical and prior measures exist. Using the realizability assumption as a prior information, the principle of minimum relative information provides the parametric probabilistic solution with the arbitrary misfit functions. Maximization of the parametric probabilistic solution leads to a multiobjective minimization problem. All global Pareto optima are the sample points of the probabilistic solution with the highest values of the volumetric measure. Unfortunately, even a local multiobjective minimization problem is computationally intractable for FWI with many millions of model parameters.
To make it computationally attractive for large-scale FWI and to find at least a few local solutions of the multiobjective minimization problem, I implement the bilevel multiobjective waveform inversion (BMWI) using a single randomly chosen shot gather at each iteration. BMWI is a stochastic, nested algorithm with an adaptive parabolic line search and multiscale strategy. The computational cost per iteration is five forward modellings only. BMWI can worsen some of the single-shot misfit functions and the different random runs of BMWI converge to different points in the model manifold. I interpret these inverted models as the sample points of the probabilistic solution. I estimate the solution, uncertainty and sensitivity using the sample estimates of the mean, standard deviation and initial deviation of the sample points, respectively. Using the numerical examples with the Marmousi-2 model, I illustrate the potential of BMWI for automatic uncertainty and sensitivity analysis with just two-three sample points.
To test the idea with real-world data, I apply stochastic single-shot BMWI in a 2D acoustic finite-difference approximation to a 2D line of pressure data acquired in a shallow-water river delta with ocean bottom cables. I use minimal data preprocessing (only a new 3D-to-2D transform which is strictly valid in a linear-gradient medium), the linear gradient starting models and the diagonal preconditioners with a negligible regularization. I estimate the theoretical uncertainties due to the neglected 3D effects using the 3D-to-2D transforms. The uncertainties estimated by the random sequences of BMWI are higher than the uncertainties related to the 3D-to-2D transforms. I provide the estimates of the solution, uncertainty and sensitivity using up to fourteen sample points inverted with the different linear-gradient starting models, the differently 3D-to-2D-transformed real data sets and the different random sequences of descent directions. The uncertainty of sound velocities is the lowest in the central semicircle with the radius 3 km equal to half the length of the ocean bottom cable. The uncertainty of mass densities is the highest in the central semicircle. The sensitivity of the measuring procedure with respect to sound velocity and mass density is the highest in the central semicircle representing a footprint of the acquisition geometry. Outside the central semicircle the parameters are not falsifiable in the specified setting.
Full-waveform inversion is the quest for the unique governing law of the physical system under study. If the governing law is deterministic and the sample mean, standard deviation and initial deviation of the sample points represent the insufficient description of the solution, uncertainty and sensitivity, then the measuring procedure in a wide sense has to be improved
Knowledge graphs in BERD and in NFDI
Knowledge graphs are able to capture, enrich and disseminate research data objects so that the FAIR and Linked Data principles are fulfilled. How knowledge graphs can improve the domain-specific (BERD) and cross-domain (NFDI) research data infrastructures? The answer is based on the use cases in BERD@NFDI and on activities of the NFDI working group “Knowledge graphs”. First, we describe the architecture, knowledge graphs and use cases in BERD@NFDI. Then, we present the NFDI working group “Knowledge Graphs”, its work plan and potential base services
Die BERD@NFDI Plattform: Eine Forschungsinfrastruktur für unstrukturierte Daten in den Sozialwissenschaften
Forschungsbibliotheken und Wissenschaft arbeiten im NFDI-Konsortium BERD@NFDI zusammen an der Verbesserung der Infrastruktur zur Dokumentation, Auffindbarkeit und Zugang zu unstrukturieren Foschungsdaten (z.B.. Text, Bilder, Audiodateien) in den Wirtschaftswissenschaften und verwandten Disziplinen.
Im Frühjahr 2023 startet mit der BERND-Plattform ein zentraler Dienst des Konsortiums. Die Plattform bietet die Möglichkeit der Speicherung und Suche von forschungsrelevanten Datensätzen. Diese Datensätze werden verknüpft mit relevanten Publikationen, Analyseverfahren und Schulungsinhalten des Konsortiums. Ein Schwerpunkt wird auf die Dokumentation von relevanten Machine Learning Methoden zur Verarbeitung von unstrukturierten Daten gelegt. Zusätzlich wird mit dem Data Marketplace Unternehmen die Möglichkeit eingeräumt, ihre Daten sicher für Kooperationen mit Wissenschaftlern zur Verfügung zu stellen. Es entsteht so eine umfassende Infrastruktur zur Unterstützung von Forschenden.
In diesem Vortrag werden die Funktionalitäten der BERD-Plattform präsentiert. Darüber hinaus blicken wir auf die Rolle der Bibliotheken in BERD@NFDI, die Möglichkeiten für den Austausch mit den Fachgemeinschaften und blicken voraus auf die zukünftigen Entwicklungsperspektiven der BERD@NFDI Infrastruktur
Recommended from our members
The Case for a Common, Reusable Knowledge Graph Infrastructure for NFDI
The Strategic Research and Innovation Agenda (SRIA) of the European Commission identifies Knowledge Graphs (KGs) as one of the most important technologies for building an interoperability framework and enabling data exchange among users across countries, sectors, and disciplines [1]. KG is a graph-structured knowledge base containing a terminology (vocabulary or ontology) and data entities interrelated via the terminology [2]. KGs are based on semantic web technologies (RDF, SPARQL, etc.) and often used for agile data integration. KGs also play an essential role within Germany as a vehicle to connect research data and research-related entities and make those accessible – examples include the GESIS Knowledge Graph Infrastructure, TIB Open Research Knowledge Graph, and GND.network. Furthermore, the Wikidata knowledge graph, maintained by Wikimedia Germany, contains a large number of research-related entities and is widely used in scientific knowledge management in addition to being an important advocacy tool for open data [3]. Extending domain-specific ontology-supported KGs with the multidisciplinary, crowdsourced knowledge in Wikidata KG would enable significant applications. The linking between expert knowledge systems and world knowledge empowers lay persons to benefit from high-quality research data and ultimately contributes to increasing confidence in scientific research in society
BERD@BW – A Science Data Center to foster Open Science in Business, Economics and Social Sciences
The Center for Business, Economic and Related Data in Baden-Württemberg(BERD@BW) is one of the four science data centers funded by the Ministry of Science, Research and Arts of Baden-Württemberg within the digitization strategy “digital@bw”. BERD@BW is aimed to improve sharing, finding and reusing unstructured and semi-structured research data in the social sciences in accordance with the FAIR principles (findable, accessible, interoperable and reusable). BERD@BW is built by the University of Mannheim and the Leibniz Center for European Economic Research (ZEW). Both institutions are experienced in infrastructure projects and in the empirical social sciences, including business and economics. BERD@BW is based on four pillars: 1) building up methodological knowledge, 2) developing tools and services dealing with unstructured and semistructured data, 3) training and consulting with respect to legal and technical issues in research data management, and 4) engaging in national and international networking. The services and materials developed within BERD@BW are available as openly as possible on the project homepage: https://www.berd-bw.de
Extracting research data from historical documents with eScriptorium and Python
This talk presents a workflow based on eScriptorium and Python to extract research data from historical documents. eScriptorium is a rather young transcription tool and uses the OCR engine Kraken. The software offers not only the possibility of optimally adapting the text recognition, but also the layout recognition to the source material by means of training. Due to the high research data quality requirements, this step is necessary in many cases. By using existing base models, the training effort can be drastically reduced. The text recognition results can then be exported in PAGE-XML format for further processing. For this purpose, the Python tool “blatt” was developed within the project. It can parse the PAGE-XML exports, sort and extract the contents using algorithms and templates, and convert them into a structured table format such as CSV. In the first part of the presentation there is small introduction to the topic, the source material and the research question. Then we show how a training process based on a base model with minimal training data can be performed using the software eScriptorium and which problem to pay attention to. In the last section, the Python tool “blatt” is presented, as well as the underlying ideas and algorithms
Facilitating creation, (re)use and interoperability of Knowledge Graphs in NFDI: perspectives from NFDI4Culture to Base4NFDI and beyond
Slide presentation for the paper 'Facilitating creation, (re)use and interoperability of Knowledge Graphs in NFDI: perspectives from NFDI4Culture to Base4NFDI and beyond' presented at the Second NFDI Berlin-Brandenburg network meeting: 'Ontologies and Knowledge Graphs', on Thursday, July 11th, 2024, at Weierstrass Institute for Applied Analysis and Stochastics Berlin: https://www.wias-berlin.de/workshops/NFDI_BB_2
"ENERGY APPROACH" FOR CALCULATING THE ECONOMIC VALUE OF BIORESOURCES OF THE HUNTING FARM "SVIYAZHSKOE"
O papel dos animais no ecossistema é determinado por uma ampla gama de fatores. Este é, antes de tudo, o número, a biomassa e a natureza da alimentação. Devido ao fato de que a taxa metabólica em diferentes grupos de animais não é a mesma, o indicador mais importante de sua importância no funcionamento do ecossistema é o fluxo de energia que passa pela comunidade (energia transformável). O artigo apresenta dados sobre o uso de energia transformável para avaliar os recursos de espécies terrestres de vertebrados nas condições da fazenda de caça "Sviyazhskoe. A abordagem implementada pode ser aplicada a diferentes territórios. Uma limitação fundamental pode ser apenas a ausência de dados sistemáticos sobre os registros de todos os grupos.The role of animals in the ecosystem is determined by a wide range of factors. This is, first of all, the number, biomass and nature of feeding. Due to the fact, that the metabolic rate in different groups of animals is not the same, the most integral indicator of their significance in the functioning of ecosystem is the energy flow, passing through the community (transformable energy). The article presents data on the use of transformable energy for assessing the resources of terrestrial vertebrate species in the conditions of the hunting farm "Sviyazhskoe. The implemented approach can be applied to different territories. A fundamental limitation can only be the absence of systematic data on the records of all groups.El papel de los animales en el ecosistema está determinado por una amplia gama de factores. Este es, en primer lugar, el número, la biomasa y la naturaleza de la alimentación. Debido al hecho de que la tasa metabólica en diferentes grupos de animales no es la misma, el indicador más integral de su importancia en el funcionamiento del ecosistema es el flujo de energía, que pasa a través de la comunidad (energía transformable). El artículo presenta datos sobre el uso de energía transformable para evaluar los recursos de las especies de vertebrados terrestres en las condiciones de la granja de caza "Sviyazhskoe. El enfoque implementado puede aplicarse a diferentes territorios. Una limitación fundamental solo puede ser la ausencia de datos sistemáticos sobre Los registros de todos los grupos