2,913 research outputs found
MAG: A Multilingual, Knowledge-base Agnostic and Deterministic Entity Linking Approach
Entity linking has recently been the subject of a significant body of
research. Currently, the best performing approaches rely on trained
mono-lingual models. Porting these approaches to other languages is
consequently a difficult endeavor as it requires corresponding training data
and retraining of the models. We address this drawback by presenting a novel
multilingual, knowledge-based agnostic and deterministic approach to entity
linking, dubbed MAG. MAG is based on a combination of context-based retrieval
on structured knowledge bases and graph algorithms. We evaluate MAG on 23 data
sets and in 7 languages. Our results show that the best approach trained on
English datasets (PBOH) achieves a micro F-measure that is up to 4 times worse
on datasets in other languages. MAG, on the other hand, achieves
state-of-the-art performance on English datasets and reaches a micro F-measure
that is up to 0.6 higher than that of PBOH on non-English languages.Comment: Accepted in K-CAP 2017: Knowledge Capture Conferenc
Recommended from our members
Global morphogenetic flow is accurately predicted by the spatial distribution of myosin motors.
During embryogenesis tissue layers undergo morphogenetic flow rearranging and folding into specific shapes. While developmental biology has identified key genes and local cellular processes, global coordination of tissue remodeling at the organ scale remains unclear. Here, we combine in toto light-sheet microscopy of the Drosophila embryo with quantitative analysis and physical modeling to relate cellular flow with the patterns of force generation during the gastrulation process. We find that the complex spatio-temporal flow pattern can be predicted from the measured meso-scale myosin density and anisotropy using a simple, effective viscous model of the tissue, achieving close to 90% accuracy with one time dependent and two constant parameters. Our analysis uncovers the importance of a) spatial modulation of myosin distribution on the scale of the embryo and b) the non-locality of its effect due to mechanical interaction of cells, demonstrating the need for the global perspective in the study of morphogenetic flow
Recommended from our members
Intelligent control system for CFD modelling software
In this thesis we show that it is possible to create an intelligent agent capable of emulating the human ability to control CFD simulations and provide similar benefits in terms of performance, overall reliability and result accuracy. We initially consider the rule-based approach proposed by other researchers. It is argued that heuristic search is better suited to model the techniques used by human experts. The residual graphs are identified as the most important source of heuristic information relevant to the control decisions. Three different graph features are found to be most important and dedicated algorithms are developed for their extraction.
A heuristic evaluation function employing the new extraction algorithms is proposed and implemented in the first version of the heuristic control system (ICS 1.0). The analysis of the test results gives rise to the next version of the system (ICS 2.0). ICS 2.0 employs an additional expert system responsible for dynamic pruning of the search space using the rules obtained by statistical analysis of the initial results. Other features include dedicated goal-driven search plans that help reduce the search space even further. The simulation results and overall improvements are compared with non-controlled runs. We present a detailed analysis of a fire case solution obtained with different control techniques. The effect of the automatic control on the accuracy of the results is explained and discussed. Finally, we provide some indications for further research that promise to provide even greater performance gains
Recommended from our members
An investigation into establishing a generalised approach for defining similarity metrics between 3D shapes for case-based reasoning (CBR)
This thesis investigates the feasibility of establishing a generalised approach for defining similarity metrics between 3D shapes for the casting design problem in Case-Based Reasoning (CBR).
This research investigates a new approach for improving the quality of casting design advice achieved from a CBR system using casting design knowledge associated with past cases. The new approach uses enhanced similarity metrics to those used in previous research in this area to achieve improvements in the advice given. The new similarity metrics proposed here are based on the decomposition of casting shape cases into a set of components. The research into metrics defines and uses the Component Type Similarity Metric (CTM) and Maximum Common Subgraph (MCS) metric between graph representations of the case shapes and are focused on the definition of partial similarity between the components of the same type that take into account the geometrical features and proportions of each single shape component. Additionally, the investigation extends the scope of the research to 3D shapes by defining and evaluating a new metric for the overall similarity between 3D shapes. Additionally, this research investigates a methodology for the integration of the CBR cycle and automation of the feature extraction from target and source case shapes.
The ShapeCBR system has been developed to demonstrate the feasibility of integrating the CBR approach for retrieving and reusing casting design advice. The ShapeCBR system automates the decomposition process, the classification process and the shape matching process and is used to evaluate the new similarity metrics proposed in this research and the extension of the approach to 3D shapes.
Evaluation of the new similarity metrics show that the efficiency of the system is enhanced using the new similarity metrics and that the new approach provides useful casting design information for 3D casting shapes. Additionally, ShapeCBR shows that it is possible to automate the decomposition and classification of components that allow a case shape to be represented in graph form and thus provide the basis for automating the overall CBR cycle.
The thesis concludes with new research questions that emerge from this research and an agenda for further work to be pursued in further research in the area
Robustness properties of estimators in generalized Pareto Models
We study global and local robustness properties of several estimators for shape and scale in a generalized Pareto model. The estimators considered in this paper cover maximum likelihood estimators, skipped maximum likelihood estimators, moment-based estimators, Cramér-von-Mises Minimum Distance estimators, and, as a special case of quantile-based estimators, Pickands Estimator as well as variants of the latter tuned for higher finite sample breakdown point (FSBP), and lower variance. We further consider an estimator matching population median and median of absolute deviations to the empirical ones (MedMad); again, in order to improve its FSBP, we propose a variant using a suitable asymmetric Mad as constituent, and which may be tuned to achieve an expected FSBP of 34%. These estimators are compared to one-step estimators distinguished as optimal in the shrinking neighborhood setting, i.e., the most bias-robust estimator minimizing the maximal (asymptotic) bias and the estimator minimizing the maximal (asymptotic) MSE. For each of these estimators, we determine the FSBP, the influence function, as well as statistical accuracy measured by asymptotic bias, variance, and mean squared error—all evaluated uniformly on shrinking convex contamination neighborhoods. Finally, we check these asymptotic theoretical findings against finite sample behavior by an extensive simulation study
Finite Element Analysis and Machine Learning Guided Design of Carbon Fiber Organosheet-based Battery Enclosures for Crashworthiness
Carbon fiber composite can be a potential candidate for replacing metal-based
battery enclosures of current electric vehicles (E.V.s) owing to its better
strength-to-weight ratio and corrosion resistance. However, the strength of
carbon fiber-based structures depends on several parameters that should be
carefully chosen. In this work, we implemented high throughput finite element
analysis (FEA) based thermoforming simulation to virtually manufacture the
battery enclosure using different design and processing parameters.
Subsequently, we performed virtual crash simulations to mimic a side pole crash
to evaluate the crashworthiness of the battery enclosures. This high throughput
crash simulation dataset was utilized to build predictive models to understand
the crashworthiness of an unknown set. Our machine learning (ML) models showed
excellent performance (R2 > 0.97) in predicting the crashworthiness metrics,
i.e., crush load efficiency, absorbed energy, intrusion, and maximum
deceleration during a crash. We believe that this FEA-ML work framework will be
helpful in down select process parameters for carbon fiber-based component
design and can be transferrable to other manufacturing technologies
Machine learning-based automated segmentation with a feedback loop for 3D synchrotron micro-CT
Die Entwicklung von Synchrotronlichtquellen der dritten Generation hat die Grundlage für die Untersuchung der 3D-Struktur opaker Proben mit einer Auflösung im Mikrometerbereich und höher geschaffen. Dies führte zur Entwicklung der Röntgen-Synchrotron-Mikro-Computertomographie, welche die Schaffung von Bildgebungseinrichtungen zur Untersuchung von Proben verschiedenster Art förderte, z.B. von Modellorganismen, um die Physiologie komplexer lebender Systeme besser zu verstehen. Die Entwicklung moderner Steuerungssysteme und Robotik ermöglichte die vollständige Automatisierung der Röntgenbildgebungsexperimente und die Kalibrierung der Parameter des Versuchsaufbaus während des Betriebs. Die Weiterentwicklung der digitalen Detektorsysteme führte zu Verbesserungen der Auflösung, des Dynamikbereichs, der Empfindlichkeit und anderer wesentlicher Eigenschaften. Diese Verbesserungen führten zu einer beträchtlichen Steigerung des Durchsatzes des Bildgebungsprozesses, aber auf der anderen Seite begannen die Experimente eine wesentlich größere Datenmenge von bis zu Dutzenden von Terabyte zu generieren, welche anschließend manuell verarbeitet wurden. Somit ebneten diese technischen Fortschritte den Weg für die Durchführung effizienterer Hochdurchsatzexperimente zur Untersuchung einer großen Anzahl von Proben, welche Datensätze von besserer Qualität produzierten. In der wissenschaftlichen Gemeinschaft besteht daher ein hoher Bedarf an einem effizienten, automatisierten Workflow für die Röntgendatenanalyse, welcher eine solche Datenlast bewältigen und wertvolle Erkenntnisse für die Fachexperten liefern kann. Die bestehenden Lösungen für einen solchen Workflow sind nicht direkt auf Hochdurchsatzexperimente anwendbar, da sie für Ad-hoc-Szenarien im Bereich der medizinischen Bildgebung entwickelt wurden. Daher sind sie nicht für Hochdurchsatzdatenströme optimiert und auch nicht in der Lage, die hierarchische Beschaffenheit von Proben zu nutzen.
Die wichtigsten Beiträge der vorliegenden Arbeit sind ein neuer automatisierter Analyse-Workflow, der für die effiziente Verarbeitung heterogener Röntgendatensätze hierarchischer Natur geeignet ist. Der entwickelte Workflow basiert auf verbesserten Methoden zur Datenvorverarbeitung, Registrierung, Lokalisierung und Segmentierung. Jede Phase eines Arbeitsablaufs, die eine Trainingsphase beinhaltet, kann automatisch feinabgestimmt werden, um die besten Hyperparameter für den spezifischen Datensatz zu finden. Für die Analyse von Faserstrukturen in Proben wurde eine neue, hochgradig parallelisierbare 3D-Orientierungsanalysemethode entwickelt, die auf einem neuartigen Konzept der emittierenden Strahlen basiert und eine präzisere morphologische Analyse ermöglicht. Alle entwickelten Methoden wurden gründlich an synthetischen Datensätzen validiert, um ihre Anwendbarkeit unter verschiedenen Abbildungsbedingungen quantitativ zu bewerten. Es wurde gezeigt, dass der Workflow in der Lage ist, eine Reihe von Datensätzen ähnlicher Art zu verarbeiten. Darüber hinaus werden die effizienten CPU/GPU-Implementierungen des entwickelten Workflows und der Methoden vorgestellt und der Gemeinschaft als Module für die Sprache Python zur Verfügung gestellt.
Der entwickelte automatisierte Analyse-Workflow wurde erfolgreich für Mikro-CT-Datensätze angewandt, die in Hochdurchsatzröntgenexperimenten im Bereich der Entwicklungsbiologie und Materialwissenschaft gewonnen wurden. Insbesondere wurde dieser Arbeitsablauf für die Analyse der Medaka-Fisch-Datensätze angewandt, was eine automatisierte Segmentierung und anschließende morphologische Analyse von Gehirn, Leber, Kopfnephronen und Herz ermöglichte. Darüber hinaus wurde die entwickelte Methode der 3D-Orientierungsanalyse bei der morphologischen Analyse von Polymergerüst-Datensätzen eingesetzt, um einen Herstellungsprozess in Richtung wünschenswerter Eigenschaften zu lenken
New Methods to Improve Large-Scale Microscopy Image Analysis with Prior Knowledge and Uncertainty
Multidimensional imaging techniques provide powerful ways to examine various
kinds of scientific questions. The routinely produced datasets in the
terabyte-range, however, can hardly be analyzed manually and require an
extensive use of automated image analysis. The present thesis introduces a new
concept for the estimation and propagation of uncertainty involved in image
analysis operators and new segmentation algorithms that are suitable for
terabyte-scale analyses of 3D+t microscopy images.Comment: 218 pages, 58 figures, PhD thesis, Department of Mechanical
Engineering, Karlsruhe Institute of Technology, published online with KITopen
(License: CC BY-SA 3.0, http://dx.doi.org/10.5445/IR/1000057821
- …