55,829 research outputs found

    Enhancing Knowledge Bases with Quantity Facts

    Get PDF

    Entities with quantities : extraction, search, and ranking

    Get PDF
    Quantities are more than numeric values. They denote measures of the world’s entities such as heights of buildings, running times of athletes, energy efficiency of car models or energy production of power plants, all expressed in numbers with associated units. Entity-centric search and question answering (QA) are well supported by modern search engines. However, they do not work well when the queries involve quantity filters, such as searching for athletes who ran 200m under 20 seconds or companies with quarterly revenue above $2 Billion. State-of-the-art systems fail to understand the quantities, including the condition (less than, above, etc.), the unit of interest (seconds, dollar, etc.), and the context of the quantity (200m race, quarterly revenue, etc.). QA systems based on structured knowledge bases (KBs) also fail as quantities are poorly covered by state-of-the-art KBs. In this dissertation, we developed new methods to advance the state-of-the-art on quantity knowledge extraction and search.Zahlen sind mehr als nur numerische Werte. Sie beschreiben Maße von EntitĂ€ten wie die Höhe von GebĂ€uden, die Laufzeit von Sportlern, die Energieeffizienz von Automodellen oder die Energieerzeugung von Kraftwerken - jeweils ausgedrĂŒckt durch Zahlen mit zugehörigen Einheiten. EntitĂ€tszentriete Anfragen und direktes Question-Answering werden von Suchmaschinen hĂ€ufig gut unterstĂŒtzt. Sie funktionieren jedoch nicht gut, wenn die Fragen Zahlenfilter beinhalten, wie z. B. die Suche nach Sportlern, die 200m unter 20 Sekunden gelaufen sind, oder nach Unternehmen mit einem Quartalsumsatz von ĂŒber 2 Milliarden US-Dollar. Selbst moderne Systeme schaffen es nicht, QuantitĂ€ten, einschließlich der genannten Bedingungen (weniger als, ĂŒber, etc.), der Maßeinheiten (Sekunden, Dollar, etc.) und des Kontexts (200-Meter-Rennen, Quartalsumsatz usw.), zu verstehen. Auch QA-Systeme, die auf strukturierten Wissensbanken (“Knowledge Bases”, KBs) aufgebaut sind, versagen, da quantitative Eigenschaften von modernen KBs kaum erfasst werden. In dieser Dissertation werden neue Methoden entwickelt, um den Stand der Technik zur Wissensextraktion und -suche von QuantitĂ€ten voranzutreiben. Unsere HauptbeitrĂ€ge sind die folgenden: ‱ ZunĂ€chst prĂ€sentieren wir Qsearch [Ho et al., 2019, Ho et al., 2020] – ein System, das mit erweiterten Fragen mit QuantitĂ€tsfiltern umgehen kann, indem es Hinweise verwendet, die sowohl in der Frage als auch in den Textquellen vorhanden sind. Qsearch umfasst zwei HauptbeitrĂ€ge. Der erste Beitrag ist ein tiefes neuronales Netzwerkmodell, das fĂŒr die Extraktion quantitĂ€tszentrierter Tupel aus Textquellen entwickelt wurde. Der zweite Beitrag ist ein neuartiges Query-Matching-Modell zum Finden und zur Reihung passender Tupel. ‱ Zweitens, um beim Vorgang heterogene Tabellen einzubinden, stellen wir QuTE [Ho et al., 2021a, Ho et al., 2021b] vor – ein System zum Extrahieren von QuantitĂ€tsinformationen aus Webquellen, insbesondere Ad-hoc Webtabellen in HTML-Seiten. Der Beitrag von QuTE umfasst eine Methode zur VerknĂŒpfung von QuantitĂ€ts- und EntitĂ€tsspalten, fĂŒr die externe Textquellen genutzt werden. Zur Beantwortung von Fragen kontextualisieren wir die extrahierten EntitĂ€ts-QuantitĂ€ts-Paare mit informativen Hinweisen aus der Tabelle und stellen eine neue Methode zur Konsolidierung und verbesserteer Reihung von Antwortkandidaten durch Inter-Fakten-Konsistenz vor. ‱ Drittens stellen wir QL [Ho et al., 2022] vor – eine Recall-orientierte Methode zur Anreicherung von Knowledge Bases (KBs) mit quantitativen Fakten. Moderne KBs wie Wikidata oder YAGO decken viele EntitĂ€ten und ihre relevanten Informationen ab, ĂŒbersehen aber oft wichtige quantitative Eigenschaften. QL ist frage-gesteuert und basiert auf iterativem Lernen mit zwei HauptbeitrĂ€gen, um die KB-Abdeckung zu verbessern. Der erste Beitrag ist eine Methode zur Expansion von Fragen, um einen grĂ¶ĂŸeren Pool an Faktenkandidaten zu erfassen. Der zweite Beitrag ist eine Technik zur Selbstkonsistenz durch BerĂŒcksichtigung der Werteverteilungen von QuantitĂ€ten

    HYDRO-POLITICS: SOCIO-ECONOMIC ANALYSIS OF INTERNATIONAL WATER TREATIES

    Get PDF
    Water resource issues are closely related to property rights issues, as the holders of property rights along a river bank, watershed, lake, or river basin, often take priority in terms of water usage. Rivers, aquifers and other bodies of water transgress national boundaries, giving rise to conflicts. Treaties, agreements, and conventions seek to allocate water rights among countries in a manner that benefits all participants. This study conducts an empirical analysis of macroeconomic, geological, hydrological, and institutional variables in order to determine factors contributing to the existence of bilateral treaty and treaty structure.International Relations/Trade, Resource /Energy Economics and Policy, K330,

    An Economic Approach to Article 82

    Get PDF
    This report argues in favour of an economics-based approach to Article 82, in a way similar to the reform of Article 81 and merger control. In particular, we support an effects-based rather than a form-based approach to competition policy. Such an approach focuses on the presence of anti-competitive effects that harm consumers, and is based on the examination of each specific case, based on sound economics and grounded on facts

    Knowledge Base Population using Semantic Label Propagation

    Get PDF
    A crucial aspect of a knowledge base population system that extracts new facts from text corpora, is the generation of training data for its relation extractors. In this paper, we present a method that maximizes the effectiveness of newly trained relation extractors at a minimal annotation cost. Manual labeling can be significantly reduced by Distant Supervision, which is a method to construct training data automatically by aligning a large text corpus with an existing knowledge base of known facts. For example, all sentences mentioning both 'Barack Obama' and 'US' may serve as positive training instances for the relation born_in(subject,object). However, distant supervision typically results in a highly noisy training set: many training sentences do not really express the intended relation. We propose to combine distant supervision with minimal manual supervision in a technique called feature labeling, to eliminate noise from the large and noisy initial training set, resulting in a significant increase of precision. We further improve on this approach by introducing the Semantic Label Propagation method, which uses the similarity between low-dimensional representations of candidate training instances, to extend the training set in order to increase recall while maintaining high precision. Our proposed strategy for generating training data is studied and evaluated on an established test collection designed for knowledge base population tasks. The experimental results show that the Semantic Label Propagation strategy leads to substantial performance gains when compared to existing approaches, while requiring an almost negligible manual annotation effort.Comment: Submitted to Knowledge Based Systems, special issue on Knowledge Bases for Natural Language Processin

    The Return of Federal Judicial Discretion in Criminal Sentencing

    Get PDF

    Data degradation to enhance privacy for the Ambient Intelligence

    Get PDF
    Increasing research in ubiquitous computing techniques towards the development of an Ambient Intelligence raises issues regarding privacy. To gain the required data needed to enable application in this Ambient Intelligence to offer smart services to users, sensors will monitor users' behavior to fill personal context histories. Those context histories will be stored on database/information systems which we consider as honest: they can be trusted now, but might be subject to attacks in the future. Making this assumption implies that protecting context histories by means of access control might be not enough. To reduce the impact of possible attacks, we propose to use limited retention techniques. In our approach, we present applications a degraded set of data with a retention delay attached to it which matches both application requirements and users privacy wishes. Data degradation can be twofold: the accuracy of context data can be lowered such that the less privacy sensitive parts are retained, and context data can be transformed such that only particular abilities for application remain available. Retention periods can be specified to trigger irreversible removal of the context data from the system

    Forensic Fisheries Science: Literature Review and Research Suggestions

    Get PDF
    Recent years have seen a dramatic increase in litigation against the National Marine Fisheries Service, NOAA. Litigation may affect personnel throughout the agency, including scientists, whose work is often directly or indirectly influenced by complex legal requirements, but who may not be in a position to comment or engage in public dialogue. It may be helpful for scientists and other agency personnel to join the ongoing discussion in the legal community regarding the interface of science and law. This paper provides a starting point with a selected introduction to relevant legal literature in this area. It uses the phrase “forensic fisheries science” to describe the application of science to legal requirements in the fishery management context. It concludes with suggestions for future research that could assist NMFS scientists as they grapple with the challenge of using science to help the agency meet its complex legal requirements. Forensic: belonging to, used in, or suitable to courts of judicature or to public discussion and debate; argumentative, rhetorical; relating to or dealing with the application of scientific knowledge to legal problems (Merriam-Webster Online Dictionary
    • 

    corecore