196 research outputs found

    Knowledge extraction from fictional texts

    Get PDF
    Knowledge extraction from text is a key task in natural language processing, which involves many sub-tasks, such as taxonomy induction, named entity recognition and typing, relation extraction, knowledge canonicalization and so on. By constructing structured knowledge from natural language text, knowledge extraction becomes a key asset for search engines, question answering and other downstream applications. However, current knowledge extraction methods mostly focus on prominent real-world entities with Wikipedia and mainstream news articles as sources. The constructed knowledge bases, therefore, lack information about long-tail domains, with fiction and fantasy as archetypes. Fiction and fantasy are core parts of our human culture, spanning from literature to movies, TV series, comics and video games. With thousands of fictional universes which have been created, knowledge from fictional domains are subject of search-engine queries - by fans as well as cultural analysts. Unlike the real-world domain, knowledge extraction on such specific domains like fiction and fantasy has to tackle several key challenges: - Training data: Sources for fictional domains mostly come from books and fan-built content, which is sparse and noisy, and contains difficult structures of texts, such as dialogues and quotes. Training data for key tasks such as taxonomy induction, named entity typing or relation extraction are also not available. - Domain characteristics and diversity: Fictional universes can be highly sophisticated, containing entities, social structures and sometimes languages that are completely different from the real world. State-of-the-art methods for knowledge extraction make assumptions on entity-class, subclass and entity-entity relations that are often invalid for fictional domains. With different genres of fictional domains, another requirement is to transfer models across domains. - Long fictional texts: While state-of-the-art models have limitations on the input sequence length, it is essential to develop methods that are able to deal with very long texts (e.g. entire books), to capture multiple contexts and leverage widely spread cues. This dissertation addresses the above challenges, by developing new methodologies that advance the state of the art on knowledge extraction in fictional domains. - The first contribution is a method, called TiFi, for constructing type systems (taxonomy induction) for fictional domains. By tapping noisy fan-built content from online communities such as Wikia, TiFi induces taxonomies through three main steps: category cleaning, edge cleaning and top-level construction. Exploiting a variety of features from the original input, TiFi is able to construct taxonomies for a diverse range of fictional domains with high precision. - The second contribution is a comprehensive approach, called ENTYFI, for named entity recognition and typing in long fictional texts. Built on 205 automatically induced high-quality type systems for popular fictional domains, ENTYFI exploits the overlap and reuse of these fictional domains on unseen texts. By combining different typing modules with a consolidation stage, ENTYFI is able to do fine-grained entity typing in long fictional texts with high precision and recall. - The third contribution is an end-to-end system, called KnowFi, for extracting relations between entities in very long texts such as entire books. KnowFi leverages background knowledge from 142 popular fictional domains to identify interesting relations and to collect distant training samples. KnowFi devises a similarity-based ranking technique to reduce false positives in training samples and to select potential text passages that contain seed pairs of entities. By training a hierarchical neural network for all relations, KnowFi is able to infer relations between entity pairs across long fictional texts, and achieves gains over the best prior methods for relation extraction.Wissensextraktion ist ein Schlüsselaufgabe bei der Verarbeitung natürlicher Sprache, und umfasst viele Unteraufgaben, wie Taxonomiekonstruktion, Entitätserkennung und Typisierung, Relationsextraktion, Wissenskanonikalisierung, etc. Durch den Aufbau von strukturiertem Wissen (z.B. Wissensdatenbanken) aus Texten wird die Wissensextraktion zu einem Schlüsselfaktor für Suchmaschinen, Question Answering und andere Anwendungen. Aktuelle Methoden zur Wissensextraktion konzentrieren sich jedoch hauptsächlich auf den Bereich der realen Welt, wobei Wikipedia und Mainstream- Nachrichtenartikel die Hauptquellen sind. Fiktion und Fantasy sind Kernbestandteile unserer menschlichen Kultur, die sich von Literatur bis zu Filmen, Fernsehserien, Comics und Videospielen erstreckt. Für Tausende von fiktiven Universen wird Wissen aus Suchmaschinen abgefragt – von Fans ebenso wie von Kulturwissenschaftler. Im Gegensatz zur realen Welt muss die Wissensextraktion in solchen spezifischen Domänen wie Belletristik und Fantasy mehrere zentrale Herausforderungen bewältigen: • Trainingsdaten. Quellen für fiktive Domänen stammen hauptsächlich aus Büchern und von Fans erstellten Inhalten, die spärlich und fehlerbehaftet sind und schwierige Textstrukturen wie Dialoge und Zitate enthalten. Trainingsdaten für Schlüsselaufgaben wie Taxonomie-Induktion, Named Entity Typing oder Relation Extraction sind ebenfalls nicht verfügbar. • Domain-Eigenschaften und Diversität. Fiktive Universen können sehr anspruchsvoll sein und Entitäten, soziale Strukturen und manchmal auch Sprachen enthalten, die sich von der realen Welt völlig unterscheiden. Moderne Methoden zur Wissensextraktion machen Annahmen über Entity-Class-, Entity-Subclass- und Entity- Entity-Relationen, die für fiktive Domänen oft ungültig sind. Bei verschiedenen Genres fiktiver Domänen müssen Modelle auch über fiktive Domänen hinweg transferierbar sein. • Lange fiktive Texte. Während moderne Modelle Einschränkungen hinsichtlich der Länge der Eingabesequenz haben, ist es wichtig, Methoden zu entwickeln, die in der Lage sind, mit sehr langen Texten (z.B. ganzen Büchern) umzugehen, und mehrere Kontexte und verteilte Hinweise zu erfassen. Diese Dissertation befasst sich mit den oben genannten Herausforderungen, und entwickelt Methoden, die den Stand der Kunst zur Wissensextraktion in fiktionalen Domänen voranbringen. • Der erste Beitrag ist eine Methode, genannt TiFi, zur Konstruktion von Typsystemen (Taxonomie induktion) für fiktive Domänen. Aus von Fans erstellten Inhalten in Online-Communities wie Wikia induziert TiFi Taxonomien in drei wesentlichen Schritten: Kategoriereinigung, Kantenreinigung und Top-Level- Konstruktion. TiFi nutzt eine Vielzahl von Informationen aus den ursprünglichen Quellen und ist in der Lage, Taxonomien für eine Vielzahl von fiktiven Domänen mit hoher Präzision zu erstellen. • Der zweite Beitrag ist ein umfassender Ansatz, genannt ENTYFI, zur Erkennung von Entitäten, und deren Typen, in langen fiktiven Texten. Aufbauend auf 205 automatisch induzierten hochwertigen Typsystemen für populäre fiktive Domänen nutzt ENTYFI die Überlappung und Wiederverwendung dieser fiktiven Domänen zur Bearbeitung neuer Texte. Durch die Zusammenstellung verschiedener Typisierungsmodule mit einer Konsolidierungsphase ist ENTYFI in der Lage, in langen fiktionalen Texten eine feinkörnige Entitätstypisierung mit hoher Präzision und Abdeckung durchzuführen. • Der dritte Beitrag ist ein End-to-End-System, genannt KnowFi, um Relationen zwischen Entitäten aus sehr langen Texten wie ganzen Büchern zu extrahieren. KnowFi nutzt Hintergrundwissen aus 142 beliebten fiktiven Domänen, um interessante Beziehungen zu identifizieren und Trainingsdaten zu sammeln. KnowFi umfasst eine ähnlichkeitsbasierte Ranking-Technik, um falsch positive Einträge in Trainingsdaten zu reduzieren und potenzielle Textpassagen auszuwählen, die Paare von Kandidats-Entitäten enthalten. Durch das Trainieren eines hierarchischen neuronalen Netzwerkes für alle Relationen ist KnowFi in der Lage, Relationen zwischen Entitätspaaren aus langen fiktiven Texten abzuleiten, und übertrifft die besten früheren Methoden zur Relationsextraktion

    First-principles study, fabrication and characterization of (Zr0.25Nb0.25Ti0.25V0.25)C high-entropy ceramic

    Get PDF
    The formation possibility of a new (Zr0.25Nb0.25Ti0.25V0.25)C high-entropy ceramic (ZHC-1) was first analyzed by the first-principles calculations and thermodynamical analysis and then it was successfully fabricated by hot pressing sintering technique. The first-principles calculation results showed that the mixing enthalpy of ZHC-1 was 5.526 kJ/mol and the mixing entropy of ZHC-1 was in the range of 0.693R-1.040R. The thermodynamical analysis results showed that ZHC-1 was thermodynamically stable above 959 K owing to its negative mixing Gibbs free energy. The experimental results showed that the as-prepared ZHC-1 (95.1% relative density) possessed a single rock-salt crystal structure, some interesting nanoplate-like structures and high compositional uniformity from nanoscale to microscale. By taking advantage of these unique features, compared with the initial metal carbides (ZrC, NbC, TiC and VC), it showed a relatively low thermal conductivity of 15.3 + - 0.3 W/(m.K) at room temperature, which was due to the presence of solid solution effects, nanoplates and porosity. Meanwhile, it exhibited the relatively high nanohardness of 30.3 + - 0.7 GPa and elastic modulus of 460.4 + - 19.2 GPa and the higher fracture toughness of 4.7 + - 0.5 MPa.m1/2, which were attributed to the solid solution strengthening mechanism and nanoplate pullout and microcrack deflection toughening mechanism.Comment: 49 pages,6 figures, 4 table

    Improving Items and Contexts Understanding with Descriptive Graph for Conversational Recommendation

    Full text link
    State-of-the-art methods on conversational recommender systems (CRS) leverage external knowledge to enhance both items' and contextual words' representations to achieve high quality recommendations and responses generation. However, the representations of the items and words are usually modeled in two separated semantic spaces, which leads to misalignment issue between them. Consequently, this will cause the CRS to only achieve a sub-optimal ranking performance, especially when there is a lack of sufficient information from the user's input. To address limitations of previous works, we propose a new CRS framework KLEVER, which jointly models items and their associated contextual words in the same semantic space. Particularly, we construct an item descriptive graph from the rich items' textual features, such as item description and categories. Based on the constructed descriptive graph, KLEVER jointly learns the embeddings of the words and items, towards enhancing both recommender and dialog generation modules. Extensive experiments on benchmarking CRS dataset demonstrate that KLEVER achieves superior performance, especially when the information from the users' responses is lacking.Comment: 14 pages, 3 figures, 9 table

    IN VITRO EVALUATION OF ANTIBACTERIAL ACTIVITY OF GARLIC ALLIUM SATIVUM AGAINST POULTRY PATHOGENS AND EFFECT OF GARLIC SUPPLEMENTATION ON DUCKLING GROWTH PERFORMANCE

    Get PDF
    ABSTRACT-QMFS2019Poultry production provides source of protein and contributes an important income for Vietnamese farmers. Among the poultry in Vietnam, ducks account for 27.3% of head of poultry and even 55.7% in Mekong Delta region. Along with the development of rearing ducks, bacterial, viral and fungal diseases occurring in the two last decades induced bad effect for poultry producer. Escherichia coli, Salmonella enterica, Streptococcal or Pasteurella act as major pathogenic bacteria in duck. The aims of this study were to investigate the antibacterial activity of garlic Allium sativum against E. coli, Staphylococcus aureus, Salmonella Typhimurium and to evaluate the effect of garlic on growth performance of duck from 1-28 old-days. The results indicated that fresh garlic and dried garlic powder showed inhibitory effect against pathogenic tested strains from 2% and 4% w/v, respectively. The inhibition zones and the minimal inhibitory concentration (MIC) values of garlic extract ranged from 11.3-28.3 mm and 0.02-0.2 g/ml, respectively. After 28 days of diet with garlic supplemented, D3 (2% of fresh garlic in water) showed significantly different in weight gain (WG), feed conversion ratio (FCR), protein efficiency ratio (PER), average daily weight (ADW); whereas, D2 (2% of garlic powder in basal diet) only possessed a difference significant in feed consumption (FC) compared to the D1 (control without garlic supplementation). The obtained results demonstrated the potential of garlic application in poultry production

    Adaptation options for agricultural cultivation systems in the South Central Coast under the context of climate change: Assessment Report.

    Get PDF
    This report highlights the results of consultation meetings and field visits organized by the Department of Crop Production and the CGIAR Research Program on Climate Change, Agriculture and Food Security in Southeast Asia in association with the three offices of the Department of Agriculture and Rural Development in the South Central Coast provinces of Binh Thuan, Ninh Thuan, and Khanh Hoa, in combination with consultation with the provinces in the conference: “Summing up crops production in the Winter-Spring season in 2018-2019, implementing the Summer-Autumn season, Main rice season in 2019 for the South Central Coast and the Central Highlands” held by the Ministry of Agriculture and Rural Development in Tam Ky City, Quang Nam Province on 12 April 2019. The meetings underlined the progress made by the provinces on climate change adaptation and mitigation, options for risk reductions in agricultural production, and conversion of crop structure as results of implementing the guidelines of the provinces and the Sector, especially, solutions for reservation and efficient and economic use of water under the context of climate change. This assessment report also reviews some issues related to the agricultural transformation of the region in adapting to risks caused by climate change. They are based on comparative advantages in terms of geographical location and market of key agricultural products. This report also points out shortcomings in using land and unreasonable points in managing and using important natural resources, especially water, and provides recommendations for the agricultural transformation and inter-regional connection with the Central Highlands and the Southeast. The team also introduces climate-related risks maps and adaptation plans (CS MAP) which is applied in the five provinces in the Mekong Delta Region, and hopes this solution’s expansion shall be supported by the Ministry of Agriculture and Rural Development and the provinces

    Ecommerce risk management: analysing the case Vietnam Airlines incident

    Get PDF
    E-Commerce is the purchase and sale of goods, services and exchange of information based on communications networks and the Internet. Information, information systems, computers, computer networks, and other electronic means play an especially important role. These objects are valuable assets and targeted attacks by cybercriminals. E-commerce risk management is to protect the development of e-commerce. It includes setting information security objectives, assessing vulnerabilities, threats and attacks, and selecting countermeasures. The paper presents the theory of e-commerce risk management, analysing the Vietnam Airlines e-commerce risk management case, using the DREAD model. The paper provides the discussions and short recommendations to other enterprises in e-commerce risk management nowadays

    China's Angel Investment Policy

    Get PDF
    Purpose: This paper summarized and systematized the case of China's policies applied to attract angel investment which turned this country into a global highlight in angel investment.   Theoretical framework: The authors conducted a literature review on start-up and angel investment to introduce angel investment in China, analyze the role of angel investments in developing start-ups, and factor determinants and policy of angel investment in China.   Design/Methodology/Approach: This paper uses qualitative method to provide overviews on the case of China in attracting angel investment for their start-ups with an updated analysis on their relevant factor determinants and policies that contributed to the success of China.   Findings: It is found that the Chinese Government has actively carried out packages, including tax incentives, built programs, investment cooperation funds, developed networks, and angel investment education systems to boost funding readiness for firms. They also focus on promoting the start-up ecosystem and providing financial support to implement an innovation-driven development strategy to raise the nation's competitiveness.   Research, practical & social implications: This paper provides some suggestions for Chinese government in making policies to enhance the role of angel investments in developing start-ups.   Originality/Value: This paper contributes to the evolvement of research about start- up and angel investment, particularly in China, in the context of high-tech  production and breakthrough technologies, by providing a policy perspective to imply lessons for other countries

    Addressing the Scalability Bottleneck of Semantic Technologies at Bosch

    Full text link
    At the heart of smart manufacturing is real-time semi-automatic decision-making. Such decisions are vital for optimizing production lines, e.g., reducing resource consumption, improving the quality of discrete manufacturing operations, and optimizing the actual products, e.g., optimizing the sampling rate for measuring product dimensions during production. Such decision-making relies on massive industrial data thus posing a real-time processing bottleneck
    • …
    corecore