Search CORE

2,092 research outputs found

Knowledge extraction from fictional texts

Author: Chu Cuong Xuan
Publication venue: Saarländische Universitäts- und Landesbibliothek
Publication date: 01/01/2022
Field of study

Knowledge extraction from text is a key task in natural language processing, which involves many sub-tasks, such as taxonomy induction, named entity recognition and typing, relation extraction, knowledge canonicalization and so on. By constructing structured knowledge from natural language text, knowledge extraction becomes a key asset for search engines, question answering and other downstream applications. However, current knowledge extraction methods mostly focus on prominent real-world entities with Wikipedia and mainstream news articles as sources. The constructed knowledge bases, therefore, lack information about long-tail domains, with fiction and fantasy as archetypes. Fiction and fantasy are core parts of our human culture, spanning from literature to movies, TV series, comics and video games. With thousands of fictional universes which have been created, knowledge from fictional domains are subject of search-engine queries - by fans as well as cultural analysts. Unlike the real-world domain, knowledge extraction on such specific domains like fiction and fantasy has to tackle several key challenges: - Training data: Sources for fictional domains mostly come from books and fan-built content, which is sparse and noisy, and contains difficult structures of texts, such as dialogues and quotes. Training data for key tasks such as taxonomy induction, named entity typing or relation extraction are also not available. - Domain characteristics and diversity: Fictional universes can be highly sophisticated, containing entities, social structures and sometimes languages that are completely different from the real world. State-of-the-art methods for knowledge extraction make assumptions on entity-class, subclass and entity-entity relations that are often invalid for fictional domains. With different genres of fictional domains, another requirement is to transfer models across domains. - Long fictional texts: While state-of-the-art models have limitations on the input sequence length, it is essential to develop methods that are able to deal with very long texts (e.g. entire books), to capture multiple contexts and leverage widely spread cues. This dissertation addresses the above challenges, by developing new methodologies that advance the state of the art on knowledge extraction in fictional domains. - The first contribution is a method, called TiFi, for constructing type systems (taxonomy induction) for fictional domains. By tapping noisy fan-built content from online communities such as Wikia, TiFi induces taxonomies through three main steps: category cleaning, edge cleaning and top-level construction. Exploiting a variety of features from the original input, TiFi is able to construct taxonomies for a diverse range of fictional domains with high precision. - The second contribution is a comprehensive approach, called ENTYFI, for named entity recognition and typing in long fictional texts. Built on 205 automatically induced high-quality type systems for popular fictional domains, ENTYFI exploits the overlap and reuse of these fictional domains on unseen texts. By combining different typing modules with a consolidation stage, ENTYFI is able to do fine-grained entity typing in long fictional texts with high precision and recall. - The third contribution is an end-to-end system, called KnowFi, for extracting relations between entities in very long texts such as entire books. KnowFi leverages background knowledge from 142 popular fictional domains to identify interesting relations and to collect distant training samples. KnowFi devises a similarity-based ranking technique to reduce false positives in training samples and to select potential text passages that contain seed pairs of entities. By training a hierarchical neural network for all relations, KnowFi is able to infer relations between entity pairs across long fictional texts, and achieves gains over the best prior methods for relation extraction.Wissensextraktion ist ein Schlüsselaufgabe bei der Verarbeitung natürlicher Sprache, und umfasst viele Unteraufgaben, wie Taxonomiekonstruktion, Entitätserkennung und Typisierung, Relationsextraktion, Wissenskanonikalisierung, etc. Durch den Aufbau von strukturiertem Wissen (z.B. Wissensdatenbanken) aus Texten wird die Wissensextraktion zu einem Schlüsselfaktor für Suchmaschinen, Question Answering und andere Anwendungen. Aktuelle Methoden zur Wissensextraktion konzentrieren sich jedoch hauptsächlich auf den Bereich der realen Welt, wobei Wikipedia und Mainstream- Nachrichtenartikel die Hauptquellen sind. Fiktion und Fantasy sind Kernbestandteile unserer menschlichen Kultur, die sich von Literatur bis zu Filmen, Fernsehserien, Comics und Videospielen erstreckt. Für Tausende von fiktiven Universen wird Wissen aus Suchmaschinen abgefragt – von Fans ebenso wie von Kulturwissenschaftler. Im Gegensatz zur realen Welt muss die Wissensextraktion in solchen spezifischen Domänen wie Belletristik und Fantasy mehrere zentrale Herausforderungen bewältigen: • Trainingsdaten. Quellen für fiktive Domänen stammen hauptsächlich aus Büchern und von Fans erstellten Inhalten, die spärlich und fehlerbehaftet sind und schwierige Textstrukturen wie Dialoge und Zitate enthalten. Trainingsdaten für Schlüsselaufgaben wie Taxonomie-Induktion, Named Entity Typing oder Relation Extraction sind ebenfalls nicht verfügbar. • Domain-Eigenschaften und Diversität. Fiktive Universen können sehr anspruchsvoll sein und Entitäten, soziale Strukturen und manchmal auch Sprachen enthalten, die sich von der realen Welt völlig unterscheiden. Moderne Methoden zur Wissensextraktion machen Annahmen über Entity-Class-, Entity-Subclass- und Entity- Entity-Relationen, die für fiktive Domänen oft ungültig sind. Bei verschiedenen Genres fiktiver Domänen müssen Modelle auch über fiktive Domänen hinweg transferierbar sein. • Lange fiktive Texte. Während moderne Modelle Einschränkungen hinsichtlich der Länge der Eingabesequenz haben, ist es wichtig, Methoden zu entwickeln, die in der Lage sind, mit sehr langen Texten (z.B. ganzen Büchern) umzugehen, und mehrere Kontexte und verteilte Hinweise zu erfassen. Diese Dissertation befasst sich mit den oben genannten Herausforderungen, und entwickelt Methoden, die den Stand der Kunst zur Wissensextraktion in fiktionalen Domänen voranbringen. • Der erste Beitrag ist eine Methode, genannt TiFi, zur Konstruktion von Typsystemen (Taxonomie induktion) für fiktive Domänen. Aus von Fans erstellten Inhalten in Online-Communities wie Wikia induziert TiFi Taxonomien in drei wesentlichen Schritten: Kategoriereinigung, Kantenreinigung und Top-Level- Konstruktion. TiFi nutzt eine Vielzahl von Informationen aus den ursprünglichen Quellen und ist in der Lage, Taxonomien für eine Vielzahl von fiktiven Domänen mit hoher Präzision zu erstellen. • Der zweite Beitrag ist ein umfassender Ansatz, genannt ENTYFI, zur Erkennung von Entitäten, und deren Typen, in langen fiktiven Texten. Aufbauend auf 205 automatisch induzierten hochwertigen Typsystemen für populäre fiktive Domänen nutzt ENTYFI die Überlappung und Wiederverwendung dieser fiktiven Domänen zur Bearbeitung neuer Texte. Durch die Zusammenstellung verschiedener Typisierungsmodule mit einer Konsolidierungsphase ist ENTYFI in der Lage, in langen fiktionalen Texten eine feinkörnige Entitätstypisierung mit hoher Präzision und Abdeckung durchzuführen. • Der dritte Beitrag ist ein End-to-End-System, genannt KnowFi, um Relationen zwischen Entitäten aus sehr langen Texten wie ganzen Büchern zu extrahieren. KnowFi nutzt Hintergrundwissen aus 142 beliebten fiktiven Domänen, um interessante Beziehungen zu identifizieren und Trainingsdaten zu sammeln. KnowFi umfasst eine ähnlichkeitsbasierte Ranking-Technik, um falsch positive Einträge in Trainingsdaten zu reduzieren und potenzielle Textpassagen auszuwählen, die Paare von Kandidats-Entitäten enthalten. Durch das Trainieren eines hierarchischen neuronalen Netzwerkes für alle Relationen ist KnowFi in der Lage, Relationen zwischen Entitätspaaren aus langen fiktiven Texten abzuleiten, und übertrifft die besten früheren Methoden zur Relationsextraktion

Universaar

Acronym

MPG.PuRe

sWOM and Online Shopping within a Disease Menace: The Case of Vietnam

Author: Chu Ba Quyet
Le Xuan Cu
Publication venue: 'Vilnius University Press'
Publication date: 21/06/2022
Field of study

Although electronic word-of-mouth via social networking sites (or sWOM) greatly induced online shopping, its importance in shopping decisions during the coronavirus disease (COVID-19) pandemic has not been holistically considered. Based on the necessity of sWOM, uses and gratifications theory (UGT), and health belief theory (HBT), this study frames a consumer shopping tendency model toward sWOM in the context of the pandemic. A web-based survey was designed to collect data from 403 respondents who are inclined to patronize e-stores during the pandemic. Next, the measurement model is examined using a two-step method of structural equation modeling. The findings specify that sWOM is an influential communication mode for online shopping in the pandemic. sWOM is of primary importance to information quality. Moreover, utilitarian value, social value, perceived threat, and self-efficacy toward shopping tendency are significantly motivated by sWOM. Lastly, information quality, utilitarian value, social value, and perceived threat are major predictors of shopping tendency during Covid-19. Finally, theoretical and practical implications are discussed

Organizations and Markets in Emerging Economies

Unearthing Common Inconsistency for Generalisable Deepfake Detection

Author: Chu Beilin
Xu Xuan
You Weike
Zhou Linna
Publication venue
Publication date: 20/11/2023
Field of study

Deepfake has emerged for several years, yet efficient detection techniques could generalize over different manipulation methods require further research. While current image-level detection method fails to generalize to unseen domains, owing to the domain-shift phenomenon brought by CNN's strong inductive bias towards Deepfake texture, video-level one shows its potential to have both generalization across multiple domains and robustness to compression. We argue that although distinct face manipulation tools have different inherent bias, they all disrupt the consistency between frames, which is a natural characteristic shared by authentic videos. Inspired by this, we proposed a detection approach by capturing frame inconsistency that broadly exists in different forgery techniques, termed unearthing-common-inconsistency (UCI). Concretely, the UCI network based on self-supervised contrastive learning can better distinguish temporal consistency between real and fake videos from multiple domains. We introduced a temporally-preserved module method to introduce spatial noise perturbations, directing the model's attention towards temporal information. Subsequently, leveraging a multi-view cross-correlation learning module, we extensively learn the disparities in temporal representations between genuine and fake samples. Extensive experiments demonstrate the generalization ability of our method on unseen Deepfake domains.Comment: 9 pages, 2 figures and 5 table

arXiv.org e-Print Archive

Preconcentration of Arsenic Species in Environmental Waters by Solid Phase Extraction Using Metal-loaded Chelating Resins

Author: Chu Xuan Anh
Do Quang Trung
Fujita Masanori
Nguyen Xuan Trung
Tanaka Minoru
Yasaka Yuta
Publication venue: 日本分析化学会
Publication date: 01/01/2001
Field of study

Joint Research on Environmental Science and Technology for the Earth『Annual Report of FY 2002, The Core University Program between Japan Society for the Promotion of Science (JSPS) and National Centre for Natural Science and Technology (NCST)』pp.20-23, Core University Program Office, Fujita Laboratory, Dept. of Environmental Engineering, Osaka University, 200

Osaka University Knowledge Archive

Analyse géomatique de la correspondance entre la localisation des hôpitaux de la ville d'Hanoi (Viêt-nam) et les besoins de la population en soins de santé

Author: Chu Xuan Huy
Publication venue: 'Universite de Sherbrooke'
Publication date: 01/01/2005
Field of study

Actuellement, dans la ville de Hanoi, le système de soins de santé ne répond plus à la demande de services exprimée par la population. Les études portant sur la relation entre l'offre et la demande de services peuvent offrir des informations supplémentaires pour analyser la situation et faciliter une meilleure prise de décision pour le développement du système. Notre étude se concentrera plus particulièrement sur les relations spatiales entre la population, les infrastructures sanitaires et l'environnement. L'objectif de cette étude est de développer un modèle d'évaluation de services de soins médicaux à partir des caractéristiques de la population et celles du système de soins médicaux existant à l'aide de la géomatique. Ce modèle sera composé de deux parties: la détermination la demande de soins et la définition de l'offre de services du système de santé. A partir des données cartographiques, socio-économiques et hospitalières, les indicateurs exprimant la demande et ceux liés aux paramètres d'accessibilité et de disponibilité de l'offre hospitalière ont été calculés. Ensuite, une analyse multivariée a permis d'estimer des résultats préliminaires. Ces résultats montrent la situation actuelle des services de soins médicaux dans la ville. On y constate que les hôpitaux sont, dans la plupart des cas, loin des zones où la demande est forte. Celle-ci se concentre au centre de la ville. Les résultats obtenus y montrent une déficience. Dans les zones d'expansion urbaine, une amélioration doit être apportée au niveau de l'offre en soins médicaux adéquats; également, dans plusieurs zones, la qualité de l'offre doit être revue à la hausse

Savoirs UdeS

Effects of peer feedback on Taiwanese adolescents’ English speaking practices and development

Author: Chu Rong-Xuan
Publication venue: The University of Edinburgh
Publication date: 05/07/2013
Field of study

This thesis explores the impact of peer feedback on two secondary level classrooms studying English as a foreign language in Taiwan. The effectiveness of teacher-led feedback has consistently been the focus of the relevant literature but relatively fewer studies have experimentally investigated the impact of peer-led feedback on learning. This research is based on the belief that the investigation of the process of peer-led feedback, as well as the effectiveness of peer-led correction, will enhance our understanding of learners’ communicative interactions. These data will allow us the opportunity to provide suggestions for successful second/foreign language learning. This study was conducted following a mixed-methods quasi-experimental design involving a variety of data collection and analysis techniques. Observations of peer-peer dialogues taken from a Year 7 and a Year 8 class were analysed using content analysis, in order to classify the types of peer feedback provided by the Year 7 and Year 8 learners. Pre-and post-measures, including English speaking tests, questionnaires, and checklists, were examined with non-parametric statistical tests used to explore any changes in relation to the learners’ speaking development after the quasi-experiment. Key findings included frequency and distribution of seven types of peer feedback, as used by the Year 7 and Year 8 learners, and the statistical results that revealed the differences between the pre-and post-measures. Among the seven types of peer feedback (translation, confirmation, completion, explicit indication, explicit correction, explanation and recasts), explicit correction and translation were the two techniques used most frequently by the learners. Post-test results indicated an improvement in the learners’ speaking performance. The results of pre- and post-questionnaires and pre- and post-checklists showed different levels of change in the learners’ self-evaluation of their own ability to speak English, as well as their attitudes towards corrective feedback. These results allow us to gain insight into the nature of peer interaction in communicative speaking activities as well as learners’ motives behind their feedback behaviours. Additionally, the results shed light on learners’ opinions towards corrective feedback that they received or provided in peer interaction. Further, the results yield a deepened understanding of impacts of peer feedback on L2 development by examining changes in learners’ speaking performance, self-confidence in speaking English and self-evaluation of their own ability to speak English after a peer-led correction treatment. In conclusion, the study suggests that adolescent learners are willing and able to provide each other with feedback in peer interaction. The feedback that they delivered successfully helps their peers to attend to form and has positive impacts on their peers’ English- speaking performance. Moreover, the study provides explanations for learners’ preference for certain types of feedback techniques, which hopefully helps to tackle the mismatch between teachers’ intentions and learners’ expectations of corrective feedback in the L2 classrooms

Edinburgh Research Archive

Validating Problem Solving Competency Instrument in the New General Education Curriculum

Author: Chu Cam Tho
Dang Xuan Cuong
Duong Thi Thu Huong
Publication venue: SCHOLINK INC.
Publication date: 16/12/2023
Field of study

Problem solving is a crucial skill for students who experience learning and living in the 21st century. To enhance this skill, students need to face a situation setting problem, then students solve the problem. The 2018 general education curriculum has been developed according to the competency approach. As a result, the instructions and assessment system need to be adapted to align with requirements in the new curriculum. The purpose of present study is to develop and validate the problem solving competency (PSC) instrument based on general requirements of this competency in the general education curriculum in Vietnam. The results of Exploratory Factor Analysis (EFA) show that the instruments can be divided into three different components with good factor loadings to measure problem solving competency of Vietnamese students. The instrument is reliable and valid. Reliability analysis using Cronbach’s alpha revealed satisfactory internal consistency for each factor, with values ranging from .670 to .812

Scholink Journals

Some field experience with subsynchronous vibration of centrifugal compressors

Author: Du Yun-Tian
Gu Jin-Chu
Hua Yong-Li
Shen Qin-Gen
Wang Xi-Xuan
Zhu Lan-Sheng
Publication venue
Publication date
Field of study

A lot of large chemical fertilizer plants producing 1000 ton NH3/day and 1700 ton urea/day were constructed in the 1970's in China. During operation, subsynchronous vibration takes place occasionally in some of the large turbine-compressor sets and has resulted in heavy economic losses. Two cases of subsynchronous vibration are described: Self-excited vibration of the low-pressure (LP) cylinder of one kind of N2-H2 multistage compressor; and Forced subsynchronous vibration of the high-pressure (HP) cylinder of the CO2 compressor

NASA Technical Reports Server