Search CORE

212 research outputs found

Heterogeneous Entity Matching with Complex Attribute Associations using BERT and Neural Networks

Author: Lu Jiamin
Wang Shitao
Publication venue
Publication date: 19/09/2023
Field of study

Across various domains, data from different sources such as Baidu Baike and Wikipedia often manifest in distinct forms. Current entity matching methodologies predominantly focus on homogeneous data, characterized by attributes that share the same structure and concise attribute values. However, this orientation poses challenges in handling data with diverse formats. Moreover, prevailing approaches aggregate the similarity of attribute values between corresponding attributes to ascertain entity similarity. Yet, they often overlook the intricate interrelationships between attributes, where one attribute may have multiple associations. The simplistic approach of pairwise attribute comparison fails to harness the wealth of information encapsulated within entities.To address these challenges, we introduce a novel entity matching model, dubbed Entity Matching Model for Capturing Complex Attribute Relationships(EMM-CCAR),built upon pre-trained models. Specifically, this model transforms the matching task into a sequence matching problem to mitigate the impact of varying data formats. Moreover, by introducing attention mechanisms, it identifies complex relationships between attributes, emphasizing the degree of matching among multiple attributes rather than one-to-one correspondences. Through the integration of the EMM-CCAR model, we adeptly surmount the challenges posed by data heterogeneity and intricate attribute interdependencies. In comparison with the prevalent DER-SSM and Ditto approaches, our model achieves improvements of approximately 4% and 1% in F1 scores, respectively. This furnishes a robust solution for addressing the intricacies of attribute complexity in entity matching

arXiv.org e-Print Archive

Efficient Long-Range Transformers: You Need to Attend More, but Not Necessarily at Every Layer

Author: Hawkins Cole
Ram Dhananjay
Zha Sheng
Zhang Qingru
Zhao Tuo
Publication venue
Publication date: 18/10/2023
Field of study

Pretrained transformer models have demonstrated remarkable performance across various natural language processing tasks. These models leverage the attention mechanism to capture long- and short-range dependencies in the sequence. However, the (full) attention mechanism incurs high computational cost - quadratic in the sequence length, which is not affordable in tasks with long sequences, e.g., inputs with 8k tokens. Although sparse attention can be used to improve computational efficiency, as suggested in existing work, it has limited modeling capacity and often fails to capture complicated dependencies in long sequences. To tackle this challenge, we propose MASFormer, an easy-to-implement transformer variant with Mixed Attention Spans. Specifically, MASFormer is equipped with full attention to capture long-range dependencies, but only at a small number of layers. For the remaining layers, MASformer only employs sparse attention to capture short-range dependencies. Our experiments on natural language modeling and generation tasks show that a decoder-only MASFormer model of 1.3B parameters can achieve competitive performance to vanilla transformers with full attention while significantly reducing computational cost (up to 75%). Additionally, we investigate the effectiveness of continual training with long sequence data and how sequence length impacts downstream generation performance, which may be of independent interest.Comment: The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023 Findings

arXiv.org e-Print Archive

Improving Factual Consistency for Knowledge-Grounded Dialogue Systems via Knowledge Enhancement and Alignment

Author: Jiang Xin
Liu Qun
Mi Fei
Shang Lifeng
Wang Hongru
Wang Rui
Wang Weichao
Wang Yasheng
Wong Kam-Fai
Xue Boyang
Publication venue
Publication date: 03/11/2023
Field of study

Pretrained language models (PLMs) based knowledge-grounded dialogue systems are prone to generate responses that are factually inconsistent with the provided knowledge source. In such inconsistent responses, the dialogue models fail to accurately express the external knowledge they rely upon. Inspired by previous work which identified that feed-forward networks (FFNs) within Transformers are responsible for factual knowledge expressions, we investigate two methods to efficiently improve the factual expression capability {of FFNs} by knowledge enhancement and alignment respectively. We first propose \textsc{K-Dial}, which {explicitly} introduces {extended FFNs in Transformers to enhance factual knowledge expressions} given the specific patterns of knowledge-grounded dialogue inputs. Additionally, we apply the reinforcement learning for factual consistency (RLFC) method to implicitly adjust FFNs' expressions in responses by aligning with gold knowledge for the factual consistency preference. To comprehensively assess the factual consistency and dialogue quality of responses, we employ extensive automatic measures and human evaluations including sophisticated fine-grained NLI-based metrics. Experimental results on WoW and CMU\_DoG datasets demonstrate that our methods efficiently enhance the ability of the FFN module to convey factual knowledge, validating the efficacy of improving factual consistency for knowledge-grounded dialogue systems.Comment: EMNLP2023 Finding

arXiv.org e-Print Archive

Improve Long-term Memory Learning Through Rescaling the Error Temporally

Author: Wang Shida
Yan Zhanglu
Publication venue
Publication date: 21/07/2023
Field of study

This paper studies the error metric selection for long-term memory learning in sequence modelling. We examine the bias towards short-term memory in commonly used errors, including mean absolute/squared error. Our findings show that all temporally positive-weighted errors are biased towards short-term memory in learning linear functionals. To reduce this bias and improve long-term memory learning, we propose the use of a temporally rescaled error. In addition to reducing the bias towards short-term memory, this approach can also alleviate the vanishing gradient issue. We conduct numerical experiments on different long-memory tasks and sequence models to validate our claims. Numerical results confirm the importance of appropriate temporally rescaled error for effective long-term memory learning. To the best of our knowledge, this is the first work that quantitatively analyzes different errors' memory bias towards short-term memory in sequence modelling.Comment: 12 pages, 7 figure

arXiv.org e-Print Archive

Recommended from our members

Artificial intelligence approaches to predicting and detecting cognitive decline in older adults: A conceptual review.

Author: Depp Colin A
Graham Sarah A
Jeste Dilip V
Kim Ho-Cheol
Lee Ellen E
Nebeker Camille
Twamley Elizabeth W
Van Patten Ryan
Yamada Yasunori
Publication venue: eScholarship, University of California
Publication date: 01/02/2020
Field of study

Preserving cognition and mental capacity is critical to aging with autonomy. Early detection of pathological cognitive decline facilitates the greatest impact of restorative or preventative treatments. Artificial Intelligence (AI) in healthcare is the use of computational algorithms that mimic human cognitive functions to analyze complex medical data. AI technologies like machine learning (ML) support the integration of biological, psychological, and social factors when approaching diagnosis, prognosis, and treatment of disease. This paper serves to acquaint clinicians and other stakeholders with the use, benefits, and limitations of AI for predicting, diagnosing, and classifying mild and major neurocognitive impairments, by providing a conceptual overview of this topic with emphasis on the features explored and AI techniques employed. We present studies that fell into six categories of features used for these purposes: (1) sociodemographics; (2) clinical and psychometric assessments; (3) neuroimaging and neurophysiology; (4) electronic health records and claims; (5) novel assessments (e.g., sensors for digital data); and (6) genomics/other omics. For each category we provide examples of AI approaches, including supervised and unsupervised ML, deep learning, and natural language processing. AI technology, still nascent in healthcare, has great potential to transform the way we diagnose and treat patients with neurocognitive disorders

eScholarship - University of California

On Effectively Learning of Knowledge in Continual Pre-training

Author: Huang Fei
Li Yanyang
Luo Fuli
Wang Cunxiang
Xu Runxin
Zhang Yue
Publication venue
Publication date: 17/04/2022
Field of study

Pre-trained language models (PLMs) like BERT have made significant progress in various downstream NLP tasks. However, by asking models to do cloze-style tests, recent work finds that PLMs are short in acquiring knowledge from unstructured text. To understand the internal behaviour of PLMs in retrieving knowledge, we first define knowledge-baring (K-B) tokens and knowledge-free (K-F) tokens for unstructured text and ask professional annotators to label some samples manually. Then, we find that PLMs are more likely to give wrong predictions on K-B tokens and attend less attention to those tokens inside the self-attention module. Based on these observations, we develop two solutions to help the model learn more knowledge from unstructured text in a fully self-supervised manner. Experiments on knowledge-intensive tasks show the effectiveness of the proposed methods. To our best knowledge, we are the first to explore fully self-supervised learning of knowledge in continual pre-training

arXiv.org e-Print Archive

Sustainable supply management & supply chain resilience:a systematic literature review

Author: Tikkanen E. (Eetu)
Publication venue: University of Oulu
Publication date: 17/05/2022
Field of study

Abstract. For the past two years there have been significant disruptions to the supply chains (SC) across the planet. Larger problems raised as the COVID-19 outbreak happened and a great number of workers had to quarantine leading to factory closures. These closures affected the world’s supply lines, which then led demand to grow to unsustainable levels. In order to soften the blow of disruptions to the performance of a business entity, a number of things can be done. One of these things is to develop supply chain resilience (SCRes) to achieve a more robust SC. Another element to achieving better disruption resistance is the more proactive root of practicing sustainable supply management (SSM) as it takes into account and develops organization’s weaknesses in the area of environmental, social and corporate governance. The aim of this research is to find out how much real-world case study literature exists on the combined topic of SSM & SCRes, by conducting a systematic literature review. The knowledge this master’s thesis seeks to pull from the literature is what industries have conducted a case study on the subject of improving SSM or SCRes and what were the methods involved in improving the organizations SSM & SCRes. For this reason, two research questions were set out: RQ1: In which industries were SSM & SCRes considered simultaneously in the reviewed literature? RQ2: Which management areas, best practices and tools most improved SSM and SCRes? The key findings indicate that surprisingly many different industries have been involved in case study research and from them a dozen approved method of improvement to the SSM & SCRes of an organization. The findings of this research can be used as a starting point for a managerial implication and as a way to seek data on the case study literature of SSM & SCRes published to this point.Kestävän hankinnan johtaminen & toimitusketjun resilienssi : systemaattinen kirjallisuuskatsaus. Tiivistelmä. Kahden viime vuoden aikana toimitusketjuissa on ollut merkittäviä häiriöitä ympäri maapalloa. Suurimmat ongelmat nousivat esiin, kun COVID-19 pandemia lähti käyntiin, ja jonka seurauksena monet tehdastyöntekijät joutuivat karanteeniin. Karanteenit johtivat tehtaiden sulkemisiin, jotka itsessään vaikuttivat koko maapallon toimitus vaikeuksiin. Tästä seurasi kysynnän kasvu kestämättömälle tasolle. Liiketoimintayksiköiden suorituskykyyn vaikuttavia häiriöitä ei voida kokonaan estää, mutta suorituskykyä itsessään voidaan saada resilientimmäksi häiriöitä vastaan. Yksi näistä tavoista on kehittää toimitusketjun resilienssiä (SCRes) ja näin saavuttaa vankempi toimitusketju. Toinen tekijä paremman häiriökestävyyden saavuttamiseksi on kehittää organisaation kestävän hankinnan johtamista (SSM), sillä SSM:ssä otetaan huomioon ja kehitetään organisaation heikkouksia ympäristö-, sosiaali- ja hallinnointijärjestelmä asioissa. Tämän tutkimuksen tarkoituksena on selvittää, kuinka paljon reaalimaailman tapaustutkimuskirjallisuutta on olemassa SSM & SCRes -yhdistelmästä systemaattisen kirjallisuuskatsauksen avulla. Tieto, jonka tämä diplomityö pyrkii etsimään kirjallisuudesta, on se, mitä eri teollisuudenalat ovat tehneet SSM & SCRes aiheista sekä mitkä olivat konkreettiset menetelmät joilla SSM & SCRes aspekteja saatiin kehitettyä. Tutkimuksen tavoitteet pyritään siis selvittämään näillä kahdella tutkimuskysymyksellä: RQ1: Millä toimialoilla SSM & SCRes kehitystä oli tarkasteltu samanaikaisesti katselmoidussa kirjallisuudessa? RQ2: Mitkä SSM & SCRes menetelmät saivat aikaan parhaat lopputulokset? Keskeiset havainnot osoittavat, että tapaustutkimuksiin on osallistunut yllättävän monia eri toimialoja. Näistä toimialoista löydettiin kymmenkunta pätevää SSM & SCRes -kehitysmenetelmää. Tämän diplomityön löydöksiä voidaan käyttää SSM & SCRes johtamismenetelmien kehittämiseen ja jatkotutkimiseen

Musculoskeletal Load Exposure Estimation by Non-supervised Annotation of Events on Motion Data

Author: Santos António Luís Marim dos
Publication venue
Publication date: 01/11/2021
Field of study

There is a significant number of work pressures that promote the incidence of musculoskeletal disorders in industrial environments. As, unfortunately, many workplace conditions are subject to these biomechanical hazards, this has become an extensively common health disorder. To properly adjust intervention strategies, an ergonomic assessment through surveillance measurements is required. However, most measurements still depend on subjective assessment tools like self-reporting and expert observation. The ideal approach for this scenario would be to use direct measurements that use sensors to retrieve more precise/accurate information of how workers interact with their work environment. Following this approach, one of the major constraints would be that a systematic retrieval of data from a labor environment would require a tiresome process of analysis and manual annotation, deviating resources and requiring data analysts. Hence, this work proposes an unsupervised methodology able to automatically annotate relevant events from direct acquisitions, with the final intent of promoting this type of analysis. The event detection methodology proposes to detect three different event types: 1) work period transition; 2) work cycle transition; and 3) sub-sequence matching by query. To achieve this, the multivariate time series are represented as a Self-Similarity matrix built with the features extracted. This matrix is analysed for each event needed to be searched. The results were successful in the segmentation of Active and Non-active working periods and in the detection of points of transition between repetitive human motions, i.e. work cycles. A method of search-by-example is also presented, being that it allows for the user to detect specific motions of interest. Although this method could still be further optimized in future work, this approach has a very promising prospect as it proposes a strategy of similarity analysis that has not yet been deeply explored in the context of ergonomic acquisition. These advances are also significant given that the summarization of ergonomic data is still a subject in expansion.Num contexto industrial, são várias as tensões que promovem a incidência de distúrbios musculosqueléticos. Uma vez que a maioria das condições laborais estão sujeitas a estas propensões do foro biomecânico, os distúrbiosmusculosqueléticos tornaram-se patologias amplamente diagnosticadas na população ativa. Para desenhar estratégias de intervenção eficientes, é necessário proceder a uma avaliação ergonómica baseada em metododologias de vigilância. Não obstante o reconhecimento desta necessidade, a maioria das medidas ainda depende de ferramentas subjetivas como a auto-avaliação e a observação externa por parte de especialistas. A abordagem preferencial para esta problemática passaria pela aplicação de medições diretas que recorressem a sensores com vista a extrair informação exata e fidedigna do ambiente laboral. Uma das maiores limitações deste leque de soluções consiste no facto de um sistema de recolha de dados neste ambiente implicar um processo exaustivo de análise e anotação manual, o que consome recursos e requer os serviços de analistas de dados. Assim, este trabalho propõe uma metodologia capaz de anotar automaticamente eventos relevantes provenientes de aquisições diretas, com o objetivo final de promover este tipo de análises mais eficientes. A metodologia de deteção de eventos proposta foca-se em três diferentes tipos de eventos: 1) transições entre tarefas; 2) transições entre ciclos de trabalho; e 3) procura de movimentos-exemplo em amostras segmentadas. Para concretizar este trabalho, realizou-se um estudo de matrizes de auto-semelhança. Os resultados provaram-se, na sua maioria, bem-sucedidos no caso da segmentação de períodos Ativos e Não-ativos e na deteção de momentos de transição entre movimentos repetitivos, isto é, ciclos de trabalho. É ainda apresentado um método de procura-porexemplo que permite ao utilizador detetar movimentos-exemplo do seu interesse. Embora este método possa ainda ser otimizado em trabalhos futuros, reflete uma abordagem promissora uma vez que propõe uma estratégia de análise de similaridade que não foi ainda especialmente explorada no contexto dos estudos ergonómicos. Estes avanços são ainda significantes na perspetiva de que a sumarização de dados ergonómicos é uma linha de investigação ainda em expansão