212 research outputs found
Heterogeneous Entity Matching with Complex Attribute Associations using BERT and Neural Networks
Across various domains, data from different sources such as Baidu Baike and
Wikipedia often manifest in distinct forms. Current entity matching
methodologies predominantly focus on homogeneous data, characterized by
attributes that share the same structure and concise attribute values. However,
this orientation poses challenges in handling data with diverse formats.
Moreover, prevailing approaches aggregate the similarity of attribute values
between corresponding attributes to ascertain entity similarity. Yet, they
often overlook the intricate interrelationships between attributes, where one
attribute may have multiple associations. The simplistic approach of pairwise
attribute comparison fails to harness the wealth of information encapsulated
within entities.To address these challenges, we introduce a novel entity
matching model, dubbed Entity Matching Model for Capturing Complex Attribute
Relationships(EMM-CCAR),built upon pre-trained models. Specifically, this model
transforms the matching task into a sequence matching problem to mitigate the
impact of varying data formats. Moreover, by introducing attention mechanisms,
it identifies complex relationships between attributes, emphasizing the degree
of matching among multiple attributes rather than one-to-one correspondences.
Through the integration of the EMM-CCAR model, we adeptly surmount the
challenges posed by data heterogeneity and intricate attribute
interdependencies. In comparison with the prevalent DER-SSM and Ditto
approaches, our model achieves improvements of approximately 4% and 1% in F1
scores, respectively. This furnishes a robust solution for addressing the
intricacies of attribute complexity in entity matching
Efficient Long-Range Transformers: You Need to Attend More, but Not Necessarily at Every Layer
Pretrained transformer models have demonstrated remarkable performance across
various natural language processing tasks. These models leverage the attention
mechanism to capture long- and short-range dependencies in the sequence.
However, the (full) attention mechanism incurs high computational cost -
quadratic in the sequence length, which is not affordable in tasks with long
sequences, e.g., inputs with 8k tokens. Although sparse attention can be used
to improve computational efficiency, as suggested in existing work, it has
limited modeling capacity and often fails to capture complicated dependencies
in long sequences. To tackle this challenge, we propose MASFormer, an
easy-to-implement transformer variant with Mixed Attention Spans. Specifically,
MASFormer is equipped with full attention to capture long-range dependencies,
but only at a small number of layers. For the remaining layers, MASformer only
employs sparse attention to capture short-range dependencies. Our experiments
on natural language modeling and generation tasks show that a decoder-only
MASFormer model of 1.3B parameters can achieve competitive performance to
vanilla transformers with full attention while significantly reducing
computational cost (up to 75%). Additionally, we investigate the effectiveness
of continual training with long sequence data and how sequence length impacts
downstream generation performance, which may be of independent interest.Comment: The 2023 Conference on Empirical Methods in Natural Language
Processing (EMNLP 2023 Findings
Improving Factual Consistency for Knowledge-Grounded Dialogue Systems via Knowledge Enhancement and Alignment
Pretrained language models (PLMs) based knowledge-grounded dialogue systems
are prone to generate responses that are factually inconsistent with the
provided knowledge source. In such inconsistent responses, the dialogue models
fail to accurately express the external knowledge they rely upon. Inspired by
previous work which identified that feed-forward networks (FFNs) within
Transformers are responsible for factual knowledge expressions, we investigate
two methods to efficiently improve the factual expression capability {of FFNs}
by knowledge enhancement and alignment respectively. We first propose
\textsc{K-Dial}, which {explicitly} introduces {extended FFNs in Transformers
to enhance factual knowledge expressions} given the specific patterns of
knowledge-grounded dialogue inputs. Additionally, we apply the reinforcement
learning for factual consistency (RLFC) method to implicitly adjust FFNs'
expressions in responses by aligning with gold knowledge for the factual
consistency preference. To comprehensively assess the factual consistency and
dialogue quality of responses, we employ extensive automatic measures and human
evaluations including sophisticated fine-grained NLI-based metrics.
Experimental results on WoW and CMU\_DoG datasets demonstrate that our methods
efficiently enhance the ability of the FFN module to convey factual knowledge,
validating the efficacy of improving factual consistency for knowledge-grounded
dialogue systems.Comment: EMNLP2023 Finding
Improve Long-term Memory Learning Through Rescaling the Error Temporally
This paper studies the error metric selection for long-term memory learning
in sequence modelling. We examine the bias towards short-term memory in
commonly used errors, including mean absolute/squared error. Our findings show
that all temporally positive-weighted errors are biased towards short-term
memory in learning linear functionals. To reduce this bias and improve
long-term memory learning, we propose the use of a temporally rescaled error.
In addition to reducing the bias towards short-term memory, this approach can
also alleviate the vanishing gradient issue. We conduct numerical experiments
on different long-memory tasks and sequence models to validate our claims.
Numerical results confirm the importance of appropriate temporally rescaled
error for effective long-term memory learning. To the best of our knowledge,
this is the first work that quantitatively analyzes different errors' memory
bias towards short-term memory in sequence modelling.Comment: 12 pages, 7 figure
Recommended from our members
Artificial intelligence approaches to predicting and detecting cognitive decline in older adults: A conceptual review.
Preserving cognition and mental capacity is critical to aging with autonomy. Early detection of pathological cognitive decline facilitates the greatest impact of restorative or preventative treatments. Artificial Intelligence (AI) in healthcare is the use of computational algorithms that mimic human cognitive functions to analyze complex medical data. AI technologies like machine learning (ML) support the integration of biological, psychological, and social factors when approaching diagnosis, prognosis, and treatment of disease. This paper serves to acquaint clinicians and other stakeholders with the use, benefits, and limitations of AI for predicting, diagnosing, and classifying mild and major neurocognitive impairments, by providing a conceptual overview of this topic with emphasis on the features explored and AI techniques employed. We present studies that fell into six categories of features used for these purposes: (1) sociodemographics; (2) clinical and psychometric assessments; (3) neuroimaging and neurophysiology; (4) electronic health records and claims; (5) novel assessments (e.g., sensors for digital data); and (6) genomics/other omics. For each category we provide examples of AI approaches, including supervised and unsupervised ML, deep learning, and natural language processing. AI technology, still nascent in healthcare, has great potential to transform the way we diagnose and treat patients with neurocognitive disorders
On Effectively Learning of Knowledge in Continual Pre-training
Pre-trained language models (PLMs) like BERT have made significant progress
in various downstream NLP tasks. However, by asking models to do cloze-style
tests, recent work finds that PLMs are short in acquiring knowledge from
unstructured text. To understand the internal behaviour of PLMs in retrieving
knowledge, we first define knowledge-baring (K-B) tokens and knowledge-free
(K-F) tokens for unstructured text and ask professional annotators to label
some samples manually. Then, we find that PLMs are more likely to give wrong
predictions on K-B tokens and attend less attention to those tokens inside the
self-attention module. Based on these observations, we develop two solutions to
help the model learn more knowledge from unstructured text in a fully
self-supervised manner. Experiments on knowledge-intensive tasks show the
effectiveness of the proposed methods. To our best knowledge, we are the first
to explore fully self-supervised learning of knowledge in continual
pre-training
Sustainable supply management & supply chain resilience:a systematic literature review
Abstract. For the past two years there have been significant disruptions to the supply chains (SC) across the planet. Larger problems raised as the COVID-19 outbreak happened and a great number of workers had to quarantine leading to factory closures. These closures affected the world’s supply lines, which then led demand to grow to unsustainable levels. In order to soften the blow of disruptions to the performance of a business entity, a number of things can be done. One of these things is to develop supply chain resilience (SCRes) to achieve a more robust SC. Another element to achieving better disruption resistance is the more proactive root of practicing sustainable supply management (SSM) as it takes into account and develops organization’s weaknesses in the area of environmental, social and corporate governance.
The aim of this research is to find out how much real-world case study literature exists on the combined topic of SSM & SCRes, by conducting a systematic literature review. The knowledge this master’s thesis seeks to pull from the literature is what industries have conducted a case study on the subject of improving SSM or SCRes and what were the methods involved in improving the organizations SSM & SCRes. For this reason, two research questions were set out:
RQ1: In which industries were SSM & SCRes considered simultaneously in the reviewed literature?
RQ2: Which management areas, best practices and tools most improved SSM and SCRes?
The key findings indicate that surprisingly many different industries have been involved in case study research and from them a dozen approved method of improvement to the SSM & SCRes of an organization. The findings of this research can be used as a starting point for a managerial implication and as a way to seek data on the case study literature of SSM & SCRes published to this point.Kestävän hankinnan johtaminen & toimitusketjun resilienssi : systemaattinen kirjallisuuskatsaus. Tiivistelmä. Kahden viime vuoden aikana toimitusketjuissa on ollut merkittäviä häiriöitä ympäri maapalloa. Suurimmat ongelmat nousivat esiin, kun COVID-19 pandemia lähti käyntiin, ja jonka seurauksena monet tehdastyöntekijät joutuivat karanteeniin. Karanteenit johtivat tehtaiden sulkemisiin, jotka itsessään vaikuttivat koko maapallon toimitus vaikeuksiin. Tästä seurasi kysynnän kasvu kestämättömälle tasolle. Liiketoimintayksiköiden suorituskykyyn vaikuttavia häiriöitä ei voida kokonaan estää, mutta suorituskykyä itsessään voidaan saada resilientimmäksi häiriöitä vastaan. Yksi näistä tavoista on kehittää toimitusketjun resilienssiä (SCRes) ja näin saavuttaa vankempi toimitusketju. Toinen tekijä paremman häiriökestävyyden saavuttamiseksi on kehittää organisaation kestävän hankinnan johtamista (SSM), sillä SSM:ssä otetaan huomioon ja kehitetään organisaation heikkouksia ympäristö-, sosiaali- ja hallinnointijärjestelmä asioissa.
Tämän tutkimuksen tarkoituksena on selvittää, kuinka paljon reaalimaailman tapaustutkimuskirjallisuutta on olemassa SSM & SCRes -yhdistelmästä systemaattisen kirjallisuuskatsauksen avulla. Tieto, jonka tämä diplomityö pyrkii etsimään kirjallisuudesta, on se, mitä eri teollisuudenalat ovat tehneet SSM & SCRes aiheista sekä mitkä olivat konkreettiset menetelmät joilla SSM & SCRes aspekteja saatiin kehitettyä. Tutkimuksen tavoitteet pyritään siis selvittämään näillä kahdella tutkimuskysymyksellä:
RQ1: Millä toimialoilla SSM & SCRes kehitystä oli tarkasteltu samanaikaisesti katselmoidussa kirjallisuudessa?
RQ2: Mitkä SSM & SCRes menetelmät saivat aikaan parhaat lopputulokset?
Keskeiset havainnot osoittavat, että tapaustutkimuksiin on osallistunut yllättävän monia eri toimialoja. Näistä toimialoista löydettiin kymmenkunta pätevää SSM & SCRes -kehitysmenetelmää. Tämän diplomityön löydöksiä voidaan käyttää SSM & SCRes johtamismenetelmien kehittämiseen ja jatkotutkimiseen
Musculoskeletal Load Exposure Estimation by Non-supervised Annotation of Events on Motion Data
There is a significant number of work pressures that promote the incidence of musculoskeletal
disorders in industrial environments. As, unfortunately, many workplace
conditions are subject to these biomechanical hazards, this has become an extensively
common health disorder. To properly adjust intervention strategies, an ergonomic assessment
through surveillance measurements is required. However, most measurements still
depend on subjective assessment tools like self-reporting and expert observation.
The ideal approach for this scenario would be to use direct measurements that use
sensors to retrieve more precise/accurate information of how workers interact with their
work environment. Following this approach, one of the major constraints would be that
a systematic retrieval of data from a labor environment would require a tiresome process
of analysis and manual annotation, deviating resources and requiring data analysts.
Hence, this work proposes an unsupervised methodology able to automatically annotate
relevant events from direct acquisitions, with the final intent of promoting this type
of analysis. The event detection methodology proposes to detect three different event
types: 1) work period transition; 2) work cycle transition; and 3) sub-sequence matching
by query. To achieve this, the multivariate time series are represented as a Self-Similarity
matrix built with the features extracted. This matrix is analysed for each event needed to
be searched.
The results were successful in the segmentation of Active and Non-active working
periods and in the detection of points of transition between repetitive human motions,
i.e. work cycles. A method of search-by-example is also presented, being that it allows for
the user to detect specific motions of interest. Although this method could still be further
optimized in future work, this approach has a very promising prospect as it proposes
a strategy of similarity analysis that has not yet been deeply explored in the context of
ergonomic acquisition. These advances are also significant given that the summarization
of ergonomic data is still a subject in expansion.Num contexto industrial, são várias as tensões que promovem a incidência de distúrbios
musculosqueléticos. Uma vez que a maioria das condições laborais estão sujeitas a estas
propensões do foro biomecânico, os distúrbiosmusculosqueléticos tornaram-se patologias
amplamente diagnosticadas na população ativa. Para desenhar estratégias de intervenção
eficientes, é necessário proceder a uma avaliação ergonómica baseada em metododologias
de vigilância. Não obstante o reconhecimento desta necessidade, a maioria das medidas
ainda depende de ferramentas subjetivas como a auto-avaliação e a observação externa
por parte de especialistas.
A abordagem preferencial para esta problemática passaria pela aplicação de medições
diretas que recorressem a sensores com vista a extrair informação exata e fidedigna do
ambiente laboral. Uma das maiores limitações deste leque de soluções consiste no facto
de um sistema de recolha de dados neste ambiente implicar um processo exaustivo de
análise e anotação manual, o que consome recursos e requer os serviços de analistas de
dados.
Assim, este trabalho propõe uma metodologia capaz de anotar automaticamente eventos
relevantes provenientes de aquisições diretas, com o objetivo final de promover este
tipo de análises mais eficientes. A metodologia de deteção de eventos proposta foca-se em
três diferentes tipos de eventos: 1) transições entre tarefas; 2) transições entre ciclos de trabalho;
e 3) procura de movimentos-exemplo em amostras segmentadas. Para concretizar
este trabalho, realizou-se um estudo de matrizes de auto-semelhança.
Os resultados provaram-se, na sua maioria, bem-sucedidos no caso da segmentação de
períodos Ativos e Não-ativos e na deteção de momentos de transição entre movimentos
repetitivos, isto é, ciclos de trabalho. É ainda apresentado um método de procura-porexemplo
que permite ao utilizador detetar movimentos-exemplo do seu interesse. Embora
este método possa ainda ser otimizado em trabalhos futuros, reflete uma abordagem
promissora uma vez que propõe uma estratégia de análise de similaridade que não foi
ainda especialmente explorada no contexto dos estudos ergonómicos. Estes avanços são
ainda significantes na perspetiva de que a sumarização de dados ergonómicos é uma linha
de investigação ainda em expansão
- …