1,431 research outputs found
Entity matching with transformer architectures - a step forward in data integration
Transformer architectures have proven to be very effective and provide state-of-the-art results in many natural language tasks. The attention-based architecture in combination with pre-training on large amounts of text lead to the recent breakthrough and a variety of slightly different implementations.
In this paper we analyze how well four of the most recent attention-based transformer architectures (BERT, XLNet, RoBERTa and DistilBERT) perform on the task of entity matching - a crucial part of data integration. Entity matching (EM) is the task of finding data instances that refer to the same real-world entity. It is a challenging task if the data instances consist of long textual data or if the data instances are "dirty" due to misplaced values.
To evaluate the capability of transformer architectures and transfer-learning on the task of EM, we empirically compare the four approaches on inherently difficult data sets. We show that transformer architectures outperform classical deep learning methods in EM by an average margin of 27.5%
Humans and language models diverge when predicting repeating text
Language models that are trained on the next-word prediction task have been
shown to accurately model human behavior in word prediction and reading speed.
In contrast with these findings, we present a scenario in which the performance
of humans and LMs diverges. We collected a dataset of human next-word
predictions for five stimuli that are formed by repeating spans of text. Human
and GPT-2 LM predictions are strongly aligned in the first presentation of a
text span, but their performance quickly diverges when memory (or in-context
learning) begins to play a role. We traced the cause of this divergence to
specific attention heads in a middle layer. Adding a power-law recency bias to
these attention heads yielded a model that performs much more similarly to
humans. We hope that this scenario will spur future work in bringing LMs closer
to human behavior.Comment: To appear in the 26th Conference on Computational Natural Language
Learning (CoNLL 2023). Code and data are available at
https://github.com/HuthLab/lm-repeating-tex
Emerging technologies and future trends in substation automation systems for the protection, monitoring and control of electrical substations
Tese de Mestrado Integrado. Engenharia Electrotécnica e de Computadores (Automação). Faculdade de Engenharia. Universidade do Porto. 201
When Urban Region Profiling Meets Large Language Models
Urban region profiling from web-sourced data is of utmost importance for
urban planning and sustainable development. We are witnessing a rising trend of
LLMs for various fields, especially dealing with multi-modal data research such
as vision-language learning, where the text modality serves as a supplement
information for the image. Since textual modality has never been introduced
into modality combinations in urban region profiling, we aim to answer two
fundamental questions in this paper: i) Can textual modality enhance urban
region profiling? ii) and if so, in what ways and with regard to which aspects?
To answer the questions, we leverage the power of Large Language Models (LLMs)
and introduce the first-ever LLM-enhanced framework that integrates the
knowledge of textual modality into urban imagery profiling, named LLM-enhanced
Urban Region Profiling with Contrastive Language-Image Pretraining (UrbanCLIP).
Specifically, it first generates a detailed textual description for each
satellite image by an open-source Image-to-Text LLM. Then, the model is trained
on the image-text pairs, seamlessly unifying natural language supervision for
urban visual representation learning, jointly with contrastive loss and
language modeling loss. Results on predicting three urban indicators in four
major Chinese metropolises demonstrate its superior performance, with an
average improvement of 6.1% on R^2 compared to the state-of-the-art methods.
Our code and the image-language dataset will be released upon paper
notification
A Survey on Graph Neural Networks in Intelligent Transportation Systems
Intelligent Transportation System (ITS) is vital in improving traffic
congestion, reducing traffic accidents, optimizing urban planning, etc.
However, due to the complexity of the traffic network, traditional machine
learning and statistical methods are relegated to the background. With the
advent of the artificial intelligence era, many deep learning frameworks have
made remarkable progress in various fields and are now considered effective
methods in many areas. As a deep learning method, Graph Neural Networks (GNNs)
have emerged as a highly competitive method in the ITS field since 2019 due to
their strong ability to model graph-related problems. As a result, more and
more scholars pay attention to the applications of GNNs in transportation
domains, which have shown excellent performance. However, most of the research
in this area is still concentrated on traffic forecasting, while other ITS
domains, such as autonomous vehicles and urban planning, still require more
attention. This paper aims to review the applications of GNNs in six
representative and emerging ITS domains: traffic forecasting, autonomous
vehicles, traffic signal control, transportation safety, demand prediction, and
parking management. We have reviewed extensive graph-related studies from 2018
to 2023, summarized their methods, features, and contributions, and presented
them in informative tables or lists. Finally, we have identified the challenges
of applying GNNs to ITS and suggested potential future directions
Proceedings of the 3rd Workshop on Domain-Specific Language Design and Implementation (DSLDI 2015)
The goal of the DSLDI workshop is to bring together researchers and
practitioners interested in sharing ideas on how DSLs should be designed,
implemented, supported by tools, and applied in realistic application contexts.
We are both interested in discovering how already known domains such as graph
processing or machine learning can be best supported by DSLs, but also in
exploring new domains that could be targeted by DSLs. More generally, we are
interested in building a community that can drive forward the development of
modern DSLs. These informal post-proceedings contain the submitted talk
abstracts to the 3rd DSLDI workshop (DSLDI'15), and a summary of the panel
discussion on Language Composition
Extracting and classifying exceptional COVID-19 measures from multilingual legal texts:The merits and limitations of automated approaches
This paper contributes to ongoing scholarly debates on the merits and limitations of computational legal text analysis by reflecting on the results of a research project documenting exceptional COVID-19 management measures in Europe. The variety of exceptional measures adopted in countries characterized by different legal systems and natural languages, as well as the rapid evolution of such measures, pose considerable challenges to manual textual analysis methods traditionally used in the social sciences. To address these challenges, we develop a supervised classifier to support the manual coding of exceptional policies by a multinational team of human coders. After presenting the results of various natural language processing (NLP) experiments, we show that human-in-the-loop approaches to computational text analysis outperform unsupervised approaches in accurately extracting policy events from legal texts. We draw lessons from our experience to ensure the successful integration of NLP methods into social science research agendas.</p
Limitations and challenges of unsupervised cross-lingual pre-training
[ES] Los métodos de alineamiento croslingüe para representaciones monolingües del lenguaje han sido objeto de un interés notable en el campo de procesamiento del lenguaje natural durante los últimos años, en gran medida debido a la capacidad que estos tienen para general alineamientos entre lenguas utilizando poca o nula información paralela. Sin embargo, su uso en técnicas de preentrenamiento de modelos de traducción automática, un papel en el que los modelos monolingües son particularmente exitosos, y que debería beneficiarse de la información croslingüe obtenida, sigue siendo limitado. Esta propuesta intenta aportar algo de luz sobre los efectos de algunos de los factores que afectan a las representaciones croslingües y las estrategias de preentrenamiento, con la esperanza de que pueda ayudar a futuras investigaciones en este campo. Para ello, este trabajo estudia los dos componentes principales que constituyen el preentrenamiento croslingüe: los alineamientos croslingües y la integración de los mismos como modelos de preentrenamiento. Los primeros son explorados a través de varios métodos croslingües no supervisados ampliamente conocidos, que emplean principalmente similaridades distribucionales para encontrar un alineamiento satisfactorio entre lenguajes. Debido a esto, resultan un interesante terreno de pruebas en el que analizar los efectos de la similaridad entre lenguajes sobre tanto las técnicas de alineamiento croslingüe como los espacios de representación sobre los que operan. En en apartado de integración en preentrenamiento, los espacios de representación croslingües son utilizados para preentrenar modelos de traducción automática, los cuales son comparados contra esquemas que emplean espacios de representación independientes. Los resultados muestran que los métodos croslingües con supervisión débil son remarcablemente efectivos a la hora de generar alineamientos incluso para parejas de lenguajes muy diferentes, y se benefician notablemente de la información a nivel de subpalabra. Sin embargo, el efecto del alineamiento croslingüe en el preentrenamiento es reducido debido a las dificultad de mantener la estructura de la proyección durante el entrenamiento, así como por la limitada influencia que el propio preentrenamiento tiene sobre el modelo supervisado.[EN] Cross-lingual alignment methods for monolingual language representations have received notable research attention in the past few years due to their capacity to induce bilingual alignments with little or no supervision signals. However, their use in machine translation pre-training, a function that monolingual models excel at, and which should benefit from cross-lingual information, remains limited. This work tries to shed light on the effects of some of the factors that play a role in cross-lingual representations and pre-training strategies, with the hope that it can help guide future endeavors in the field. To this end, the survey studies the two main components that constitute cross-lingual pre-training: cross-lingual mappings and their pre-training integration. The former are explored through some widely known fully unsupervised cross-lingual methods, which rely on distributional similarities between languages. Consequently, they are a great basis upon which to consider the effects of language similarity on both cross-mapping techniques and the representation spaces over which they operate. In pre-training integration, cross-lingual representation spaces are used to pre-train a neural machine translation models, which are compared against techniques that employ independent monolingual spaces. The results show that weakly-supervised cross-lingual methods are remarkably effective at inducing alignment even for distant languages and they benefit noticeably from subword information. However, the effect of cross-linguality in pre-training is diminished due to difficulties in maintaining the structure of the projection during training, and the limited influence that pre-training itself has in the supervised model.Quesada Zaragoza, M. (2021). Limitations and challenges of unsupervised cross-lingual pre-training. Universitat Politècnica de València. http://hdl.handle.net/10251/174111TFG
Efficient Long-Text Understanding with Short-Text Models
Transformer-based pretrained language models (LMs) are ubiquitous across
natural language understanding, but cannot be applied to long sequences such as
stories, scientific articles and long documents, due to their quadratic
complexity. While a myriad of efficient transformer variants have been
proposed, they are typically based on custom implementations that require
expensive pretraining from scratch. In this work, we propose SLED:
SLiding-Encoder and Decoder, a simple approach for processing long sequences
that re-uses and leverages battle-tested short-text pretrained LMs.
Specifically, we partition the input into overlapping chunks, encode each with
a short-text LM encoder and use the pretrained decoder to fuse information
across chunks (fusion-in-decoder). We illustrate through controlled experiments
that SLED offers a viable strategy for long text understanding and evaluate our
approach on SCROLLS, a benchmark with seven datasets across a wide range of
language understanding tasks. We find that SLED is competitive with specialized
models that are up to 50x larger and require a dedicated and expensive
pretraining step.Comment: Accepted for publication in Transactions of the Association for
Computational Linguistics (TACL), 2023. Authors' final version (pre-MIT
- …