11 research outputs found
Embeddings for Named Entity Recognition in Geoscience Portuguese Literature
This work focuses on Portuguese Named Entity Recognition (NER) in the Geology domain. The only domain-specific dataset in the Portuguese language annotated for Named Entity Recognition is the GeoCorpus. Our approach relies on Bidirecional Long Short-Term Memory - Conditional Random Fields neural networks (BiLSTM-CRF) - a widely used type of network for this area of research - that use vector and tensor embedding representations. We used three types of embedding models (Word Embeddings, Flair Embeddings, and Stacked Embeddings) under two versions (domain-specific and generalized). We originally trained the domain specific Flair Embeddings model with a generalized context in mind, but we fine-tuned with domain-specific Oil and Gas corpora, as there simply was not enough domain corpora to properly train such a model. We evaluated each of these embeddings separately, as well as we stacked with another embedding. Finally, we achieved state-of-the-art results for this domain with one of our embeddings, and we performed an error analysis on the language model that achieved the best results. Furthermore, we investigated the effects of domain-specific versus generalized embeddings.UIDB/00057/2020, CEECIND/01997/201
A Deep Learning Entity Extraction Model for Chinese Government Documents
In this paper, we propose a combined Whole-Word-Masking based Robustly Optimized BERT pretraining approach with dictionary embedding entities recognition model for Chinese documents. By using multiple feature vectors generated by such as Roberta and domain dictionaries as embedding layers, the contextual semantic information of the text is fully considered. Meanwhile, Bi-directional Long Short-Term Memory(BiLSTM) and a multi-head attention mechanism are used to learn the information of long-distance dependency of the text. We use conditional random field(CRF) to obtain the global optimal annotation sequence, which is expected to improve the performance of the model. In this paper, we conduct comparison experiments with five baseline-based methods in the official document dataset of government affairs domain. The Precision of the model is 91.8%, Recall is 90.5%, and F1 value is 91.1%, which are better than other baseline models, indicating that the proposed model is more accurate for recognizing named entities in government documents
Relation Network for Multi-label Aerial Image Classification
Multi-label classification plays a momentous role in perceiving intricate
contents of an aerial image and triggers several related studies over the last
years. However, most of them deploy few efforts in exploiting label relations,
while such dependencies are crucial for making accurate predictions. Although
an LSTM layer can be introduced to modeling such label dependencies in a chain
propagation manner, the efficiency might be questioned when certain labels are
improperly inferred. To address this, we propose a novel aerial image
multi-label classification network, attention-aware label relational reasoning
network. Particularly, our network consists of three elemental modules: 1) a
label-wise feature parcel learning module, 2) an attentional region extraction
module, and 3) a label relational inference module. To be more specific, the
label-wise feature parcel learning module is designed for extracting high-level
label-specific features. The attentional region extraction module aims at
localizing discriminative regions in these features and yielding attentional
label-specific features. The label relational inference module finally predicts
label existences using label relations reasoned from outputs of the previous
module. The proposed network is characterized by its capacities of extracting
discriminative label-wise features in a proposal-free way and reasoning about
label relations naturally and interpretably. In our experiments, we evaluate
the proposed model on the UCM multi-label dataset and a newly produced dataset,
AID multi-label dataset. Quantitative and qualitative results on these two
datasets demonstrate the effectiveness of our model. To facilitate progress in
the multi-label aerial image classification, the AID multi-label dataset will
be made publicly available
Deep Learning for Aerial Scene Understanding in High Resolution Remote Sensing Imagery from the Lab to the Wild
Diese Arbeit präsentiert die Anwendung von Deep Learning beim Verständnis von Luftszenen, z. B. Luftszenenerkennung, Multi-Label-Objektklassifizierung und semantische Segmentierung. Abgesehen vom Training tiefer Netzwerke unter Laborbedingungen bietet diese Arbeit auch Lernstrategien für praktische Szenarien, z. B. werden Daten ohne Einschränkungen gesammelt oder Annotationen sind knapp
Sustainable Agriculture and Advances of Remote Sensing (Volume 2)
Agriculture, as the main source of alimentation and the most important economic activity globally, is being affected by the impacts of climate change. To maintain and increase our global food system production, to reduce biodiversity loss and preserve our natural ecosystem, new practices and technologies are required. This book focuses on the latest advances in remote sensing technology and agricultural engineering leading to the sustainable agriculture practices. Earth observation data, in situ and proxy-remote sensing data are the main source of information for monitoring and analyzing agriculture activities. Particular attention is given to earth observation satellites and the Internet of Things for data collection, to multispectral and hyperspectral data analysis using machine learning and deep learning, to WebGIS and the Internet of Things for sharing and publication of the results, among others
LIPIcs, Volume 277, GIScience 2023, Complete Volume
LIPIcs, Volume 277, GIScience 2023, Complete Volum