Search CORE

6 research outputs found

An overview of mixing augmentation methods and augmentation strategies

Author: Lewy Dominik
Mańdziuk Jacek
Publication venue
Publication date: 21/07/2021
Field of study

Deep Convolutional Neural Networks have made an incredible progress in many Computer Vision tasks. This progress, however, often relies on the availability of large amounts of the training data, required to prevent over-fitting, which in many domains entails significant cost of manual data labeling. An alternative approach is application of data augmentation (DA) techniques that aim at model regularization by creating additional observations from the available ones. This survey focuses on two DA research streams: image mixing and automated selection of augmentation strategies. First, the presented methods are briefly described, and then qualitatively compared with respect to their key characteristics. Various quantitative comparisons are also included based on the results reported in recent DA literature. This review mainly covers the methods published in the materials of top-tier conferences and in leading journals in the years 2017-2021

arXiv.org e-Print Archive

AttentionMix: Data augmentation method that relies on BERT attention mechanism

Author: Lewy Dominik
Mańdziuk Jacek
Publication venue
Publication date: 20/09/2023
Field of study

The Mixup method has proven to be a powerful data augmentation technique in Computer Vision, with many successors that perform image mixing in a guided manner. One of the interesting research directions is transferring the underlying Mixup idea to other domains, e.g. Natural Language Processing (NLP). Even though there already exist several methods that apply Mixup to textual data, there is still room for new, improved approaches. In this work, we introduce AttentionMix, a novel mixing method that relies on attention-based information. While the paper focuses on the BERT attention mechanism, the proposed approach can be applied to generally any attention-based model. AttentionMix is evaluated on 3 standard sentiment classification datasets and in all three cases outperforms two benchmark approaches that utilize Mixup mechanism, as well as the vanilla BERT method. The results confirm that the attention-based information can be effectively used for data augmentation in the NLP domain

arXiv.org e-Print Archive

Training CNN classifiers solely on webly data

Author: Lewy Dominik
Mandziuk Jacek
Publication venue: Społeczna Akademia Nauk w Łodzi. Polskie Towarzystwo Sieci Neuronowych
Publication date: 01/01/2023
Field of study

Real life applications of deep learning (DL) are often limited by the lack of expert labeled data required to effectively train DL models. Creation of such data usually requires substantial amount of time for manual categorization, which is costly and is considered to be one of the major impediments in development of DL methods in many areas. This work proposes a classification approach which completely removes the need for costly expert labeled data and utilizes noisy web data created by the users who are not subject matter experts. The experiments are performed with two well-known Convolutional Neural Network (CNN) architectures: VGG16 and ResNet50 trained on three randomly collected Instagram-based sets of images from three distinct domains: metropolitan cities, popular food and common objects - the last two sets were compiled by the authors and made freely available to the research community. The dataset containing common objects is a webly counterpart of PascalVOC2007 set. It is demonstrated that despite significant amount of label noise in the training data, application of proposed approach paired with standard training CNN protocol leads to high classification accuracy on representative data in all three above-mentioned domains. Additionally, two straightforward procedures of automatic cleaning of the data, before its use in the training process, are proposed. Apparently, data cleaning does not lead to improvement of results which suggests that the presence of noise in webly data is actually helpful in learning meaningful and robust class representations. Manual inspection of a subset of web-based test data shows that labels assigned to many images are ambiguous even for humans. It is our conclusion that for the datasets and CNN architectures used in this paper, in case of training with webly data, a major factor contributing to the final classification accuracy is representativeness of test data rather than application of data cleaning procedures

Biblioteka Nauki - repozytorium artykuÅÃ³w

The sediments of Wadi Qena (Eastern Desert, Egypt)

Author: Abd El Razik
Abdallah
Bandel
Bandel
Bandel
Barthel
Barthel
Basahel
Böttcher
Böttcher
Dominik
El Deftar
Fay
Garrison
Ghorab
Ghorab
Issawi
Issawi
Jux
Keheila
Klitzsch
Klitzsch
Kostandis
Kuss
Kuss
Lewy
Luger
Mazhar
Said
Said
Said
Schellwien
Schweinfurth
Van Houten
Walther
Ward
Youssef
Youssef
Youssef
Zittel
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Late Cenomanian oysters from Egypt and Jordan

Author: Abdallah
Abdel Hamid
Abdel-Gawad
Abdel-Gawad
Abdel-Gawad
Abdel-Gawad
Abdel-Gawad
Abdel-Gawad
Ahmad
Amard
Aqrabawi
Awad
Ayoub-Hannaa
Ayoub-Hannaa
Bandel
Bandel
Barber
Basha
Bauer
Blanckenhorn
Boreham
Cherif
Choffat
Collignon
Coquand
Dhondt
Dhondt
Dhondt
Dhondt
Dhondt
Dhondt
Dhondta
Dilley
Dominik
El-Qot
El-Qot
Farouk
Fawzi
Feldmann
Fischer
Fourtau
Freneix
Freneix
Gale
Greco
Haq
Hewaidy
Kassab
Kassab
Kassab
Khalil
Kora
Kora
Kora
Kuss
Kuss
Kuss
Lefranc
Lewy
MacLeod
Malchus
Mekawy
Perrilliat
Pervinquière
Powell
Powell
Powell
Quennell
Reeside
Schulze
Seeling
Seguenza
Sepùlveda
Sharpe
Soares
Stoliczka
Trevisan
Wetzel
Wiese
Wilmsen
Woods
Zakhera
Zakhera
Zakhera
Zakhera
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Lithostratigraphy, sedimentology, and cyclicity of the Duwi Formation (late Cretaceous) at Abu Tartur plateau, Western Desert of Egypt: evidences for reworking and redeposition

Author: A Almogi-Labin
A Katz
A Lewy
A Scotti
A Seilacher
A Wetzel
AA Tantawy
Abdalla M. El Ayyat
AF Embry
AM Abed
AM Kammar El
AS Wassef
B Issawi
BF Atwater
BR Wilkinson
BU Haq
C Savrda
CA Cowan
CR Glenn
D Osleger
DE Sibley
DH Porrenga
DJC Laming
E Klitzsch
E Klitzsch
E Klitzsch
E Schrank
E Schrank
EA Ahmed
F Hendriks
FB Houten Van
G Einsele
G Mateu-Vicens
GN Baturin
GV Chilinger
H Mansour
HD Johanson
HD Klemme
HR Wanless
J Small
J Trappe
J Trappe
JA Dorr
JA Kupecz
JF Burst
JF Read
JL Cisne
JL Wilson
JM Chievelet
JP Grotzinger
K Barthel
KB Föllmi
KB Föllmi
KC Cloyd
KG Condie
KN Sediek
M Friedman
M Slansky
M Yoo
MA Khalifa
MA Khalifa
MF Barad
MH Hermina
MI Youssef
MI Youssef
MS Compton
N Kumar
PN Southgate
PO Reynolds
PW Goodwin
R Oyarzun
R Said
R Said
RJ Dunham
RK Goldhammer
RV Demicco
RY Aruna
S Lüning
SE Nakkady
SO Schlanger
T Aigner
TR McHargue
VD Robinson
W Dominik
W Dominik
WF Körschner
X Picon Le
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref