Search CORE

270 research outputs found

Grid Recognition: Classical and Parameterized Computational Perspectives

Author: Gupta Siddharth
Sa\u27ar Guy
Zehavi Meirav
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 32nd International Symposium on Algorithms and Computation (ISAAC 2021)
Publication date: 01/01/2021
Field of study

Grid graphs, and, more generally,

k\times r

grid graphs, form one of the most basic classes of geometric graphs. Over the past few decades, a large body of works studied the (in)tractability of various computational problems on grid graphs, which often yield substantially faster algorithms than general graphs. Unfortunately, the recognition of a grid graph is particularly hard -- it was shown to be NP-hard even on trees of pathwidth 3 already in 1987. Yet, in this paper, we provide several positive results in this regard in the framework of parameterized complexity (additionally, we present new and complementary hardness results). Specifically, our contribution is threefold. First, we show that the problem is fixed-parameter tractable (FPT) parameterized by

k+\mathsf {mcc}

where

\mathsf{mcc}

is the maximum size of a connected component of

G

. This also implies that the problem is FPT parameterized by

\mathtt{td}+k

where

\mathtt{td}

is the treedepth of

G

(to be compared with the hardness for pathwidth 2 where

k=3

). Further, we derive as a corollary that strip packing is FPT with respect to the height of the strip plus the maximum of the dimensions of the packed rectangles, which was previously only known to be in XP. Second, we present a new parameterization, denoted

a_G

, relating graph distance to geometric distance, which may be of independent interest. We show that the problem is para-NP-hard parameterized by

a_G

, but FPT parameterized by

a_G

on trees, as well as FPT parameterized by

k+a_G

. Third, we show that the recognition of

k\times r

grid graphs is NP-hard on graphs of pathwidth 2 where

k=3

. Moreover, when

k

and

r

are unrestricted, we show that the problem is NP-hard on trees of pathwidth 2, but trivially solvable in polynomial time on graphs of pathwidth 1

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Distance Metric Learning Loss Functions in Few-Shot Scenarios of Supervised Language Models Fine-Tuning

Author: Gawrysiak Piotr
Seweryn Karolina
Sosnowski Witold
Wróblewska Anna
Publication venue
Publication date: 28/11/2022
Field of study

This paper presents an analysis regarding an influence of the Distance Metric Learning (DML) loss functions on the supervised fine-tuning of the language models for classification tasks. We experimented with known datasets from SentEval Transfer Tasks. Our experiments show that applying the DML loss function can increase performance on downstream classification tasks of RoBERTa-large models in few-shot scenarios. Models fine-tuned with the use of SoftTriple loss can achieve better results than models with a standard categorical cross-entropy loss function by about 2.89 percentage points from 0.04 to 13.48 percentage points depending on the training dataset. Additionally, we accomplished a comprehensive analysis with explainability techniques to assess the models' reliability and explain their results

arXiv.org e-Print Archive

Learning the Ordering of Coordinate Compounds and Elaborate Expressions in Hmong, Lahu, and Chinese

Author: Cui Chenxuan
Mortensen David R.
Zhang Katherine J.
Publication venue
Publication date: 08/04/2022
Field of study

Coordinate compounds (CCs) and elaborate expressions (EEs) are coordinate constructions common in languages of East and Southeast Asia. Mortensen (2006) claims that (1) the linear ordering of EEs and CCs in Hmong, Lahu, and Chinese can be predicted via phonological hierarchies and (2) these phonological hierarchies lack a clear phonetic rationale. These claims are significant because morphosyntax has often been seen as in a feed-forward relationship with phonology, and phonological generalizations have often been assumed to be phonetically "natural". We investigate whether the ordering of CCs and EEs can be learned empirically and whether computational models (classifiers and sequence labeling models) learn unnatural hierarchies similar to those posited by Mortensen (2006). We find that decision trees and SVMs learn to predict the order of CCs/EEs on the basis of phonology, with DTs learning hierarchies strikingly similar to those proposed by Mortensen. However, we also find that a neural sequence labeling model is able to learn the ordering of elaborate expressions in Hmong very effectively without using any phonological information. We argue that EE ordering can be learned through two independent routes: phonology and lexical distribution, presenting a more nuanced picture than previous work. [ISO 639-3:hmn, lhu, cmn]Comment: To be published in NAACL202

arXiv.org e-Print Archive

Coherence in Machine Translation

Author: Sim Smith Karin M
Publication venue: 'University of Sheffield Conference Proceedings'
Publication date: 01/04/2018
Field of study

Coherence ensures individual sentences work together to form a meaningful document. When properly translated, a coherent document in one language should result in a coherent document in another language. In Machine Translation, however, due to reasons of modeling and computational complexity, sentences are pieced together from words or phrases based on short context windows and with no access to extra-sentential context. In this thesis I propose ways to automatically assess the coherence of machine translation output. The work is structured around three dimensions: entity-based coherence, coherence as evidenced via syntactic patterns, and coherence as evidenced via discourse relations. For the first time, I evaluate existing monolingual coherence models on this new task, identifying issues and challenges that are specific to the machine translation setting. In order to address these issues, I adapted a state-of-the-art syntax model, which also resulted in improved performance for the monolingual task. The results clearly indicate how much more difficult the new task is than the task of detecting shuffled texts. I proposed a new coherence model, exploring the crosslingual transfer of discourse relations in machine translation. This model is novel in that it measures the correctness of the discourse relation by comparison to the source text rather than to a reference translation. I identified patterns of incoherence common across different language pairs, and created a corpus of machine translated output annotated with coherence errors for evaluation purposes. I then examined lexical coherence in a multilingual context, as a preliminary study for crosslingual transfer. Finally, I determine how the new and adapted models correlate with human judgements of translation quality and suggest that improvements in general evaluation within machine translation would benefit from having a coherence component that evaluated the translation output with respect to the source text

White Rose E-theses Online

Soil moisture deficit estimation using satellite multi-angle brightness temperature

Author: Ahmad
Aires
Al Bitar
Al-Shrafany
Al-Yaari
Bartholomé
Beven
Bishop
Brodzik
Calvet
Chen
Daly
Dawei Han
Dehghani
Draper
Durrant
Elsa
Entekhabi
Evans
Fletcher
Gan
Han
Han
Hornik
Ireland
Islam
Jaafar
Jones
Kerr
Kerr
Kerr
Koncar
Liu
Lu Zhuo
Massari
McMullan
Mitchell
Nash
Noori
Peel
Penrose
Penrose
Pi
Pinson
Qiang Dai
Remesan
Rodriguez-Fernandez
Rodríguez-Fernández
Shi
SMOS-BEC
Srivastava
Srivastava
Srivastava
Stefánsson
Sun
Tehrany
Tramblay
Tsui
USDA
Walker
Wang
Wei
Wigneron
Xia
Zhao
Zhao
Zhao
Zhuo
Zhuo
Zhuo
Zhuo
Zhuo
Zhuo
Publication venue: 'Elsevier BV'
Publication date: 01/08/2016
Field of study

Accurate soil moisture information is critically important for hydrological modelling. Although remote sensing soil moisture measurement has become an important data source, it cannot be used directly in hydrological modelling. A novel study based on nonlinear techniques (a local linear regression (LLR) and two feedforward artificial neural networks (ANNs)) is carried out to estimate soil moisture deficit (SMD), using the Soil Moisture and Ocean Salinity (SMOS) multi-angle brightness temperatures (Tbs) with both horizontal (H) and vertical (V) polarisations. The gamma test is used for the first time to determine the optimum number of Tbs required to construct a reliable smooth model for SMD estimation, and the relationship between model input and output is achieved through error variance estimation. The simulated SMD time series in the study area is from the Xinanjiang hydrological model. The results have shown that LLR model is better at capturing the interrelations between SMD and Tbs than ANNs, with outstanding statistical performances obtained during both training (NSE = 0.88, r = 0.94, RMSE = 0.008 m) and testing phases (NSE = 0.85, r = 0.93, RMSE = 0.009 m). Nevertheless, both ANN training algorithms (radial BFGS and conjugate gradient) have performed well in estimating the SMD data and showed excellent performances compared with those derived directly from the SMOS soil moisture products. This study has also demonstrated the informative capability of the gamma test in the input data selection for model development. These results provide interesting perspectives for data-assimilation in flood-forecasting

Crossref

Online Research @ Cardiff

White Rose Research Online

Explore Bristol Research

Unitary Representations of Wavelet Groups and Encoding of Iterated Function Systems in Solenoids

Author: Dutkay Dorin Ervin
Jorgensen Palle E. T.
Picioroaga Gabriel
Publication venue
Publication date: 01/01/2007
Field of study

For points in

d

real dimensions, we introduce a geometry for general digit sets. We introduce a positional number system where the basis for our representation is a fixed

d

d

matrix over \bz. Our starting point is a given pair

(A, \mathcal D)

with the matrix

A

assumed expansive, and

\mathcal D

a chosen complete digit set, i.e., in bijective correspondence with the points in \bz^d/A^T\bz^d. We give an explicit geometric representation and encoding with infinite words in letters from

\mathcal D

. We show that the attractor

X(A^T,\mathcal D)

for an affine Iterated Function System (IFS) based on

(A,\mathcal D)

is a set of fractions for our digital representation of points in \br^d. Moreover our positional "number representation" is spelled out in the form of an explicit IFS-encoding of a compact solenoid \sa associated with the pair

(A,\mathcal D)

. The intricate part (Theorem \ref{thenccycl}) is played by the cycles in \bz^d for the initial

(A,\mathcal D)

-IFS. Using these cycles we are able to write down formulas for the two maps which do the encoding as well as the decoding in our positional

\mathcal D

-representation. We show how some wavelet representations can be realized on the solenoid, and on symbolic spaces

arXiv.org e-Print Archive

CiteSeerX

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)