Search CORE

74 research outputs found

Proceedings of the 2022 XCSP3 Competition

Author: Audemard Gilles
Lecoutre Christophe
Lonca Emmanuel
Publication venue
Publication date: 02/09/2022
Field of study

This document represents the proceedings of the 2022 XCSP3 Competition. The results of this competition of constraint solvers were presented at FLOC (Federated Logic Conference) 2022 Olympic Games, held in Haifa, Israel from 31th July 2022 to 7th August, 2022.Comment: arXiv admin note: text overlap with arXiv:1901.0183

arXiv.org e-Print Archive

HAL-Artois

Recommender system in a non-stationary context: recommending job ads in pandemic times

Author: Bied Guillaume
Caillou Philippe
Crépon Bruno
Gaillac Christophe
Nathan Solal
Naya Victor,
Pérennes Elia
Sebag Michèle
Publication venue: HAL CCSD
Publication date: 19/09/2022
Field of study

International audienceThis paper focuses on the recommendation of job ads to job seekers, exploiting proprietary data from the French Public Employment Service (PES) and focusing more specifically on low or unskilled workers. Besides the usual challenges of data sparsity, the signal to noise ratio is high (few job seekers have diplomas), and scalability requirements are paramount.As a first contribution, a two-tiered approach is designed to handle these requirements; its empirical validation shows significant computational gains with no performance loss compared to boosted tree ensembles representative of the state of the art.A second contribution is a methodology aimed to assess the impact of the non-stationarity of the item and user distributions. Specifically, during the last three periods (before, during and after the Covid lock-downs), the numbers of job ads and job seekers dramatically vary in some industries. A normalized recall indicator is proposed to filter out the impact of variations of the number of job ads. This normalization suggests that the same score function adapts to the multi-faceted changes of the environment, resulting in different recommendations but with similar accuracy as before - at least for the job seekers finding a job

INRIA a CCSD electronic archive server

Recognition and Exploitation of Gate Structure in SAT Solving

Author: Iser Markus
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2020
Field of study

In der theoretischen Informatik ist das SAT-Problem der archetypische Vertreter der Klasse der NP-vollständigen Probleme, weshalb effizientes SAT-Solving im Allgemeinen als unmöglich angesehen wird. Dennoch erzielt man in der Praxis oft erstaunliche Resultate, wo einige Anwendungen Probleme mit Millionen von Variablen erzeugen, die von neueren SAT-Solvern in angemessener Zeit gelöst werden können. Der Erfolg von SAT-Solving in der Praxis ist auf aktuelle Implementierungen des Conflict Driven Clause-Learning (CDCL) Algorithmus zurückzuführen, dessen Leistungsfähigkeit weitgehend von den verwendeten Heuristiken abhängt, welche implizit die Struktur der in der industriellen Praxis erzeugten Instanzen ausnutzen. In dieser Arbeit stellen wir einen neuen generischen Algorithmus zur effizienten Erkennung der Gate-Struktur in CNF-Encodings von SAT Instanzen vor, und außerdem drei Ansätze, in denen wir diese Struktur explizit ausnutzen. Unsere Beiträge umfassen auch die Implementierung dieser Ansätze in unserem SAT-Solver Candy und die Entwicklung eines Werkzeugs für die verteilte Verwaltung von Benchmark-Instanzen und deren Attribute, der Global Benchmark Database (GBD)

KITopen

Towards a Question Answering View of Natural Language Processing

Author: Zukov Gregoric Andrej
Publication venue
Publication date: 01/01/2019
Field of study

Royal Holloway - Pure

On Maximum Weight Clique Algorithms, and How They Are Evaluated

Author: A Mehrotra
AE Roth
C Mannino
C Mannino
C McCreesh
D Strash
DA Cohen
DA Cohen
K Glorie
L Gouveia
LA Sanchis
M Depolli
M Gendreau
MC Cooper
P Refalo
PRJ Östergård
Q Wu
Q Wu
R Carraghan
S Held
S Sethuraman
SL Saidman
T Sandholm
T Sandholm
U Benlic
V Boginski
V Chvátal
WH Suters
WJ Pullan
Y Fan
Y Wang
Y Wang
Y Zhou
Z Fang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Maximum weight clique and maximum weight independent set solvers are often benchmarked using maximum clique problem instances, with weights allocated to vertices by taking the vertex number mod 200 plus 1. For constraint programming approaches, this rule has clear implications, favouring weight-based rather than degree-based heuristics. We show that similar implications hold for dedicated algorithms, and that additionally, weight distributions affect whether certain inference rules are cost-effective. We look at other families of benchmark instances for the maximum weight clique problem, coming from winner determination problems, graph colouring, and error-correcting codes, and introduce two new families of instances, based upon kidney exchange and the Research Excellence Framework. In each case the weights carry much more interesting structure, and do not in any way resemble the 200 rule. We make these instances available in the hopes of improving the quality of future experiments

Crossref

Enlighten

On the Nature and Types of Anomalies: A Review

Author: Foorthuis Ralph
Publication venue
Publication date: 27/12/2020
Field of study

Anomalies are occurrences in a dataset that are in some way unusual and do not fit the general patterns. The concept of the anomaly is generally ill-defined and perceived as vague and domain-dependent. Moreover, despite some 250 years of publications on the topic, no comprehensive and concrete overviews of the different types of anomalies have hitherto been published. By means of an extensive literature review this study therefore offers the first theoretically principled and domain-independent typology of data anomalies, and presents a full overview of anomaly types and subtypes. To concretely define the concept of the anomaly and its different manifestations, the typology employs five dimensions: data type, cardinality of relationship, anomaly level, data structure and data distribution. These fundamental and data-centric dimensions naturally yield 3 broad groups, 9 basic types and 61 subtypes of anomalies. The typology facilitates the evaluation of the functional capabilities of anomaly detection algorithms, contributes to explainable data science, and provides insights into relevant topics such as local versus global anomalies.Comment: 38 pages (30 pages content), 10 figures, 3 tables. Preprint; review comments will be appreciated. Improvements in version 2: Explicit mention of fifth anomaly dimension; Added section on explainable anomaly detection; Added section on variations on the anomaly concept; Various minor additions and improvement

arXiv.org e-Print Archive

Artificial cognitive architecture with self-learning and self-optimization capabilities. Case studies in micromachining processes

Author: Beruvides López Gerardo
Publication venue
Publication date: 22/09/2017
Field of study

Tesis doctoral inédita leída en la Universidad Autónoma de Madrid, Escuela Politécnica Superior, Departamento de Ingeniería Informática. Fecha de lectura : 22-09-201

Digital.CSIC

Biblos-e Archivo

Hierarchical Text Classification: a review of current research

Author: Alessandro Zangari
Andrea Albarelli
Andrea Gasparetto
Matteo Marcuzzo
Matteo Rizzo
Michele Schiavinato
Publication venue
Publication date
Field of study

t is often the case that collections of documents are annotated with hierarchically-structured concepts. However, the benefits of this structure are rarely taken into account by commonly-used classification techniques. Conversely, Hierarchical Text Classification methods are devisedto take advantage of the labels’ organization to boost classification performance. With this work,we aim to deliver an updated overview of current research in this domain. We begin by definingthe task and framing it within the broader text classification area, examining important shared concepts such as text representation. Then, we dive into details regarding the specific task,providing a high-level description of its traditional approaches. We then summarize recentlyproposed methods, highlighting their main contributions. We additionally provide statisticsfor the most adopted datasets and describe the benefits of using evaluation metrics tailored to hierarchical settings. Finally, a selection of recent proposals is benchmarked against non-hierarchical baselines on five domain-specific datasets

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari