840 research outputs found
A review of differentiable digital signal processing for music and speech synthesis
The term “differentiable digital signal processing” describes a family of techniques in which loss function gradients are backpropagated through digital signal processors, facilitating their integration into neural networks. This article surveys the literature on differentiable audio signal processing, focusing on its use in music and speech synthesis. We catalogue applications to tasks including music performance rendering, sound matching, and voice transformation, discussing the motivations for and implications of the use of this methodology. This is accompanied by an overview of digital signal processing operations that have been implemented differentiably, which is further supported by a web book containing practical advice on differentiable synthesiser programming (https://intro2ddsp.github.io/). Finally, we highlight open challenges, including optimisation pathologies, robustness to real-world conditions, and design trade-offs, and discuss directions for future research
LIPIcs, Volume 251, ITCS 2023, Complete Volume
LIPIcs, Volume 251, ITCS 2023, Complete Volum
SCALE: Scaling up the Complexity for Advanced Language Model Evaluation
Recent strides in Large Language Models (LLMs) have saturated many NLP benchmarks (even professional domain-specific ones), emphasizing the need for novel, more challenging novel ones to properly assess LLM capabilities. In this paper, we introduce a novel NLP benchmark that poses challenges to current LLMs across four key dimensions: processing long documents (up to 50K tokens), utilizing domain specific knowledge (embodied in legal texts), multilingual understanding (covering five languages), and multitasking (comprising legal document to
document Information Retrieval, Court View Generation, Leading Decision Summarization, Citation Extraction, and eight challenging Text Classification tasks). Our benchmark comprises diverse legal NLP datasets from the Swiss legal system, allowing for a comprehensive study of the underlying Non-English, inherently multilingual, federal legal system. Despite recent advances, efficiently processing long documents for intense review/analysis tasks remains an open challenge for language models. Also, comprehensive, domain-specific benchmarks requiring high expertise to develop are rare, as are multilingual benchmarks. This scarcity underscores our contribution’s value, considering most public models are trained predominantly on English corpora, while other languages remain understudied, particularly for practical domain-specific NLP tasks. Our benchmark allows for testing and advancing the state-of-the-art LLMs. As part of our study, we evaluate several pre-trained multilingual language models on our benchmark to establish strong baselines as a point of reference. Despite the large size of our datasets ∗ Equal contribution. (tens to hundreds of thousands of examples), existing publicly available models struggle with most tasks, even after in-domain pretraining. We publish all resources (benchmark suite, pre-trained models, code) under a fully permissive open CC BY-SA license
Advances and Applications of DSmT for Information Fusion. Collected Works, Volume 5
This fifth volume on Advances and Applications of DSmT for Information Fusion collects theoretical and applied contributions of researchers working in different fields of applications and in mathematics, and is available in open-access. The collected contributions of this volume have either been published or presented after disseminating the fourth volume in 2015 in international conferences, seminars, workshops and journals, or they are new. The contributions of each part of this volume are chronologically ordered.
First Part of this book presents some theoretical advances on DSmT, dealing mainly with modified Proportional Conflict Redistribution Rules (PCR) of combination with degree of intersection, coarsening techniques, interval calculus for PCR thanks to set inversion via interval analysis (SIVIA), rough set classifiers, canonical decomposition of dichotomous belief functions, fast PCR fusion, fast inter-criteria analysis with PCR, and improved PCR5 and PCR6 rules preserving the (quasi-)neutrality of (quasi-)vacuous belief assignment in the fusion of sources of evidence with their Matlab codes.
Because more applications of DSmT have emerged in the past years since the apparition of the fourth book of DSmT in 2015, the second part of this volume is about selected applications of DSmT mainly in building change detection, object recognition, quality of data association in tracking, perception in robotics, risk assessment for torrent protection and multi-criteria decision-making, multi-modal image fusion, coarsening techniques, recommender system, levee characterization and assessment, human heading perception, trust assessment, robotics, biometrics, failure detection, GPS systems, inter-criteria analysis, group decision, human activity recognition, storm prediction, data association for autonomous vehicles, identification of maritime vessels, fusion of support vector machines (SVM), Silx-Furtif RUST code library for information fusion including PCR rules, and network for ship classification.
Finally, the third part presents interesting contributions related to belief functions in general published or presented along the years since 2015. These contributions are related with decision-making under uncertainty, belief approximations, probability transformations, new distances between belief functions, non-classical multi-criteria decision-making problems with belief functions, generalization of Bayes theorem, image processing, data association, entropy and cross-entropy measures, fuzzy evidence numbers, negator of belief mass, human activity recognition, information fusion for breast cancer therapy, imbalanced data classification, and hybrid techniques mixing deep learning with belief functions as well
Seamless Multimodal Biometrics for Continuous Personalised Wellbeing Monitoring
Artificially intelligent perception is increasingly present in the lives of
every one of us. Vehicles are no exception, (...) In the near future, pattern
recognition will have an even stronger role in vehicles, as self-driving cars
will require automated ways to understand what is happening around (and within)
them and act accordingly. (...) This doctoral work focused on advancing
in-vehicle sensing through the research of novel computer vision and pattern
recognition methodologies for both biometrics and wellbeing monitoring. The
main focus has been on electrocardiogram (ECG) biometrics, a trait well-known
for its potential for seamless driver monitoring. Major efforts were devoted to
achieving improved performance in identification and identity verification in
off-the-person scenarios, well-known for increased noise and variability. Here,
end-to-end deep learning ECG biometric solutions were proposed and important
topics were addressed such as cross-database and long-term performance,
waveform relevance through explainability, and interlead conversion. Face
biometrics, a natural complement to the ECG in seamless unconstrained
scenarios, was also studied in this work. The open challenges of masked face
recognition and interpretability in biometrics were tackled in an effort to
evolve towards algorithms that are more transparent, trustworthy, and robust to
significant occlusions. Within the topic of wellbeing monitoring, improved
solutions to multimodal emotion recognition in groups of people and
activity/violence recognition in in-vehicle scenarios were proposed. At last,
we also proposed a novel way to learn template security within end-to-end
models, dismissing additional separate encryption processes, and a
self-supervised learning approach tailored to sequential data, in order to
ensure data security and optimal performance. (...)Comment: Doctoral thesis presented and approved on the 21st of December 2022
to the University of Port
XAIR: A Framework of Explainable AI in Augmented Reality
Explainable AI (XAI) has established itself as an important component of
AI-driven interactive systems. With Augmented Reality (AR) becoming more
integrated in daily lives, the role of XAI also becomes essential in AR because
end-users will frequently interact with intelligent services. However, it is
unclear how to design effective XAI experiences for AR. We propose XAIR, a
design framework that addresses "when", "what", and "how" to provide
explanations of AI output in AR. The framework was based on a
multi-disciplinary literature review of XAI and HCI research, a large-scale
survey probing 500+ end-users' preferences for AR-based explanations, and three
workshops with 12 experts collecting their insights about XAI design in AR.
XAIR's utility and effectiveness was verified via a study with 10 designers and
another study with 12 end-users. XAIR can provide guidelines for designers,
inspiring them to identify new design opportunities and achieve effective XAI
designs in AR.Comment: Proceedings of the 2023 CHI Conference on Human Factors in Computing
System
Recommended from our members
Computational Models of Argument Structure and Argument Quality for Understanding Misinformation
With the continuing spread of misinformation and disinformation online, it is of increasing importance to develop combating mechanisms at scale in the form of automated systems that can find checkworthy information, detect fallacious argumentation of online content, retrieve relevant evidence from authoritative sources and analyze the veracity of claims given the retrieved evidence. The robustness and applicability of these systems depend on the availability of annotated resources to train machine learning models in a supervised fashion, as well as machine learning models that capture patterns beyond domain-specific lexical clues or genre-specific stylistic insights. In this thesis, we investigate the role of models for argument structure and argument quality in improving tasks relevant to fact-checking and furthering our understanding of misinformation and disinformation. We contribute to argumentation mining, misinformation detection, and fact-checking by releasing multiple annotated datasets, developing unified models across datasets and task formulations, and analyzing the vulnerabilities of such models in adversarial settings.
We start by studying the argument structure's role in two downstream tasks related to fact-checking. As it is essential to differentiate factual knowledge from opinionated text, we develop a model for detecting the type of news articles (factual or opinionated) using highly transferable argumentation-based features. We also show the potential of argumentation features to predict the checkworthiness of information in news articles and provide the first multi-layer annotated corpus for argumentation and fact-checking.
We then study qualitative aspects of arguments through models for fallacy recognition. To understand the reasoning behind checkworthiness and the relation of argumentative fallacies to fake content, we develop an annotation scheme of fallacies in fact-checked content and investigate avenues for automating the detection of such fallacies considering single- and multi-dataset training. Using instruction-based prompting, we introduce a unified model for recognizing twenty-eight fallacies across five fallacy datasets. We also use this model to explain the checkworthiness of statements in two domains.
Next, we show our models for end-to-end fact-checking of statements that include finding the relevant evidence document and sentence from a collection of documents and then predicting the veracity of the given statements using the retrieved evidence. We also analyze the robustness of end-to-end fact extraction and verification by generating adversarial statements and addressing areas for improvements for models under adversarial attacks. Finally, we show that evidence-based verification is essential for fine-grained claim verification by modeling the human-provided justifications with the gold veracity labels
Synthesizing Conjunctive Queries for Code Search
This paper presents Squid, a new conjunctive query synthesis algorithm for searching code with target patterns. Given positive and negative examples along with a natural language description, Squid analyzes the relations derived from the examples by a Datalog-based program analyzer and synthesizes a conjunctive query expressing the search intent. The synthesized query can be further used to search for desired grammatical constructs in the editor. To achieve high efficiency, we prune the huge search space by removing unnecessary relations and enumerating query candidates via refinement. We also introduce two quantitative metrics for query prioritization to select the queries from multiple candidates, yielding desired queries for code search. We have evaluated Squid on over thirty code search tasks. It is shown that Squid successfully synthesizes the conjunctive queries for all the tasks, taking only 2.56 seconds on average
Using Implicit Feedback to Improve Question Generation
Question Generation (QG) is a task of Natural Language Processing (NLP) that
aims at automatically generating questions from text. Many applications can
benefit from automatically generated questions, but often it is necessary to
curate those questions, either by selecting or editing them. This task is
informative on its own, but it is typically done post-generation, and, thus,
the effort is wasted. In addition, most existing systems cannot incorporate
this feedback back into them easily. In this work, we present a system, GEN,
that learns from such (implicit) feedback. Following a pattern-based approach,
it takes as input a small set of sentence/question pairs and creates patterns
which are then applied to new unseen sentences. Each generated question, after
being corrected by the user, is used as a new seed in the next iteration, so
more patterns are created each time. We also take advantage of the corrections
made by the user to score the patterns and therefore rank the generated
questions. Results show that GEN is able to improve by learning from both
levels of implicit feedback when compared to the version with no learning,
considering the top 5, 10, and 20 questions. Improvements go up from 10%,
depending on the metric and strategy used.Comment: 27 pages, 8 figure
- …