16,858 research outputs found
Discontinuous grammar as a foreign language
[Abstract] In order to achieve deep natural language understanding, syntactic constituent parsing is a vital step, highly demanded by many artificial intelligence systems to process both text and speech. One of the most recent proposals is the use of standard sequence-to-sequence models to perform constituent parsing as a machine translation task, instead of applying task-specific parsers. While they show a competitive performance, these text-to-parse transducers are still lagging behind classic techniques in terms of accuracy,
coverage and speed. To close the gap, we here extend the framework of sequence-to-sequence models for constituent parsing, not only by providing a more powerful neural architecture for improving their performance, but also by enlarging their coverage to handle the most complex syntactic phenomena: discontinuous structures. To that end, we design several novel linearizations that can fully produce discontinuities and, for the first time, we test a sequence-to-sequence model on the main discontinuous benchmarks, obtaining competitive results on par with task-specific discontinuous constituent parsers and achieving state-of-the-art scores on the (discontinuous) English Penn Treebank.Xunta de Galicia; ED431G 2019/01Xunta de Galicia; ED431C 2020/11We acknowledge the European Research Council (ERC), which has funded this research under the European Unionâs Horizon 2020 research and innovation programme (FASTPARSE, grant agreement No 714150) and the Horizon Europe research and innovation programme (SALSA, grant agreement No 101100615), ERDF/ MICINN-AEI (SCANNER-UDC, PID2020-113230RB-C21), Xunta de Galicia (ED431C 2020/11), and Centro de InvestigaciĂłn de Galicia ââCITICâ, funded by Xunta de Galicia and the European Union (ERDF - Galicia 2014â2020 Program), by grant ED431G 2019/01. Funding for open access charge: Universidade da Coruña/CISUG
Protein-DNA binding sites prediction based on pre-trained protein language model and contrastive learning
Protein-DNA interaction is critical for life activities such as replication,
transcription, and splicing. Identifying protein-DNA binding residues is
essential for modeling their interaction and downstream studies. However,
developing accurate and efficient computational methods for this task remains
challenging. Improvements in this area have the potential to drive novel
applications in biotechnology and drug design. In this study, we propose a
novel approach called CLAPE, which combines a pre-trained protein language
model and the contrastive learning method to predict DNA binding residues. We
trained the CLAPE-DB model on the protein-DNA binding sites dataset and
evaluated the model performance and generalization ability through various
experiments. The results showed that the AUC values of the CLAPE-DB model on
the two benchmark datasets reached 0.871 and 0.881, respectively, indicating
superior performance compared to other existing models. CLAPE-DB showed better
generalization ability and was specific to DNA-binding sites. In addition, we
trained CLAPE on different protein-ligand binding sites datasets, demonstrating
that CLAPE is a general framework for binding sites prediction. To facilitate
the scientific community, the benchmark datasets and codes are freely available
at https://github.com/YAndrewL/clape
A systematic literature review on source code similarity measurement and clone detection: techniques, applications, and challenges
Measuring and evaluating source code similarity is a fundamental software
engineering activity that embraces a broad range of applications, including but
not limited to code recommendation, duplicate code, plagiarism, malware, and
smell detection. This paper proposes a systematic literature review and
meta-analysis on code similarity measurement and evaluation techniques to shed
light on the existing approaches and their characteristics in different
applications. We initially found over 10000 articles by querying four digital
libraries and ended up with 136 primary studies in the field. The studies were
classified according to their methodology, programming languages, datasets,
tools, and applications. A deep investigation reveals 80 software tools,
working with eight different techniques on five application domains. Nearly 49%
of the tools work on Java programs and 37% support C and C++, while there is no
support for many programming languages. A noteworthy point was the existence of
12 datasets related to source code similarity measurement and duplicate codes,
of which only eight datasets were publicly accessible. The lack of reliable
datasets, empirical evaluations, hybrid methods, and focuses on multi-paradigm
languages are the main challenges in the field. Emerging applications of code
similarity measurement concentrate on the development phase in addition to the
maintenance.Comment: 49 pages, 10 figures, 6 table
Focused Decoding Enables 3D Anatomical Detection by Transformers
Detection Transformers represent end-to-end object detection approaches based
on a Transformer encoder-decoder architecture, exploiting the attention
mechanism for global relation modeling. Although Detection Transformers deliver
results on par with or even superior to their highly optimized CNN-based
counterparts operating on 2D natural images, their success is closely coupled
to access to a vast amount of training data. This, however, restricts the
feasibility of employing Detection Transformers in the medical domain, as
access to annotated data is typically limited. To tackle this issue and
facilitate the advent of medical Detection Transformers, we propose a novel
Detection Transformer for 3D anatomical structure detection, dubbed Focused
Decoder. Focused Decoder leverages information from an anatomical region atlas
to simultaneously deploy query anchors and restrict the cross-attention's field
of view to regions of interest, which allows for a precise focus on relevant
anatomical structures. We evaluate our proposed approach on two publicly
available CT datasets and demonstrate that Focused Decoder not only provides
strong detection results and thus alleviates the need for a vast amount of
annotated data but also exhibits exceptional and highly intuitive
explainability of results via attention weights. Our code is available at
https://github.com/bwittmann/transoar.Comment: Accepted for publication at the Journal of Machine Learning for
Biomedical Imaging (MELBA) https://melba-journal.org/2023:00
DeepOnto: A Python Package for Ontology Engineering with Deep Learning
Applying deep learning techniques, particularly language models (LMs), in
ontology engineering has raised widespread attention. However, deep learning
frameworks like PyTorch and Tensorflow are predominantly developed for Python
programming, while widely-used ontology APIs, such as the OWL API and Jena, are
primarily Java-based. To facilitate seamless integration of these frameworks
and APIs, we present Deeponto, a Python package designed for ontology
engineering. The package encompasses a core ontology processing module founded
on the widely-recognised and reliable OWL API, encapsulating its fundamental
features in a more "Pythonic" manner and extending its capabilities to include
other essential components including reasoning, verbalisation, normalisation,
projection, and more. Building on this module, Deeponto offers a suite of
tools, resources, and algorithms that support various ontology engineering
tasks, such as ontology alignment and completion, by harnessing deep learning
methodologies, primarily pre-trained LMs. In this paper, we also demonstrate
the practical utility of Deeponto through two use-cases: the Digital Health
Coaching in Samsung Research UK and the Bio-ML track of the Ontology Alignment
Evaluation Initiative (OAEI).Comment: under review at Semantic Web Journa
Towards Suicide Prevention from Bipolar Disorder with Temporal Symptom-Aware Multitask Learning
Bipolar disorder (BD) is closely associated with an increased risk of
suicide. However, while the prior work has revealed valuable insight into
understanding the behavior of BD patients on social media, little attention has
been paid to developing a model that can predict the future suicidality of a BD
patient. Therefore, this study proposes a multi-task learning model for
predicting the future suicidality of BD patients by jointly learning current
symptoms. We build a novel BD dataset clinically validated by psychiatrists,
including 14 years of posts on bipolar-related subreddits written by 818 BD
patients, along with the annotations of future suicidality and BD symptoms. We
also suggest a temporal symptom-aware attention mechanism to determine which
symptoms are the most influential for predicting future suicidality over time
through a sequence of BD posts. Our experiments demonstrate that the proposed
model outperforms the state-of-the-art models in both BD symptom identification
and future suicidality prediction tasks. In addition, the proposed temporal
symptom-aware attention provides interpretable attention weights, helping
clinicians to apprehend BD patients more comprehensively and to provide timely
intervention by tracking mental state progression.Comment: KDD 2023 accepte
Towards A Practical High-Assurance Systems Programming Language
Writing correct and performant low-level systems code is a notoriously demanding job, even for experienced developers. To make the matter worse, formally reasoning about their correctness properties introduces yet another level of complexity to the task. It requires considerable expertise in both systems programming and formal verification. The development can be extremely costly due to the sheer complexity of the systems and the nuances in them, if not assisted with appropriate tools that provide abstraction and automation.
Cogent is designed to alleviate the burden on developers when writing and verifying systems code. It is a high-level functional language with a certifying compiler, which automatically proves the correctness of the compiled code and also provides a purely functional abstraction of the low-level program to the developer. Equational reasoning techniques can then be used to prove functional correctness properties of the program on top of this abstract semantics, which is notably less laborious than directly verifying the C code.
To make Cogent a more approachable and effective tool for developing real-world systems, we further strengthen the framework by extending the core language and its ecosystem. Specifically, we enrich the language to allow users to control the memory representation of algebraic data types, while retaining the automatic proof with a data layout refinement calculus. We repurpose existing tools in a novel way and develop an intuitive foreign function interface, which provides users a seamless experience when using Cogent in conjunction with native C. We augment the Cogent ecosystem with a property-based testing framework, which helps developers better understand the impact formal verification has on their programs and enables a progressive approach to producing high-assurance systems. Finally we explore refinement type systems, which we plan to incorporate into Cogent for more expressiveness and better integration of systems programmers with the verification process
SimpleMTOD: A Simple Language Model for Multimodal Task-Oriented Dialogue with Symbolic Scene Representation
SimpleMTOD is a simple language model which recasts several sub-tasks in
multimodal task-oriented dialogues as sequence prediction tasks. SimpleMTOD is
built on a large-scale transformer-based auto-regressive architecture, which
has already proven to be successful in uni-modal task-oriented dialogues, and
effectively leverages transfer learning from pre-trained GPT-2. In-order to
capture the semantics of visual scenes, we introduce both local and
de-localized tokens for objects within a scene. De-localized tokens represent
the type of an object rather than the specific object itself and so possess a
consistent meaning across the dataset. SimpleMTOD achieves a state-of-the-art
BLEU score (0.327) in the Response Generation sub-task of the SIMMC 2.0
test-std dataset while performing on par in other multimodal sub-tasks:
Disambiguation, Coreference Resolution, and Dialog State Tracking. This is
despite taking a minimalist approach for extracting visual (and non-visual)
information. In addition the model does not rely on task-specific architectural
changes such as classification heads
Designing a Direct Feedback Loop between Humans and Convolutional Neural Networks through Local Explanations
The local explanation provides heatmaps on images to explain how
Convolutional Neural Networks (CNNs) derive their output. Due to its visual
straightforwardness, the method has been one of the most popular explainable AI
(XAI) methods for diagnosing CNNs. Through our formative study (S1), however,
we captured ML engineers' ambivalent perspective about the local explanation as
a valuable and indispensable envision in building CNNs versus the process that
exhausts them due to the heuristic nature of detecting vulnerability. Moreover,
steering the CNNs based on the vulnerability learned from the diagnosis seemed
highly challenging. To mitigate the gap, we designed DeepFuse, the first
interactive design that realizes the direct feedback loop between a user and
CNNs in diagnosing and revising CNN's vulnerability using local explanations.
DeepFuse helps CNN engineers to systemically search "unreasonable" local
explanations and annotate the new boundaries for those identified as
unreasonable in a labor-efficient manner. Next, it steers the model based on
the given annotation such that the model doesn't introduce similar mistakes. We
conducted a two-day study (S2) with 12 experienced CNN engineers. Using
DeepFuse, participants made a more accurate and "reasonable" model than the
current state-of-the-art. Also, participants found the way DeepFuse guides
case-based reasoning can practically improve their current practice. We provide
implications for design that explain how future HCI-driven design can move our
practice forward to make XAI-driven insights more actionable.Comment: 32 pages, 6 figures, 5 tables. Accepted for publication in the
Proceedings of the ACM on Human-Computer Interaction (PACM HCI), CSCW 202
- âŠ