Search CORE

8,595 research outputs found

Validating Multimedia Content Moderation Software via Semantic Fusion

Author: Chen Chang
Gu Jiazhen
He Pinjia
Huang Jingyuan
Lyu Michael
Wang Wenxuan
Wu Weibin
Zhang Jianping
Publication venue
Publication date: 22/05/2023
Field of study

The exponential growth of social media platforms, such as Facebook and TikTok, has revolutionized communication and content publication in human society. Users on these platforms can publish multimedia content that delivers information via the combination of text, audio, images, and video. Meanwhile, the multimedia content release facility has been increasingly exploited to propagate toxic content, such as hate speech, malicious advertisements, and pornography. To this end, content moderation software has been widely deployed on these platforms to detect and blocks toxic content. However, due to the complexity of content moderation models and the difficulty of understanding information across multiple modalities, existing content moderation software can fail to detect toxic content, which often leads to extremely negative impacts. We introduce Semantic Fusion, a general, effective methodology for validating multimedia content moderation software. Our key idea is to fuse two or more existing single-modal inputs (e.g., a textual sentence and an image) into a new input that combines the semantics of its ancestors in a novel manner and has toxic nature by construction. This fused input is then used for validating multimedia content moderation software. We realized Semantic Fusion as DUO, a practical content moderation software testing tool. In our evaluation, we employ DUO to test five commercial content moderation software and two state-of-the-art models against three kinds of toxic content. The results show that DUO achieves up to 100% error finding rate (EFR) when testing moderation software. In addition, we leverage the test cases generated by DUO to retrain the two models we explored, which largely improves model robustness while maintaining the accuracy on the original test set.Comment: Accepted by ISSTA 202

arXiv.org e-Print Archive

Getting Past the Language Gap: Innovations in Machine Translation

Author: Hush NS
McKemmish LK
McKenzie RH
Reimers JR
Publication venue: Attuale: SPRINGER, NEW YOIRK
Publication date: 01/01/2013
Field of study

In this chapter, we will be reviewing state of the art machine translation systems, and will discuss innovative methods for machine translation, highlighting the most promising techniques and applications. Machine translation (MT) has benefited from a revitalization in the last 10 years or so, after a period of relatively slow activity. In 2005 the field received a jumpstart when a powerful complete experimental package for building MT systems from scratch became freely available as a result of the unified efforts of the MOSES international consortium. Around the same time, hierarchical methods had been introduced by Chinese researchers, which allowed the introduction and use of syntactic information in translation modeling. Furthermore, the advances in the related field of computational linguistics, making off-the-shelf taggers and parsers readily available, helped give MT an additional boost. Yet there is still more progress to be made. For example, MT will be enhanced greatly when both syntax and semantics are on board: this still presents a major challenge though many advanced research groups are currently pursuing ways to meet this challenge head-on. The next generation of MT will consist of a collection of hybrid systems. It also augurs well for the mobile environment, as we look forward to more advanced and improved technologies that enable the working of Speech-To-Speech machine translation on hand-held devices, i.e. speech recognition and speech synthesis. We review all of these developments and point out in the final section some of the most promising research avenues for the future of MT

Archivio Ricerca Ca'Foscari

Crossref

OPUS - University of Technology Sydney

UCL Discovery

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

University of Queensland eSpace

Image Quality Assessment for Population Cardiac MRI: From Detection to Synthesis

Author: Zhang Le
Publication venue: 'University of Sheffield Conference Proceedings'
Publication date: 27/08/2019
Field of study

Cardiac magnetic resonance (CMR) images play a growing role in diagnostic imaging of cardiovascular diseases. Left Ventricular (LV) cardiac anatomy and function are widely used for diagnosis and monitoring disease progression in cardiology and to assess the patient's response to cardiac surgery and interventional procedures. For population imaging studies, CMR is arguably the most comprehensive imaging modality for non-invasive and non-ionising imaging of the heart and great vessels and, hence, most suited for population imaging cohorts. Due to insufficient radiographer's experience in planning a scan, natural cardiac muscle contraction, breathing motion, and imperfect triggering, CMR can display incomplete LV coverage, which hampers quantitative LV characterization and diagnostic accuracy. To tackle this limitation and enhance the accuracy and robustness of the automated cardiac volume and functional assessment, this thesis focuses on the development and application of state-of-the-art deep learning (DL) techniques in cardiac imaging. Specifically, we propose new image feature representation types that are learnt with DL models and aimed at highlighting the CMR image quality cross-dataset. These representations are also intended to estimate the CMR image quality for better interpretation and analysis. Moreover, we investigate how quantitative analysis can benefit when these learnt image representations are used in image synthesis. Specifically, a 3D fisher discriminative representation is introduced to identify CMR image quality in the UK Biobank cardiac data. Additionally, a novel adversarial learning (AL) framework is introduced for the cross-dataset CMR image quality assessment and we show that the common representations learnt by AL can be useful and informative for cross-dataset CMR image analysis. Moreover, we utilize the dataset invariance (DI) representations for CMR volumes interpolation by introducing a novel generative adversarial nets (GANs) based image synthesis framework, which enhance the CMR image quality cross-dataset

White Rose E-theses Online

Data-driven machine translation for sign languages

Author: Morrissey Sara
Publication venue: Dublin City University. School of Computing
Publication date: 01/11/2008
Field of study

This thesis explores the application of data-driven machine translation (MT) to sign languages (SLs). The provision of an SL MT system can facilitate communication between Deaf and hearing people by translating information into the native and preferred language of the individual. We begin with an introduction to SLs, focussing on Irish Sign Language - the native language of the Deaf in Ireland. We describe their linguistics and mechanics including similarities and differences with spoken languages. Given the lack of a formalised written form of these languages, an outline of annotation formats is discussed as well as the issue of data collection. We summarise previous approaches to SL MT, highlighting the pros and cons of each approach. Initial experiments in the novel area of example-based MT for SLs are discussed and an overview of the problems that arise when automatically translating these manual-visual languages is given. Following this we detail our data-driven approach, examining the MT system used and modifications made for the treatment of SLs and their annotation. Through sets of automatically evaluated experiments in both language directions, we consider the merits of data-driven MT for SLs and outline the mainstream evaluation metrics used. To complete the translation into SLs, we discuss the addition and manual evaluation of a signing avatar for real SL output

DCU Online Research Access Service

A Survey on Interpretable Cross-modal Reasoning

Author: Qian Shengsheng
Xu Changsheng
Xue Dizhan
Zhou Zuyi
Publication venue
Publication date: 05/09/2023
Field of study

In recent years, cross-modal reasoning (CMR), the process of understanding and reasoning across different modalities, has emerged as a pivotal area with applications spanning from multimedia analysis to healthcare diagnostics. As the deployment of AI systems becomes more ubiquitous, the demand for transparency and comprehensibility in these systems' decision-making processes has intensified. This survey delves into the realm of interpretable cross-modal reasoning (I-CMR), where the objective is not only to achieve high predictive performance but also to provide human-understandable explanations for the results. This survey presents a comprehensive overview of the typical methods with a three-level taxonomy for I-CMR. Furthermore, this survey reviews the existing CMR datasets with annotations for explanations. Finally, this survey summarizes the challenges for I-CMR and discusses potential future directions. In conclusion, this survey aims to catalyze the progress of this emerging research area by providing researchers with a panoramic and comprehensive perspective, illuminating the state of the art and discerning the opportunities

arXiv.org e-Print Archive

Learning Explicit and Implicit Arabic Discourse Relations.

Author: Benamara Farah
Hadrich Belguith Lamia
Keskes Iskandar
Publication venue: 'Elsevier BV'
Publication date: 01/01/2014
Field of study

We propose in this paper a supervised learning approach to identify discourse relations in Arabic texts. To our knowledge, this work represents the first attempt to focus on both explicit and implicit relations that link adjacent as well as non adjacent Elementary Discourse Units (EDUs) within the Segmented Discourse Representation Theory (SDRT). We use the Discourse Arabic Treebank corpus (D-ATB) which is composed of newspaper documents extracted from the syntactically annotated Arabic Treebank v3.2 part3 where each document is associated with complete discourse graph according to the cognitive principles of SDRT. Our list of discourse relations is composed of a three-level hierarchy of 24 relations grouped into 4 top-level classes. To automatically learn them, we use state of the art features whose efficiency has been empirically proved. We investigate how each feature contributes to the learning process. We report our experiments on identifying fine-grained discourse relations, mid-level classes and also top-level classes. We compare our approach with three baselines that are based on the most frequent relation, discourse connectives and the features used by Al-Saif and Markert (2011). Our results are very encouraging and outperform all the baselines with an F-score of 78.1% and an accuracy of 80.6%

Elsevier - Publisher Connector

Scientific Publications of the University of Toulouse II Le Mirail

Open Archive Toulouse Archive Ouverte

Directory of Open Access Journals

Neural Natural Language Generation: A Survey on Multilinguality, Multimodality, Controllability and Learning

Author: Apostol Elena-Simona
Babii Andrii
Berend Gábor
Calixto Iacer
Erdem Aykut
Erdem Erkut
Frank Anette
Gatt Albert
Korvel Grăzina
Kuyu Menekse
Lloret Elena
Martinčić-Ipšić Sanda
Parcalabescu Letitia
Truică Ciprian-Octavian
Turuta Oleksii
Yagcioglu Semih
Šandrih Branislava
Publication venue: 'AI Access Foundation'
Publication date: 06/04/2022
Field of study

Developing artificial learning systems that can understand and generate natural language has been one of the long-standing goals of artificial intelligence. Recent decades have witnessed an impressive progress on both of these problems, giving rise to a new family of approaches. Especially, the advances in deep learning over the past couple of years have led to neural approaches to natural language generation (NLG). These methods combine generative language learning techniques with neural-networks based frameworks. With a wide range of applications in natural language processing, neural NLG (NNLG) is a new and fast growing field of research. In this state-of-the-art report, we investigate the recent developments and applications of NNLG in its full extent from a multidimensional view, covering critical perspectives such as multimodality, multilinguality, controllability and learning strategies. We summarize the fundamental building blocks of NNLG approaches from these aspects and provide detailed reviews of commonly used preprocessing steps and basic neural architectures. This report also focuses on the seminal applications of these NNLG models such as machine translation, description generation, automatic speech recognition, abstractive summarization, text simplification, question answering and generation, and dialogue generation. Finally, we conclude with a thorough discussion of the described frameworks by pointing out some open research directions.This work has been partially supported by the European Commission ICT COST Action “Multi-task, Multilingual, Multi-modal Language Generation” (CA18231). AE was supported by BAGEP 2021 Award of the Science Academy. EE was supported in part by TUBA GEBIP 2018 Award. BP is in in part funded by Independent Research Fund Denmark (DFF) grant 9063-00077B. IC has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No 838188. EL is partly funded by Generalitat Valenciana and the Spanish Government throught projects PROMETEU/2018/089 and RTI2018-094649-B-I00, respectively. SMI is partly funded by UNIRI project uniri-drustv-18-20. GB is partly supported by the Ministry of Innovation and the National Research, Development and Innovation Office within the framework of the Hungarian Artificial Intelligence National Laboratory Programme. COT is partially funded by the Romanian Ministry of European Investments and Projects through the Competitiveness Operational Program (POC) project “HOLOTRAIN” (grant no. 29/221 ap2/07.04.2020, SMIS code: 129077) and by the German Academic Exchange Service (DAAD) through the project “AWAKEN: content-Aware and netWork-Aware faKE News mitigation” (grant no. 91809005). ESA is partially funded by the German Academic Exchange Service (DAAD) through the project “Deep-Learning Anomaly Detection for Human and Automated Users Behavior” (grant no. 91809358)

Repositorio Institucional de la Universidad de Alicante

The reliability of cephalometric tracing using AI

Author: Suissa Emmanuel
Publication venue
Publication date: 01/02/2023
Field of study

Introduction : L'objectif de cette étude est de comparer la différence entre l'analyse céphalométrique manuelle et l'analyse automatisée par l’intelligence artificielle afin de confirmer la fiabilité de cette dernière. Notre hypothèse de recherche est que la technique manuelle est la plus fiable des deux méthodes. Méthode : Un total de 99 radiographies céphalométriques latérales sont recueillies. Des tracés par technique manuelle (MT) et par localisation automatisée par intelligence artificielle (AI) sont réalisés pour toutes les radiographies. La localisation de 29 points céphalométriques couramment utilisés est comparée entre les deux groupes. L'erreur radiale moyenne (MRE) et un taux de détection réussie (SDR) de 2 mm sont utilisés pour comparer les deux groupes. Le logiciel AudaxCeph version 6.2.57.4225 est utilisé pour l'analyse manuelle et l'analyse AI. Résultats : Le MRE et SDR pour le test de fiabilité inter-examinateur sont respectivement de 0,87 ± 0,61mm et 95%. Pour la comparaison entre la technique manuelle MT et le repérage par intelligence artificielle AI, le MRE et SDR pour tous les repères sont respectivement de 1,48 ± 1,42 mm et 78 %. Lorsque les repères dentaires sont exclus, le MRE diminue à 1,33 ± 1,39 mm et le SDR augmente à 84 %. Lorsque seuls les repères des tissus durs sont inclus (excluant les points des tissus mous et dentaires), le MRE diminue encore à 1,25 ± 1,09 mm et le SDR augmente à 85 %. Lorsque seuls les points de repère des tissus mous sont inclus, le MRE augmente à 1,68 ± 1,89 mm et le SDR diminue à 78 %. Conclusion: La performance du logiciel est similaire à celles précédemment rapportée dans la littérature pour des logiciels utilisant un cadre de modélisation similaire. Nos résultats révèlent que le repérage manuel a donné lieu à une plus grande précision. Le logiciel a obtenu de très bons résultats pour les points de tissus durs, mais sa précision a diminué pour les tissus mous et dentaires. Nous concluons que cette technologie est très prometteuse pour une application en milieu clinique sous la supervision du docteur.Introduction: The objective of this study is to compare the difference between manual cephalometric analysis and automatic analysis by artificial intelligence to confirm the reliability of the latter. Our research hypothesis is that the manual technique is the most reliable of the methods and is still considered the gold standard. Method: A total of 99 lateral cephalometric radiographs were collected in this study. Manual technique (MT) and automatic localization by artificial intelligence (AI) tracings were performed for all radiographs. The localization of 29 commonly used landmarks were compared between both groups. Mean radial error (MRE) and a successful detection rate (SDR) of 2mm were used to compare both groups. AudaxCeph software version 6.2.57.4225 (Audax d.o.o., Ljubljana, Slovenia) was used for both manual and AI analysis. Results: The MRE and SDR for the inter-examinator reliability test were 0.87 ± 0.61mm and 95% respectively. For the comparison between the manual technique MT and landmarking with artificial intelligence AI, the MRE and SDR for all landmarks were 1.48 ± 1.42mm and 78% respectively. When dental landmarks are excluded, the MRE decreases to 1.33 ± 1.39mm and the SDR increases to 84%. When only hard tissue landmarks are included (excluding soft tissue and dental points) the MRE decreases further to 1.25 ± 1.09mm and the SDR increases to 85%. When only soft tissue landmarks are included the MRE increases to 1.68 ± 1.89mm and the SDR decreases to 78%. Conclusion: The software performed similarly to what was previously reported in literature for software that use analogous modeling framework. Comparing the software’s landmarking to manual landmarking our results reveal that the manual landmarking resulted in higher accuracy. The software operated very well for hard tissue points, but its accuracy went down for soft and dental tissue. Our conclusion is this technology shows great promise for application in clinical settings under the doctor’s supervision

Dépôt Institutionnel Numérique