1,093 research outputs found

    Team Triple-Check at Factify 2: Parameter-Efficient Large Foundation Models with Feature Representations for Multi-Modal Fact Verification

    Full text link
    Multi-modal fact verification has become an important but challenging issue on social media due to the mismatch between the text and images in the misinformation of news content, which has been addressed by considering cross-modalities to identify the veracity of the news in recent years. In this paper, we propose the Pre-CoFactv2 framework with new parameter-efficient foundation models for modeling fine-grained text and input embeddings with lightening parameters, multi-modal multi-type fusion for not only capturing relations for the same and different modalities but also for different types (i.e., claim and document), and feature representations for explicitly providing metadata for each sample. In addition, we introduce a unified ensemble method to boost model performance by adjusting the importance of each trained model with not only the weights but also the powers. Extensive experiments show that Pre-CoFactv2 outperforms Pre-CoFact by a large margin and achieved new state-of-the-art results at the Factify challenge at AAAI 2023. We further illustrate model variations to verify the relative contributions of different components. Our team won the first prize (F1-score: 81.82%) and we made our code publicly available at https://github.com/wwweiwei/Pre-CoFactv2-AAAI-2023.Comment: AAAI-23 DeFactify 2 Workshop (1st Prize

    Benchmarking Diverse-Modal Entity Linking with Generative Models

    Full text link
    Entities can be expressed in diverse formats, such as texts, images, or column names and cell values in tables. While existing entity linking (EL) models work well on per modality configuration, such as text-only EL, visual grounding, or schema linking, it is more challenging to design a unified model for diverse modality configurations. To bring various modality configurations together, we constructed a benchmark for diverse-modal EL (DMEL) from existing EL datasets, covering all three modalities including text, image, and table. To approach the DMEL task, we proposed a generative diverse-modal model (GDMM) following a multimodal-encoder-decoder paradigm. Pre-training \Model with rich corpora builds a solid foundation for DMEL without storing the entire KB for inference. Fine-tuning GDMM builds a stronger DMEL baseline, outperforming state-of-the-art task-specific EL models by 8.51 F1 score on average. Additionally, extensive error analyses are conducted to highlight the challenges of DMEL, facilitating future research on this task.Comment: 15 pages. ACL 202

    Guidelines for digital storytelling for Arab children

    Get PDF
    Children are getting more exposed to various technologies in teaching-learning. Various types of teaching-learning have been designed, including interactive digital storytelling. In Malaysia, local children have been clear about story-based learning materials. However, the situation is a little bit different with Arab children. Because the number of Arab children migrating into Malaysia is increasing, for following their parents who are studying at higher levels, they have to also make themselves familiar with the local scenario. In accordance, this study is initiates, to identify their acceptance towards story-based learning materials, or specifically interactive digital storytelling. Hence, this study reacts proactively, by approaching Arab children asking for their feedback on whether they have any desire for interactive digital storytelling. Through a series of interviews, this study found that they have a strong desire and tendency. Then, the following objectives have been stated: (1) to determine the components for the interactive digital storytelling for Arab children, (2) to design and develop a prototype of the interactive digital storytelling, and (3) to observe on how the Arab children experience the interactive digital storytelling. User-centered design (UCD) approach has been gone through in ensuring that the objectives are achieved. The process of determining the components for the interactive digital storytelling was carried out by directly involving Arab children and their teachers from three preschools in Changlun and Sintok. It was similar with the efforts in determining the contents, and interface design until the prototype development. Having the prototype ready, user testing was carried out to explore the way Arab children experience the prototype. All the processes involved various techniques through observation, interviews, and noting. Specifically, the user testing involved qualitative and empirical data. Qualitative data were gathered through observation, meanwhile the empirical data were gathered using Computer System Usability Questionnaire (CSUQ) tool. In the end, having data processed, the findings show that Arab children are highly satisfied with the prototype. Scientifically, the developed prototype is a mirror of the obtained guidelines, obtained through the UCD seminars. Hence, the positive acceptance on the prototype reflects positive acceptance on the guidelines, as the main contribution of this study. Besides the guidelines as the main contribution of this study, the developed prototype is also a wonderful contribution to the Arab children and their teacher. They will be using it as part of their teaching and learning material

    Proceedings of the international conference on cooperative multimodal communication CMC/95, Eindhoven, May 24-26, 1995:proceedings

    Get PDF

    Image database system for glaucoma diagnosis support

    Get PDF
    Tato práce popisuje přehled standardních a pokročilých metod používaných k diagnose glaukomu v ranném stádiu. Na základě teoretických poznatků je implementován internetově orientovaný informační systém pro oční lékaře, který má tři hlavní cíle. Prvním cílem je možnost sdílení osobních dat konkrétního pacienta bez nutnosti posílat tato data internetem. Druhým cílem je vytvořit účet pacienta založený na kompletním očním vyšetření. Posledním cílem je aplikovat algoritmus pro registraci intenzitního a barevného fundus obrazu a na jeho základě vytvořit internetově orientovanou tři-dimenzionální vizualizaci optického disku. Tato práce je součásti DAAD spolupráce mezi Ústavem Biomedicínského Inženýrství, Vysokého Učení Technického v Brně, Oční klinikou v Erlangenu a Ústavem Informačních Technologií, Friedrich-Alexander University, Erlangen-Nurnberg.This master thesis describes a conception of standard and advanced eye examination methods used for glaucoma diagnosis in its early stage. According to the theoretical knowledge, a web based information system for ophthalmologists with three main aims is implemented. The first aim is the possibility to share medical data of a concrete patient without sending his personal data through the Internet. The second aim is to create a patient account based on a complete eye examination procedure. The last aim is to improve the HRT diagnostic method with an image registration algorithm for the fundus and intensity images and create an optic nerve head web based 3D visualization. This master thesis is a part of project based on DAAD co-operation between Department of Biomedical Engineering, Brno University of Technology, Eye Clinic in Erlangen and Department of Computer Science, Friedrich-Alexander University, Erlangen-Nurnberg.

    Applications of Large Scale Foundation Models for Autonomous Driving

    Full text link
    Since DARPA Grand Challenges (rural) in 2004/05 and Urban Challenges in 2007, autonomous driving has been the most active field of AI applications. Recently powered by large language models (LLMs), chat systems, such as chatGPT and PaLM, emerge and rapidly become a promising direction to achieve artificial general intelligence (AGI) in natural language processing (NLP). There comes a natural thinking that we could employ these abilities to reformulate autonomous driving. By combining LLM with foundation models, it is possible to utilize the human knowledge, commonsense and reasoning to rebuild autonomous driving systems from the current long-tailed AI dilemma. In this paper, we investigate the techniques of foundation models and LLMs applied for autonomous driving, categorized as simulation, world model, data annotation and planning or E2E solutions etc.Comment: 23 pages. A survey pape

    Mining a Small Medical Data Set by Integrating the Decision Tree and t-test

    Get PDF
    [[abstract]]Although several researchers have used statistical methods to prove that aspiration followed by the injection of 95% ethanol left in situ (retention) is an effective treatment for ovarian endometriomas, very few discuss the different conditions that could generate different recovery rates for the patients. Therefore, this study adopts the statistical method and decision tree techniques together to analyze the postoperative status of ovarian endometriosis patients under different conditions. Since our collected data set is small, containing only 212 records, we use all of these data as the training data. Therefore, instead of using a resultant tree to generate rules directly, we use the value of each node as a cut point to generate all possible rules from the tree first. Then, using t-test, we verify the rules to discover some useful description rules after all possible rules from the tree have been generated. Experimental results show that our approach can find some new interesting knowledge about recurrent ovarian endometriomas under different conditions.[[journaltype]]國外[[incitationindex]]EI[[booktype]]紙本[[countrycodes]]FI
    corecore