74,310 research outputs found

    Multimedia Semantic Integrity Assessment Using Joint Embedding Of Images And Text

    Full text link
    Real world multimedia data is often composed of multiple modalities such as an image or a video with associated text (e.g. captions, user comments, etc.) and metadata. Such multimodal data packages are prone to manipulations, where a subset of these modalities can be altered to misrepresent or repurpose data packages, with possible malicious intent. It is, therefore, important to develop methods to assess or verify the integrity of these multimedia packages. Using computer vision and natural language processing methods to directly compare the image (or video) and the associated caption to verify the integrity of a media package is only possible for a limited set of objects and scenes. In this paper, we present a novel deep learning-based approach for assessing the semantic integrity of multimedia packages containing images and captions, using a reference set of multimedia packages. We construct a joint embedding of images and captions with deep multimodal representation learning on the reference dataset in a framework that also provides image-caption consistency scores (ICCSs). The integrity of query media packages is assessed as the inlierness of the query ICCSs with respect to the reference dataset. We present the MultimodAl Information Manipulation dataset (MAIM), a new dataset of media packages from Flickr, which we make available to the research community. We use both the newly created dataset as well as Flickr30K and MS COCO datasets to quantitatively evaluate our proposed approach. The reference dataset does not contain unmanipulated versions of tampered query packages. Our method is able to achieve F1 scores of 0.75, 0.89 and 0.94 on MAIM, Flickr30K and MS COCO, respectively, for detecting semantically incoherent media packages.Comment: *Ayush Jaiswal and Ekraam Sabir contributed equally to the work in this pape

    Negotiating the Maze: Case based, Collaborative Distance Learning in Dentistry

    Get PDF
    The module was developed as an elective to give motivated senior dental students an opportunity to expand their horizons in planning oral rehabilitation. It comprised one tutor and 12 students, from five universities world-wide, communicating on the World Wide Web (WWW), to develop oral rehabilitation plans for simulated patients. Trigger material came from one of two Case Profiles and consisted of diagnostic casts and details of the clinical and radiographic examination in WWW/CD-ROM form. No background material was supplied as to the "patient's" age, sex, history or main concern(s). Students worked in groups of three, each student from a different location. Individual students were given a role within the group: "Patient", who developed a "personal background" belonging to the trigger examination material, "Academic" who identified state-of-the-art treatment options available for the dental treatment needs identified by the group and "General Practitioner" who tailored these options to the "patient's" needs and wants. Student feedback focused on their perception of their experience with the program in response to a questionnaire comprising 11 structured and four "open" questions. All students felt that the program increased their confidence in planning oral rehabilitation. Ten students felt that the "best thing about the program" was the interaction with students from other universities and the exposure to different philosophies from the different schools. Eight students mentioned their increased awareness of the importance of patient input into holistic planning. Under the heading "What was the worst thing", students cited some technical hitches and the snowball effect of two sluggish students who were not identified early enough and thus impacted negatively on the working of their groups. Student feedback showed that the module succeeded in its aims but needed modification to improve the logistics of working with an extended campu

    Integrating Authentic Digital Resources in Support of Deep, Meaningful Learning

    Get PDF
    "Integrating Authentic Digital Resources in Support of Deep, Meaningful Learning," a white paper prepared for the Smithsonian by Interactive Educational Systems Design Inc., describes instructional approaches that apply to successful teaching with the Smithsonian Learning Lab.After defining its use of terms such as deeper learning and authentic resources the authors review the research basis of three broad approaches that support integrating digital resources into the classroom:Project-based learningGuided exploration of concepts and principlesGuided development of academic skillsThese approaches find practical application in the last section of the paper, which includes seven case studies. Examples range from first-grade science, to middle-school English (including ELL strategy) to a high-school American government class. In each example, students study and analyze digital resources, going on to apply their knowledge and deepen their understanding of a range of topics and problems

    Evocative computing – creating meaningful lasting experiences in connecting with the past

    Get PDF
    We present an approach – evocative computing – that demonstrates how ‘at hand’ technologies can be ‘picked up’ and used by people to create meaningful and lasting experiences, through connecting and interacting with the past. The approach is instantiated here through a suite of interactive technologies configured for an indoor-outdoor setting that enables groups to explore, discover and research the history and background of a public cemetery. We report on a two-part study where different groups visited the cemetery and interacted with the digital tools and resources. During their activities serendipitous uses of the technology led to connections being made between personal memo-ries and ongoing activities. Furthermore, these experiences were found to be long-lasting; a follow-up study, one year later, showed them to be highly memorable, and in some cases leading participants to take up new directions in their work. We discuss the value of evocative computing for enriching user experiences and engagement with heritage practices

    Scraping social media photos posted in Kenya and elsewhere to detect and analyze food types

    Full text link
    Monitoring population-level changes in diet could be useful for education and for implementing interventions to improve health. Research has shown that data from social media sources can be used for monitoring dietary behavior. We propose a scrape-by-location methodology to create food image datasets from Instagram posts. We used it to collect 3.56 million images over a period of 20 days in March 2019. We also propose a scrape-by-keywords methodology and used it to scrape ∼30,000 images and their captions of 38 Kenyan food types. We publish two datasets of 104,000 and 8,174 image/caption pairs, respectively. With the first dataset, Kenya104K, we train a Kenyan Food Classifier, called KenyanFC, to distinguish Kenyan food from non-food images posted in Kenya. We used the second dataset, KenyanFood13, to train a classifier KenyanFTR, short for Kenyan Food Type Recognizer, to recognize 13 popular food types in Kenya. The KenyanFTR is a multimodal deep neural network that can identify 13 types of Kenyan foods using both images and their corresponding captions. Experiments show that the average top-1 accuracy of KenyanFC is 99% over 10,400 tested Instagram images and of KenyanFTR is 81% over 8,174 tested data points. Ablation studies show that three of the 13 food types are particularly difficult to categorize based on image content only and that adding analysis of captions to the image analysis yields a classifier that is 9 percent points more accurate than a classifier that relies only on images. Our food trend analysis revealed that cakes and roasted meats were the most popular foods in photographs on Instagram in Kenya in March 2019.Accepted manuscrip
    • …
    corecore