48 research outputs found

    Short-Video Marketing in E-commerce: Analyzing and Predicting Consumer Response

    Get PDF
    This study analyzes and predicts consumer viewing response to e-commerce short-videos (ESVs). We first construct a large-scale ESV dataset that contains 23,001 ESVs across 40 product categories. The dataset consists of the consumer response label in terms of average viewing durations and human-annotated ESV content attributes. Using the constructed dataset and mixed-effects model, we find that product description, product demonstration, pleasure, and aesthetics are four key determinants of ESV viewing duration. Furthermore, we design a content-based multimodal-multitask framework to predict consumer viewing response to ESVs. We propose the information distillation module to extract the shared, special, and conflicted information from ESV multimodal features. Additionally, we employ a hierarchical multitask classification module to capture feature-level and label-level dependencies. We conduct extensive experiments to evaluate the prediction performance of our proposed framework. Taken together, our paper provides theoretical and methodological contributions to the IS and relevant literature

    Experiences from the ImageCLEF Medical Retrieval and Annotation Tasks

    Get PDF
    The medical tasks in ImageCLEF have been run every year from 2004-2018 and many different tasks and data sets have been used over these years. The created resources are being used by many researchers well beyond the actual evaluation campaigns and are allowing to compare the performance of many techniques on the same grounds and in a reproducible way. Many of the larger data sets are from the medical literature, as such images are easier to obtain and to share than clinical data, which was used in a few smaller ImageCLEF challenges that are specifically marked with the disease type and anatomic region. This chapter describes the main results of the various tasks over the years, including data, participants, types of tasks evaluated and also the lessons learned in organizing such tasks for the scientific community

    A Multimodal Approach to Sarcasm Detection on Social Media

    Get PDF
    In recent times, a major share of human communication takes place online. The main reason being the ease of communication on social networking sites (SNSs). Due to the variety and large number of users, SNSs have drawn the attention of the computer science (CS) community, particularly the affective computing (also known as emotional AI), information retrieval, natural language processing, and data mining groups. Researchers are trying to make computers understand the nuances of human communication including sentiment and sarcasm. Emotion or sentiment detection requires more insights about the communication than it does for factual information retrieval. Sarcasm detection is particularly more difficult than categorizing sentiment. Because, in sarcasm, the intended meaning of the expression by the user is opposite to the literal meaning. Because of its complex nature, it is often difficult even for human to detect sarcasm without proper context. However, people on social media succeed in detecting sarcasm despite interacting with strangers across the world. That motivates us to investigate the human process of detecting sarcasm on social media where abundant context information is often unavailable and the group of users communicating with each other are rarely well-acquainted. We have conducted a qualitative study to examine the patterns of users conveying sarcasm on social media. Whereas most sarcasm detection systems deal in word-by-word basis to accomplish their goal, we focused on the holistic sentiment conveyed by the post. We argue that utilization of word-level information will limit the systems performance to the domain of the dataset used to train the system and might not perform well for non-English language. As an endeavor to make our system less dependent on text data, we proposed a multimodal approach for sarcasm detection. We showed the applicability of images and reaction emoticons as other sources of hints about the sentiment of the post. Our research showed the superior results from a multimodal approach when compared to a unimodal approach. Multimodal sarcasm detection systems, as the one presented in this research, with the inclusion of more modes or sources of data might lead to a better sarcasm detection model

    The role of context in image annotation and recommendation

    Get PDF
    With the rise of smart phones, lifelogging devices (e.g. Google Glass) and popularity of image sharing websites (e.g. Flickr), users are capturing and sharing every aspect of their life online producing a wealth of visual content. Of these uploaded images, the majority are poorly annotated or exist in complete semantic isolation making the process of building retrieval systems difficult as one must firstly understand the meaning of an image in order to retrieve it. To alleviate this problem, many image sharing websites offer manual annotation tools which allow the user to “tag” their photos, however, these techniques are laborious and as a result have been poorly adopted; Sigurbjörnsson and van Zwol (2008) showed that 64% of images uploaded to Flickr are annotated with < 4 tags. Due to this, an entire body of research has focused on the automatic annotation of images (Hanbury, 2008; Smeulders et al., 2000; Zhang et al., 2012a) where one attempts to bridge the semantic gap between an image’s appearance and meaning e.g. the objects present. Despite two decades of research the semantic gap still largely exists and as a result automatic annotation models often offer unsatisfactory performance for industrial implementation. Further, these techniques can only annotate what they see, thus ignoring the “bigger picture” surrounding an image (e.g. its location, the event, the people present etc). Much work has therefore focused on building photo tag recommendation (PTR) methods which aid the user in the annotation process by suggesting tags related to those already present. These works have mainly focused on computing relationships between tags based on historical images e.g. that NY and timessquare co-exist in many images and are therefore highly correlated. However, tags are inherently noisy, sparse and ill-defined often resulting in poor PTR accuracy e.g. does NY refer to New York or New Year? This thesis proposes the exploitation of an image’s context which, unlike textual evidences, is always present, in order to alleviate this ambiguity in the tag recommendation process. Specifically we exploit the “what, who, where, when and how” of the image capture process in order to complement textual evidences in various photo tag recommendation and retrieval scenarios. In part II, we combine text, content-based (e.g. # of faces present) and contextual (e.g. day-of-the-week taken) signals for tag recommendation purposes, achieving up to a 75% improvement to precision@5 in comparison to a text-only TF-IDF baseline. We then consider external knowledge sources (i.e. Wikipedia & Twitter) as an alternative to (slower moving) Flickr in order to build recommendation models on, showing that similar accuracy could be achieved on these faster moving, yet entirely textual, datasets. In part II, we also highlight the merits of diversifying tag recommendation lists before discussing at length various problems with existing automatic image annotation and photo tag recommendation evaluation collections. In part III, we propose three new image retrieval scenarios, namely “visual event summarisation”, “image popularity prediction” and “lifelog summarisation”. In the first scenario, we attempt to produce a rank of relevant and diverse images for various news events by (i) removing irrelevant images such memes and visual duplicates (ii) before semantically clustering images based on the tweets in which they were originally posted. Using this approach, we were able to achieve over 50% precision for images in the top 5 ranks. In the second retrieval scenario, we show that by combining contextual and content-based features from images, we are able to predict if it will become “popular” (or not) with 74% accuracy, using an SVM classifier. Finally, in chapter 9 we employ blur detection and perceptual-hash clustering in order to remove noisy images from lifelogs, before combining visual and geo-temporal signals in order to capture a user’s “key moments” within their day. We believe that the results of this thesis show an important step towards building effective image retrieval models when there lacks sufficient textual content (i.e. a cold start)

    A proactive chatbot framework designed to assist students based on the PS2CLH model

    Get PDF
    Nowadays, universities are using new technologies to improve the efficiency and effectiveness of learning and to assist students to enhance their academic performance. In fact, for decades, new ways to convey the information required to teach and support students have slowly been integrated into education. This development started decades ago with the popularity of e-mails and the Web. A review of relevant literature revealed that learning requires more innovative and efficient technologies to cope with natural learning challenges, highlighting a need for more effective tools to establish the interaction between humans and machines, lecturers and students. In addition, the covid pandemic presented additional new challenges for the collaboration/interaction of lecturers and students at universities. This situation led to a great demand for such tools. Researchers have been trying to develop such tools for decades, and have made good progress, but they are still in their infancy. There has been a significant evolution in computer hardware in the last decade, leading to advances in AI machine learning and Deep Learning which have made tools such as chatbots more usable. However, the efficiency and effectiveness of the chatbot are still insufficient to meet many educational needs. According to our investigation, current chatbots are mainly based on subject knowledge and therefore assist users with answers which take no consideration of their personal circumstances, which is essential in education. This research aims to design a proactive chatbot framework to assist students. The new chatbot framework integrates students’ learning profiles and subject knowledge, making the chatbot more intelligent to improve student learning and interaction more effectively. The research consists of two main parts. The first part seeks to determine the most effective students’ learning profiles on the basis of the controllable academic factors which affect their performance. The second part develops a chatbot framework to which students’ learning profiles will be applied. Due to the different nature of these two endeavours, a hybrid methodology was used in this research. The literature on learners’ characteristics and the academic factors that affect their performance was reviewed in depth, and this formed the basis for developing a new PS2CLH (psychology, self-responsibility, sociology, communication, learning and health & wellbeing) model on which an individual’s web profile can be built. The PS2CLH model combines the perspectives of psychology, self-responsibility, sociology, communication, learning and health & wellbeing to build a student-controllable learning factor model. This study identifies the impact of students’ controllable factors on their achievement. It was found that the model was 94% accurate. In addition, this research raised participant students’ awareness of PS2CLH perspectives, which helped learners and educators to manage the factors affecting academic performance more effectively. A comprehensive investigation, including a survey, showed that the chatbot supported by AI technology performed better and more efficiently in various assistant situations, including education. However, there is still room for improvement in the effectiveness of the education chatbot. Therefore, the research proposes a new chatbot framework assistant which will integrate students’ learning profiles and develop components to improve student interaction. The new framework uses knowledge from the PS2CLH model AI - Deep Learning to build a proactive chatbot for assisting students’ learning of their academic subjects and their controllable factors that affects students’ performance. One of the principal novelties of the chatbot framework lies in the communication facilitator between student-lecturer/assistant. The proactive chatbot applies multimodality to the students’ learning process to retain their attention and explain the content in different ways using text, image, video and audio to assist the students and improve their learning experience effectively. Furthermore, the chatbot proactively suggests new controllable factors for students to work on, including related factors that influence their academic performance. Tests of the framework showed that the proactive chatbot demonstrated better question response accuracy than the current BERT (Bidirectional Encoder Representations from Transformers) chatbot and presented a more effective learning method for students

    Identifying Antecedents to Learning Effectively with Digital Media: A Student-Centered Approach

    Get PDF
    Digital media is becoming more pervasive in the classroom. Even in Germany that has hesitated compared to other OECD countries to implement technology into the classroom, there is increasingly more pressure to use digital media for teaching and learning processes (Gerick, Eickelmann, & Bos, 2017). This effort to equip schools with digital media such as tablets has emerged despite not knowing how regular digital media use in classrooms affects student learning processes. Although research has attempted to keep up with the pace that digital media has entered classrooms, it has tended to emphasize gains in student achievement (Lai & Bower, 2020), with much less attention paid to the factors that precede student learning processes. To understand how students may learn with digital media in classrooms, recent conceptions of learning have highlighted that students arrive in the classroom with a range of learning skills, beliefs, prior knowledge, and experiences that significantly influence how they interpret their learning environments and acquire new knowledge (Bransford, Brown, & Cocking, 2000). Rather than look at student learning outcomes, this dissertation has aimed to build on previous theories and models about how students learn in classrooms to understand the antecedent factors that precede effective learning processes in classrooms with digital media. Following the opportunity to learn model (Seidel, 2014), students’ previous learning environments, including their families, influence students’ individual learning prerequisites, such as students’ cognitive and motivational-affective characteristics, which in turn affect students’ individual learning processes and subsequently learning outcomes (Seidel, 2014). Therefore, I have investigated (1) how parents’ beliefs and behaviors at home affect the development of students’ digital media self-efficacy, (2) how students perceive instructional quality in classrooms with digital media depending on their cognitive and motivational-affective characteristics, and (3) how students’ perceptions compared in classes with and without digital media as well as in classes where teachers had lower or higher technology innovativeness. These questions were addressed in three empirical studies that used data from a school trial investigating the use of digital media in classrooms. In Study 1, I investigated how students’ family environments and experiences at home shape students’ development of digital media self-efficacy. Specifically, using the parent socialization model, one link of the widely used expectancy-value model framework (Eccles et al., 1983), I examined whether parents’ behaviors including modeling and provision of digital media mediated the relation between parents’ value beliefs regarding digital media and students’ digital media self-efficacy (N = 1,206 students and their parents). To assess parents’ beliefs and behaviors regarding digital media, a questionnaire was developed. Results showed though parents’ value beliefs were related to students’ digital media self-efficacy, only parents’ provision of smart phones mediated this relation. Findings indicate the importance of parents’ beliefs regarding digital media and the need for future research into at home factors that influence students’ digital media self-efficacy. In Study 2 and in Study 3, I investigated students’ perceptions of supportive climate and cognitive activation in classes with tablets to understand how the use of tablets may affect how students experience their new learning context and in turn inform students’ learning outcomes. In both studies, I used latent profile analysis to first examine whether students could be grouped into distinct profiles based on their subject-specific motivational and cognitive characteristics and whether these profiles differentially predicted students’ perceptions. In Study 2, I compared students’ profile perceptions of supportive climate in biology classes with (n = 518 students) and without tablets (n = 540 students). After four months of tablet use, the ‘struggling’ and ‘unmotivated’ profiles perceived supportive climate significantly more positively than the same profiles in classes that were not given tablets. Building on these findings, in Study 3, I investigated whether there were differences between students’ perceptions of supportive climate and cognitive activation in math classes with tablets depending on teachers’ beliefs towards using technology (n = 575 students; n = 23 teachers). I found that most students perceived instructional practices more positively in classes where teachers had higher technology innovativeness with the exception of the ‘unmotivated’ profile that perceived instructional practices more negatively. The contribution of this dissertation is that students perceive instruction with digital media differently depending on their cognitive and motivational-affective characteristics. Understanding that not all students will perceive and learn with digital media in the same way has important implications for teachers’ use of digital media in the classroom as well as researchers investigating how digital media facilitates student learning. Furthermore, students’ previous experiences with digital media and characteristics such as digital media self-efficacy can affect how they feel towards and learn with digital media. Moving forward, research exploring how learning with digital media in classrooms takes place should also examine the factors outside the classroom such as students’ experiences with digital media at home

    Machine learning into metaheuristics: A survey and taxonomy of data-driven metaheuristics

    Get PDF
    During the last years, research in applying machine learning (ML) to design efficient, effective and robust metaheuristics became increasingly popular. Many of those data driven metaheuristics have generated high quality results and represent state-of-the-art optimization algorithms. Although various appproaches have been proposed, there is a lack of a comprehensive survey and taxonomy on this research topic. In this paper we will investigate different opportunities for using ML into metaheuristics. We define uniformly the various ways synergies which might be achieved. A detailed taxonomy is proposed according to the concerned search component: target optimization problem, low-level and high-level components of metaheuristics. Our goal is also to motivate researchers in optimization to include ideas from ML into metaheuristics. We identify some open research issues in this topic which needs further in-depth investigations
    corecore