74 research outputs found

    Complete Cross-triplet Loss in Label Space for Audio-visual Cross-modal Retrieval

    Full text link
    The heterogeneity gap problem is the main challenge in cross-modal retrieval. Because cross-modal data (e.g. audiovisual) have different distributions and representations that cannot be directly compared. To bridge the gap between audiovisual modalities, we learn a common subspace for them by utilizing the intrinsic correlation in the natural synchronization of audio-visual data with the aid of annotated labels. TNN-CCCA is the best audio-visual cross-modal retrieval (AV-CMR) model so far, but the model training is sensitive to hard negative samples when learning common subspace by applying triplet loss to predict the relative distance between inputs. In this paper, to reduce the interference of hard negative samples in representation learning, we propose a new AV-CMR model to optimize semantic features by directly predicting labels and then measuring the intrinsic correlation between audio-visual data using complete cross-triple loss. In particular, our model projects audio-visual features into label space by minimizing the distance between predicted label features after feature projection and ground label representations. Moreover, we adopt complete cross-triplet loss to optimize the predicted label features by leveraging the relationship between all possible similarity and dissimilarity semantic information across modalities. The extensive experimental results on two audio-visual double-checked datasets have shown an improvement of approximately 2.1% in terms of average MAP over the current state-of-the-art method TNN-CCCA for the AV-CMR task, which indicates the effectiveness of our proposed model.Comment: 9 pages, 5 figures, 3 tables, accepted by IEEE ISM 202

    Temporal Success Analyses in Music Collaboration Networks: Brazilian and Global Scenarios

    Get PDF
    Collaboration is a part of the music industry and has increased over recent decades; but little do we know about its effects on success and evolution. Our goal is to analyze how success has evolved over collaboration networks and compare its global scenario to a local, thriving one: the Brazilian music industry. Specifically, we build collaboration networks from data collected from Spotify's Global and Brazilian daily charts, analyze them and identify collaboration profiles in such networks. Analyses over their topological characteristics reveal collaboration patterns mapped into four different profiles: Standard, Niche, Ephemeral and Absent, where the two first have a higher level of success. Furthermore, we do deeper by evaluating the temporal evolution of such profiles through case studies: pop and k-pop globally, and pop and forró in Brazil. Overall, our findings emphasize the importance of collaboration profiles in assessing success, and show differences between the global and Brazilian scenarios

    Temporal Success Analyses in Music Collaboration Networks: Brazilian and Global Scenarios

    Get PDF
    Collaboration is a part of the music industry and has increased over recent decades; but little do we know about its effects on success and evolution. Our goal is to analyze how success has evolved over collaboration networks and compare its global scenario to a local, thriving one: the Brazilian music industry. Specifically, we build collaboration networks from data collected from Spotify's Global and Brazilian daily charts, analyze them and identify collaboration profiles in such networks. Analyses over their topological characteristics reveal collaboration patterns mapped into four different profiles: Standard, Niche, Ephemeral and Absent, where the two first have a higher level of success. Furthermore, we do deeper by evaluating the temporal evolution of such profiles through case studies: pop and k-pop globally, and pop and forró in Brazil. Overall, our findings emphasize the importance of collaboration profiles in assessing success, and show differences between the global and Brazilian scenarios

    Selected Papers from the First International Symposium on Future ICT (Future-ICT 2019) in Conjunction with 4th International Symposium on Mobile Internet Security (MobiSec 2019)

    Get PDF
    The International Symposium on Future ICT (Future-ICT 2019) in conjunction with the 4th International Symposium on Mobile Internet Security (MobiSec 2019) was held on 17–19 October 2019 in Taichung, Taiwan. The symposium provided academic and industry professionals an opportunity to discuss the latest issues and progress in advancing smart applications based on future ICT and its relative security. The symposium aimed to publish high-quality papers strictly related to the various theories and practical applications concerning advanced smart applications, future ICT, and related communications and networks. It was expected that the symposium and its publications would be a trigger for further related research and technology improvements in this field

    Multimodal Automated Fact-Checking: A Survey

    Full text link
    Misinformation is often conveyed in multiple modalities, e.g. a miscaptioned image. Multimodal misinformation is perceived as more credible by humans, and spreads faster than its text-only counterparts. While an increasing body of research investigates automated fact-checking (AFC), previous surveys mostly focus on text. In this survey, we conceptualise a framework for AFC including subtasks unique to multimodal misinformation. Furthermore, we discuss related terms used in different communities and map them to our framework. We focus on four modalities prevalent in real-world fact-checking: text, image, audio, and video. We survey benchmarks and models, and discuss limitations and promising directions for future researchComment: The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP): Finding

    Listener Modeling and Context-aware Music Recommendation Based on Country Archetypes

    Get PDF
    Music preferences are strongly shaped by the cultural and socio-economic background of the listener, which is reflected, to a considerable extent, in country-specific music listening profiles. Previous work has already identified several country-specific differences in the popularity distribution of music artists listened to. In particular, what constitutes the "music mainstream" strongly varies between countries. To complement and extend these results, the article at hand delivers the following major contributions: First, using state-of-the-art unsupervised learning techniques, we identify and thoroughly investigate (1) country profiles of music preferences on the fine-grained level of music tracks (in contrast to earlier work that relied on music preferences on the artist level) and (2) country archetypes that subsume countries sharing similar patterns of listening preferences. Second, we formulate four user models that leverage the user's country information on music preferences. Among others, we propose a user modeling approach to describe a music listener as a vector of similarities over the identified country clusters or archetypes. Third, we propose a context-aware music recommendation system that leverages implicit user feedback, where context is defined via the four user models. More precisely, it is a multi-layer generative model based on a variational autoencoder, in which contextual features can influence recommendations through a gating mechanism. Fourth, we thoroughly evaluate the proposed recommendation system and user models on a real-world corpus of more than one billion listening records of users around the world (out of which we use 369 million in our experiments) and show its merits vis-a-vis state-of-the-art algorithms that do not exploit this type of context information.Comment: 30 pages, 3 tables, 12 figure

    A Closer Look into Recent Video-based Learning Research: A Comprehensive Review of Video Characteristics, Tools, Technologies, and Learning Effectiveness

    Full text link
    People increasingly use videos on the Web as a source for learning. To support this way of learning, researchers and developers are continuously developing tools, proposing guidelines, analyzing data, and conducting experiments. However, it is still not clear what characteristics a video should have to be an effective learning medium. In this paper, we present a comprehensive review of 257 articles on video-based learning for the period from 2016 to 2021. One of the aims of the review is to identify the video characteristics that have been explored by previous work. Based on our analysis, we suggest a taxonomy which organizes the video characteristics and contextual aspects into eight categories: (1) audio features, (2) visual features, (3) textual features, (4) instructor behavior, (5) learners activities, (6) interactive features (quizzes, etc.), (7) production style, and (8) instructional design. Also, we identify four representative research directions: (1) proposals of tools to support video-based learning, (2) studies with controlled experiments, (3) data analysis studies, and (4) proposals of design guidelines for learning videos. We find that the most explored characteristics are textual features followed by visual features, learner activities, and interactive features. Text of transcripts, video frames, and images (figures and illustrations) are most frequently used by tools that support learning through videos. The learner activity is heavily explored through log files in data analysis studies, and interactive features have been frequently scrutinized in controlled experiments. We complement our review by contrasting research findings that investigate the impact of video characteristics on the learning effectiveness, report on tasks and technologies used to develop tools that support learning, and summarize trends of design guidelines to produce learning video

    Investigating cross-country relationship between users' social ties and music mainstreaminess

    Get PDF
    We investigate the complex relationship between the fac- tors (i) preference for music mainstream, (ii) social ties in an online music platform, and (iii) demographics. We define (i) on a global and a country level, (ii) by several network centrality measures such as Jaccard index among users’ connections, closeness centrality, and betweenness centrality, and (iii) by country and age information. Using the LFM-1b dataset of listening events of Last.fm users, we are able to uncover country-dependent differences in consumption of mainstream music as well as in user behavior with respect to social ties and users’ centrality. We could identify that users inclined to mainstream music tend to have stronger connections than the group of less mainstreamy users. Furthermore, our analysis revealed that users typically have less connections within a country than cross-country ones, with the first being stronger social ties, though. Results will help building better user models of listeners and in turn improve personalized music retrieval and recommendation algorithms

    A Survey of Wireless Communication Technologies & Their Performance for High Speed Railways

    Get PDF
    High Speed Railway (HSR) provides its customers not only safety, security, comfort and on-time commuting, but also a fast transportation alternative to air travel or regular passenger rail services. Providing these benefits would not be possible without the tremendous growth and prevalence of wireless communication technologies. Due to advances in wireless communication systems, both trains and passengers are connected through high speed wireless networks to the Internet, data centers and railroad control centers. Railroad communities, academia, related industries and standards bodies, even the European Space Agency, are involved in advancing developments of HSR for highly connected train communication systems. The goal of these efforts is to provide the capabilities for uninterrupted high-speed fault-tolerant communication networks for all possible geographic, structural and weather conditions. This survey provides an overview of the current state-of-the-art and future trends for wireless technologies aiming to realize the concept of HSR communication services. Our goal is to highlight the challenges for these technologies, including GSM-R, Wi-Fi, WIMAX, LTE-R, RoF, LCX & Cognitive Radio, the offered solutions, their performance, and other related issues. Currently, providing HSR services is the goal of many countries across the globe. Europe, Japan & Taiwan, China, as well as North & South America have increased their efforts to advance HSR technologies to monitor and control not only the operations but also to deliver extensive broadband solutions to passengers. This survey determined a trend of the industry to transition control plane operations towards narrowband frequencies, i.e. LTE400/700, and to utilize concurrently other technologies for broadband access for passengers such that services of both user and train control systems are supported. With traditional technologies, a tradeoff was required and often favored train control services over passenger amenities. However, with the advances in communication systems, such as LTE-R and cognitive radios, it is becoming possible for system designers to offer rich services to passengers while also providing support for enhanced train control operations such as Positive Train Control
    • …
    corecore