Search CORE

74 research outputs found

Complete Cross-triplet Loss in Label Space for Audio-visual Cross-modal Retrieval

Author: Ikeda Kazushi
Wang Yanan
Wu Jianming
Zeng Donghuo
Publication venue
Publication date: 07/11/2022
Field of study

The heterogeneity gap problem is the main challenge in cross-modal retrieval. Because cross-modal data (e.g. audiovisual) have different distributions and representations that cannot be directly compared. To bridge the gap between audiovisual modalities, we learn a common subspace for them by utilizing the intrinsic correlation in the natural synchronization of audio-visual data with the aid of annotated labels. TNN-CCCA is the best audio-visual cross-modal retrieval (AV-CMR) model so far, but the model training is sensitive to hard negative samples when learning common subspace by applying triplet loss to predict the relative distance between inputs. In this paper, to reduce the interference of hard negative samples in representation learning, we propose a new AV-CMR model to optimize semantic features by directly predicting labels and then measuring the intrinsic correlation between audio-visual data using complete cross-triple loss. In particular, our model projects audio-visual features into label space by minimizing the distance between predicted label features after feature projection and ground label representations. Moreover, we adopt complete cross-triplet loss to optimize the predicted label features by leveraging the relationship between all possible similarity and dissimilarity semantic information across modalities. The extensive experimental results on two audio-visual double-checked datasets have shown an improvement of approximately 2.1% in terms of average MAP over the current state-of-the-art method TNN-CCCA for the AV-CMR task, which indicates the effectiveness of our proposed model.Comment: 9 pages, 5 figures, 3 tables, accepted by IEEE ISM 202

arXiv.org e-Print Archive

Temporal Success Analyses in Music Collaboration Networks: Brazilian and Global Scenarios

Author: B. Seufitelli Danilo
M. Moro Mirella
O. Silva Mariana
P. Oliveira Gabriel
Publication venue: 'Universidade Estadual do Parana - Unespar'
Publication date: 03/08/2023
Field of study

Collaboration is a part of the music industry and has increased over recent decades; but little do we know about its effects on success and evolution. Our goal is to analyze how success has evolved over collaboration networks and compare its global scenario to a local, thriving one: the Brazilian music industry. Specifically, we build collaboration networks from data collected from Spotify's Global and Brazilian daily charts, analyze them and identify collaboration profiles in such networks. Analyses over their topological characteristics reveal collaboration patterns mapped into four different profiles: Standard, Niche, Ephemeral and Absent, where the two first have a higher level of success. Furthermore, we do deeper by evaluating the temporal evolution of such profiles through case studies: pop and k-pop globally, and pop and forró in Brazil. Overall, our findings emphasize the importance of collaboration profiles in assessing success, and show differences between the global and Brazilian scenarios

UNESPAR - Portal de Periódicos (E-Journal)

Temporal Success Analyses in Music Collaboration Networks: Brazilian and Global Scenarios

Author: Danilo B. Seufitelli
Gabriel P. Oliveira
Mariana O. Silva
Mirella M. Moro
Publication venue: Universidade Estadual do Paraná
Publication date: 01/08/2023
Field of study

Directory of Open Access Journals

Selected Papers from the First International Symposium on Future ICT (Future-ICT 2019) in Conjunction with 4th International Symposium on Mobile Internet Security (MobiSec 2019)

Author
Publication venue: 'MDPI AG'
Publication date: 11/01/2022
Field of study

The International Symposium on Future ICT (Future-ICT 2019) in conjunction with the 4th International Symposium on Mobile Internet Security (MobiSec 2019) was held on 17–19 October 2019 in Taichung, Taiwan. The symposium provided academic and industry professionals an opportunity to discuss the latest issues and progress in advancing smart applications based on future ICT and its relative security. The symposium aimed to publish high-quality papers strictly related to the various theories and practical applications concerning advanced smart applications, future ICT, and related communications and networks. It was expected that the symposium and its publications would be a trigger for further related research and technology improvements in this field

Directory of Open Access Books (DOAB)

Multimodal Automated Fact-Checking: A Survey

Author: Akhtar Mubashara
Cocarascu Oana
Guo Zhijiang
Schlichtkrull Michael
Simperl Elena
Vlachos Andreas
Publication venue
Publication date: 25/10/2023
Field of study

Misinformation is often conveyed in multiple modalities, e.g. a miscaptioned image. Multimodal misinformation is perceived as more credible by humans, and spreads faster than its text-only counterparts. While an increasing body of research investigates automated fact-checking (AFC), previous surveys mostly focus on text. In this survey, we conceptualise a framework for AFC including subtasks unique to multimodal misinformation. Furthermore, we discuss related terms used in different communities and map them to our framework. We focus on four modalities prevalent in real-world fact-checking: text, image, audio, and video. We survey benchmarks and models, and discuss limitations and promising directions for future researchComment: The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP): Finding

arXiv.org e-Print Archive

Listener Modeling and Context-aware Music Recommendation Based on Country Archetypes

Author: Bauer Christine
Kowald Dominik
Lex Elisabeth
Reisinger Wolfgang
Schedl Markus
Sub Human-Centered Computing
Publication venue: 'Frontiers Media SA'
Publication date: 11/09/2020
Field of study

Music preferences are strongly shaped by the cultural and socio-economic background of the listener, which is reflected, to a considerable extent, in country-specific music listening profiles. Previous work has already identified several country-specific differences in the popularity distribution of music artists listened to. In particular, what constitutes the "music mainstream" strongly varies between countries. To complement and extend these results, the article at hand delivers the following major contributions: First, using state-of-the-art unsupervised learning techniques, we identify and thoroughly investigate (1) country profiles of music preferences on the fine-grained level of music tracks (in contrast to earlier work that relied on music preferences on the artist level) and (2) country archetypes that subsume countries sharing similar patterns of listening preferences. Second, we formulate four user models that leverage the user's country information on music preferences. Among others, we propose a user modeling approach to describe a music listener as a vector of similarities over the identified country clusters or archetypes. Third, we propose a context-aware music recommendation system that leverages implicit user feedback, where context is defined via the four user models. More precisely, it is a multi-layer generative model based on a variational autoencoder, in which contextual features can influence recommendations through a gating mechanism. Fourth, we thoroughly evaluate the proposed recommendation system and user models on a real-world corpus of more than one billion listening records of users around the world (out of which we use 369 million in our experiments) and show its merits vis-a-vis state-of-the-art algorithms that do not exploit this type of context information.Comment: 30 pages, 3 tables, 12 figure

arXiv.org e-Print Archive

Utrecht University Repository

Influence of speech codecs selection on transcoding steganography

Author
Publication venue: Springer
Publication date
Field of study

Springer - Publisher Connector

A Closer Look into Recent Video-based Learning Research: A Comprehensive Review of Video Characteristics, Tools, Technologies, and Learning Effectiveness

Author: Ewerth Ralph
Hoppe Anett
Navarrete Evelyn
Nehring Andreas
Schanze Sascha
Publication venue
Publication date: 11/08/2023
Field of study

People increasingly use videos on the Web as a source for learning. To support this way of learning, researchers and developers are continuously developing tools, proposing guidelines, analyzing data, and conducting experiments. However, it is still not clear what characteristics a video should have to be an effective learning medium. In this paper, we present a comprehensive review of 257 articles on video-based learning for the period from 2016 to 2021. One of the aims of the review is to identify the video characteristics that have been explored by previous work. Based on our analysis, we suggest a taxonomy which organizes the video characteristics and contextual aspects into eight categories: (1) audio features, (2) visual features, (3) textual features, (4) instructor behavior, (5) learners activities, (6) interactive features (quizzes, etc.), (7) production style, and (8) instructional design. Also, we identify four representative research directions: (1) proposals of tools to support video-based learning, (2) studies with controlled experiments, (3) data analysis studies, and (4) proposals of design guidelines for learning videos. We find that the most explored characteristics are textual features followed by visual features, learner activities, and interactive features. Text of transcripts, video frames, and images (figures and illustrations) are most frequently used by tools that support learning through videos. The learner activity is heavily explored through log files in data analysis studies, and interactive features have been frequently scrutinized in controlled experiments. We complement our review by contrasting research findings that investigate the impact of video characteristics on the learning effectiveness, report on tasks and technologies used to develop tools that support learning, and summarize trends of design guidelines to produce learning video

arXiv.org e-Print Archive

Investigating cross-country relationship between users' social ties and music mainstreaminess

Author: Bauer Christine
Schedl Markus
Publication venue
Publication date: 01/01/2018
Field of study

We investigate the complex relationship between the fac- tors (i) preference for music mainstream, (ii) social ties in an online music platform, and (iii) demographics. We define (i) on a global and a country level, (ii) by several network centrality measures such as Jaccard index among users’ connections, closeness centrality, and betweenness centrality, and (iii) by country and age information. Using the LFM-1b dataset of listening events of Last.fm users, we are able to uncover country-dependent differences in consumption of mainstream music as well as in user behavior with respect to social ties and users’ centrality. We could identify that users inclined to mainstream music tend to have stronger connections than the group of less mainstreamy users. Furthermore, our analysis revealed that users typically have less connections within a country than cross-country ones, with the first being stronger social ties, though. Results will help building better user models of listeners and in turn improve personalized music retrieval and recommendation algorithms

Utrecht University Repository

A Survey of Wireless Communication Technologies & Their Performance for High Speed Railways

Author: Banerjee Subharthi
Hempel Michael
Sharif Hamid
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 01/01/2016
Field of study

High Speed Railway (HSR) provides its customers not only safety, security, comfort and on-time commuting, but also a fast transportation alternative to air travel or regular passenger rail services. Providing these benefits would not be possible without the tremendous growth and prevalence of wireless communication technologies. Due to advances in wireless communication systems, both trains and passengers are connected through high speed wireless networks to the Internet, data centers and railroad control centers. Railroad communities, academia, related industries and standards bodies, even the European Space Agency, are involved in advancing developments of HSR for highly connected train communication systems. The goal of these efforts is to provide the capabilities for uninterrupted high-speed fault-tolerant communication networks for all possible geographic, structural and weather conditions. This survey provides an overview of the current state-of-the-art and future trends for wireless technologies aiming to realize the concept of HSR communication services. Our goal is to highlight the challenges for these technologies, including GSM-R, Wi-Fi, WIMAX, LTE-R, RoF, LCX & Cognitive Radio, the offered solutions, their performance, and other related issues. Currently, providing HSR services is the goal of many countries across the globe. Europe, Japan & Taiwan, China, as well as North & South America have increased their efforts to advance HSR technologies to monitor and control not only the operations but also to deliver extensive broadband solutions to passengers. This survey determined a trend of the industry to transition control plane operations towards narrowband frequencies, i.e. LTE400/700, and to utilize concurrently other technologies for broadband access for passengers such that services of both user and train control systems are supported. With traditional technologies, a tradeoff was required and often favored train control services over passenger amenities. However, with the advances in communication systems, such as LTE-R and cognitive radios, it is becoming possible for system designers to offer rich services to passengers while also providing support for enhanced train control operations such as Positive Train Control