212,065 research outputs found

    One Deep Music Representation to Rule Them All? : A comparative analysis of different representation learning strategies

    Full text link
    Inspired by the success of deploying deep learning in the fields of Computer Vision and Natural Language Processing, this learning paradigm has also found its way into the field of Music Information Retrieval. In order to benefit from deep learning in an effective, but also efficient manner, deep transfer learning has become a common approach. In this approach, it is possible to reuse the output of a pre-trained neural network as the basis for a new learning task. The underlying hypothesis is that if the initial and new learning tasks show commonalities and are applied to the same type of input data (e.g. music audio), the generated deep representation of the data is also informative for the new task. Since, however, most of the networks used to generate deep representations are trained using a single initial learning source, their representation is unlikely to be informative for all possible future tasks. In this paper, we present the results of our investigation of what are the most important factors to generate deep representations for the data and learning tasks in the music domain. We conducted this investigation via an extensive empirical study that involves multiple learning sources, as well as multiple deep learning architectures with varying levels of information sharing between sources, in order to learn music representations. We then validate these representations considering multiple target datasets for evaluation. The results of our experiments yield several insights on how to approach the design of methods for learning widely deployable deep data representations in the music domain.Comment: This work has been accepted to "Neural Computing and Applications: Special Issue on Deep Learning for Music and Audio

    Introduction to the Special Section on Computational Modeling and Understanding of Emotions in Conflictual Social Interactions

    Full text link
    The editorial work of C. Clavel for this special issue was partially supported by a grant overseen by the French National Research Agency (ANR17-MAOI) and by the European project H2020 ANIMATAS (MSCA-ITN-ETN 7659552). The editorial work of V. Patti was partially funded by Progetto di Ateneo/CSP 2016 (Immigrants, Hate and Prejudice in Social Media, S1618_L2_BOSC_01). P. Rosso was partially funded by Spanish MICINN under the research project MISMIS-FAKEnHATE on MISinformation and MIScommunication in social media: FAKE news and HATE speech (PGC2018-096212-B-C31).Damiano, R.; Patti, V.; Clavel, C.; Rosso, P. (2020). Introduction to the Special Section on Computational Modeling and Understanding of Emotions in Conflictual Social Interactions. ACM Transactions on Internet Technology. 20(2):1-5. https://doi.org/10.1145/3392334S15202Basile, V., Bosco, C., Fersini, E., Nozza, D., Patti, V., Rangel Pardo, F. M., … Sanguinetti, M. (2019). SemEval-2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter. Proceedings of the 13th International Workshop on Semantic Evaluation. doi:10.18653/v1/s19-2007Bassignana, E., Basile, V., & Patti, V. (2018). Hurtlex: A Multilingual Lexicon of Words to Hurt. Proceedings of the Fifth Italian Conference on Computational Linguistics CLiC-it 2018, 51-56. doi:10.4000/books.aaccademia.3085Cristina Bosco Felice Dell’Orletta Fabio Poletto Manuela Sanguinetti and Maurizio Tesconi. 2018. Overview of the EVALITA 2018 hate speech detection task. In Proceedings of the 6th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA’18) co-located with the 5th Italian Conference on Computational Linguistics (CLiC-it’18). 9. http://ceur-ws.org/Vol-2263/paper010.pdf Cristina Bosco Felice Dell’Orletta Fabio Poletto Manuela Sanguinetti and Maurizio Tesconi. 2018. Overview of the EVALITA 2018 hate speech detection task. In Proceedings of the 6th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA’18) co-located with the 5th Italian Conference on Computational Linguistics (CLiC-it’18). 9. http://ceur-ws.org/Vol-2263/paper010.pdfBrady, W. J., Wills, J. A., Jost, J. T., Tucker, J. A., & Van Bavel, J. J. (2017). Emotion shapes the diffusion of moralized content in social networks. Proceedings of the National Academy of Sciences, 114(28), 7313-7318. doi:10.1073/pnas.1618923114Fortuna, P., & Nunes, S. (2018). A Survey on Automatic Detection of Hate Speech in Text. ACM Computing Surveys, 51(4), 1-30. doi:10.1145/3232676Pamungkas, E. W., & Patti, V. (2019). Cross-domain and Cross-lingual Abusive Language Detection: A Hybrid Approach with Deep Learning and a Multilingual Lexicon. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop. doi:10.18653/v1/p19-2051Plutchik, R. (2001). The Nature of Emotions. American Scientist, 89(4), 344. doi:10.1511/2001.4.344Schmidt, A., & Wiegand, M. (2017). A Survey on Hate Speech Detection using Natural Language Processing. Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media. doi:10.18653/v1/w17-1101W. Wilmot and J. Hocker. 2013. Interpersonal Conflict (9th ed.). McGraw-Hill New York NY. W. Wilmot and J. Hocker. 2013. Interpersonal Conflict (9th ed.). McGraw-Hill New York NY

    Introduction to the special issue on cross-language algorithms and applications

    Get PDF
    With the increasingly global nature of our everyday interactions, the need for multilingual technologies to support efficient and efective information access and communication cannot be overemphasized. Computational modeling of language has been the focus of Natural Language Processing, a subdiscipline of Artificial Intelligence. One of the current challenges for this discipline is to design methodologies and algorithms that are cross-language in order to create multilingual technologies rapidly. The goal of this JAIR special issue on Cross-Language Algorithms and Applications (CLAA) is to present leading research in this area, with emphasis on developing unifying themes that could lead to the development of the science of multi- and cross-lingualism. In this introduction, we provide the reader with the motivation for this special issue and summarize the contributions of the papers that have been included. The selected papers cover a broad range of cross-lingual technologies including machine translation, domain and language adaptation for sentiment analysis, cross-language lexical resources, dependency parsing, information retrieval and knowledge representation. We anticipate that this special issue will serve as an invaluable resource for researchers interested in topics of cross-lingual natural language processing.Postprint (published version

    Open Vocabulary Learning on Source Code with a Graph-Structured Cache

    Get PDF
    Machine learning models that take computer program source code as input typically use Natural Language Processing (NLP) techniques. However, a major challenge is that code is written using an open, rapidly changing vocabulary due to, e.g., the coinage of new variable and method names. Reasoning over such a vocabulary is not something for which most NLP methods are designed. We introduce a Graph-Structured Cache to address this problem; this cache contains a node for each new word the model encounters with edges connecting each word to its occurrences in the code. We find that combining this graph-structured cache strategy with recent Graph-Neural-Network-based models for supervised learning on code improves the models' performance on a code completion task and a variable naming task --- with over 100%100\% relative improvement on the latter --- at the cost of a moderate increase in computation time.Comment: Published in the International Conference on Machine Learning (ICML 2019), 13 page

    Sequence Mining and Pattern Analysis in Drilling Reports with Deep Natural Language Processing

    Full text link
    Drilling activities in the oil and gas industry have been reported over decades for thousands of wells on a daily basis, yet the analysis of this text at large-scale for information retrieval, sequence mining, and pattern analysis is very challenging. Drilling reports contain interpretations written by drillers from noting measurements in downhole sensors and surface equipment, and can be used for operation optimization and accident mitigation. In this initial work, a methodology is proposed for automatic classification of sentences written in drilling reports into three relevant labels (EVENT, SYMPTOM and ACTION) for hundreds of wells in an actual field. Some of the main challenges in the text corpus were overcome, which include the high frequency of technical symbols, mistyping/abbreviation of technical terms, and the presence of incomplete sentences in the drilling reports. We obtain state-of-the-art classification accuracy within this technical language and illustrate advanced queries enabled by the tool.Comment: 7 pages, 14 figures, technical repor

    Controlling Risk of Web Question Answering

    Full text link
    Web question answering (QA) has become an indispensable component in modern search systems, which can significantly improve users' search experience by providing a direct answer to users' information need. This could be achieved by applying machine reading comprehension (MRC) models over the retrieved passages to extract answers with respect to the search query. With the development of deep learning techniques, state-of-the-art MRC performances have been achieved by recent deep methods. However, existing studies on MRC seldom address the predictive uncertainty issue, i.e., how likely the prediction of an MRC model is wrong, leading to uncontrollable risks in real-world Web QA applications. In this work, we first conduct an in-depth investigation over the risk of Web QA. We then introduce a novel risk control framework, which consists of a qualify model for uncertainty estimation using the probe idea, and a decision model for selectively output. For evaluation, we introduce risk-related metrics, rather than the traditional EM and F1 in MRC, for the evaluation of risk-aware Web QA. The empirical results over both the real-world Web QA dataset and the academic MRC benchmark collection demonstrate the effectiveness of our approach.Comment: 42nd International ACM SIGIR Conference on Research and Development in Information Retrieva
    • …
    corecore