2 research outputs found

    PREDICTING MUSIC GENRE PREFERENCES BASED ON ONLINE COMMENTS

    Get PDF
    Communication Accommodation Theory (CAT) states that individuals adapt to each other’s communicative behaviors. This adaptation is called “convergence.” In this work we explore the convergence of writing styles of users of the online music distribution plat- form SoundCloud.com. In order to evaluate our system we created a corpus of over 38,000 comments retrieved from SoundCloud in April 2014. The corpus represents comments from 8 distinct musical genres: Classical, Electronic, Hip Hop, Jazz, Country, Metal, Folk, and World. Our corpus contains: short comments, frequent misspellings, little sentence struc- ture, hashtags, emoticons, and URLs. We adapt techniques used by researchers analyzing other short web-text corpora in order to deal with these problems. We use a supervised machine learning approach to classify the genre of comments in our corpus. We examine the effects of different feature sets and supervised machine learning algorithms on classification accuracy. In total we ran 180 experiments in which we varied: number of genres, feature set composition, and machine learning algorithm. In experiments with all 8 genres we achieve up to 40% accuracy using either a Naive Bayes classifier or C4.5 based classifier with a feature set consisting of 1262 token unigrams and bigrams. This represents a 3 time improvement over chance levels

    Mining and Analyzing the Academic Network

    Get PDF
    Social Network research has attracted the interests of many researchers, not only in analyzing the online social networking applications, such as Facebook and Twitter, but also in providing comprehensive services in scientific research domain. We define an Academic Network as a social network which integrates scientific factors, such as authors, papers, affiliations, publishing venues, and their relationships, such as co-authorship among authors and citations among papers. By mining and analyzing the academic network, we can provide users comprehensive services as searching for research experts, published papers, conferences, as well as detecting research communities or the evolutions hot research topics. We can also provide recommendations to users on with whom to collaborate, whom to cite and where to submit.In this dissertation, we investigate two main tasks that have fundamental applications in the academic network research. In the first, we address the problem of expertise retrieval, also known as expert finding or ranking, in which we identify and return a ranked list of researchers, based upon their estimated expertise or reputation, to user-specified queries. In the second, we address the problem of research action recommendation (prediction), specifically, the tasks of publishing venue recommendation, citation recommendation and coauthor recommendation. For both tasks, to effectively mine and integrate heterogeneous information and therefore develop well-functioning ranking or recommender systems is our principal goal. For the task of expertise retrieval, we first proposed or applied three modified versions of PageRank-like algorithms into citation network analysis; we then proposed an enhanced author-topic model by simultaneously modeling citation and publishing venue information; we finally incorporated the pair-wise learning-to-rank algorithm into traditional topic modeling process, and further improved the model by integrating groups of author-specific features. For the task of research action recommendation, we first proposed an improved neighborhood-based collaborative filtering approach for publishing venue recommendation; we then applied our proposed enhanced author-topic model and demonstrated its effectiveness in both cited author prediction and publishing venue prediction; finally we proposed an extended latent factor model that can jointly model several relations in an academic environment in a unified way and verified its performance in four recommendation tasks: the recommendation on author-co-authorship, author-paper citation, paper-paper citation and paper-venue submission. Extensive experiments conducted on large-scale real-world data sets demonstrated the superiority of our proposed models over other existing state-of-the-art methods
    corecore