35,649 research outputs found
Coauthor prediction for junior researchers
Research collaboration can bring in different perspectives and generate more productive results. However, finding an appropriate collaborator can be difficult due to the lacking of sufficient information. Link prediction is a related technique for collaborator discovery; but its focus has been mostly on the core authors who have relatively more publications. We argue that junior researchers actually need more help in finding collaborators. Thus, in this paper, we focus on coauthor prediction for junior researchers. Most of the previous works on coauthor prediction considered global network feature and local network feature separately, or tried to combine local network feature and content feature. But we found a significant improvement by simply combing local network feature and global network feature. We further developed a regularization based approach to incorporate multiple features simultaneously. Experimental results demonstrated that this approach outperformed the simple linear combination of multiple features. We further showed that content features, which were proved to be useful in link prediction, can be easily integrated into our regularization approach. Ā© 2013 Springer-Verlag
The OU Linked Open Data: production and consumption
The aim of this paper is to introduce the current efforts toward the release and exploitation of The Open University's (OU) Linked Open Data (LOD). We introduce the work that has been done within the LUCERO project in order to select, extract and structure subsets of information contained within the OU data sources and migrate and expose this information as part of the LOD cloud. To show the potential of such exposure we also introduce three different prototypes that exploit this new educational resource: (1) the OU expert search system, a tool focused on fnding the best experts for a certain topic within the OU staff; (2) the Buddy Study system, a tool that relies on Facebook information to identify common interest among friends and recommend potential courses within the OU that `buddies' can study together, and; (3) Linked OpenLearn, an application that enables exploring linked courses, Podcasts and tags to OpenLearn units. Its aim is to enhance the browsing experience for students, by detecting relevant educational resources on fly while reading an OpenLearn unit
Comprehensive Review of Opinion Summarization
The abundance of opinions on the web has kindled the study of opinion summarization over the last few years. People have introduced various techniques and paradigms to solving this special task. This survey attempts to systematically investigate the different techniques and approaches used in opinion summarization. We provide a multi-perspective classification of the approaches used and highlight some of the key weaknesses of these approaches. This survey also covers evaluation techniques and data sets used in studying the opinion summarization problem. Finally, we provide insights into some of the challenges that are left to be addressed as this will help set the trend for future research in this area.unpublishednot peer reviewe
Broad expertise retrieval in sparse data environments
Expertise retrieval has been largely unexplored on data other than the W3C collection. At the same time, many intranets of universities and other knowledge-intensive organisations offer examples of relatively small but clean multilingual expertise data, covering broad ranges of expertise areas. We first present two main expertise retrieval tasks, along with a set of baseline approaches based on generative language modeling, aimed at finding expertise relations between topics and people. For our experimental evaluation, we introduce (and release) a new test set based on a crawl of a university site. Using this test set, we conduct two series of experiments. The first is aimed at determining the effectiveness of baseline expertise retrieval methods applied to the new test set. The second is aimed at assessing refined models that exploit characteristic features of the new test set, such as the organizational structure of the university, and the hierarchical structure of the topics in the test set. Expertise retrieval models are shown to be robust with respect to environments smaller than the W3C collection, and current techniques appear to be generalizable to other settings
Collaborative Summarization of Topic-Related Videos
Large collections of videos are grouped into clusters by a topic keyword,
such as Eiffel Tower or Surfing, with many important visual concepts repeating
across them. Such a topically close set of videos have mutual influence on each
other, which could be used to summarize one of them by exploiting information
from others in the set. We build on this intuition to develop a novel approach
to extract a summary that simultaneously captures both important
particularities arising in the given video, as well as, generalities identified
from the set of videos. The topic-related videos provide visual context to
identify the important parts of the video being summarized. We achieve this by
developing a collaborative sparse optimization method which can be efficiently
solved by a half-quadratic minimization algorithm. Our work builds upon the
idea of collaborative techniques from information retrieval and natural
language processing, which typically use the attributes of other similar
objects to predict the attribute of a given object. Experiments on two
challenging and diverse datasets well demonstrate the efficacy of our approach
over state-of-the-art methods.Comment: CVPR 201
Search Rank Fraud De-Anonymization in Online Systems
We introduce the fraud de-anonymization problem, that goes beyond fraud
detection, to unmask the human masterminds responsible for posting search rank
fraud in online systems. We collect and study search rank fraud data from
Upwork, and survey the capabilities and behaviors of 58 search rank fraudsters
recruited from 6 crowdsourcing sites. We propose Dolos, a fraud
de-anonymization system that leverages traits and behaviors extracted from
these studies, to attribute detected fraud to crowdsourcing site fraudsters,
thus to real identities and bank accounts. We introduce MCDense, a min-cut
dense component detection algorithm to uncover groups of user accounts
controlled by different fraudsters, and leverage stylometry and deep learning
to attribute them to crowdsourcing site profiles. Dolos correctly identified
the owners of 95% of fraudster-controlled communities, and uncovered fraudsters
who promoted as many as 97.5% of fraud apps we collected from Google Play. When
evaluated on 13,087 apps (820,760 reviews), which we monitored over more than 6
months, Dolos identified 1,056 apps with suspicious reviewer groups. We report
orthogonal evidence of their fraud, including fraud duplicates and fraud
re-posts.Comment: The 29Th ACM Conference on Hypertext and Social Media, July 201
- ā¦