4,626 research outputs found
The Best Answers? Think Twice: Online Detection of Commercial Campaigns in the CQA Forums
In an emerging trend, more and more Internet users search for information
from Community Question and Answer (CQA) websites, as interactive communication
in such websites provides users with a rare feeling of trust. More often than
not, end users look for instant help when they browse the CQA websites for the
best answers. Hence, it is imperative that they should be warned of any
potential commercial campaigns hidden behind the answers. However, existing
research focuses more on the quality of answers and does not meet the above
need. In this paper, we develop a system that automatically analyzes the hidden
patterns of commercial spam and raises alarms instantaneously to end users
whenever a potential commercial campaign is detected. Our detection method
integrates semantic analysis and posters' track records and utilizes the
special features of CQA websites largely different from those in other types of
forums such as microblogs or news reports. Our system is adaptive and
accommodates new evidence uncovered by the detection algorithms over time.
Validated with real-world trace data from a popular Chinese CQA website over a
period of three months, our system shows great potential towards adaptive
online detection of CQA spams.Comment: 9 pages, 10 figure
Investigating the Impact of the Blogsphere: Using PageRank to Determine the Distribution of Attention
Much has been written in recent years about the blogosphere and its impact on political, educational and scientific debates. Lately the issue has received significant attention from the industry. As the blogosphere continues to grow, even doubling its size every six months, this paper investigates its apparent impact on the overall Web itself. We use the popular Google PageRank algorithm which employs a model of Web used to measure the distribution of user attention across sites in the blogosphere. The paper is based on an analysis of the PageRank distribution for 8.8 million blogs in 2005 and 2006. This paper addresses the following key questions: How is PageRank distributed across the blogosphere? Does it indicate the existence of measurable, visible effects of blogs on the overall mediasphere? Can we compare the distribution of attention to blogs as characterised by the PageRank with the situation for other forms of Web content? Has there been a growth in the impact of the blogosphere on the Web over the two years analysed here? Finally, it will also be necessary to examine the limitations of a PageRank-centred approach
SportsAnno: what do you think?
The automatic summarisation of sports video is of growing importance with the increased availability of on-demand content. Consumers who are unable to view events live often have a desire to watch a summary which allows then to quickly come to terms with all that has happened during a sporting event. Sports forums show that it is not only summaries that are desirable but also the opportunity to share one’s own point of view and discuss the opinions with a community of similar users. In this paper we give an overview of the ways in which annotations have been used to augment existing visual media. We present SportsAnno, a system developed to summarise World Cup 2006 matches and provide a means for open discussion of events within
these matches
How Google Works; Are Search Engines Really Dumb and Should Educators Care?
Google is like air; your students use Google every day but do they know how it works and ranks results? This “real life literacy session” will explain the factors that determine a webpage’s Google ranking and those factors that students can control to return the best results
- …