129,905 research outputs found
Clustering of twitter technology tweets and the impact of stopwords on clusters
Year of 2010 could be termed as the year in which Twitter became completely mainstream. Twitter, which started as a means of communicating with friends, became much more than its beginning. Now Twitter is used by companies to promote their new products, used by movie industry to promote movies. A lot of advertising and branding is now tied to Twitter and most importantly any breaking news that happens, the first place one goes and tries to find is to search it on Twitter. Be it the Mumbai attacks that happened in 2008, or the minor earthquakes that happened in Bay Area in 2010 or the twitter revolution cause of the Iran elections, most of the tech and not so tech savvy viewers were following twitter rather than any main stream news channels. In fact most of the breaking news now comes on Twitter because of the huge number of user base rather than the traditional mainstream media. The focus of this paper is clustering with the TF-IDF weighted mechanism of daily technology news tweets of prominent bloggers and news sites using Apache Mahout and to evaluate the effects of introducing and removing stop words on the quality of clustering. This project restricts itself to only tweets in the English language
Mining Images in Biomedical Publications: Detection and Analysis of Gel Diagrams
Authors of biomedical publications use gel images to report experimental
results such as protein-protein interactions or protein expressions under
different conditions. Gel images offer a concise way to communicate such
findings, not all of which need to be explicitly discussed in the article text.
This fact together with the abundance of gel images and their shared common
patterns makes them prime candidates for automated image mining and parsing. We
introduce an approach for the detection of gel images, and present a workflow
to analyze them. We are able to detect gel segments and panels at high
accuracy, and present preliminary results for the identification of gene names
in these images. While we cannot provide a complete solution at this point, we
present evidence that this kind of image mining is feasible.Comment: arXiv admin note: substantial text overlap with arXiv:1209.148
- ā¦