Search CORE

3,759 research outputs found

Clustering of twitter technology tweets and the impact of stopwords on clusters

Author: Bhagvat Surya
Publication venue: SJSU ScholarWorks
Publication date: 01/10/2011
Field of study

Year of 2010 could be termed as the year in which Twitter became completely mainstream. Twitter, which started as a means of communicating with friends, became much more than its beginning. Now Twitter is used by companies to promote their new products, used by movie industry to promote movies. A lot of advertising and branding is now tied to Twitter and most importantly any breaking news that happens, the first place one goes and tries to find is to search it on Twitter. Be it the Mumbai attacks that happened in 2008, or the minor earthquakes that happened in Bay Area in 2010 or the twitter revolution cause of the Iran elections, most of the tech and not so tech savvy viewers were following twitter rather than any main stream news channels. In fact most of the breaking news now comes on Twitter because of the huge number of user base rather than the traditional mainstream media. The focus of this paper is clustering with the TF-IDF weighted mechanism of daily technology news tweets of prominent bloggers and news sites using Apache Mahout and to evaluate the effects of introducing and removing stop words on the quality of clustering. This project restricts itself to only tweets in the English language

SJSU ScholarWorks

Key exchange with the help of a public ledger

Author: A Shamir
C Gehrmann
DP Jablon
F Hao
M Petraschek
S Laur
S Vaudenay
T Aura
V Boyko
W Diffie
Publication venue
Publication date: 11/08/2017
Field of study

Blockchains and other public ledger structures promise a new way to create globally consistent event logs and other records. We make use of this consistency property to detect and prevent man-in-the-middle attacks in a key exchange such as Diffie-Hellman or ECDH. Essentially, the MitM attack creates an inconsistency in the world views of the two honest parties, and they can detect it with the help of the ledger. Thus, there is no need for prior knowledge or trusted third parties apart from the distributed ledger. To prevent impersonation attacks, we require user interaction. It appears that, in some applications, the required user interaction is reduced in comparison to other user-assisted key-exchange protocols

arXiv.org e-Print Archive

Danish Phraseography

Author: Farø Ken
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2007
Field of study

Copenhagen University Research Information System

(How) is formulaic language universal? Insights from Korean, German and English

Author: Buerki Andreas
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 20/01/2020
Field of study

The existence of common expressions, also referred to as formulaic language or phraseological units, has been evidenced in a very large number of languages. However, the extent to which languages feature such formulaic material, how formulaicity may be understood across typologically different languages and whether indeed there is a concept of formulaic language that applies across languages, are questions that have been less commonly discussed. Using a novel data set consisting of topically matched corpora in three typologically different languages (Korean, German and English), this study proposes an empirically founded universal concept for formulaic language and discusses what the shape of this concept suggests for the theoretical understanding of formulaic language going forward. In particular, it is argued that the nexus of the concept of formulaic language cannot be fixed at any particular structural level (such as the phrase or the level of polylexicality) and incorporates elements specified at varying levels of abstraction (or schematicity). This means that a cross-linguistic concept of formulaic language fits in well with a constructionist view of linguistic structure