218,583 research outputs found
I Know Where You are and What You are Sharing: Exploiting P2P Communications to Invade Users' Privacy
In this paper, we show how to exploit real-time communication applications to
determine the IP address of a targeted user. We focus our study on Skype,
although other real-time communication applications may have similar privacy
issues. We first design a scheme that calls an identified targeted user
inconspicuously to find his IP address, which can be done even if he is behind
a NAT. By calling the user periodically, we can then observe the mobility of
the user. We show how to scale the scheme to observe the mobility patterns of
tens of thousands of users. We also consider the linkability threat, in which
the identified user is linked to his Internet usage. We illustrate this threat
by combining Skype and BitTorrent to show that it is possible to determine the
file-sharing usage of identified users. We devise a scheme based on the
identification field of the IP datagrams to verify with high accuracy whether
the identified user is participating in specific torrents. We conclude that any
Internet user can leverage Skype, and potentially other real-time communication
systems, to observe the mobility and file-sharing usage of tens of millions of
identified users.Comment: This is the authors' version of the ACM/USENIX Internet Measurement
Conference (IMC) 2011 pape
"'Who are you?' - Learning person specific classifiers from video"
We investigate the problem of automatically labelling
faces of characters in TV or movie material with their
names, using only weak supervision from automaticallyaligned
subtitle and script text. Our previous work (Everingham
et al. [8]) demonstrated promising results on the
task, but the coverage of the method (proportion of video
labelled) and generalization was limited by a restriction to
frontal faces and nearest neighbour classification.
In this paper we build on that method, extending the coverage
greatly by the detection and recognition of characters
in profile views. In addition, we make the following contributions:
(i) seamless tracking, integration and recognition
of profile and frontal detections, and (ii) a character specific
multiple kernel classifier which is able to learn the features
best able to discriminate between the characters.
We report results on seven episodes of the TV series
âBuffy the Vampire Slayerâ, demonstrating significantly increased
coverage and performance with respect to previous
methods on this material
The bad guys are using it, are you?
From Occupy Wall Street to 2011 England riots to Arab Spring to Mumbai 26/11 to the ethnic cleansing rumors in India and increasingly used by pedophiles, social media is a very powerful tool for pedophiles, troublemakers, criminals and even terrorists to target individuals and even to go against the establishment. On the other hand, social media can save lives in a disaster, and its a natural extension of community policing or engagement. Community engagement is a must-have strategy for any public safety and security agency. However, this strategy requires the removal of stovepipe processes and systems within an agency, allowing better process integration and information sharing, thereby offering improved services to all stakeholders. Improved information sharing within an agency is also a stepping stone towards cross-agency information sharing. Criminal and terrorist organizations do not follow and do not commit crimes and terrorism based on how public safety and security agencies are structured. They commit crimes based on maximum returns, be they financial ideological or to instill fear. This is why information sharing or intelligence fusion is crucial across agencies especially with more illicit use of technologies and social media by the âbad guysâ
These are not the k-mers you are looking for: efficient online k-mer counting using a probabilistic data structure
K-mer abundance analysis is widely used for many purposes in nucleotide
sequence analysis, including data preprocessing for de novo assembly, repeat
detection, and sequencing coverage estimation. We present the khmer software
package for fast and memory efficient online counting of k-mers in sequencing
data sets. Unlike previous methods based on data structures such as hash
tables, suffix arrays, and trie structures, khmer relies entirely on a simple
probabilistic data structure, a Count-Min Sketch. The Count-Min Sketch permits
online updating and retrieval of k-mer counts in memory which is necessary to
support online k-mer analysis algorithms. On sparse data sets this data
structure is considerably more memory efficient than any exact data structure.
In exchange, the use of a Count-Min Sketch introduces a systematic overcount
for k-mers; moreover, only the counts, and not the k-mers, are stored. Here we
analyze the speed, the memory usage, and the miscount rate of khmer for
generating k-mer frequency distributions and retrieving k-mer counts for
individual k-mers. We also compare the performance of khmer to several other
k-mer counting packages, including Tallymer, Jellyfish, BFCounter, DSK, KMC,
Turtle and KAnalyze. Finally, we examine the effectiveness of profiling
sequencing error, k-mer abundance trimming, and digital normalization of reads
in the context of high khmer false positive rates. khmer is implemented in C++
wrapped in a Python interface, offers a tested and robust API, and is freely
available under the BSD license at github.com/ged-lab/khmer
- âŠ