14,430 research outputs found
Third Party Tracking in the Mobile Ecosystem
Third party tracking allows companies to identify users and track their
behaviour across multiple digital services. This paper presents an empirical
study of the prevalence of third-party trackers on 959,000 apps from the US and
UK Google Play stores. We find that most apps contain third party tracking, and
the distribution of trackers is long-tailed with several highly dominant
trackers accounting for a large portion of the coverage. The extent of tracking
also differs between categories of apps; in particular, news apps and apps
targeted at children appear to be amongst the worst in terms of the number of
third party trackers associated with them. Third party tracking is also
revealed to be a highly trans-national phenomenon, with many trackers operating
in jurisdictions outside the EU. Based on these findings, we draw out some
significant legal compliance challenges facing the tracking industry.Comment: Corrected missing company info (Linkedin owned by Microsoft). Figures
for Microsoft and Linkedin re-calculated and added to Table
Static Score Bucketing in Inverted Indexes
Maintaining strict static score order of inverted lists is a
heuristic used by search engines to improve the quality of query results when the entire inverted lists cannot be processed. This heuristic, however, increases the cost of index generation and requires time-consuming index build algorithms. In this paper, we study a new index organization based on static score bucketing. We show that this new technique significantly improves in index build performance while having minimal impact on the quality of search results. We also provide upper bounds on the quality degradation and verify experimentally the benefits of the proposed approach
Attack graph approach to dynamic network vulnerability analysis and countermeasures
A thesis submitted to the University of Bedfordshire, in partial fulfilment of the requirements for the degree of Doctor of PhilosophyIt is widely accepted that modern computer networks (often presented as a heterogeneous collection of functioning organisations, applications, software, and hardware) contain vulnerabilities. This research proposes a new methodology to compute a dynamic severity cost for each state. Here a state refers to the behaviour of a system during an attack; an example of a state is where an attacker could influence the information on an application to alter the credentials. This is performed by utilising a modified variant of the Common Vulnerability Scoring System (CVSS), referred to as a Dynamic Vulnerability Scoring System (DVSS). This calculates scores of intrinsic, time-based, and ecological metrics by combining related sub-scores and modelling the problemâs parameters into a mathematical framework to develop a unique severity cost.
The individual static nature of CVSS affects the scoring value, so the author has adapted a novel model to produce a DVSS metric that is more precise and efficient.
In this approach, different parameters are used to compute the final scores determined from a number of parameters including network architecture, device setting, and the impact of vulnerability interactions.
An attack graph (AG) is a security model representing the chains of vulnerability exploits in a network. A number of researchers have acknowledged the attack graph visual complexity and a lack of in-depth understanding. Current attack graph tools are constrained to only limited attributes or even rely on hand-generated input. The automatic formation of vulnerability information has been troublesome and vulnerability descriptions are frequently created by hand, or based on limited data. The network architectures and configurations along with the interactions between the individual vulnerabilities are considered in the method of computing the Cost using the DVSS and a dynamic cost-centric framework.
A new methodology was built up to present an attack graph with a dynamic cost metric based on DVSS and also a novel methodology to estimate and represent the cost-centric approach for each hostâ states was followed out.
A framework is carried out on a test network, using the Nessus scanner to detect known vulnerabilities, implement these results and to build and represent the dynamic cost centric attack graph using ranking algorithms (in a standardised fashion to Mehta et al. 2006 and Kijsanayothin, 2010). However, instead of using vulnerabilities for each host, a CostRank Markov Model has developed utilising a novel cost-centric approach, thereby reducing the complexity in the attack graph and reducing the problem of visibility.
An analogous parallel algorithm is developed to implement CostRank. The reason for developing a parallel CostRank Algorithm is to expedite the states ranking calculations for the increasing number of hosts and/or vulnerabilities. In the same way, the author intends to secure large scale networks that require fast and reliable computing to calculate the ranking of enormous graphs with thousands of vertices (states) and millions of arcs (representing an action to move from one state to another). In this proposed approach, the focus on a parallel CostRank computational architecture to appraise the enhancement in CostRank calculations and scalability of of the algorithm. In particular, a partitioning of input data, graph files and ranking vectors with a load balancing technique can enhance the performance and scalability of CostRank computations in parallel.
A practical model of analogous CostRank parallel calculation is undertaken, resulting in a substantial decrease in calculations communication levels and in iteration time. The results are presented in an analytical approach in terms of scalability, efficiency, memory usage, speed up and input/output rates.
Finally, a countermeasures model is developed to protect against network attacks by using a Dynamic Countermeasures Attack Tree (DCAT). The following scheme is used to build DCAT tree (i) using scalable parallel CostRank Algorithm to determine the critical asset, that system administrators need to protect; (ii) Track the Nessus scanner to determine the vulnerabilities associated with the asset using the dynamic cost centric framework and DVSS; (iii) Check out all published mitigations for all vulnerabilities. (iv) Assess how well the security solution mitigates those risks; (v) Assess DCAT algorithm in terms of effective security cost, probability and cost/benefit analysis to reduce the total impact of a specific vulnerability
LambdaFM: Learning Optimal Ranking with Factorization Machines Using Lambda Surrogates
State-of-the-art item recommendation algorithms, which apply
Factorization Machines (FM) as a scoring function and
pairwise ranking loss as a trainer (PRFM for short), have
been recently investigated for the implicit feedback based
context-aware recommendation problem (IFCAR). However,
good recommenders particularly emphasize on the accuracy
near the top of the ranked list, and typical pairwise loss functions
might not match well with such a requirement. In this
paper, we demonstrate, both theoretically and empirically,
PRFM models usually lead to non-optimal item recommendation
results due to such a mismatch. Inspired by the success
of LambdaRank, we introduce Lambda Factorization
Machines (LambdaFM), which is particularly intended for
optimizing ranking performance for IFCAR. We also point
out that the original lambda function suffers from the issue
of expensive computational complexity in such settings due
to a large amount of unobserved feedback. Hence, instead
of directly adopting the original lambda strategy, we create
three effective lambda surrogates by conducting a theoretical
analysis for lambda from the top-N optimization perspective.
Further, we prove that the proposed lambda surrogates
are generic and applicable to a large set of pairwise
ranking loss functions. Experimental results demonstrate
LambdaFM significantly outperforms state-of-the-art algorithms
on three real-world datasets in terms of four standard
ranking measures
Survey over Existing Query and Transformation Languages
A widely acknowledged obstacle for realizing the vision of the Semantic Web is the inability
of many current Semantic Web approaches to cope with data available in such diverging
representation formalisms as XML, RDF, or Topic Maps. A common query language is the first
step to allow transparent access to data in any of these formats. To further the understanding
of the requirements and approaches proposed for query languages in the conventional as well
as the Semantic Web, this report surveys a large number of query languages for accessing
XML, RDF, or Topic Maps. This is the first systematic survey to consider query languages from
all these areas. From the detailed survey of these query languages, a common classification
scheme is derived that is useful for understanding and differentiating languages within and
among all three areas
SparkIR: a Scalable Distributed Information Retrieval Engine over Spark
Search engines have to deal with a huge amount of data (e.g., billions of
documents in the case of the Web) and find scalable and efficient ways to produce
effective search results. In this thesis, we propose to use Spark framework, an in
memory distributed big data processing framework, and leverage its powerful
capabilities of handling large amount of data to build an efficient and scalable
experimental search engine over textual documents. The proposed system, SparkIR,
can serve as a research framework for conducting information retrieval (IR)
experiments. SparkIR supports two indexing schemes, document-based partitioning
and term-based partitioning, to adopt document-at-a-time (DAAT) and term-at-a-time
(TAAT) query evaluation methods. Moreover, it offers static and dynamic pruning to
improve the retrieval efficiency. For static pruning, it employs champion list and
tiering, while for dynamic pruning, it uses MaxScore top k retrieval. We evaluated the
performance of SparkIR using ClueWeb12-B13 collection that contains about 50M
English Web pages. Experiments over different subsets of the collection and
compared the Elasticsearch baseline show that SparkIR exhibits reasonable efficiency
and scalability performance overall for both indexing and retrieval. Implemented as
an open-source library over Spark, users of SparkIR can also benefit from other Spark
libraries (e.g., MLlib and GraphX), which, therefore, eliminates the need of usin
Dublin City University video track experiments for TREC 2003
In this paper, we describe our experiments for both the News Story Segmentation task and Interactive Search task for
TRECVID 2003. Our News Story Segmentation task involved the use of a Support Vector Machine (SVM) to combine evidence from audio-visual analysis tools in order to generate a listing of news stories from a given news programme. Our
Search task experiment compared a video retrieval system based on text, image and relevance feedback with a text-only
video retrieval system in order to identify which was more effective. In order to do so we developed two variations of our FĂschlĂĄr video retrieval system and conducted user testing in a controlled lab environment. In this paper we outline our work on both of these two tasks
- âŠ