999 research outputs found
adPerf: Characterizing the Performance of Third-party Ads
Monetizing websites and web apps through online advertising is widespread in
the web ecosystem. The online advertising ecosystem nowadays forces publishers
to integrate ads from these third-party domains. On the one hand, this raises
several privacy and security concerns that are actively studied in recent
years. On the other hand, given the ability of today's browsers to load dynamic
web pages with complex animations and Javascript, online advertising has also
transformed and can have a significant impact on webpage performance. The
performance cost of online ads is critical since it eventually impacts user
satisfaction as well as their Internet bill and device energy consumption.
In this paper, we apply an in-depth and first-of-a-kind performance
evaluation of web ads. Unlike prior efforts that rely primarily on adblockers,
we perform a fine-grained analysis on the web browser's page loading process to
demystify the performance cost of web ads. We aim to characterize the cost by
every component of an ad, so the publisher, ad syndicate, and advertiser can
improve the ad's performance with detailed guidance. For this purpose, we
develop an infrastructure, adPerf, for the Chrome browser that classifies page
loading workloads into ad-related and main-content at the granularity of
browser activities (such as Javascript and Layout). Our evaluations show that
online advertising entails more than 15% of browser page loading workload and
approximately 88% of that is spent on JavaScript. We also track the sources and
delivery chain of web ads and analyze performance considering the origin of the
ad contents. We observe that 2 of the well-known third-party ad domains
contribute to 35% of the ads performance cost and surprisingly, top news
websites implicitly include unknown third-party ads which in some cases build
up to more than 37% of the ads performance cost
Recommended from our members
Righting Web Development
The web browser is the most important application runtime today, encompassing all types of applications on practically every Internet-connected device. Browsers power complete office suites, media players, games, and augmented and virtual reality experiences, and they integrate with cameras, microphones, GPSes, and other sensors available on computing devices. Many apparently native mobile and desktop applications are secretly hybrid apps that contain a mix of native and browser code. History has shown that when new devices, sensors, and experiences appear on the market, the browser will evolve to support them.
Despite the browser\u27s importance, developing web applications is exceedingly difficult. Web browsers organically evolved from a document viewer into a ubiquitous program runtime. The browser\u27s scripting language for web designers, JavaScript, has grown into the only universally supported programming language in the browser. Unfortunately, JavaScript is notoriously difficult to write and debug. The browser\u27s high-level and event-driven I/O interfaces make it easy to add simple interactions to webpages, but these same interfaces lead to nondeterministic bugs and performance issues in larger applications. These bugs are challenging for developers to reason about and fix.
This dissertation revisits web development and provides developers with a complete set of development tools with full support for the browser environment. McFly is the first time-traveling debugger for the browser, and lets developers debug web applications and their visual state during time-travel; components of this work shipped in Microsoft\u27s ChakraCore JavaScript engine. BLeak is the first system for automatically debugging memory leaks in web applications, and provides developers with a ranked list of memory leaks along with the source code responsible for them. BCause constructs a causal graph of a web application\u27s events, which helps developers understand their code\u27s behavior. Doppio lets developers run code written in conventional languages in the browser, and Browsix brings Unix into the browser to enable unmodified programs expecting a Unix-like environment to run directly in the browser. Together, these five systems form a solid foundation for web development
Scaling Causality Analysis for Production Systems.
Causality analysis reveals how program values influence each other.
It is important for debugging, optimizing, and understanding the execution of
programs. This thesis scales causality analysis to production systems
consisting of desktop and server applications as well as large-scale Internet
services. This enables developers to employ causality analysis to debug and
optimize complex, modern software systems. This thesis shows that it is
possible to scale causality analysis to both fine-grained instruction level
analysis and analysis of Internet scale distributed systems with thousands of
discrete software components by developing and employing automated methods to
observe and reason about causality.
First, we observe causality at a fine-grained instruction level by developing
the first taint tracking framework to support tracking millions of input
sources. We also introduce flexible taint tracking to allow
for scoping different queries and dynamic filtering of inputs, outputs, and
relationships.
Next, we introduce the Mystery Machine, which uses a ``big data'' approach to
discover causal relationships between software components in a large-scale
Internet service. We leverage the fact that large-scale Internet services
receive a large number of requests in order to observe counterexamples to
hypothesized causal relationships. Using discovered casual relationships, we
identify the critical path for request execution and use the critical path
analysis to explore potential scheduling optimizations.
Finally, we explore using causality to make data-quality tradeoffs in
Internet services. A data-quality tradeoff is an explicit decision by a software
component to return lower-fidelity data in order to improve response time or
minimize resource usage. We perform a study of data-quality tradeoffs in a
large-scale Internet service to show the pervasiveness of these
tradeoffs. We develop DQBarge, a system that enables better data-quality
tradeoffs by propagating critical information along the causal path of request
processing. Our evaluation shows that DQBarge helps Internet services mitigate
load spikes, improve utilization of spare resources, and implement dynamic
capacity planning.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/135888/1/mcchow_1.pd
MT-WAVE: Profiling multi-tier web applications
The web is evolving: what was once primarily used for sharing static content has now evolved into a platform
for rich client-side applications. These applications do not run exclusively on the client; while the client is
responsible for presentation and some processing, there is a significant amount of processing and persistence
that happens server-side. This has advantages and disadvantages. The biggest advantage is that the user’s
data is accessible from anywhere. It doesn’t matter which device you sign into a web application from,
everything you’ve been working on is instantly accessible. The largest disadvantage is that large numbers
of servers are required to support a growing user base; unlike traditional client applications, an organization
making a web application needs to provision compute and storage resources for each expected user. This
infrastructure is designed in tiers that are responsible for different aspects of the application, and these tiers
may not even be run by the same organization.
As these systems grow in complexity, it becomes progressively more challenging to identify and solve
performance problems. While there are many measures of software system performance, web application
users only care about response latency. This “fingertip-to-eyeball performance” is the only metric that users
directly perceive: when a button is clicked in a web application, how long does it take for the desired action
to complete?
MT-WAVE is a system for solving fingertip-to-eyeball performance problems in web applications. The
system is designed for doing multi-tier tracing: each piece of the application is instrumented, execution
traces are collected, and the system merges these traces into a single coherent snapshot of system latency
at every tier. To ensure that user-perceived latency is accurately captured, the tracing begins in the web
browser. The application developer then uses the MT-WAVE Visualization System to explore the execution
traces to first identify which system is causing the largest amount of latency, and then zooms in on the
specific function calls in that tier to find optimization candidates. After fixing an identified problem, the
system is used to verify that the changes had the intended effect.
This optimization methodology and toolset is explained through a series of case studies that identify and
solve performance problems in open-source and commercial applications. These case studies demonstrate
both the utility of the MT-WAVE system and the unintuitive nature of system optimization
Improving the robustness and privacy of HTTP cookie-based tracking systems within an affiliate marketing context : a thesis presented in fulfilment of the requirements for the degree of Doctor of Philosophy at Massey University, Albany, New Zealand
E-commerce activities provide a global reach for enterprises large and small. Third parties generate visitor traffic for a fee; through affiliate marketing, search engine marketing, keyword bidding and through organic search, amongst others. Therefore, improving the robustness of the underlying tracking and state management techniques is a vital requirement for the growth and stability of e-commerce. In an inherently stateless ecosystem such as the Internet, HTTP cookies have been the de-facto tracking vector for decades. In a previous study, the thesis author exposed circumstances under which cookie-based tracking system can fail, some due to technical glitches, others due to manipulations made for monetary gain by some fraudulent actors.
Following a design science research paradigm, this research explores alternative tracking vectors discussed in previous research studies within a cross-domain tracking environment. It evaluates their efficacy within current context and demonstrates how to use them to improve the robustness of existing tracking techniques. Research outputs include methods, instantiations and a privacy model artefact based on information seeking behaviour of different categories of tracking software, and their resulting privacy intrusion levels. This privacy model provides clarity and is useful for practitioners and regulators to create regulatory frameworks that do not hinder technological advancement, rather they curtail privacy-intrusive tracking practices on the Internet. The method artefacts are instantiated as functional prototypes, available publicly on Internet, to demonstrate the efficacy and utility of the methods through live tests.
The research contributes to the theoretical knowledge base through generalisation of empirical findings and to the industry by problem solving design artefacts
Alter ego, state of the art on user profiling: an overview of the most relevant organisational and behavioural aspects regarding User Profiling.
This report gives an overview of the most relevant organisational and\ud
behavioural aspects regarding user profiling. It discusses not only the\ud
most important aims of user profiling from both an organisation’s as\ud
well as a user’s perspective, it will also discuss organisational motives\ud
and barriers for user profiling and the most important conditions for\ud
the success of user profiling. Finally recommendations are made and\ud
suggestions for further research are given
Augmenting Network Flows with User Interface Context to Inform Access Control Decisions
Whitelisting IP addresses and hostnames allow organizations to employ a default-deny approach to network traffic. Organizations employing a default-deny approach can stop many malicious threats, even including zero-day attacks, because it only allows explicitly stated legitimate activities. However, creating a comprehensive whitelist for a default-deny approach is difficult due to user-supplied destinations that can only be known at the time of usage. Whitelists, therefore, interfere with user experience by denying network traffic to user-supplied legitimate destinations. In this thesis, we focus on creating dynamic whitelists that are capable of allowing user-supplied network activity. We designed and built a system called Harbinger, which leverages user interface activity to provide contextual information in which network activity took place. We built Harbinger for Microsoft Windows operating systems and have tested its usability and effectiveness on four popular Microsoft applications. We find that Harbinger can reduce false positives-positive detection rates from 44%-54% to 0%-0.4% in IP and DNS whitelists. Furthermore, while traditional whitelists failed to detect propagation attacks, Harbinger detected the same attacks 96% of the time. We find that our system only introduced six milliseconds of delay or less for 96% of network activity
VisANT: an online visualization and analysis tool for biological interaction data
BACKGROUND: New techniques for determining relationships between biomolecules of all types – genes, proteins, noncoding DNA, metabolites and small molecules – are now making a substantial contribution to the widely discussed explosion of facts about the cell. The data generated by these techniques promote a picture of the cell as an interconnected information network, with molecular components linked with one another in topologies that can encode and represent many features of cellular function. This networked view of biology brings the potential for systematic understanding of living molecular systems. RESULTS: We present VisANT, an application for integrating biomolecular interaction data into a cohesive, graphical interface. This software features a multi-tiered architecture for data flexibility, separating back-end modules for data retrieval from a front-end visualization and analysis package. VisANT is a freely available, open-source tool for researchers, and offers an online interface for a large range of published data sets on biomolecular interactions, including those entered by users. This system is integrated with standard databases for organized annotation, including GenBank, KEGG and SwissProt. VisANT is a Java-based, platform-independent tool suitable for a wide range of biological applications, including studies of pathways, gene regulation and systems biology. CONCLUSION: VisANT has been developed to provide interactive visual mining of biological interaction data sets. The new software provides a general tool for mining and visualizing such data in the context of sequence, pathway, structure, and associated annotations. Interaction and predicted association data can be combined, overlaid, manipulated and analyzed using a variety of built-in functions. VisANT is available at
Automated Injection of Curated Knowledge Into Real-Time Clinical Systems: CDS Architecture for the 21st Century
abstract: Clinical Decision Support (CDS) is primarily associated with alerts, reminders, order entry, rule-based invocation, diagnostic aids, and on-demand information retrieval. While valuable, these foci have been in production use for decades, and do not provide a broader, interoperable means of plugging structured clinical knowledge into live electronic health record (EHR) ecosystems for purposes of orchestrating the user experiences of patients and clinicians. To date, the gap between knowledge representation and user-facing EHR integration has been considered an “implementation concern” requiring unscalable manual human efforts and governance coordination. Drafting a questionnaire engineered to meet the specifications of the HL7 CDS Knowledge Artifact specification, for example, carries no reasonable expectation that it may be imported and deployed into a live system without significant burdens. Dramatic reduction of the time and effort gap in the research and application cycle could be revolutionary. Doing so, however, requires both a floor-to-ceiling precoordination of functional boundaries in the knowledge management lifecycle, as well as formalization of the human processes by which this occurs.
This research introduces ARTAKA: Architecture for Real-Time Application of Knowledge Artifacts, as a concrete floor-to-ceiling technological blueprint for both provider heath IT (HIT) and vendor organizations to incrementally introduce value into existing systems dynamically. This is made possible by service-ization of curated knowledge artifacts, then injected into a highly scalable backend infrastructure by automated orchestration through public marketplaces. Supplementary examples of client app integration are also provided. Compilation of knowledge into platform-specific form has been left flexible, in so far as implementations comply with ARTAKA’s Context Event Service (CES) communication and Health Services Platform (HSP) Marketplace service packaging standards.
Towards the goal of interoperable human processes, ARTAKA’s treatment of knowledge artifacts as a specialized form of software allows knowledge engineers to operate as a type of software engineering practice. Thus, nearly a century of software development processes, tools, policies, and lessons offer immediate benefit: in some cases, with remarkable parity. Analyses of experimentation is provided with guidelines in how choice aspects of software development life cycles (SDLCs) apply to knowledge artifact development in an ARTAKA environment.
Portions of this culminating document have been further initiated with Standards Developing Organizations (SDOs) intended to ultimately produce normative standards, as have active relationships with other bodies.Dissertation/ThesisDoctoral Dissertation Biomedical Informatics 201
- …