999 research outputs found

    adPerf: Characterizing the Performance of Third-party Ads

    Get PDF
    Monetizing websites and web apps through online advertising is widespread in the web ecosystem. The online advertising ecosystem nowadays forces publishers to integrate ads from these third-party domains. On the one hand, this raises several privacy and security concerns that are actively studied in recent years. On the other hand, given the ability of today's browsers to load dynamic web pages with complex animations and Javascript, online advertising has also transformed and can have a significant impact on webpage performance. The performance cost of online ads is critical since it eventually impacts user satisfaction as well as their Internet bill and device energy consumption. In this paper, we apply an in-depth and first-of-a-kind performance evaluation of web ads. Unlike prior efforts that rely primarily on adblockers, we perform a fine-grained analysis on the web browser's page loading process to demystify the performance cost of web ads. We aim to characterize the cost by every component of an ad, so the publisher, ad syndicate, and advertiser can improve the ad's performance with detailed guidance. For this purpose, we develop an infrastructure, adPerf, for the Chrome browser that classifies page loading workloads into ad-related and main-content at the granularity of browser activities (such as Javascript and Layout). Our evaluations show that online advertising entails more than 15% of browser page loading workload and approximately 88% of that is spent on JavaScript. We also track the sources and delivery chain of web ads and analyze performance considering the origin of the ad contents. We observe that 2 of the well-known third-party ad domains contribute to 35% of the ads performance cost and surprisingly, top news websites implicitly include unknown third-party ads which in some cases build up to more than 37% of the ads performance cost

    Scaling Causality Analysis for Production Systems.

    Full text link
    Causality analysis reveals how program values influence each other. It is important for debugging, optimizing, and understanding the execution of programs. This thesis scales causality analysis to production systems consisting of desktop and server applications as well as large-scale Internet services. This enables developers to employ causality analysis to debug and optimize complex, modern software systems. This thesis shows that it is possible to scale causality analysis to both fine-grained instruction level analysis and analysis of Internet scale distributed systems with thousands of discrete software components by developing and employing automated methods to observe and reason about causality. First, we observe causality at a fine-grained instruction level by developing the first taint tracking framework to support tracking millions of input sources. We also introduce flexible taint tracking to allow for scoping different queries and dynamic filtering of inputs, outputs, and relationships. Next, we introduce the Mystery Machine, which uses a ``big data'' approach to discover causal relationships between software components in a large-scale Internet service. We leverage the fact that large-scale Internet services receive a large number of requests in order to observe counterexamples to hypothesized causal relationships. Using discovered casual relationships, we identify the critical path for request execution and use the critical path analysis to explore potential scheduling optimizations. Finally, we explore using causality to make data-quality tradeoffs in Internet services. A data-quality tradeoff is an explicit decision by a software component to return lower-fidelity data in order to improve response time or minimize resource usage. We perform a study of data-quality tradeoffs in a large-scale Internet service to show the pervasiveness of these tradeoffs. We develop DQBarge, a system that enables better data-quality tradeoffs by propagating critical information along the causal path of request processing. Our evaluation shows that DQBarge helps Internet services mitigate load spikes, improve utilization of spare resources, and implement dynamic capacity planning.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/135888/1/mcchow_1.pd

    MT-WAVE: Profiling multi-tier web applications

    Get PDF
    The web is evolving: what was once primarily used for sharing static content has now evolved into a platform for rich client-side applications. These applications do not run exclusively on the client; while the client is responsible for presentation and some processing, there is a significant amount of processing and persistence that happens server-side. This has advantages and disadvantages. The biggest advantage is that the user’s data is accessible from anywhere. It doesn’t matter which device you sign into a web application from, everything you’ve been working on is instantly accessible. The largest disadvantage is that large numbers of servers are required to support a growing user base; unlike traditional client applications, an organization making a web application needs to provision compute and storage resources for each expected user. This infrastructure is designed in tiers that are responsible for different aspects of the application, and these tiers may not even be run by the same organization. As these systems grow in complexity, it becomes progressively more challenging to identify and solve performance problems. While there are many measures of software system performance, web application users only care about response latency. This “fingertip-to-eyeball performance” is the only metric that users directly perceive: when a button is clicked in a web application, how long does it take for the desired action to complete? MT-WAVE is a system for solving fingertip-to-eyeball performance problems in web applications. The system is designed for doing multi-tier tracing: each piece of the application is instrumented, execution traces are collected, and the system merges these traces into a single coherent snapshot of system latency at every tier. To ensure that user-perceived latency is accurately captured, the tracing begins in the web browser. The application developer then uses the MT-WAVE Visualization System to explore the execution traces to first identify which system is causing the largest amount of latency, and then zooms in on the specific function calls in that tier to find optimization candidates. After fixing an identified problem, the system is used to verify that the changes had the intended effect. This optimization methodology and toolset is explained through a series of case studies that identify and solve performance problems in open-source and commercial applications. These case studies demonstrate both the utility of the MT-WAVE system and the unintuitive nature of system optimization

    Improving the robustness and privacy of HTTP cookie-based tracking systems within an affiliate marketing context : a thesis presented in fulfilment of the requirements for the degree of Doctor of Philosophy at Massey University, Albany, New Zealand

    Get PDF
    E-commerce activities provide a global reach for enterprises large and small. Third parties generate visitor traffic for a fee; through affiliate marketing, search engine marketing, keyword bidding and through organic search, amongst others. Therefore, improving the robustness of the underlying tracking and state management techniques is a vital requirement for the growth and stability of e-commerce. In an inherently stateless ecosystem such as the Internet, HTTP cookies have been the de-facto tracking vector for decades. In a previous study, the thesis author exposed circumstances under which cookie-based tracking system can fail, some due to technical glitches, others due to manipulations made for monetary gain by some fraudulent actors. Following a design science research paradigm, this research explores alternative tracking vectors discussed in previous research studies within a cross-domain tracking environment. It evaluates their efficacy within current context and demonstrates how to use them to improve the robustness of existing tracking techniques. Research outputs include methods, instantiations and a privacy model artefact based on information seeking behaviour of different categories of tracking software, and their resulting privacy intrusion levels. This privacy model provides clarity and is useful for practitioners and regulators to create regulatory frameworks that do not hinder technological advancement, rather they curtail privacy-intrusive tracking practices on the Internet. The method artefacts are instantiated as functional prototypes, available publicly on Internet, to demonstrate the efficacy and utility of the methods through live tests. The research contributes to the theoretical knowledge base through generalisation of empirical findings and to the industry by problem solving design artefacts

    Alter ego, state of the art on user profiling: an overview of the most relevant organisational and behavioural aspects regarding User Profiling.

    Get PDF
    This report gives an overview of the most relevant organisational and\ud behavioural aspects regarding user profiling. It discusses not only the\ud most important aims of user profiling from both an organisation’s as\ud well as a user’s perspective, it will also discuss organisational motives\ud and barriers for user profiling and the most important conditions for\ud the success of user profiling. Finally recommendations are made and\ud suggestions for further research are given

    Augmenting Network Flows with User Interface Context to Inform Access Control Decisions

    Get PDF
    Whitelisting IP addresses and hostnames allow organizations to employ a default-deny approach to network traffic. Organizations employing a default-deny approach can stop many malicious threats, even including zero-day attacks, because it only allows explicitly stated legitimate activities. However, creating a comprehensive whitelist for a default-deny approach is difficult due to user-supplied destinations that can only be known at the time of usage. Whitelists, therefore, interfere with user experience by denying network traffic to user-supplied legitimate destinations. In this thesis, we focus on creating dynamic whitelists that are capable of allowing user-supplied network activity. We designed and built a system called Harbinger, which leverages user interface activity to provide contextual information in which network activity took place. We built Harbinger for Microsoft Windows operating systems and have tested its usability and effectiveness on four popular Microsoft applications. We find that Harbinger can reduce false positives-positive detection rates from 44%-54% to 0%-0.4% in IP and DNS whitelists. Furthermore, while traditional whitelists failed to detect propagation attacks, Harbinger detected the same attacks 96% of the time. We find that our system only introduced six milliseconds of delay or less for 96% of network activity

    VisANT: an online visualization and analysis tool for biological interaction data

    Get PDF
    BACKGROUND: New techniques for determining relationships between biomolecules of all types – genes, proteins, noncoding DNA, metabolites and small molecules – are now making a substantial contribution to the widely discussed explosion of facts about the cell. The data generated by these techniques promote a picture of the cell as an interconnected information network, with molecular components linked with one another in topologies that can encode and represent many features of cellular function. This networked view of biology brings the potential for systematic understanding of living molecular systems. RESULTS: We present VisANT, an application for integrating biomolecular interaction data into a cohesive, graphical interface. This software features a multi-tiered architecture for data flexibility, separating back-end modules for data retrieval from a front-end visualization and analysis package. VisANT is a freely available, open-source tool for researchers, and offers an online interface for a large range of published data sets on biomolecular interactions, including those entered by users. This system is integrated with standard databases for organized annotation, including GenBank, KEGG and SwissProt. VisANT is a Java-based, platform-independent tool suitable for a wide range of biological applications, including studies of pathways, gene regulation and systems biology. CONCLUSION: VisANT has been developed to provide interactive visual mining of biological interaction data sets. The new software provides a general tool for mining and visualizing such data in the context of sequence, pathway, structure, and associated annotations. Interaction and predicted association data can be combined, overlaid, manipulated and analyzed using a variety of built-in functions. VisANT is available at

    Automated Injection of Curated Knowledge Into Real-Time Clinical Systems: CDS Architecture for the 21st Century

    Get PDF
    abstract: Clinical Decision Support (CDS) is primarily associated with alerts, reminders, order entry, rule-based invocation, diagnostic aids, and on-demand information retrieval. While valuable, these foci have been in production use for decades, and do not provide a broader, interoperable means of plugging structured clinical knowledge into live electronic health record (EHR) ecosystems for purposes of orchestrating the user experiences of patients and clinicians. To date, the gap between knowledge representation and user-facing EHR integration has been considered an “implementation concern” requiring unscalable manual human efforts and governance coordination. Drafting a questionnaire engineered to meet the specifications of the HL7 CDS Knowledge Artifact specification, for example, carries no reasonable expectation that it may be imported and deployed into a live system without significant burdens. Dramatic reduction of the time and effort gap in the research and application cycle could be revolutionary. Doing so, however, requires both a floor-to-ceiling precoordination of functional boundaries in the knowledge management lifecycle, as well as formalization of the human processes by which this occurs. This research introduces ARTAKA: Architecture for Real-Time Application of Knowledge Artifacts, as a concrete floor-to-ceiling technological blueprint for both provider heath IT (HIT) and vendor organizations to incrementally introduce value into existing systems dynamically. This is made possible by service-ization of curated knowledge artifacts, then injected into a highly scalable backend infrastructure by automated orchestration through public marketplaces. Supplementary examples of client app integration are also provided. Compilation of knowledge into platform-specific form has been left flexible, in so far as implementations comply with ARTAKA’s Context Event Service (CES) communication and Health Services Platform (HSP) Marketplace service packaging standards. Towards the goal of interoperable human processes, ARTAKA’s treatment of knowledge artifacts as a specialized form of software allows knowledge engineers to operate as a type of software engineering practice. Thus, nearly a century of software development processes, tools, policies, and lessons offer immediate benefit: in some cases, with remarkable parity. Analyses of experimentation is provided with guidelines in how choice aspects of software development life cycles (SDLCs) apply to knowledge artifact development in an ARTAKA environment. Portions of this culminating document have been further initiated with Standards Developing Organizations (SDOs) intended to ultimately produce normative standards, as have active relationships with other bodies.Dissertation/ThesisDoctoral Dissertation Biomedical Informatics 201
    corecore