1,287 research outputs found
Recommended from our members
Democratizing Web Automation: Programming for Social Scientists and Other Domain Experts
We have promised social scientists a data revolution, but it has not arrived. What stands between practitioners and the data-driven insights they want? Acquiring the data. In particular, acquiring the social media, online forum, and other web data that was supposed to help them produce big, rich, ecologically valid datasets. Web automation programming is resistant to high-level abstractions, so end-user programmers end up stymied by the need to reverse engineer website internals—DOM, JavaScript, AJAX. Programming by Demonstration (PBD) offered one promising avenue towards democratizing web automation. Unfortunately, as the web matured, the programs became too complex for PBD tools to synthesize, and web PBD progress stalled.This dissertation describes how I reformulated traditional web PBD around the insight that demonstrations are not always the easiest way for non-programmers to communicate their intent. By shifting from a purely Programming-By-Demonstration view to a Programming-By-X view that accepts a variety of user-friendly inputs, we can dramatically broaden the class of programs that come in reach for end-user programmers. Our Helena ecosystem combines (i) usable PBD-based program drafting tools, (ii) learnable programming languages, and (iii) novel programming environment interactions. The end result: non-coders write Helena programs in 10 minutes that can handle the complexity of modern webpages, while coders attempt the same task and time out in an hour. I conclude with a discussion of the abstraction-resistant domains that will fall next and how hybrid PL-HCI breakthroughs will vastly expand access to programming
Detecting Abnormal Behavior in Web Applications
The rapid advance of web technologies has made the Web an essential part of our daily lives. However, network attacks have exploited vulnerabilities of web applications, and caused substantial damages to Internet users. Detecting network attacks is the first and important step in network security. A major branch in this area is anomaly detection. This dissertation concentrates on detecting abnormal behaviors in web applications by employing the following methodology. For a web application, we conduct a set of measurements to reveal the existence of abnormal behaviors in it. We observe the differences between normal and abnormal behaviors. By applying a variety of methods in information extraction, such as heuristics algorithms, machine learning, and information theory, we extract features useful for building a classification system to detect abnormal behaviors.;In particular, we have studied four detection problems in web security. The first is detecting unauthorized hotlinking behavior that plagues hosting servers on the Internet. We analyze a group of common hotlinking attacks and web resources targeted by them. Then we present an anti-hotlinking framework for protecting materials on hosting servers. The second problem is detecting aggressive behavior of automation on Twitter. Our work determines whether a Twitter user is human, bot or cyborg based on the degree of automation. We observe the differences among the three categories in terms of tweeting behavior, tweet content, and account properties. We propose a classification system that uses the combination of features extracted from an unknown user to determine the likelihood of being a human, bot or cyborg. Furthermore, we shift the detection perspective from automation to spam, and introduce the third problem, namely detecting social spam campaigns on Twitter. Evolved from individual spammers, spam campaigns manipulate and coordinate multiple accounts to spread spam on Twitter, and display some collective characteristics. We design an automatic classification system based on machine learning, and apply multiple features to classifying spam campaigns. Complementary to conventional spam detection methods, our work brings efficiency and robustness. Finally, we extend our detection research into the blogosphere to capture blog bots. In this problem, detecting the human presence is an effective defense against the automatic posting ability of blog bots. We introduce behavioral biometrics, mainly mouse and keyboard dynamics, to distinguish between human and bot. By passively monitoring user browsing activities, this detection method does not require any direct user participation, and improves the user experience
A Web-Based Collaborative Multimedia Presentation Document System
With the distributed and rapidly increasing volume of data and expeditious development of modern web browsers, web browsers have become a possible legitimate vehicle for remote interactive multimedia presentation and collaboration, especially for geographically dispersed teams. To our knowledge, although there are a large number of applications developed for these purposes, there are some drawbacks in prior work including the lack of interactive controls of presentation flows, general-purpose collaboration support on multimedia, and efficient and precise replay of presentations.
To fill the research gaps in prior work, in this dissertation, we propose a web-based multimedia collaborative presentation document system, which models a presentation as media resources together with a stream of media events, attached to associated media objects. It represents presentation flows and collaboration actions in events, implements temporal and spatial scheduling on multimedia objects, and supports real-time interactive control of the predefined schedules. As all events are represented by simple messages with an object-prioritized approach, our platform can also support fine-grained precise replay of presentations. Hundreds of kilobytes could be enough to store the events in a collaborative presentation session for accurate replays, compared with hundreds of megabytes in screen recording tools with a pixel-based replay mechanism
AUTOMATED CYBER OPERATIONS MISSION DATA REPLAY
The Persistent Cyber Training Environment (PCTE) has been developed as the joint force solution to provide a single training environment for cyberspace operations. PCTE offers a closed network for Joint Cyberspace Operations Forces, which provides a range of training solutions from individual sustainment training to mission rehearsal and post-operation analysis. Currently, PCTE does not have the ability to replay previously executed training scenarios or external scenarios. Replaying cyber mission data on a digital twin virtual network within PCTE would support operator training as well as enable development and testing of new strategies for offensive and defensive cyberspace operations. A necessary first step in developing such a tool is to acquire network specifications for a target network, or to extract network specifications from a cyber mission data set. This research developed a program design and proof-of-concept tool, Automated Cyber Operations Mission Data Replay (ACOMDR), to extract a portion of the network specifications necessary to instantiate a digital twin network within PCTE from cyber mission data. From this research, we were able to identify key areas for future work to increase the fidelity of the network specification and replay cyber events within PCTE.Captain, United States Marine CorpsApproved for public release. Distribution is unlimited
Low-Code Programming Models
Traditionally, computer programming has been the prerogative of professional
developers using textual programming languages such as C, Java, or Python.
Low-code programming promises an alternative: letting citizen developers create
programs using visual abstractions, demonstrations, or natural language. While
low-code programming is currently getting a lot of attention in industry, the
relevant research literature is scattered, and in fact, rarely uses the term
"low-code". This article brings together low-code literature from various
research fields, explaining how techniques work while providing a unified point
of view. Low-code has the potential to empower more people to automate tasks by
creating computer programs, making them more productive and less dependent on
scarce professional software developers
Managing big data experiments on smartphones
The explosive number of smartphones with ever growing sensing and computing capabilities have brought a paradigm shift to many traditional domains of the computing field. Re-programming smartphones and instrumenting them for application testing and data gathering at scale is currently a tedious and time-consuming process that poses significant logistical challenges. Next generation smartphone applications are expected to be much larger-scale and complex, demanding that these undergo evaluation and testing under different real-world datasets, devices and conditions. In this paper, we present an architecture for managing such large-scale data management experiments on real smartphones. We particularly present the building blocks of our architecture that encompassed smartphone sensor data collected by the crowd and organized in our big data repository. The given datasets can then be replayed on our testbed comprising of real and simulated smartphones accessible to developers through a web-based interface. We present the applicability of our architecture through a case study that involves the evaluation of individual components that are part of a complex indoor positioning system for smartphones, coined Anyplace, which we have developed over the years. The given study shows how our architecture allows us to derive novel insights into the performance of our algorithms and applications, by simplifying the management of large-scale data on smartphones
LTE Spectrum Sharing Research Testbed: Integrated Hardware, Software, Network and Data
This paper presents Virginia Tech's wireless testbed supporting research on
long-term evolution (LTE) signaling and radio frequency (RF) spectrum
coexistence. LTE is continuously refined and new features released. As the
communications contexts for LTE expand, new research problems arise and include
operation in harsh RF signaling environments and coexistence with other radios.
Our testbed provides an integrated research tool for investigating these and
other research problems; it allows analyzing the severity of the problem,
designing and rapidly prototyping solutions, and assessing them with
standard-compliant equipment and test procedures. The modular testbed
integrates general-purpose software-defined radio hardware, LTE-specific test
equipment, RF components, free open-source and commercial LTE software, a
configurable RF network and recorded radar waveform samples. It supports RF
channel emulated and over-the-air radiated modes. The testbed can be remotely
accessed and configured. An RF switching network allows for designing many
different experiments that can involve a variety of real and virtual radios
with support for multiple-input multiple-output (MIMO) antenna operation. We
present the testbed, the research it has enabled and some valuable lessons that
we learned and that may help designing, developing, and operating future
wireless testbeds.Comment: In Proceeding of the 10th ACM International Workshop on Wireless
Network Testbeds, Experimental Evaluation & Characterization (WiNTECH),
Snowbird, Utah, October 201
- …