3,781 research outputs found
Goal-Oriented RE for E-Services
Current research in service-oriented computing (SoC) is mainly\ud
about technology standards for SoC and the design of software components that\ud
implement these standards. In this paper we investigate the problem of\ud
requirements engineering (RE) for SoC. We propose a framework for goaloriented\ud
RE for e-services that identifies patterns in service provisioning and\ud
shows how to compose business models from them. Based on an analysis of 19\ud
business models for e-intermediaries we identified 10 intermediation service\ud
patterns and their goals, and show how we can compose new business models\ud
from those patterns in a goal-oriented way. We represent the service patterns\ud
using value models, which are models that show which value exchanges\ud
business patterns engage in. We conclude the paper with a discussion of how\ud
this approach can be extended to include business process patterns to perform\ud
the services, and software components that support these processes
Integrating E-Commerce and Data Mining: Architecture and Challenges
We show that the e-commerce domain can provide all the right ingredients for
successful data mining and claim that it is a killer domain for data mining. We
describe an integrated architecture, based on our expe-rience at Blue Martini
Software, for supporting this integration. The architecture can dramatically
reduce the pre-processing, cleaning, and data understanding effort often
documented to take 80% of the time in knowledge discovery projects. We
emphasize the need for data collection at the application server layer (not the
web server) in order to support logging of data and metadata that is essential
to the discovery process. We describe the data transformation bridges required
from the transaction processing systems and customer event streams (e.g.,
clickstreams) to the data warehouse. We detail the mining workbench, which
needs to provide multiple views of the data through reporting, data mining
algorithms, visualization, and OLAP. We con-clude with a set of challenges.Comment: KDD workshop: WebKDD 200
Analyzing the Language of Food on Social Media
We investigate the predictive power behind the language of food on social
media. We collect a corpus of over three million food-related posts from
Twitter and demonstrate that many latent population characteristics can be
directly predicted from this data: overweight rate, diabetes rate, political
leaning, and home geographical location of authors. For all tasks, our
language-based models significantly outperform the majority-class baselines.
Performance is further improved with more complex natural language processing,
such as topic modeling. We analyze which textual features have most predictive
power for these datasets, providing insight into the connections between the
language of food, geographic locale, and community characteristics. Lastly, we
design and implement an online system for real-time query and visualization of
the dataset. Visualization tools, such as geo-referenced heatmaps,
semantics-preserving wordclouds and temporal histograms, allow us to discover
more complex, global patterns mirrored in the language of food.Comment: An extended abstract of this paper will appear in IEEE Big Data 201
XRay: Enhancing the Web's Transparency with Differential Correlation
Today's Web services - such as Google, Amazon, and Facebook - leverage user
data for varied purposes, including personalizing recommendations, targeting
advertisements, and adjusting prices. At present, users have little insight
into how their data is being used. Hence, they cannot make informed choices
about the services they choose. To increase transparency, we developed XRay,
the first fine-grained, robust, and scalable personal data tracking system for
the Web. XRay predicts which data in an arbitrary Web account (such as emails,
searches, or viewed products) is being used to target which outputs (such as
ads, recommended products, or prices). XRay's core functions are service
agnostic and easy to instantiate for new services, and they can track data
within and across services. To make predictions independent of the audited
service, XRay relies on the following insight: by comparing outputs from
different accounts with similar, but not identical, subsets of data, one can
pinpoint targeting through correlation. We show both theoretically, and through
experiments on Gmail, Amazon, and YouTube, that XRay achieves high precision
and recall by correlating data from a surprisingly small number of extra
accounts.Comment: Extended version of a paper presented at the 23rd USENIX Security
Symposium (USENIX Security 14
Recommended from our members
Location Data: Perils, Profits, Promise
Most of the modern online economy is based on websites offering free services and content in exchange for advertising access and user data. Web companies collect vast troves of data about their users in order to better target their advertisements. An important subset of this harvested data is the locations visited by users. Location data is valuable as it is a ``real world" signal compared to online behaviors: a visit to a store is a stronger signal than a visit to a website, and location data can reveal user attributes that are interesting to advertisers. The collection of this data, however, raises many concerns. Location data can reveal important attributes that users may not wish to disclose: ZIP codes can reveal income and race, visits to places of worship may allow discrimination, and insurers may want to know about trips to hospitals. The risks exist at both an individual level, with location tied to physical safety, and at a collective level, with inference about group membership a necessary step towards discrimination. In this thesis, I examine issues of privacy and fairness in the use of location data. In the first portion, I empirically demonstrate new attacks on the anonymity and privacy of users, including a theoretical basis for user identification. In the second portion, I propose and analyze new solutions for dealing with privacy, anonymity, and fairness in the collection and use of location data. In contrast to previous work which presents privacy in abstract ways or ignores the power of data aggregators, the work presented here focuses on concretely informing users and incorporates the economic incentives driving privacy and fairness concerns
Context aware advertising
IP Television (IPTV) has created a new arena for digital advertising that has not been explored to its full potential yet. IPTV allows users to retrieve on demand content and recommended content; however, very limited research has been applied in the domain of advertising in IPTV systems. The diversity of the field led to a lot of mature efforts in the fields of content recommendation and mobile advertising. The introduction of IPTV and smart devices led to the ability to gather more context information that was not subject of study before. This research attempts at studying the different contextual parameters, how to enrich the advertising context to tailor better ads for users, devising a recommendation engine that utilizes the new context, building a prototype to prove the viability of the system and evaluating it on different quality of service and quality of experience measures. To tackle this problem, a review of the state of the art in the field of context-aware advertising as well as the related field of context-aware multimedia have been studied. The intent was to come up with the most relevant contextual parameters that can possibly yield a higher percentage precision for recommending advertisements to users. Subsequently, a prototype application was also developed to validate the feasibility and viability of the approach. The prototype gathers contextual information related to the number of viewers, their age, genders, viewing angles as well as their emotions. The gathered context is then dispatched to a web service which generates advertisement recommendations and sends them back to the user. A scheduler was also implemented to identify the most suitable time to push advertisements to users based on their attention span. To achieve our contributions, a corpus of 421 ads was gathered and processed for streaming. The advertisements were displayed in reality during the holy month of Ramadan, 2016. A data gathering application was developed where sample users were presented with 10 random ads and asked to rate and evaluate the advertisements according to a predetermined criteria. The gathered data was used for training the recommendation engine and computing the latent context-item preferences. This also served to identify the performance of a system that randomly sends advertisements to users. The resulting performance is used as a benchmark to compare our results against. When it comes to the recommendation engine itself, several implementation options were considered that pertain to the methodology to create a vector representation of an advertisement as well as the metric to use to measure the similarity between two advertisement vectors. The goal is to find a representation of advertisements that circumvents the cold start problem and the best similarity measure to use with the different vectorization techniques. A set of experiments have been designed and executed to identify the right vectorization methodology and similarity measure to apply in this problem domain. To evaluate the overall performance of the system, several experiments were designed and executed that cover different quality aspects of the system such as quality of service, quality of experience and quality of context. All three aspects have been measured and our results show that our recommendation engine exhibits a significant improvement over other mechanisms of pushing ads to users that are employed in currently existing systems. The other mechanisms placed in comparison are the random ad generation and targeted ad generation. Targeted ads mechanism relies on demographic information of the viewer with disregard to his/her historical consumption. Our system showed a precision percentage of 69.70% which means that roughly 7 out of 10 recommended ads are actually liked and viewed to the end by the viewer. The practice of randomly generating ads yields a result of 41.11% precision which means that only 4 out of 10 recommended ads are actually liked by viewers. The targeted ads system resulted in 51.39% precision. Our results show that a significant improvement can be introduced when employing context within a recommendation engine. When introducing emotion context, our results show a significant improvement in case the user’s emotion is happiness; however, it showed a degradation of performance when the user’s emotion is sadness. When considering all emotions, the overall results did not show a significant improvement. It is worth noting though that ads recommended based on detected emotions using our systems proved to always be relevant to the user\u27s current mood
- …