Search CORE

2,188 research outputs found

Predictive User Modeling with Actionable Attributes

Author: Pechenizkiy Mykola
Zliobaite Indre
Publication venue
Publication date: 01/01/2013
Field of study

Different machine learning techniques have been proposed and used for modeling individual and group user needs, interests and preferences. In the traditional predictive modeling instances are described by observable variables, called attributes. The goal is to learn a model for predicting the target variable for unseen instances. For example, for marketing purposes a company consider profiling a new user based on her observed web browsing behavior, referral keywords or other relevant information. In many real world applications the values of some attributes are not only observable, but can be actively decided by a decision maker. Furthermore, in some of such applications the decision maker is interested not only to generate accurate predictions, but to maximize the probability of the desired outcome. For example, a direct marketing manager can choose which type of a special offer to send to a client (actionable attribute), hoping that the right choice will result in a positive response with a higher probability. We study how to learn to choose the value of an actionable attribute in order to maximize the probability of a desired outcome in predictive modeling. We emphasize that not all instances are equally sensitive to changes in actions. Accurate choice of an action is critical for those instances, which are on the borderline (e.g. users who do not have a strong opinion one way or the other). We formulate three supervised learning approaches for learning to select the value of an actionable attribute at an instance level. We also introduce a focused training procedure which puts more emphasis on the situations where varying the action is the most likely to take the effect. The proof of concept experimental validation on two real-world case studies in web analytics and e-learning domains highlights the potential of the proposed approaches

arXiv.org e-Print Archive

Repository TU/e

Pure OAI Repository

Automatic Chinese Postal Address Block Location Using Proximity Descriptors and Cooperative Profit Random Forests.

Author: Dong J.
Dong X.
Sun Jianyuan
Tao D.
Zhou H.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 19/10/2017
Field of study

Locating the destination address block is key to automated sorting of mails. Due to the characteristics of Chinese envelopes used in mainland China, we here exploit proximity cues in order to describe the investigated regions on envelopes. We propose two proximity descriptors encoding spatial distributions of the connected components obtained from the binary envelope images. To locate the destination address block, these descriptors are used together with cooperative profit random forests (CPRFs). Experimental results show that the proposed proximity descriptors are superior to two component descriptors, which only exploit the shape characteristics of the individual components, and the CPRF classifier produces higher recall values than seven state-of-the-art classifiers. These promising results are due to the fact that the proposed descriptors encode the proximity characteristics of the binary envelope images, and the CPRF classifier uses an effective tree node split approach

Queen's University Belfast Research Portal

Crossref

OPUS - University of Technology Sydney

Bournemouth University Research Online

Applying Domain Knowledge to the Recognition of Handwritten Zip Codes

Author: Chaaban Ibrahim
Publication venue: Indiana University South Bend
Publication date: 01/05/2007
Field of study

IUScholarWorks (University of Indiana)

Recommended from our members

Factors affecting the development of sophisticated database marketing systems.

Author: Lewington John Alfred.
Publication venue
Publication date: 01/01/1998
Field of study

In the late 1980s, companies in a wide variety of industries began to implement segmented marketing strategies using database marketing (DBM) systems. Several surveys noted that some organisations were developing sophisticated DBM systems to achieve competitive advantage, while others, in similar marketplaces, seemed unable, or unwilling, to exploit the potential benefits of these powerful systems. Alternatively, evidence from industrial reports suggested that most companies were failing to fully exploit the capabilities oftheir systems. Hence, this research was designed to determine the factors affecting levels of sophistication in database marketing (DBM) systems. First, theories from marketing and information systems were synthesised to develop a generic model of DBM systems. Next, notions about the sources of competitive advantage were reviewed to identify potential factors promoting the development of sophisticated DBM systems. This review resulted in four such factors being hypothesised: market orientation as a specific organisation culture, database size (i.e. number of customers) as a key resource, locus of control of the senior marketing manager as an important individual characteristic, and the difference between consumer and business markets as a factor in firms' external environment. Empirical data were collected from two random samples of senior marketing managers in US catalogue companies using postal surveys. Data from the first sample (36 observations) were used to develop a valid and reliable construct to measure the level of sophistication in DBM systems. Further data were collected from a second random sample using two further postal surveys (69 observations), which confirmed and replicated the results obtained from the first sample. Overall, the research findings show that the development of sophisticated DBM systems is positively associated with two factors: market orientation of organisation culture, and database size. The other two factors - locus of control and type of market - failed to show any association with the level of sophistication in DBM systems. Further data analyses revealed a strong association between the elements of sophisticated OBM systems and marketing notions of sources of competitive advantage

Open Research Online (The Open University)

OpenGrey Repository

A data-driven method for unsupervised electricity consumption characterisation at the district level and beyond

Author: Chemisana Villegas Daniel
Cipriano Jordi
Grillone Benedetto
Lazzari Florencia
Lodi Chiara
Martirano Giacomo
Mor Martínez Gerard
Pignatelli Francesco
Publication venue: 'Elsevier BV'
Publication date: 01/01/2021
Field of study

A bottom-up electricity characterisation methodology of the building stock at the local level is presented. It is based on the statistical learning analysis of aggregated energy consumption data, weather data, cadastre, and socioeconomic information. To demonstrate the validity of this methodology, the characterisation of the electricity consumption of the whole province of Lleida, located in northeast Spain, is implemented and tested. The geographical aggregation level considered is the postal code since it is the highest data resolution available through the open data sources used in the research work. The development and the experimental tests are supported by a web application environment formed by interactive user interfaces specifically developed for this purpose. The paper’s novelty relies on the application of statistical data methods able to infer the main energy performance characteristics of a large number of urban districts without prior knowledge of their building characteristics and with the use of solely measured data coming from smart meters, cadastre databases and weather forecasting services. A data-driven technique disaggregates electricity consumption in multiple uses (space heating, cooling, holidays and baseload). In addition, multiple Key Performance Indicators (KPIs) are derived from this disaggregated energy uses to obtain the energy characterisation of the buildings within a specific area. The potential reuse of this methodology allows for a better understanding of the drivers of electricity use, with multiple applications for the public and private sector.This work emanated from research conducted with the fi-nancial support of the European Commission through the H2020project BIGG , grant agreement 957047, and the JRC Expert Con-tractCT-EX2017D306558-102.D.ChemisanathanksICREAfortheICREA Acadèmia. Dr J. Cipriano also thanks the Ministerio deCiencia e Innovación of the Spanish Government for the Juan dela Cierva Incorporación gran

Directory of Open Access Journals

Scipedia

Repositori Obert UdL

RIARTE

A Computational Theory of Contextual Knowledge in Machine Reading

Author: Hanlon Stephen James
Publication venue: University of Leeds
Publication date: 01/09/1994
Field of study

Machine recognition of off–line handwriting can be achieved by either recognising words as individual symbols (word level recognition) or by segmenting a word into parts, usually letters, and classifying those parts (letter level recognition). Whichever method is used, current handwriting recognition systems cannot overcome the inherent ambiguity in writingwithout recourse to contextual information. This thesis presents a set of experiments that use Hidden Markov Models of language to resolve ambiguity in the classification process. It goes on to describe an algorithm designed to recognise a document written by a single–author and to improve recognition by adaptingto the writing style and learning new words. Learning and adaptation is achieved by reading the document over several iterations. The algorithm is designed to incorporate contextual processing, adaptation to modify the shape of known words and learning of new words within a constrained dictionary. Adaptation occurs when a word that has previously been trained in the classifier is recognised at either the word or letter level and the word image is used to modify the classifier. Learning occurs when a new word that has not been in the training set is recognised at the letter level and is subsequently added to the classifier. Words and letters are recognised using a nearest neighbour classifier and used features based on the two–dimensional Fourier transform. By incorporating a measure of confidence based on the distribution of training points around an exemplar, adaptation and learning is constrained to only occur when a word is confidently classified. The algorithm was implemented and tested with a dictionary of 1000 words. Results show that adaptation of the letter classifier improved recognition on average by 3.9% with only 1.6% at the whole word level. Two experiments were carried out to evaluate the learning in the system. It was found that learning accounted for little improvement in the classification results and also that learning new words was prone to misclassifications being propagated

White Rose E-theses Online