11 research outputs found

    A novel MapReduce Lift association rule mining algorithm (MRLAR) for Big Data

    Get PDF
    Big Data mining is an analytic process used to dis-cover the hidden knowledge and patterns from a massive, com-plex, and multi-dimensional dataset. Single-processor's memory and CPU resources are very limited, which makes the algorithm performance ineffective. Recently, there has been renewed inter-est in using association rule mining (ARM) in Big Data to uncov-er relationships between what seems to be unrelated. However, the traditional discovery ARM techniques are unable to handle this huge amount of data. Therefore, there is a vital need to scal-able and parallel strategies for ARM based on Big Data ap-proaches. This paper develops a novel MapReduce framework for an association rule algorithm based on Lift interestingness measurement (MRLAR) which can handle massive datasets with a large number of nodes. The experimental result shows the effi-ciency of the proposed algorithm to measure the correlations between itemsets through integrating the uses of MapReduce and LIM instead of depending on confidence.Web of Science7315715

    A cloud-based remote sensing data production system

    Get PDF
    The data processing capability of existing remote sensing system has not kept pace with the amount of data typically received and need to be processed. Existing product services are not capable of providing users with a variety of remote sensing data sources for selection, either. Therefore, in this paper, we present a product generation programme using multisource remote sensing data, across distributed data centers in a cloud environment, so as to compensate for the low productive efficiency, less types and simple services of the existing system. The programme adopts “master–slave” architecture. Specifically, the master center is mainly responsible for the production order receiving and parsing, as well as task and data scheduling, results feedback, and so on; the slave centers are the distributed remote sensing data centers, which storage one or more types of remote sensing data, and mainly responsible for production task execution. In general, each production task only runs on one data center, and the data scheduling among centers adopts a “minimum data transferring” strategy. The logical workflow of each production task is organized based on knowledge base, and then turned into the actual executed workflow by Kepler. In addition, the scheduling strategy of each production task mainly depends on the Ganglia monitoring results, thus the computing resources can be allocated or expanded adaptively. Finally, we evaluated the proposed programme using test experiments performed at global, regional and local areas, and the results showed that our proposed cloud-based remote sensing production system could deal with massive remote sensing data and different products generating, as well as on-demand remote sensing computing and information service

    Classification techniques for airfares prediction

    Get PDF
    Esta memoria de tesis se enfoca en los problemas multifactoriales a los que se enfrentan las aerolíneas comerciales como son la guerra de precios y la creación de una tabla dinámica de descuentos. Por un lado, dentro de la industria aérea, los equipos de precios y ganancias pasan una cantidad de tiempo considerable analizando e interpretando las acciones de sus competidores. La mayoría de las veces, estos analistas tienen que usar sus habilidades para realizar una serie de análisis ad-hoc que les permita interpretar o encontrar patrones en las tarifas aéreas. La implementación de metodologías automáticas es clave para reducir los tiempos y evitar errores humanos. Esta tesis propone una nueva metodología para predecir, analizar e interpretar las tarifas de las aerolíneas que es capaz de imitar los procesos manuales ejecutados por los equipos de fijación precios. Para enfrentar esta guerra de precios, se propone un algoritmo de programación de expresión genética que imita el proceso manual llevado a cabo por los equipos de analistas mediante la adición automática de nuevas características o atributos. Para demostrar la capacidad de la metodología, se consideró un escenario real utilizando tarifas publicadas por parte de la aerolínea denominada Air Canada durante el período de diciembre 2019 a enero 2020; correspondiente a un período de viajes entre los meses de diciembre 2019 y abril de 2020. En segundo lugar, se aborda el problema de crear una tabla de ofertas dinámicas, debido a que, históricamente, las aerolíneas de todo el mundo han utilizado estructuras de precios estáticas, que están restringidas a puntos de precios discretos y existe una segmentación limitada entre sus pasajeros. Debido a estas limitaciones y restricciones, existe una enorme necesidad de métodos novedosos para calcular la disposición a pagar e identificar a los pasajeros potenciales, cuya probabilidad de reservar un vuelo aumenta si estos reciben un descuento con la finalidad de incrementar sus ganancias a través del incremento de las ventas de tarifas aéreas. Se propone un algoritmo de gramáticas evolutivas, el cual funciona como un selector de características para extraer los mejores subgrupos mediante el análisis del comportamiento de reservas que muestran los pasajeros. Se consideró un escenario real en el análisis experimental utilizando datos privados de una aerolínea comercial de talla mundial.This work is focused on the multi-factorial problems that commercial airlines face up, such as the pricing war and the creation of a dynamic discount table through the implementation of evolutionary algorithms and data mining methods. On the one hand, in the airline industry, the Revenue and Pricing teams generally spend a considerable amount of time analysing and interpreting the actions of their competitors. Most of the time the analysts have to use their analytical skills to create ad-hoc methods to interpret or find patterns in the fares. The use of automatic methodologies is key to reducing time and avoiding human errors. This thesis proposes a new methodology to predict, analyze and interpret airline fares which are capable of mimicking manual processes executed by pricing teams. A gene expression programming algorithm is proposed to mimic the manual process carried out by pricing teams by adding new features automatically. The algorithm can explore huge search spaces, which is a daunting process to be done manually as pricing teams do daily. A real scenario was considered in the experimental analysis by considering Air Canada fares in the period December 2019 to January 2020, corresponding to a travel period between December 2019 and April 2020. On the other hand, historically, airlines around the globe have used static pricing structures, which are constrained to discrete price points and there is limited segmentation between their guests. Because of these limitations and constraints, the necessity of novel methods to calculate the willingness to pay and identify potential guests whose propensity to book a flight will increase if they receive a discount to improve their sales is huge. Thus, This thesis proposes a novel methodology to identify interesting subgroups whose chance to book a flight increases if they receive an offer discount. This proposal includes a grammatically evolutionary feature selection algorithm to extract the best subgroups by analyzing the booking behaviour of historical passengers. A real case scenario was considered in the experimental analysis using private data from a commercial airline

    Autonomy and relatedness : an ethnography of Wik people of Aurukun, western Cape York Peninsula

    Get PDF
    I seek in this thesis to provide a critical account of Wik Aboriginal people living in and near the township of Aurukun on western Cape York Peninsula, north Queensland. It is set in a period of rapid and often traumatic changes for Wik, the seeds of which were sown during the seventy-four year mission period, but which accelerated dramatically with the imposition in 1978 of a local government administrative system based on the mainstream Queensland model. The decade or so following this saw the massive and cumulative penetration of the forms and institutions of the wider, dominant society. Yet, despite this, Wik people continued to carve out a social and spatial domain established through a distinctive way of life, defined in terms of particular sets of conjoint dispositions, beliefs, and understandings and through the forms, styles and contexts of social practices. In analysing this particular style of life, I argue that the essentially unresolved tension between personal autonomy and relatedness provided a fundamental dynamic to Wik social forms and processes. I examine the changing symbolic and material resources, such as cash and alcohol, through which autonomy could be realized but which at the same time instantiated relatedness. These new resources, I suggest, provided potent and unprecedented means through which personal autonomy could be realized. For these and other reasons, there was a trend towards increasing individuation of Wik, and the sundering of the control of the means of social reproduction which had lain essentially with senior generations. At the same time as this developing individuation, there was a rise in the importance of 'community' based forms, and of a construction of 'culture' as a set of reified practices which were posited as differentiating Wik from others, particularly Whites. I also examine Wik political processes in detail. The Wik domain was distinguished by a high degree of fluidity and contingency in the composition of the various collectivities coalescing around social actions. Despite the attempts of the Mission and more recent secular. regimes to alter the legitimate definitions of social and geographic space, the constantly ebbing and flowing currents of Wik social life acted to subvert these imposed designations of public and private spaces and their appropriate uses. This fluidity of structure and process extended to Wik political forms. Within the Wik domain, relations of domination and subordination were essentially created in and through the direct interactions between persons, rather than being mediated through objective institutions such as a legislature or bureaucracy. In such circumstances, not only political groupings but orthodoxy and legitimacy themselves were contingent and embedded in the flux of social life. Implicit in this thesis also is an argument against theories which see phenomena such as violence, large-scale alcohol consumption, and gambling, characteristic of many remote areas of Aboriginal Australia, as in some simple causal sense resulting from dispossession and alienation. Rather, it is argued that such phenomena can only be understood in terms of the complex interaction between core cultural themes, themselves historically located, and the circumstances of settlement life which have arisen through the colonial and post-colonial periods

    The influence of English on the lexical expansion of Bahasa Malaysia

    Get PDF
    SIGLEAvailable from British Library Document Supply Centre- DSC:D38970/82 / BLDSC - British Library Document Supply CentreGBUnited Kingdo
    corecore