Search CORE

181 research outputs found

USER INTERFACE CONCEPTION FOR ASSET MANAGEMENT SYSTEM

Author: Xue Xiaoguo
Publication venue
Publication date: 01/01/2015
Field of study

Data as a critical resource nowadays, it is growing expotentially. As a consequence, the issue of data storing, filtering, processing and analysis have attracted a lot of attention in the database industry, since they can not be handled by the traditional database systems. The new situation has been discussed as a concept of Big Data. To be able to handle and process the Big Data, one muct be capable of deal with large volume, high velocity and various types data. With this trend, Relational Database also succeeded in integrating parts of the functions which are required to handle Big Data. So it becomes that those two techniques advance side by side. Here in this work both Big Data and Relational Database are discussed based on the current database industry development situation. Moreover, data mining techniques and communication standards are also studied and discussed to give directions for further implementation work. An interfacing work, which includes both software interfacing for third party user and user interface design and implementation, is done to provide Wärtsilä internal users access to their remote monitoring data. The desing work is done based on the needs of the different user groups of the personnel of Wärtsilä.fi=Opinnäytetyö kokotekstinä PDF-muodossa.|en=Thesis fulltext in PDF format.|sv=Lärdomsprov tillgängligt som fulltext i PDF-format

Osuva

Enhanced image encryption scheme with new mapreduce approach for big size images

Author: Al-Khasawneh Mahmoud Ahmad Salem
Publication venue
Publication date: 01/01/2018
Field of study

Achieving a secured image encryption (IES) scheme for sensitive and confidential data communications, especially in a Hadoop environment is challenging. An accurate and secure cryptosystem for colour images requires the generation of intricate secret keys that protect the images from diverse attacks. To attain such a goal, this work proposed an improved shuffled confusion-diffusion based colour IES using a hyper-chaotic plain image. First, five different sequences of random numbers were generated. Then, two of the sequences were used to shuffle the image pixels and bits, while the remaining three were used to XOR the values of the image pixels. Performance of the developed IES was evaluated in terms of various measures such as key space size, correlation coefficient, entropy, mean squared error (MSE), peak signal to noise ratio (PSNR) and differential analysis. Values of correlation coefficient (0.000732), entropy (7.9997), PSNR (7.61), and MSE (11258) were determined to be better (against various attacks) compared to current existing techniques. The IES developed in this study was found to have outperformed other comparable cryptosystems. It is thus asserted that the developed IES can be advantageous for encrypting big data sets on parallel machines. Additionally, the developed IES was also implemented on a Hadoop environment using MapReduce to evaluate its performance against known attacks. In this process, the given image was first divided and characterized in a key-value format. Next, the Map function was invoked for every key-value pair by implementing a mapper. The Map function was used to process data splits, represented in the form of key-value pairs in parallel modes without any communication between other map processes. The Map function processed a series of key/value pairs and subsequently generated zero or more key/value pairs. Furthermore, the Map function also divided the input image into partitions before generating the secret key and XOR matrix. The secret key and XOR matrix were exploited to encrypt the image. The Reduce function merged the resultant images from the Map tasks in producing the final image. Furthermore, the value of PSNR did not exceed 7.61 when the developed IES was evaluated against known attacks for both the standard dataset and big data size images. As can be seen, the correlation coefficient value of the developed IES did not exceed 0.000732. As the handling of big data size images is different from that of standard data size images, findings of this study suggest that the developed IES could be most beneficial for big data and big size images

Universiti Teknologi Malaysia Institutional Repository

Spatial frequency based video stream analysis for object classification and recognition in clouds

Author: Anjum Ashiq
Ding Da
Ojansivu V.
Yaseen Muhammad Usman
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

The recent rise in multimedia technology has made it easier to perform a number of tasks. One of these tasks is monitoring where cheap cameras are producing large amount of video data. This video data is then processed for object classification to extract useful information. However, the video data obtained by these cheap cameras is often of low quality and results in blur video content. Moreover, various illumination effects caused by lightning conditions also degrade the video quality. These effects present severe challenges for object classification. We present a cloud-based blur and illumination invariant approach for object classification from images and video data. The bi-dimensional empirical mode decomposition (BEMD) has been adopted to decompose a video frame into intrinsic mode functions (IMFs). These IMFs further undergo to first order Reisz transform to generate monogenic video frames. The analysis of each IMF has been carried out by observing its local properties (amplitude, phase and orientation) generated from each monogenic video frame. We propose a stack based hierarchy of local pattern features generated from the amplitudes of each IMF which results in blur and illumination invariant object classification. The extensive experimentation on video streams as well as publically available image datasets reveals that our system achieves high accuracy from 0.97 to 0.91 for increasing Gaussian blur ranging from 0.5 to 5 and outperforms state of the art techniques under uncontrolled conditions. The system also proved to be scalable with high throughput when tested on a number of video streams using cloud infrastructure

Crossref

UDORA - University of Derby Online Research Archive

Air Quality Prediction in Smart Cities Using Machine Learning Technologies Based on Sensor Data: A Review

Author: Iskandaryan Ditsuhi
Ramos Francisco
Trilles Sergio
Publication venue: 'MDPI AG'
Publication date: 01/04/2020
Field of study

The influence of machine learning technologies is rapidly increasing and penetrating almost in every field, and air pollution prediction is not being excluded from those fields. This paper covers the revision of the studies related to air pollution prediction using machine learning algorithms based on sensor data in the context of smart cities. Using the most popular databases and executing the corresponding filtration, the most relevant papers were selected. After thorough reviewing those papers, the main features were extracted, which served as a base to link and compare them to each other. As a result, we can conclude that: (1) instead of using simple machine learning techniques, currently, the authors apply advanced and sophisticated techniques, (2) China was the leading country in terms of a case study, (3) Particulate matter with diameter equal to 2.5 micrometers was the main prediction target, (4) in 41% of the publications the authors carried out the prediction for the next day, (5) 66% of the studies used data had an hourly rate, (6) 49% of the papers used open data and since 2016 it had a tendency to increase, and (7) for efficient air quality prediction it is important to consider the external factors such as weather conditions, spatial characteristics, and temporal features

Multidisciplinary Digital Publishing Institute

Repositori Institucional de la Universitat Jaume I

Query-driven learning for predictive analytics of data subspace cardinality

Author: Anagnostopoulos Christos
Triantafillou Peter
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 29/06/2017
Field of study

Fundamental to many predictive analytics tasks is the ability to estimate the cardinality (number of data items) of multi-dimensional data subspaces, defined by query selections over datasets. This is crucial for data analysts dealing with, e.g., interactive data subspace explorations, data subspace visualizations, and in query processing optimization. However, in many modern data systems, predictive analytics may be (i) too costly money-wise, e.g., in clouds, (ii) unreliable, e.g., in modern Big Data query engines, where accurate statistics are difficult to obtain/maintain, or (iii) infeasible, e.g., for privacy issues. We contribute a novel, query-driven, function estimation model of analyst-defined data subspace cardinality. The proposed estimation model is highly accurate in terms of prediction and accommodating the well-known selection queries: multi-dimensional range and distance-nearest neighbors (radius) queries. Our function estimation model: (i) quantizes the vectorial query space, by learning the analysts’ access patterns over a data space, (ii) associates query vectors with their corresponding cardinalities of the analyst-defined data subspaces, (iii) abstracts and employs query vectorial similarity to predict the cardinality of an unseen/unexplored data subspace, and (iv) identifies and adapts to possible changes of the query subspaces based on the theory of optimal stopping. The proposed model is decentralized, facilitating the scaling-out of such predictive analytics queries. The research significance of the model lies in that (i) it is an attractive solution when data-driven statistical techniques are undesirable or infeasible, (ii) it offers a scale-out, decentralized training solution, (iii) it is applicable to different selection query types, and (iv) it offers a performance that is superior to that of data-driven approaches

Warwick Research Archives Portal Repository

Enlighten

Scalable aggregation predictive analytics: a query-driven machine learning approach

Author: Anagnostopoulos Christos
Savva Fotis
Triantafillou Peter
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/09/2018
Field of study

We introduce a predictive modeling solution that provides high quality predictive analytics over aggregation queries in Big Data environments. Our predictive methodology is generally applicable in environments in which large-scale data owners may or may not restrict access to their data and allow only aggregation operators like COUNT to be executed over their data. In this context, our methodology is based on historical queries and their answers to accurately predict ad-hoc queries’ answers. We focus on the widely used set-cardinality, i.e., COUNT, aggregation query, as COUNT is a fundamental operator for both internal data system optimizations and for aggregation-oriented data exploration and predictive analytics. We contribute a novel, query-driven Machine Learning (ML) model whose goals are to: (i) learn the query-answer space from past issued queries, (ii) associate the query space with local linear regression & associative function estimators, (iii) define query similarity, and (iv) predict the cardinality of the answer set of unseen incoming queries, referred to the Set Cardinality Prediction (SCP) problem. Our ML model incorporates incremental ML algorithms for ensuring high quality prediction results. The significance of contribution lies in that it (i) is the only query-driven solution applicable over general Big Data environments, which include restricted-access data, (ii) offers incremental learning adjusted for arriving ad-hoc queries, which is well suited for query-driven data exploration, and (iii) offers a performance (in terms of scalability, SCP accuracy, processing time, and memory requirements) that is superior to data-centric approaches. We provide a comprehensive performance evaluation of our model evaluating its sensitivity, scalability and efficiency for quality predictive analytics. In addition, we report on the development and incorporation of our ML model in Spark showing its superior performance compared to the Spark’s COUNT method

Warwick Research Archives Portal Repository

Enlighten

Real-time Feature Detection in Mass Spectrometer Data

Author: Hillman Chris
Publication venue
Publication date: 01/01/2018
Field of study

University of Dundee Online Publications

Improving Content Based Video Retrieval Performance by Using Hadoop-MapReduce Model

Author: Abderrahim Sekkaki
Abderrahmane Adoui El Ouadrhiri
El Mehdi Saoudi
Othman El Warrak
Said Jai Andaloussi
Publication venue: FRUCT
Publication date: 01/11/2018
Field of study

In this paper, we present a distributed Content- Based Video Retrieval (CBVR) system based on MapReduce pro- gramming model. A CBVR system called Bounded Coordinate of Motion Histogram (BCMH) has been implemented as case study by using Hadoop framework. Our work consists of proposing a distributed model to extract videos signatures and compute similarity with the BCMH system based on a set of Mapreduce jobs assigned to multiple nodes of the Hadoop cluster in order to reduce computation time of training process. The proposed approach is tested on HOLLYWOOD2 dataset and the obtained results demonstrate efﬁciency of the proposed approach

Directory of Open Access Journals