316 research outputs found

    Deep Data Locality on Apache Hadoop

    Full text link
    The amount of data being collected in various areas such as social media, network, scientific instrument, mobile devices, and sensors is growing continuously, and the technology to process them is also advancing rapidly. One of the fundamental technologies to process big data is Apache Hadoop that has been adopted by many commercial products, such as InfoSphere by IBM, or Spark by Cloudera. MapReduce on Hadoop has been widely used in many data science applications. As a dominant big data processing platform, the performance of MapReduce on Hadoop system has a significant impact on the big data processing capability across multiple industries. Most of the research for improving the speed of big data analysis has been on Hadoop modules such as Hadoop common, Hadoop Distribute File System (HDFS), Hadoop Yet Another Resource Negotiator (YARN) and Hadoop MapReduce. In this research, we focused on data locality on HDFS to improve the performance of MapReduce. To reduce the amount of data transfer, MapReduce has been utilizing data locality. However, even though the majority of the processing cost occurs in the later stages, data locality has been utilized only in the early stages, which we call Shallow Data Locality (SDL). As a result, the benefit of data locality has not been fully realized. We have explored a new concept called Deep Data Locality (DDL) where the data is pre-arranged to maximize the locality in the later stages. Specifically, we introduce two implementation methods of the DDL, i.e., block-based DDL and key-based DDL. In block-based DDL, the data blocks are pre-arranged to reduce the block copying time in two ways. First the RLM blocks are eliminated. Under the conventional default block placement policy (DBPP), data blocks are randomly placed on any available slave nodes, requiring a copy of RLM (Rack-Local Map) blocks. In block-based DDL, blocks are placed to avoid RLMs to reduce the block copy time. Second, block-based DDL concentrates the blocks in a smaller number of nodes and reduces the data transfer time among them. We analyzed the block distribution status with the customer review data from TripAdvisor and measured the performances with Terasort Benchmark. Our test result shows that the execution times of Map and Shuffle have been improved by up to 25% and 31% respectively. In key-based DDL, the input data is divided into several blocks and stored in HDFS before going into the Map stage. In comparison with conventional blocks that have random keys, our blocks have a unique key. This requires a pre-sorting of the key-value pairs, which can be done during ETL process. This eliminates some data movements in map, shuffle, and reduce stages, and thereby improves the performance. In our experiments, MapReduce with key-based DDL performed 21.9% faster than default MapReduce and 13.3% faster than MapReduce with block-based DDL. Additionally, key-based DDL can be combined with other methods to further improve the performance. When key-based DDL and block-based DDL are combined, the Hadoop performance went up by 34.4%. In this research, we developed the MapReduce workflow models with a novel computational model. We developed a numerical simulator that integrates the computational models. The model faithfully predicts the Hadoop performance under various conditions

    Improved Correction of Atmospheric Pressure Data Obtained by Smartphones through Machine Learning

    Get PDF
    A correction method using machine learning aims to improve the conventional linear regression (LR) based method for correction of atmospheric pressure data obtained by smartphones. The method proposed in this study conducts clustering and regression analysis with time domain classification. Data obtained in Gyeonggi-do, one of the most populous provinces in South Korea surrounding Seoul with the size of 10,000 km2, from July 2014 through December 2014, using smartphones were classified with respect to time of day (daytime or nighttime) as well as day of the week (weekday or weekend) and the user’s mobility, prior to the expectation-maximization (EM) clustering. Subsequently, the results were analyzed for comparison by applying machine learning methods such as multilayer perceptron (MLP) and support vector regression (SVR). The results showed a mean absolute error (MAE) 26% lower on average when regression analysis was performed through EM clustering compared to that obtained without EM clustering. For machine learning methods, the MAE for SVR was around 31% lower for LR and about 19% lower for MLP. It is concluded that pressure data from smartphones are as good as the ones from national automatic weather station (AWS) network

    Enhanced artificial bee colony-least squares support vector machines algorithm for time series prediction

    Get PDF
    Over the past decades, the Least Squares Support Vector Machines (LSSVM) has been widely utilized in prediction task of various application domains. Nevertheless, existing literature showed that the capability of LSSVM is highly dependent on the value of its hyper-parameters, namely regularization parameter and kernel parameter, where this would greatly affect the generalization of LSSVM in prediction task. This study proposed a hybrid algorithm, based on Artificial Bee Colony (ABC) and LSSVM, that consists of three algorithms; ABC-LSSVM, lvABC-LSSVM and cmABC-LSSVM. The lvABC algorithm is introduced to overcome the local optima problem by enriching the searching behaviour using Levy mutation. On the other hand, the cmABC algorithm that incorporates conventional mutation addresses the over- fitting or under-fitting problem. The combination of lvABC and cmABC algorithm, which is later introduced as Enhanced Artificial Bee Colony–Least Squares Support Vector Machine (eABC-LSSVM), is realized in prediction of non renewable natural resources commodity price. Upon the completion of data collection and data pre processing, the eABC-LSSVM algorithm is designed and developed. The predictability of eABC-LSSVM is measured based on five statistical metrics which include Mean Absolute Percentage Error (MAPE), prediction accuracy, symmetric MAPE (sMAPE), Root Mean Square Percentage Error (RMSPE) and Theils’ U. Results showed that the eABC-LSSVM possess lower prediction error rate as compared to eight hybridization models of LSSVM and Evolutionary Computation (EC) algorithms. In addition, the proposed algorithm is compared to single prediction techniques, namely, Support Vector Machines (SVM) and Back Propagation Neural Network (BPNN). In general, the eABC-LSSVM produced more than 90% prediction accuracy. This indicates that the proposed eABC-LSSVM is capable of solving optimization problem, specifically in the prediction task. The eABC-LSSVM is hoped to be useful to investors and commodities traders in planning their investment and projecting their profit

    Air Force Institute of Technology Research Report 2001

    Get PDF
    This report summarizes the research activities of the Air Force Institute of Technology’s Graduate School of Engineering and Management. It describes research interests and faculty expertise; lists student theses/dissertations; identifies research sponsors and contributions; and outlines the procedures for contacting the school. Included in the report are: faculty publications, conference presentations, consultations, and funded research projects. Research was conducted in the areas of Aeronautical and Astronautical Engineering, Electrical Engineering and Electro-Optics, Computer Engineering and Computer Science, Systems and Engineering Management, Operational Sciences, and Engineering Physics

    Analysis of Android Device-Based Solutions for Fall Detection

    Get PDF
    Falls are a major cause of health and psychological problems as well as hospitalization costs among older adults. Thus, the investigation on automatic Fall Detection Systems (FDSs) has received special attention from the research community during the last decade. In this area, the widespread popularity, decreasing price, computing capabilities, built-in sensors and multiplicity of wireless interfaces of Android-based devices (especially smartphones) have fostered the adoption of this technology to deploy wearable and inexpensive architectures for fall detection. This paper presents a critical and thorough analysis of those existing fall detection systems that are based on Android devices. The review systematically classifies and compares the proposals of the literature taking into account different criteria such as the system architecture, the employed sensors, the detection algorithm or the response in case of a fall alarms. The study emphasizes the analysis of the evaluation methods that are employed to assess the effectiveness of the detection process. The review reveals the complete lack of a reference framework to validate and compare the proposals. In addition, the study also shows that most research works do not evaluate the actual applicability of the Android devices (with limited battery and computing resources) to fall detection solutions.Ministerio de EconomĂ­a y Competitividad TEC2013-42711-

    A review of multi-car elevator system

    Get PDF
    This paper presents a review of a new generation of elevator system, the Multi-Car Elevator System. It is an elevator system which contains more than one elevator car in the elevator shaft. In the introduction, it explains why the Multi-Car Elevator System is a new trend elevator system based on its structural design, cost saving and efficiency in elevator system. Different types of Multi-Car Elevator System such as circulation or loop-type, non-circulation and bifurcate circulation are described in section 2. In section 3, researches on dispatch strategies, control strategies and avoidance of car collision strategies of Multi-Car Elevator System since 2002 are reviewed. In the discussion section, it reveals some drawbacks of the Multi-Car Elevator System in transport capability and the risk of car collision. There are recommendations to the future work as well

    Air Force Institute of Technology Research Report 2004

    Get PDF
    This report summarizes the research activities of the Air Force Institute of Technology’s Graduate School of Engineering and Management. It describes research interests and faculty expertise; lists student theses/dissertations; identifies research sponsors and contributions; and outlines the procedures for contacting the school. Included in the report are: faculty publications, conference presentations, consultations, and funded research projects. Research was conducted in the areas of Aeronautical and Astronautical Engineering, Electrical Engineering and Electro-Optics, Computer Engineering and Computer Science, Systems and Engineering Management, Operational Sciences, and Engineering Physics

    Socially Assistive Robots in Smart Environments to Attend Elderly People—A Survey.

    Get PDF
    Assistive environments for daily living (Ambient Assisted Living, AAL) include the deployment of sensors and certain actuators in the home or residence where the person to be cared for lives so that, with the help of the necessary computational management and decision-making mechanisms, the person can live a more autonomous life. Such technologies are becoming more affordable and popular. However, despite the undoubted potential of the services offered by these AAL systems, there are serious problems of acceptance today. In part, these problems arise from the design phase, which often does not sufficiently take into account the end users. On the other hand, it is complex for these older people to interact with interfaces that are sometimes not very natural or intuitive. The use of a socially assistive robot (SAR) that serves as an interface to the AAL system and takes responsibility for the interaction with the person is a possible solution. The robot is a physical entity that can operate with a certain degree of autonomy and be able to bring features to the interaction with the person that, obviously, a tablet or smartphone will not be able to do. The robot can benefit from the recent popularization of artificial intelligence-based solutions to personalize its attention to the person and to provide new services. Their inclusion in an AAL ecosystem should, however, also be carefully assessed. The robot’s mission should not be to replace the person but to be a tool to facilitate the elderly person’s daily life. Its design should consider the AAL system in which it is integrated, the needs and preferences of the people with whom it will interact, and the services that, in conjunction with this system, the robot can offer. The aim of this article is to review the current state of the art in the integration of SARs into the AAL ecosystem and to determine whether an initial phase of high expectations but very limited results have been overcome.This work has been supported by grants PDC2022-133597-C42, TED2021-131739B-C21 and PID2022-137344OB-C32, funded by MCIN/AEI/10.13039/501100011033 and by the European Union NextGenerationEU/PRTR (for the first two grants), and “ERDF A way of making Europe” (for the third grant). Furthermore, this work has also been supported by the “Vivir en Casa” project (8.07/5.14.6298), funded by the European Union Next Generation/PRTR and by the Government of Andalusia

    Air Force Institute of Technology Research Report 1997

    Get PDF
    This report summarizes the research activities of the Air Force Institute of Technology\u27s Graduate School of Engineering and the Graduate School of Logistics and Acquisition Management. It describes research interests and faculty expertise; list student theses/dissertations; identifies research sponsors and contributions; and outlines the procedure for contacting either school

    In vivo skin capacitive imaging analysis by using grey level co-occurrence matrix (GLCM).

    Get PDF
    We present our latest work on in vivo skin capacitive imaging analysis by using grey level co-occurrence matrix (GLCM). The in vivo skin capacitive images were taken by a capacitance based fingerprint sensor, the skin capacitive images were then analysed by GLCM. Four different GLCM feature vectors, angular second moment (ASM), entropy (ENT), contrast (CON) and correlation (COR), are selected to describe the skin texture. The results show that angular second moment increases as age increases, and entropy decreases as age increases. The results also suggest that the angular second moment values and the entropy values reflect more about the skin texture, whilst the contrast values and the correlation values reflect more about the topically applied solvents. The overall results shows that the GLCM is an effective way to extract and analyse the skin texture information, which can potentially be a valuable reference for evaluating effects of medical and cosmetic treatments
    • …
    corecore