5 research outputs found

    Modelling of Spatial Big Data Analysis and Visualization

    Get PDF
    Today’s advanced survey tools open new approaches and opportunities for Geoscience researchers to create new Models, Systems and frameworks to support the lifecycle of special big data. Mobile Mapping Systems use LIDAR technology to provide efficient and accurate way to collect geographic features and its attribute from field, whichhelps city planning departments and surveyors to design and update city GIS maps with a high accuracy. It is not only about heterogenic increase in the volume of point cloud data, but also it refers to several other characteristics such as its velocity and variety. However,the vast amount of Point Cloud data gathered by Mobile Mapping Systemleads to new challenges for researches, innovation and business development to solve its five characters: Volume, Velocity, Variety, and Veracity then achievethe Value of SBD. Cloud Computing has provided a new paradigm to publish and consume new spatial models as a service plus big data utilities , services which can be utilized to overcome Point Cloud data analysis and visualization challenges. This paper presentsa model With Cloud-Based Spatial,big data Services,using spatial joinservices capabilities to relate the analysis results to its location on map,describe how Cloud Computing supports the visualizing and analyzing spatial big data and review the related scientific model’s examples

    Аналітика великих даних: принципи, напрямки і задачі (огляд)

    No full text
    Висвітлено основні напрямки, задачі та типи результатів глибокого аналізу великих (комп'ютеризованих) даних. Показано практичне значення великих даних та великої аналітики як фундаменту створення нових комп'ютерних технологій планування і керування у бізнесі. Виділено специфічні для великих даних режими використання даних (або роди завдань аналізу): «інтелектуальний» пошук потрібної інформації; масована переробка («відпрацювання») даних; індукція моделі об'єкту (середовища); екстракція знань з даних (відкриття структур і закономірностей). Окреслено етапи і організацію циклу робіт з аналізу даних. До типових класів задач великої аналітики належать: групування випадків (кластеризація); виведення цілевизначених моделей (класифікація, регресія, розпізнавання); виведення генеративних моделей; відкриття структур і закономірностей. Охарактеризовано особливості «глибокого навчання» та фактори його популярності. Виділено каузальні мережі як клас моделей, які поєднують у собі переваги генеративних, цілевизначених та багатоцільових моделей і відрізняються тим, що придатні для прогнозу ефектів керування (втручання). Вказано шість «опор», на яких будується методологічне ядро великої аналітики.Освещены основные направления, задачи и типы результатов анализа больших (компьютеризованных) данных. Показано практическое значение больших данных и большой аналитики как фундамента создания новых компьютерных технологий планирования и управления в бизнесе. Выделены специфичные для больших данных режимы использования данных (или роды заданий анализа): «интеллектуальный» поиск нужной информации; массированная переработка («отработка») данных; индукция модели объекта (среды); экстракция знаний из данных (открытие структур и закономерностей). Очерчено этапы и организацию цикла работ по анализу данных. К типовым классам задач большой аналитики относятся: группирование случаев (кластеризация); вывод целеопределенных моделей (классификация, регрессия, распознавание); вывод генеративных моделей; выявление структур і закономерностей. Охарактеризовано особенности «глубокого обучения» и факторы его популярности. Выделены каузальные сети как класс моделей, которые объединяют в себе преимущества генеративных, целеопределенных и многоцелевых моделей и отличаются тем, что пригодны для прогноза эффектов управления (вмешательства). Указано шесть «опор», на которых стоит методологическое ядро большой аналитики.We review directions (avenues) of Big Data analysis and their practical meaning as well as problems and tasks in this field. Big Data Analytics appears a dominant trend in development of modern information technologies for management and planning in business. A few examples of real applications of Big Data are briefly outlined. Analysis of Big Data is aimed to extract useful sense from raw data collection. Big Data and Big Analytics have evolved as computer society’s response to the challenges raised by rapid grows in data volumes, variety, heterogeneity, velocity and veracity. Big Data Analytics may be seen as today’s phase of researches and developments known under names ‘Data Mining’, ‘Knowledge Discovery in Data’, ‘intelligent data analysis’ etc. We suggest that there exist three modes of large-scale usage of Big Data: 1) ‘intelligent information retrieval; 2) massive “intermediate” data processing (concentration, mining), which may be performed during one or two scanning; 3) model inference from data; 4) knowledge discovery in data. Stages in data analysis cycle are outlined. Because of Big Data are raw, distributed, unstructured, heterogeneous and disaggregated (vertically splitted), this data should be prepared for deep analysis. Data preparation may comprise such jobs as data retrieval, access, filtering, cleaning, aggregation, integration, dimensionality reduction, reformatting etc. There are several classes of typical data analysis problems (tasks), including: cases grouping (clustering), predictive model inference (regression, classification, recognition etc.), generative model inference, extracting structures and regularities from data. Distinction between model inference and knowledge discovery is elucidated. We give some suggestion why ‘deep learning’ (one of the most attractive topic by now) is so successive and popular. One of drawbacks of traditional models is they disability to make prediction under incomplete list of predictors (when some predictors are missed) or under augmented list of predictors. One may overcome this drawback using causal model. Causal networks are illuminated in the survey as attractive in that they appear to be expressive generative models and (simultaneously) predictive models in strict sense. This means they pretend to explain how the object at hand is acting (provided they are adequate). Being adequate, causal network facilitates predicting causal effect of local intervention on the object

    Parallel fast fourier transform in SPMD style of cilk

    Full text link
    Copyright © 2019 Inderscience Enterprises Ltd. In this paper, we propose a parallel one-dimensional non-recursive fast Fourier transform (FFT) program based on conventional Cooley-Tukey’s algorithm written in C using Cilk in single program multiple data (SPMD) style. As a highly compact designed code, this code is compared with a highly tuned parallel recursive fast Fourier transform (FFT) using Cilk, which is included in Cilk package of version 5.4.6. Both algorithms are executed on multicore servers, and experimental results show that the performance of the SPMD style of Cilk fast Fourier transform (FFT) parallel code is highly competitive and promising

    Scheduling in Mapreduce Clusters

    Get PDF
    MapReduce is a framework proposed by Google for processing huge amounts of data in a distributed environment. The simplicity of the programming model and the fault-tolerance feature of the framework make it very popular in Big Data processing. As MapReduce clusters get popular, their scheduling becomes increasingly important. On one hand, many MapReduce applications have high performance requirements, for example, on response time and/or throughput. On the other hand, with the increasing size of MapReduce clusters, the energy-efficient scheduling of MapReduce clusters becomes inevitable. These scheduling challenges, however, have not been systematically studied. The objective of this dissertation is to provide MapReduce applications with low cost and energy consumption through the development of scheduling theory and algorithms, energy models, and energy-aware resource management. In particular, we will investigate energy-efficient scheduling in hybrid CPU-GPU MapReduce clusters. This research work is expected to have a breakthrough in Big Data processing, particularly in providing green computing to Big Data applications such as social network analysis, medical care data mining, and financial fraud detection. The tools we propose to develop are expected to increase utilization and reduce energy consumption for MapReduce clusters. In this PhD dissertation, we propose to address the aforementioned challenges by investigating and developing 1) a match-making scheduling algorithm for improving the data locality of Map- Reduce applications, 2) a real-time scheduling algorithm for heterogeneous Map- Reduce clusters, and 3) an energy-efficient scheduler for hybrid CPU-GPU Map- Reduce cluster. Advisers: Ying Lu and David Swanso
    corecore