995 research outputs found

    TVL<sub>1</sub> Planarity Regularization for 3D Shape Approximation

    Get PDF
    The modern emergence of automation in many industries has given impetus to extensive research into mobile robotics. Novel perception technologies now enable cars to drive autonomously, tractors to till a field automatically and underwater robots to construct pipelines. An essential requirement to facilitate both perception and autonomous navigation is the analysis of the 3D environment using sensors like laser scanners or stereo cameras. 3D sensors generate a very large number of 3D data points when sampling object shapes within an environment, but crucially do not provide any intrinsic information about the environment which the robots operate within. This work focuses on the fundamental task of 3D shape reconstruction and modelling from 3D point clouds. The novelty lies in the representation of surfaces by algebraic functions having limited support, which enables the extraction of smooth consistent implicit shapes from noisy samples with a heterogeneous density. The minimization of total variation of second differential degree makes it possible to enforce planar surfaces which often occur in man-made environments. Applying the new technique means that less accurate, low-cost 3D sensors can be employed without sacrificing the 3D shape reconstruction accuracy

    Computing server power modeling in a data center: survey,taxonomy and performance evaluation

    Full text link
    Data centers are large scale, energy-hungry infrastructure serving the increasing computational demands as the world is becoming more connected in smart cities. The emergence of advanced technologies such as cloud-based services, internet of things (IoT) and big data analytics has augmented the growth of global data centers, leading to high energy consumption. This upsurge in energy consumption of the data centers not only incurs the issue of surging high cost (operational and maintenance) but also has an adverse effect on the environment. Dynamic power management in a data center environment requires the cognizance of the correlation between the system and hardware level performance counters and the power consumption. Power consumption modeling exhibits this correlation and is crucial in designing energy-efficient optimization strategies based on resource utilization. Several works in power modeling are proposed and used in the literature. However, these power models have been evaluated using different benchmarking applications, power measurement techniques and error calculation formula on different machines. In this work, we present a taxonomy and evaluation of 24 software-based power models using a unified environment, benchmarking applications, power measurement technique and error formula, with the aim of achieving an objective comparison. We use different servers architectures to assess the impact of heterogeneity on the models' comparison. The performance analysis of these models is elaborated in the paper

    DAPHNE: An Open and Extensible System Infrastructure for Integrated Data Analysis Pipelines

    Get PDF
    Integrated data analysis (IDA) pipelinesโ€”that combine data management (DM) and query processing, high-performance computing (HPC), and machine learning (ML) training and scoringโ€”become increasingly common in practice. Interestingly, systems of these areas share many compilation and runtime techniques, and the usedโ€”increasingly heterogeneousโ€”hardware infrastructure converges as well. Yet, the programming paradigms, cluster resource management, data formats and representations, as well as execution strategies differ substantially. DAPHNE is an open and extensible system infrastructure for such IDA pipelines, including language abstractions, compilation and runtime techniques, multi-level scheduling, hardware (HW) accelerators, and computational storage for increasing productivity and eliminating unnecessary overheads. In this paper, we make a case for IDA pipelines, describe the overall DAPHNE system architecture, its key components, and the design of a vectorized execution engine for computational storage, HW accelerators, as well as local and distributed operations. Preliminary experiments that compare DAPHNE with MonetDB, Pandas, DuckDB, and TensorFlow show promising results

    ๋น„์ •ํ˜•๋ฐ์ดํ„ฐ๊ฐ€ ์žˆ๋Š” ์ œํ•œ์ ์ธ ์ƒํ’ˆ์ •๋ณด ์ œ๊ณตํ™˜๊ฒฝ์—์„œ์˜ ๊ฒ€์ƒ‰๊ณผ ๊ตฌ๋งค ํ–‰๋™์— ๊ด€ํ•œ ์—ฐ๊ตฌ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (์„์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ๊ฒฝ์˜๋Œ€ํ•™ ๊ฒฝ์˜ํ•™๊ณผ, 2020. 8. ์†ก์ธ์„ฑ.I develop an empirical model of search and choice in which consumers are presented with limited product information prior to the search. In the model, consumers search and click on the items listed on product listing pages. They expect to view vertical as well as horizontal attribute values that cannot be observed on product listing pages (i.e. costly attribute values) after clicking-through. Vertical costly attributes include quantified review scores of several product attributes. They reflect actual users satisfaction with the product attributes. This paper has the following contributions to the literature. First, the model reflects consumers higher uncertainty of their utility prior to search which can be reduced by obtaining information about costly attribute values. It is in line with consumer learning literature. Second, the model also reflects consumers heteroskedastic uncertainty of their utility during searching for the products without violating the parsimony of the model. Third, this paper uses a deep learning method in order to extract the structured features from reviews. The model is applied to the aggregate search and choice data from Chrome-OS laptops at Bestbuy.com. The model shows the realistic values of parameter estimates and better in-sample fit in comparison with Kim et al. (2016). With the estimated model parameters, I conduct the counterfactual experiment that shows how consumer search set size and manufacturer market share and revenue change in a full information environment. In the full information environment, consumers reduce their search set size by -3.9% and choose almost the same products as they do in the limited information environment. It leads to an increase in consumer surplus by 3.19%. For producers, most of their market share and revenue increase. Furthermore, the brands with relatively low rank in total rating and high rank in average review score shows the relative higher increase. Therefore, I want to suggest to manufacturers that they should post quantified review scores with respect to each attribute on product listing pages in order to boost their sales and revenues especially when their total rating is relatively low.๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ์ œํ•œ๋œ ์ƒํ’ˆ ์ •๋ณด๊ฐ€ ์ œ๊ณต๋˜๋Š” ํ™˜๊ฒฝ์—์„œ ์†Œ๋น„์ž๊ฐ€ ์ƒํ’ˆ์„ ๊ฒ€์ƒ‰ํ•˜๊ณ  ๊ตฌ๋งคํ•˜๋Š” ํ–‰์œ„๋ฅผ ์„ค๋ช…ํ•˜๋Š” ์‹ค์ฆ์  ๋ชจํ˜•์„ ๊ฐœ๋ฐœํ•˜์˜€๋‹ค. ๋ณธ ๋ชจํ˜•์—์„œ ์†Œ๋น„์ž๋Š” ์ƒํ’ˆ ๋ฆฌ์ŠคํŠธ ํŽ˜์ด์ง€์—์„œ ์ œํ’ˆ์„ ๊ฒ€์ƒ‰ํ•˜๋Š” ๊ณผ์ •์—์„œ ์ƒํ’ˆ์„ ํด๋ฆญ ํ•œ ํ›„์—๋งŒ ๋ณผ ์ˆ˜ ์žˆ๋Š” vertical attribute๊ณผ horizontal attribute์— ๋Œ€ํ•œ ๊ธฐ๋Œ€์น˜๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค(๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ์ด๋ฅผ costly attributes์ด๋ผ๊ณ  ๋ช…ํ•จ). Vertical costly attributes์—๋Š” ์ƒํ’ˆ์˜ ๋ช‡๊ฐ€์ง€ ํŠน์„ฑ์— ๋Œ€ํ•œ ๋ฆฌ๋ทฐ ์ ์ˆ˜๋“ค์ด ํฌํ•จ๋˜์–ด ์žˆ๋‹ค. ์ด ๋ฆฌ๋ทฐ ์ ์ˆ˜๋“ค์€ ํŠน์ • ์ƒํ’ˆ ํŠน์ง•์— ๋Œ€ํ•œ ์‹ค์ œ ์†Œ๋น„์ž ๋งŒ์กฑ๋„๋ฅผ ๊ณ„๋Ÿ‰ํ™” ํ•œ ๊ฒƒ์ด๋‹ค. ๋ณธ ์—ฐ๊ตฌ๋Š” ๊ธฐ์กด ์—ฐ๊ตฌ์— ๋‹ค์Œ๊ณผ ๊ฐ™์€ ํ•™๋ฌธ์  ์˜์˜๋ฅผ ๊ฐ€์ง„๋‹ค. ์ฒซ์งธ, ๋ณธ ๋ชจํ˜•์€ ๊ฒ€์ƒ‰ํ•˜๊ธฐ ์ „ ๋‹จ๊ณ„์—์„œ ์ œํ’ˆ ํšจ์šฉ์— ๋Œ€ํ•œ ๋” ๋†’์€ ๋ถˆํ™•์‹ค์„ฑ์„ ๊ฐ€์ง€๊ณ  ์žˆ์Œ์„ ๋ฐ˜์˜ํ•˜๊ณ  ์žˆ๋‹ค. ๋ถˆํ™•์‹ค์„ฑ์€ costly attribute์— ๋Œ€ํ•œ ์ •๋ณด๋ฅผ ์–ป์Œ์œผ๋กœ์จ ์–ด๋Š์ •๋„ ์™„ํ™”๋˜๋Š”๋ฐ, ์ด๋Š” ๊ธฐ์กด ์†Œ๋น„์ž ํ•™์Šต ๋ฌธํ—Œ๊ณผ ์ผ๋งฅ์ƒํ†ตํ•˜๋Š” ๋ฐ”์ด๋‹ค. ๋‘˜์งธ, ๋ณธ ์—ฐ๊ตฌ๋Š” ๋ชจํ˜•์„ ๋ณต์žกํ•˜๊ฒŒ ๋งŒ๋“ค์ง€ ์•Š์œผ๋ฉด์„œ, ์ƒํ’ˆ์„ ๊ฒ€์ƒ‰ํ•˜๋Š” ๋™์•ˆ ์†Œ๋น„์ž๊ฐ€ ๊ฒ€์ƒ‰๊ณผ์ •์—์„œ ์ด๋ถ„์‚ฐ์ ์ธ ํšจ์šฉ์„ ๊ฐ€์ง€๋Š” ๊ฒƒ์„ ๋ฐ˜์˜ํ•˜์˜€๋‹ค. ์…‹์งธ, ๋ณธ ์—ฐ๊ตฌ๋Š” ๋ฆฌ๋ทฐ๋กœ๋ถ€ํ„ฐ ๊ตฌ์กฐํ™”๋œ ๋ณ€์ˆ˜๋ฅผ ์ถ”์ถœํ•˜๊ธฐ ์œ„ํ•ด ๋”ฅ๋Ÿฌ๋‹(deep learning) ๋ชจํ˜•์„ ํ™œ์šฉํ•˜์˜€๋‹ค. ํ•ด๋‹น ๋ชจํ˜•์€ ๋ฒ ์ŠคํŠธ๋ฐ”์ด๋‹ท์ปด(Bestbuy.com)์— ์žˆ๋Š” ํฌ๋กฌ๋…ธํŠธ๋ถ ์นดํ…Œ๊ณ ๋ฆฌ์— ์žˆ๋Š” ๊ฒ€์ƒ‰ ๋ฐ ๊ตฌ๋งค ๋ฐ์ดํ„ฐ์— ์ ์šฉ๋˜์—ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ํ˜„์‹ค์ ์ธ ๋ชจ์ˆ˜ ์ถ”์ •์น˜๊ฐ€ ๋‚˜์™”๊ณ , Kim et al. (2016)์™€ ๋น„๊ตํ–ˆ์„ ๋•Œ, ํ•ด๋‹น ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•ด ๋” ๋‚˜์€ ์ ํ•ฉ๋„๋ฅผ ๋ณด์—ฌ์ฃผ์—ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ์ถ”์ •๋œ ๋ชจ์ˆ˜๋ฅผ ํ†ตํ•ด ๊ฐ€์ƒ์ ์ธ ์ƒํ™ฉ์— ๋Œ€ํ•œ ์‹คํ—˜์„ ํ•˜์˜€๋‹ค. ํ•ด๋‹น ์‹คํ—˜์—์„œ๋Š” ์ƒํ’ˆ์ •๋ณด๊ฐ€ ๋ฆฌ์ŠคํŠธ ํŽ˜์ด์ง€์—์„œ ๋ชจ๋‘ ์ œ๊ณต๋˜๋Š” ์ƒํ™ฉ์—์„œ ์†Œ๋น„์ž์˜ ๊ฒ€์ƒ‰๋Ÿ‰๊ณผ ์ œ์กฐ์—…์ฒด์˜ ์‹œ์žฅ ์ ์œ ์œจ ๋ฐ ์ˆ˜์ต๋ฅ ์ด ์–ด๋–ป๊ฒŒ ๋ณ€ํ•˜๋Š”์ง€ ๋ณด์•˜๋‹ค. ๊ฐ€์ƒ์ ์ธ ์ƒํ™ฉ์—์„œ ์†Œ๋น„์ž๋Š” -3.9%๋งŒํผ ๊ฒ€์ƒ‰๋Ÿ‰์„ ์ค„์ด๋Š”๋ฐ, ์ตœ์ข…์ ์œผ๋กœ ์„ ํƒํ•˜๋Š” ์ƒํ’ˆ์€ ๊ฑฐ์˜ ๋ณ€ํ™”๊ฐ€ ์—†์—ˆ๋‹ค. ๊ทธ๋ž˜์„œ ๋ณ€ํ™”๋œ ์ •๋ณด์ œ๊ณต ํ™˜๊ฒฝ์—์„œ ์†Œ๋น„์ž์ž‰์—ฌ 3.19%๋งŒํผ ์ฆ๊ฐ€ํ•˜์˜€๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๋Œ€๋ถ€๋ถ„ ์ œ์กฐ์—…์ฒด์˜ ์‹œ์žฅ ์ ์œ ์œจ๊ณผ ์ˆ˜์ต๋ฅ ์ด ์ฆ๊ฐ€ํ•˜์˜€๋‹ค. ๋”๋ถˆ์–ด์„œ total rating์ด ๋น„๊ต์  ๋‚ฎ๊ณ  ๋ฆฌ๋ทฐ ์ ์ˆ˜๊ฐ€ ๋†’์€ ๋ธŒ๋žœ๋“œ๊ฐ€ ์ƒ๋Œ€์ ์œผ๋กœ ๋” ํฐ ํญ์˜ ์ฆ๊ฐ€์„ธ๋ฅผ ๋ณด์˜€๋‹ค. ์ด ๊ฒฐ๊ณผ๋Š” ์–ด๋–ค ์—…์ฒด๊ฐ€ total rating์ด ์ƒ๋Œ€์ ์œผ๋กœ ๋‚ฎ๋‹ค๋ฉด, ์ƒํ’ˆ ๋ฆฌ์ŠคํŠธ ํŽ˜์ด์ง€์— ์ƒํ’ˆ ํŠน์„ฑ๋“ค์— ๋Œ€ํ•œ ๋ฆฌ๋ทฐ ์ ์ˆ˜๋ฅผ ๊ฒŒ์‹œํ•˜์—ฌ ํŒ๋งค์œจ๊ณผ ์ˆ˜์ต์œจ์„ ์ฆ๊ฐ€์‹œํ‚ฌ ์ˆ˜ ์žˆ์Œ์„ ์‹œ์‚ฌํ•œ๋‹ค.1 Introduction 1 2 Data 8 2-1 Details of Search and Choice Data 9 2-2 Data Summary 13 2-3 Review Feature Extraction 13 2-3-1 Convolutional Neural Network for Extracting Features 17 3 Empirical Settings 24 3-1 Product Information Environment 24 3-2 Model-free Evidence 26 4 Model 31 4-1 Utility and Empirical Specification 31 4-2 Optimal Sequential Search: Reservation Utility 32 4-3 Search and Choice Probabilities 35 5 Estimation and Identification Strategy 36 5-1 Pre-estimation 36 5-2 Main Model Estimation 37 5-3 Identification 38 6 Results 40 7 Counterfactual Experiment 43 8 Conclusion 45 References 47 Appendix 50Maste

    Data Quality Over Quantity: Pitfalls and Guidelines for Process Analytics

    Full text link
    A significant portion of the effort involved in advanced process control, process analytics, and machine learning involves acquiring and preparing data. Literature often emphasizes increasingly complex modelling techniques with incremental performance improvements. However, when industrial case studies are published they often lack important details on data acquisition and preparation. Although data pre-processing is unfairly maligned as trivial and technically uninteresting, in practice it has an out-sized influence on the success of real-world artificial intelligence applications. This work describes best practices for acquiring and preparing operating data to pursue data-driven modelling and control opportunities in industrial processes. We present practical considerations for pre-processing industrial time series data to inform the efficient development of reliable soft sensors that provide valuable process insights.Comment: This work has been accepted to the 22nd IFAC World Congress 202

    Simulation of the performance of complex data-intensive workflows

    Get PDF
    PhD ThesisRecently, cloud computing has been used for analytical and data-intensive processes as it offers many attractive features, including resource pooling, on-demand capability and rapid elasticity. Scientific workflows use these features to tackle the problems of complex data-intensive applications. Data-intensive workflows are composed of many tasks that may involve large input data sets and produce large amounts of data as output, which typically runs in highly dynamic environments. However, the resources should be allocated dynamically depending on the demand changes of the work flow, as over-provisioning increases the cost and under-provisioning causes Service Level Agreement (SLA) violation and poor Quality of Service (QoS). Performance prediction of complex workflows is a necessary step prior to the deployment of the workflow. Performance analysis of complex data-intensive workflows is a challenging task due to the complexity of their structure, diversity of big data, and data dependencies, in addition to the required examination to the performance and challenges associated with running their workflows in the real cloud. In this thesis, a solution is explored to address these challenges, using a Next Generation Sequencing (NGS) workflow pipeline as a case study, which may require hundreds/ thousands of CPU hours to process a terabyte of data. We propose a methodology to model, simulate and predict runtime and the number of resources used by the complex data-intensive workflows. One contribution of our simulation methodology is that it provides an ability to extract the simulation parameters (e.g., MIPs and BW values) that are required for constructing a training set and a fairly accurate prediction of the run time for input for cluster sizes much larger than ones used in training of the prediction model. The proposed methodology permits the derivation of run time prediction based on historical data from the provenance fi les. We present the run time prediction of the complex workflow by considering different cases of its running in the cloud such as execution failure and library deployment time. In case of failure, the framework can apply the prediction only partially considering the successful parts of the pipeline, in the other case the framework can predict with or without considering the time to deploy libraries. To further improve the accuracy of prediction, we propose a simulation model that handles I/O contention
    • โ€ฆ
    corecore