Search CORE

27,531 research outputs found

Storage Solutions for Big Data Systems: A Qualitative Study and Comparison

Author: Alam Mansaf
Ali Syed Arshad
Khan Samiya
Liu Xiufeng
Publication venue
Publication date: 01/01/2019
Field of study

Big data systems development is full of challenges in view of the variety of application areas and domains that this technology promises to serve. Typically, fundamental design decisions involved in big data systems design include choosing appropriate storage and computing infrastructures. In this age of heterogeneous systems that integrate different technologies for optimized solution to a specific real world problem, big data system are not an exception to any such rule. As far as the storage aspect of any big data system is concerned, the primary facet in this regard is a storage infrastructure and NoSQL seems to be the right technology that fulfills its requirements. However, every big data application has variable data characteristics and thus, the corresponding data fits into a different data model. This paper presents feature and use case analysis and comparison of the four main data models namely document oriented, key value, graph and wide column. Moreover, a feature analysis of 80 NoSQL solutions has been provided, elaborating on the criteria and points that a developer must consider while making a possible choice. Typically, big data storage needs to communicate with the execution engine and other processing and visualization technologies to create a comprehensive solution. This brings forth second facet of big data storage, big data file formats, into picture. The second half of the research paper compares the advantages, shortcomings and possible use cases of available big data file formats for Hadoop, which is the foundation for most big data computing technologies. Decentralized storage and blockchain are seen as the next generation of big data storage and its challenges and future prospects have also been discussed

arXiv.org e-Print Archive

Online Research Database In Technology

Taming Numbers and Durations in the Model Checking Integrated Planning System

Author: Edelkamp S.
Publication venue: 'AI Access Foundation'
Publication date: 30/06/2011
Field of study

The Model Checking Integrated Planning System (MIPS) is a temporal least commitment heuristic search planner based on a flexible object-oriented workbench architecture. Its design clearly separates explicit and symbolic directed exploration algorithms from the set of on-line and off-line computed estimates and associated data structures. MIPS has shown distinguished performance in the last two international planning competitions. In the last event the description language was extended from pure propositional planning to include numerical state variables, action durations, and plan quality objective functions. Plans were no longer sequences of actions but time-stamped schedules. As a participant of the fully automated track of the competition, MIPS has proven to be a general system; in each track and every benchmark domain it efficiently computed plans of remarkable quality. This article introduces and analyzes the most important algorithmic novelties that were necessary to tackle the new layers of expressiveness in the benchmark problems and to achieve a high level of performance. The extensions include critical path analysis of sequentially generated plans to generate corresponding optimal parallel plans. The linear time algorithm to compute the parallel plan bypasses known NP hardness results for partial ordering by scheduling plans with respect to the set of actions and the imposed precedence relations. The efficiency of this algorithm also allows us to improve the exploration guidance: for each encountered planning state the corresponding approximate sequential plan is scheduled. One major strength of MIPS is its static analysis phase that grounds and simplifies parameterized predicates, functions and operators, that infers knowledge to minimize the state description length, and that detects domain object symmetries. The latter aspect is analyzed in detail. MIPS has been developed to serve as a complete and optimal state space planner, with admissible estimates, exploration engines and branching cuts. In the competition version, however, certain performance compromises had to be made, including floating point arithmetic, weighted heuristic search exploration according to an inadmissible estimate and parameterized optimization

arXiv.org e-Print Archive

Crossref

Physical Representation-based Predicate Optimization for a Visual Analytics Database

Author: Anderson Michael R.
Cafarella Michael
Ros German
Wenisch Thomas F.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 27/02/2019
Field of study

Querying the content of images, video, and other non-textual data sources requires expensive content extraction methods. Modern extraction techniques are based on deep convolutional neural networks (CNNs) and can classify objects within images with astounding accuracy. Unfortunately, these methods are slow: processing a single image can take about 10 milliseconds on modern GPU-based hardware. As massive video libraries become ubiquitous, running a content-based query over millions of video frames is prohibitive. One promising approach to reduce the runtime cost of queries of visual content is to use a hierarchical model, such as a cascade, where simple cases are handled by an inexpensive classifier. Prior work has sought to design cascades that optimize the computational cost of inference by, for example, using smaller CNNs. However, we observe that there are critical factors besides the inference time that dramatically impact the overall query time. Notably, by treating the physical representation of the input image as part of our query optimization---that is, by including image transforms, such as resolution scaling or color-depth reduction, within the cascade---we can optimize data handling costs and enable drastically more efficient classifier cascades. In this paper, we propose Tahoma, which generates and evaluates many potential classifier cascades that jointly optimize the CNN architecture and input data representation. Our experiments on a subset of ImageNet show that Tahoma's input transformations speed up cascades by up to 35 times. We also find up to a 98x speedup over the ResNet50 classifier with no loss in accuracy, and a 280x speedup if some accuracy is sacrificed.Comment: Camera-ready version of the paper submitted to ICDE 2019, In Proceedings of the 35th IEEE International Conference on Data Engineering (ICDE 2019

arXiv.org e-Print Archive

Crossref

Pemilihan kerjaya di kalangan pelajar aliran perdagangan sekolah menengah teknik : satu kajian kes

Author: Minhat Norhasyila
Publication venue
Publication date: 01/03/2004
Field of study

This research is a survey to determine the career chosen of form four student in commerce streams. The important aspect of the career chosen has been divided into three, first is information about career, type of career and factor that most influence students in choosing a career. The study was conducted at Sekolah Menengah Teknik Kajang, Selangor Darul Ehsan. Thirty six form four students was chosen by using non-random sampling purpose method as respondent. All information was gather by using questionnaire. Data collected has been analyzed in form of frequency, percentage and mean. Results are performed in table and graph. The finding show that information about career have been improved in students career chosen and mass media is the main factor influencing students in choosing their career

UTHM Institutional Repository

Carbon Dynamics and Land-Use Choices: Building a Regional-Scale Multidisciplinary Model

Author: Alexander S. P. Pfaff
R. Flint Hughes
Shuguang Liu
Suzi Kerr
Publication venue
Publication date
Field of study

Policy enabling tropical forests to approach their potential contribution to global-climate-change mitigation requires forecasts of land use and carbon storage on a large scale over long periods. In this paper, we present an integrated modeling methodology that addresses these needs. We model the dynamics of the human land-use system and of C pools contained in each ecosystem, as well as their interactions. The model is national scale, and is currently applied in a preliminary way to Costa Rica using data spanning a period of over fifty years. It combines an ecological process model, parameterized using field and other data, with an economic model, estimated using historical data to ensure a close link to actual behavior. These two models are linked so that ecological conditions affect land-use choices and vice versa. The integrated model predicts land use and its consequences for C storage for policy scenarios. These predictions can be used to create baselines, reward sequestration, and estimate the value in both environmental and economic terms of including C sequestration in tropical forests as part of the efforts to mitigate global climate change. The model can also be used to assess the benefits from costly activities to increase accuracy and thus reduce errors and their societal costs.carbon, sequestration, climate change, land use, modelling

Research Papers in Economics

Carbon Dynamics and Land-use Choices: Building a Regional-scale Multidisciplinary Model

Author: Alexander Pfaff
Flint Hughes
Shuguang Liu
Suzi Kerr
Publication venue
Publication date
Field of study

Research Papers in Economics