Search CORE

211 research outputs found

The MapReduce Model on Cascading Platform for Frequent Itemset Mining

Author: Nursanti Amelia
Rokhman Nur
Publication venue: 'Universitas Gadjah Mada'
Publication date: 31/07/2018
Field of study

The implementation of parallel algorithms is very interesting research recently. Parallelism is very suitable to handle large-scale data processing. MapReduce is one of the parallel and distributed programming models. The implementation of parallel programming faces many difficulties. The Cascading gives easy scheme of Hadoop system which implements MapReduce model.Frequent itemsets are most often appear objects in a dataset. The Frequent Itemset Mining (FIM) requires complex computation. FIM is a complicated problem when implemented on large-scale data. This paper discusses the implementation of MapReduce model on Cascading for FIM. The experiment uses the Amazon dataset product co-purchasing network metadata.The experiment shows the fact that the simple mechanism of Cascading can be used to solve FIM problem. It gives time complexity O(n), more efficient than the nonparallel which has complexity O(n2/m)

Crossref

IJCCS (Indonesian Journal of Computing and Cybernetics Systems)

Corporate Smart Content Evaluation

Author: Einhaus Johannes
Hasan Ahmad
La Fleur Alexandra
Paschke Adrian
Schäfermeier Ralph
Todor Alexandru-Aurelian
Publication venue
Publication date: 01/01/2016
Field of study

Nowadays, a wide range of information sources are available due to the evolution of web and collection of data. Plenty of these information are consumable and usable by humans but not understandable and processable by machines. Some data may be directly accessible in web pages or via data feeds, but most of the meaningful existing data is hidden within deep web databases and enterprise information systems. Besides the inability to access a wide range of data, manual processing by humans is effortful, error-prone and not contemporary any more. Semantic web technologies deliver capabilities for machine-readable, exchangeable content and metadata for automatic processing of content. The enrichment of heterogeneous data with background knowledge described in ontologies induces re-usability and supports automatic processing of data. The establishment of “Corporate Smart Content” (CSC) - semantically enriched data with high information content with sufficient benefits in economic areas - is the main focus of this study. We describe three actual research areas in the field of CSC concerning scenarios and datasets applicable for corporate applications, algorithms and research. Aspect- oriented Ontology Development advances modular ontology development and partial reuse of existing ontological knowledge. Complex Entity Recognition enhances traditional entity recognition techniques to recognize clusters of related textual information about entities. Semantic Pattern Mining combines semantic web technologies with pattern learning to mine for complex models by attaching background knowledge. This study introduces the afore-mentioned topics by analyzing applicable scenarios with economic and industrial focus, as well as research emphasis. Furthermore, a collection of existing datasets for the given areas of interest is presented and evaluated. The target audience includes researchers and developers of CSC technologies - people interested in semantic web features, ontology development, automation, extracting and mining valuable information in corporate environments. The aim of this study is to provide a comprehensive and broad overview over the three topics, give assistance for decision making in interesting scenarios and choosing practical datasets for evaluating custom problem statements. Detailed descriptions about attributes and metadata of the datasets should serve as starting point for individual ideas and approaches

Institutional Repository of the Freie Universität Berlin

Fraunhofer-ePrints

A Survey on Index Support for Item Set Mining

Author: Dr.P K Singhal
Senthil Prakash.T
Publication venue: Global Journals Inc. (US)
Publication date: 30/06/2011
Field of study

It is very difficult to handle the huge amount of information stored in modern databases. To manage with these databases association rule mining is currently used, which is a costly process that involves a significant amount of time and memory. Therefore, it is necessary to develop an approach to overcome these difficulties. A suitable data structures and algorithms must be developed to effectively perform the item set mining. An index includes all necessary characteristics potentially needed during the mining task; the extraction can be executed with the help of the index, without accessing the database. A database index is a data structure that enhances the speed of information retrieval operations on a database table at very low cost and increased storage space. The use index permits user interaction, in which the user can specify different attributes for item set extraction. Therefore, the extraction can be completed with the use index and without accessing the original database. Index also supports for reusing concept to mine item sets with the use of any support threshold. This paper also focuses on the survey of index support for item set mining which are proposed by various authors

Global Journal of Computer Science and Technology (GJCST)

Collaborative Planning and Event Monitoring Over Supply Chain Network

Author: Ray Sujoy
Publication venue
Publication date: 06/04/2017
Field of study

The shifting paradigm of supply chain management is manifesting increasing reliance on automated collaborative planning and event monitoring through information-bounded interaction across organizations. An end-to-end support for the course of actions is turning vital in faster incident response and proactive decision making. Many current platforms exhibit limitations to handle supply chain planning and monitoring in decentralized setting where participants may divide their responsibilities and share computational load of the solution generation. In this thesis, we investigate modeling and solution generation techniques for shared commodity delivery planning and event monitoring problems in a collaborative setting. In particular, we first elaborate a new model of Multi-Depot Vehicle Routing Problem (MDVRP) to jointly serve customer demands using multiple vehicles followed by a heuristic technique to search near-optimal solutions for such problem instances. Secondly, we propose two distributed mechanisms, namely: Passive Learning and Active Negotiation, to find near-optimal MDVRP solutions while executing the heuristic algorithm at the participant's side. Thirdly, we illustrate a collaboration mechanism to cost-effectively deploy execution monitors over supply chain network in order to collect in-field plan execution data. Finally, we describe a distributed approach to collaboratively monitor associations among recent events from an incoming stream of plan execution data. Experimental results over known datasets demonstrate the efficiency of the approaches to handle medium and large problem instances. The work has also produced considerable knowledge on the collaborative transportation planning and execution event monitoring

Concordia University Research Repository

A Survey on Behavioral Pattern Mining from Sensor Data in Internet of Things

Author: Bhuiyan Md Zakirul
Hassan Mohammad
Kamruzzaman Joarder
Rashid Md Mamunur
Shahriar Shafin Sakib
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

The deployment of large-scale wireless sensor networks (WSNs) for the Internet of Things (IoT) applications is increasing day-by-day, especially with the emergence of smart city services. The sensor data streams generated from these applications are largely dynamic, heterogeneous, and often geographically distributed over large areas. For high-value use in business, industry and services, these data streams must be mined to extract insightful knowledge, such as about monitoring (e.g., discovering certain behaviors over a deployed area) or network diagnostics (e.g., predicting faulty sensor nodes). However, due to the inherent constraints of sensor networks and application requirements, traditional data mining techniques cannot be directly used to mine IoT data streams efficiently and accurately in real-time. In the last decade, a number of works have been reported in the literature proposing behavioral pattern mining algorithms for sensor networks. This paper presents the technical challenges that need to be considered for mining sensor data. It then provides a thorough review of the mining techniques proposed in the recent literature to mine behavioral patterns from sensor data in IoT, and their characteristics and differences are highlighted and compared. We also propose a behavioral pattern mining framework for IoT and discuss possible future research directions in this area. © 2013 IEEE

Federation ResearchOnline

aCQUIRe

A Novel Frequent Pattern Mining Algorithm for Evaluating Applicability of a Mobile Learning Framework

Author: D.D.M. Dolawattha
H. K. Salinda Premadasa
Publication venue: Global Journals Inc. (US)
Publication date: 28/10/2023
Field of study

The applicability of a mobile learning system reflects how it works in an actual situation under diverse conditions In previous studies researches for evaluating applicability in learning systems using data mining approaches are challenging to find The main objective of this study is to evaluate the applicability of the proposed mobile learning framework This framework consists of seven independent variables and their influencing factors Initially 1000 students and teachers were allowed to use the mobile learning system developed based on the proposed mobile learning framework The authors implemented the system using Moodle mobile learning environment and used its transaction log file for evaluation Transactional records that were generated due to various user activities with the facilities integrated into the system were extracted These activities were classified under eight different features i e chat forum quiz assignment book video game and app usage in thousand transactional rows A novel pattern mining algorithm namely Binary Total for Pattern Mining BTPM was developed using the above transactional dataset s binary incidence matrix format to test the system applicability Similarly Apriori frequent itemsets mining and Frequent Pattern FP Growth mining algorithms were applied to the same dataset to predict system applicabilit

Global Journal of Computer Science and Technology (GJCST)

Engaging Mainstream Media for Efficient Content Distribution and Creation

Author: Lobzhanidze Aleksandre
Publication venue: University of Missouri--Columbia
Publication date
Field of study

University of Missouri: MOspace