272 research outputs found

    WEIDJ: Development of a new algorithm for semi-structured web data extraction

    Get PDF
    In the era of industrial digitalization, people are increasingly investing in solutions that allow their process for data collection, data analysis and performance improvement. In this paper, advancing web scale knowledge extraction and alignment by integrating few sources by exploring different methods of aggregation and attention is considered in order focusing on image information. The main aim of data extraction with regards to semi-structured data is to retrieve beneficial information from the web. The data from web also known as deep web is retrievable but it requires request through form submission because it cannot be performed by any search engines. As the HTML documents start to grow larger, it has been found that the process of data extraction has been plagued with lengthy processing time. In this research work, we propose an improved model namely wrapper extraction of image using document object model (DOM) and JavaScript object notation data (JSON) (WEIDJ) in response to the promising results of mining in a higher volume of image from a various type of format. To observe the efficiency of WEIDJ, we compare the performance of data extraction by different level of page extraction with VIBS, MDR, DEPTA and VIDE. It has yielded the best results in Precision with 100, Recall with 97.93103 and F-measure with 98.9547

    A performance of comparative study for semi-structured web data extraction model

    Get PDF
    The extraction of information from multi-sources of web is an essential yet complicated step for data analysis in multiple domains. In this paper, we present a data extraction model based on visual segmentation, DOM tree and JSON approach which is known as Wrapper Extraction of Image using DOM and JSON (WEIDJ) for extracting semi-structured data from biodiversity web. The large number of information from multiple sources of web which is image’s information will be extracted using three different approach; Document Object Model (DOM), Wrapper image using Hybrid DOM and JSON (WHDJ) and Wrapper Extraction of Image using DOM and JSON (WEIDJ). Experiments were conducted on several biodiversity website. The experiment results show that WEIDJ approach promising results with respect to time analysis values. WEIDJ wrapper has successfully extracted greater than 100 images of data from the multi-source web biodiversity of over 15 different websites

    i-Eclat: performance enhancement of eclat via incremental approach in frequent itemset mining

    Get PDF
    One example of the state-of-the-art vertical rule mining technique is called equivalence class transformation (Eclat) algorithm. Neither horizontal nor vertical data format, both are still suffering from the huge memory consumption. In response to the promising results of mining in a higher volume of data from a vertical format, and taking consideration of dynamic transaction of data in a database, the research proposes a performance enhancement of Eclat algorithm that relies on incremental approach called an Incremental-Eclat (i-Eclat) algorithm. Motivated from the fast intersection in Eclat, this algorithm of performance enhancement adopts via my structured query language (MySQL) database management system (DBMS) as its platform. It serves as the association rule mining database engine in testing benchmark frequent itemset mining (FIMI) datasets from online repository. The MySQL DBMS is chosen in order to reduce the preprocessing stages of datasets. The experimental results indicate that the proposed algorithm outperforms the traditional Eclat with 17% both in chess and T10I4D100K, 69% in mushroom, 5% and 8% in pumsb_star and retail datasets. Thus, among five (5) dense and sparse datasets, the average performance of i-Eclat is concluded to be 23% better than Eclat

    A FRAMEWORK FOR DEVELOPMENT OF SOCIAL NETWORKING SITE SKILL AMONG RURAL WOMEN COMMUNITIES

    Get PDF
    Purpose: A Social Networking Site (also social networking service or social media) is a platform to make people connected and share anything about them. The purpose of this research to construct a framework for the Development of Social Networking Site Skill to help women in rural areas to face the growth of ICT. This paper discusses how the proposed framework can help them to develop their skills of marketing using the SNS. This kind of effort, hopefully could empower the targeted marginalized group with the knowledge of information engineering, increase their awareness and utilization of ICT in their everyday actions. Methodology: The data obtained are the result of on-going projects in Setiu Wetlands, Terengganu.  Community rural women in Setiu Wetlands are respondent for this study. A total of 30 people (identified as women entrepreneurs) were respondents and profile data was preliminary studies about the skills and existing ICT literacy and internet use. Main Findings: Based on profiling data that have been collected, a framework for the development of skills in using social media as a business medium has been developed. Implications/Applications: The framework developed is expected to produce successful entrepreneurs from rural women communities. The entrepreneur will be an example to other women. This effort also is expected to help rural women community can improve the living standards of their families

    Formal Specification for Spatial Information Databases Integration Framework (SIDIF)

    Get PDF
    This paper discusses the formal validation for spatial information databases integration framework (SIDIF). A SIDIF database is a large, organized body of persistent data, usually associated with computerized software designed to update, query, and retrieve components of the data stored within the system. One of the common difficulties faced by the developer is in designing a robust database system. Even so, in order to solve this matter, developers have to focus their efforts on the formal specifications. The formal specification is supposed to reduce the overall development time. Formal specifications can be used to provide an unambiguous and precise supplement to natural language descriptions. Besides, it can be rigorously validated and verified leading to the early detection of specification errors. Consequently, to validate this problem formally, we specify the SIDIF database framework using Z language and prove by using Z/EVES theorem proven tool. By using this kind of tools, it may help to reduce time, energy and mistake compared to manual theorem proving which can be error task and tedious

    Gold nanoparticle sensor for the visual detection of pork adulteration in meatball formulation.

    Get PDF
    We visually identify pork adulteration in beef and chicken meatball preparations using 20 nm gold nanoparticles (GNPs) as colorimetric sensors. Meatball is a popular food in certain Asian and European countries. Verification of pork adulteration in meatball is necessary to meet the Halal and Kosher food standards. Twenty nm GNPs change color from pinkish-red to gray-purple, and their absorption peak at 525 nm is red-shifted by 30–50 nm in 3 mM phosphate buffer saline (PBS). Adsorption of single-stranded DNA protects the particles against salt-induced aggregation. Mixing and annealing of a 25-nucleotide (nt) single-stranded (ss) DNA probe with denatured DNA of different meatballs differentiated well between perfectly matched and mismatch hybridization at a critical annealing temperature. The probes become available in nonpork DNA containing vials due to mismatches and interact with GNPs to protect them from salt-induced aggregation. Whereas, all the pork containing vials, either in pure and mixed forms, consumed the probes totally by perfect hybridization and turned into grey, indicating aggregation. This is clearly reflected by a well-defined red-shift of the absorption peak and significantly increased absorbance in 550–800 nm regimes. This label-free low-cost assay should find applications in food analysis, genetic screening, and homology studies
    corecore