21 research outputs found

    NATDATA: integrando dados de recursos naturais dos biomas brasileiros.

    Get PDF
    RESUMO: A agricultura nacional exige a intensificação das áreas plantadas aliada à manutenção dos recursos naturais dos biomas brasileiros. Respostas rápidas a questões envolvendo temas como solo, recursos hídricos, biodiversidade e clima nesse caso são essenciais. O Brasil dispõe de um grande acervo de dados sobre estes temas, distribuídos em várias instituições de pesquisa. A heterogeneidade de padrões aliada a essa distribuição dificulta o seu uso combinado. Este trabalho apresenta uma iniciativa que vem sendo desenvolvida pela Empresa Brasileira de Pesquisa Agropecuária - Embrapa, que tem como principal objetivo integrar dados de recursos naturais dos diferentes biomas brasileiros, fornecendo aos usuários um ambiente que permita a consulta rápida e integrada a estes dados.SBIAgro 2011

    DYMS (Dynamic Matcher Selector) – Scenario-based Schema Matcher Selector

    Get PDF
    Schema matching is one of the main challenges in different information system integration contexts. Over the past 20 years, different schema matching methods have been proposed and shown to be successful in various situations. Although numerous advanced matching algorithms have emerged, schema matching research remains a critical issue. Different algorithms are implemented to resolve different types of schema heterogeneities, including differences in design methodologies, naming conventions, and the level of specificity of schemas, amongst others. The algorithms are usually too generic regardless of the schema matching scenario. This situation indicates that a single matcher cannot be optimized for all matching scenarios. In this research, I proposed a dynamic matcher selector (DYMS) as a probable solution to the aforementioned problem. The proposed DYMS analyzes the schema matching scenario and selects the most appropriate matchers for a given scenario. Selecting matchers are weighted based on the parameter optimization process, which adopts the heuristic learning approach. The DYMS returns the alignment result of input schemas

    Web Personalization using Neuro-Fuzzy Clustering Algorithms

    Get PDF
    Different users have different needs from the same web page and hence it is necessary to develop a system which understands the needs and demands of the users. Web server logs have abundant information about the nature of users accessing it. In this paper we discussed how to mine these web server logs for a given period of time using unsupervised and competitive learning algorithm like Kohonen\u27\u27s self organizing maps (SOM) and interpreting those results using Unified distance Matrix (U-matrix). These algorithms help us in efficiently clustering users based on similar web access patterns and each cluster having users with similar browsing patterns. These clusters are useful in web personalization so that it communicates better with its users and also in web traffic analysis for predicting web traffic at a given period of time

    Data Warehousing Scenarios for Model Management

    Get PDF
    Model management is a framework for supporting meta-data related applications where models and mappings are manipulated as first class objects using operations such as Match, Merge, ApplyFunction, and Compose. To demonstrate the approach, we show how to use model management in two scenarios related to loading data warehouses. The case study illustrates the value of model management as a methodology for approaching meta-data related problems. It also helps clarify the required semantics of key operations. These detailed scenarios provide evidence that generic model management is useful and, very likely, implementable

    Analyzing Web Server Access Log Files Using Data Mining Techniques

    Get PDF
    Nowadays web is not only considered as a network for acquiring data, buying products and obtaining services but as a social environment for interaction and information sharing. As the number of web sites continues to grow it becomes more difficult for users to find and extract information. As a solution to that problem, during the last decade, web mining is used to evaluate the web sites, to personalize the information that is displayed to a user or set of users or to adapt the indexing structure of a web site to meet the needs of the users. In this work we describe a methodology for web usage mining that enables discovering user access patterns. Particularly we are interested whether the topology of the web site matches the desires of the users. Data collections that are used for analysis and interpretation of user viewing patterns are taken from the web server log files. Data mining techniques, such as classification, clustering and association rules are applied on preprocessed data. The intent of this research is to propose techniques for improvement of user perception and interaction with a web site

    Natdata - plataforma de recursos naturais dos biomas brasileiros: informações geoespaciais para sustentabilidade na agricultura.

    Get PDF
    Resumo: Este trabalho apresenta o Natdata, uma plataforma para integração de informação de Recursos Naturais dos Biomas Brasileiros. O artigo comenta o processo a ser adotado no seu desenvolvimento para lidar com problemas como a heterogeneidade semântica, de dados e espacial. Em especial destaca-se neste contexto a contribuição do Bioma Pantanal, rico em diversidade de informação, com vários dados já estruturados, podendo servir como um ambiente para validação das atividades sendo desenvolvidas.Geopantanal 2012

    Online Real Estate Feed Reader

    Get PDF
    This report basically discusses the preliminary research done and basic understanding of the proposed topic, which is "Online Real Estate Feed Reader". This online real estate feed reader is an idea in order to give services to people who lives in the developed area where people will face the difficulty to find land or houses. This also gives people an easy alternative way to find the agent for real estate from using the traditional way like newspaper or directly to agent. User can just click search button to display the result and also can just subscribe RSS in order to get time to time updated. This Online Real Estate feed reader website with the objective to help the user to search exactly the real estate website available from internet and not with the anonymous result which sometimes no related at all to the real estate web. User also can get information about what they are searching in term of description, contact person or picture from this online real estate feed reader, this will make sure user will get right information from the selected real estate website. The website also contains with the RSS feeds where it can be subscribe from the top of the page, this RSS will be update frequently automatically will show the outline of the information and also the picture if there are available. For the introduction, the scope for this online real estate feed reader will be focusing to search or the of real estate in KL area only, where this website only lists the real estate from KL area and stores all the information into the database. The methodology use in this project is prototyping methodology, where it consist of several phase which are planning, analysis, design and implementation phase, which the planning, analysis and design are perform repeatedly until the system is completed .The analysis also has been perform under result and discussion session, most of the user give a positive feedback for the system and for the conclusion author hope this project will be success and achieve it scope and objectives like planned and user can get benefit by using this system

    A New Algorithm to Preserve Sensitive Frequents Itemsets (APSFI) in Horizontal or Vertical Database

    Get PDF
    This research aimed to preserve on privacy of sensitive information from adversaries. We propose an Algorithm to Preserve Sensitive Frequents Itemsets (APSFI) with two ramifications to hides sensitive frequents itemsets in horizontal or vertical databases which minimize the number of database scanning processes during hiding operation. The main approach to hide sensitive frequent itemsets is to reduce the support of each given frequents sensitive 1-itemsets to be insensitive and convert another insensitive to be sensitive in the same transaction to avoid the change of database size and transaction's nature to avoid adversaries' doubt. The experiments of APSFI showed very encouraging results; it excluded 91% of database scan operations in vertical databases and 41% in horizontal layout databases in comparison with the well-known FHSFI algorithm. The experiments depict the APSFI tolerance for database size scalability, and its linear outperformance, from execution time aspect, in contrast with FHSFI

    CẢI THIỆN THUẬT GIẢI CUCKOO TRONG VẤN ĐỀ ẨN LUẬT KẾT HỢP

    Get PDF
    Nowadays, the problem of data security in the process of data mining receives more attention. The question is how to balance between exploiting legal data and avoiding revealing sensitive information. There have been many approaches, and one remarkable approach is privacy preservation in association rule mining to hide sensitive rules. Recently, a meta-heuristic algorithm is relatively effective for this purpose, which is cuckoo optimization algorithm (COA4ARH). In this paper, an improved version of COA4ARH is presented for calculating the minimum number of sensitive items which should be removed to hide sensitive rules, as well as limit the loss of non-sensitive rules. The experimental results gained from three real datasets showed that the proposed method has better results compared to the original algorithm in several cases.Hiện nay, vấn đề bảo mật dữ liệu ngày càng được quan tâm hơn trong quá trình khai thác dữ liệu. Làm sao để vừa có thể khai thác hợp pháp mà vừa tránh lộ ra các thông tin nhạy cảm. Có rất nhiều hướng tiếp cận nhưng nổi trội trong số đó là khai thác luật kết hợp đảm bảo sự riêng tư nhằm ẩn các luật nhạy cảm. Gần đây, có một thuật toán meta heuristic khá hiệu quả để đạt mục đích này, đó là thuật toán tối ưu hóa Cuckoo (COA4ARH). Trong bài báo này, một đề xuất cải tiến của COA4ARH được đưa ra để tính toán số lượng tối thiểu các item nhạy cảm cần được xóa để ẩn luật, từ đó hạn chế việc mất các luật không nhạy cảm. Các kết quả thực nghiệm tiến hành trên ba tập dữ liệu thực cho thấy trong một số trường hợp thì cải tiến đề xuất có kết quả khá tốt so với thuật toán ban đầu
    corecore