29 research outputs found

    An Integrated Approach for Mining Meta-Rules 1

    Get PDF
    Abstract: An integrated approach of mining association rules and meta-rules based on a hyper-structure is put forward. In this approach, time serial databases are partitioned according to time segments, and the total number of scanning database is only twice. In the first time, a set of 1-frequent itemsets and its projection database are formed at every partition. Then every projected database is scanned to construct a hyper-structure. Through mining the hyper-structure, various rules, for example, global association rules, meta-rules, stable association rules and trend rules etc. can be obtained. Compared with existing algorithms for mining association rule, our approach can mine and obtain more useful rules. Compared with existing algorithms for meta-mining or change mining, our approach has higher efficiency. The experimental results show that our approach is very promising

    Query Recommendation Using Hybrid Query Relevance

    No full text
    With the explosion of web information, search engines have become main tools in information retrieval. However, most queries submitted in web search are ambiguous and multifaceted. Understanding the queries and mining query intention is critical for search engines. In this paper, we present a novel query recommendation algorithm by combining query information and URL information which can get wide and accurate query relevance. The calculation of query relevance is based on query information by query co-concurrence and query embedding vector. Adding the ranking to query-URL pairs can calculate the strength between query and URL more precisely. Empirical experiments are performed based on AOL log. The results demonstrate the effectiveness of our proposed query recommendation algorithm, which achieves superior performance compared to other algorithms

    Geographic Information Analysis - Past and Trends

    No full text
    Effectively analyzing geographic information has attracted renewed attention in a wide range of fields, especially in urban studies. The specific characteristics of geographic information make the analysis of geographic information rather unique. In this review, we discuss the unique characteristics of geographic information and introduce some classical and new techniques in analyzing geographic information. In particular, we pay specific attention to analyzing spatial heterogeneity that inherently exists in geographic information, specifically regressed relationships using geographic information. The extension of analyzing spatial heterogeneity of geographic information in a temporal context is discussed as well. This review echoes a belief that one of the major tasks of analyzing geographic information is to develop new methods to help understand the everincreasing volume of geographic information better. We hope this review serves as a start for interested scholars to energetically seek better analytical techniques to get more insights of what geographic information can offer

    A Personalized Recommendation Algorithm Based on the User’s Implicit Feedback in E-Commerce

    No full text
    A recommendation system can recommend items of interest to users. However, due to the scarcity of user rating data and the similarity of single ratings, the accuracy of traditional collaborative filtering algorithms (CF) is limited. Compared with user rating data, the user’s behavior log is easier to obtain and contains a large amount of implicit feedback information, such as the purchase behavior, comparison behavior, and sequences of items (item-sequences). In this paper, we proposed a personalized recommendation algorithm based on a user’s implicit feedback (BUIF). BUIF considers not only the user’s purchase behavior but also the user’s comparison behavior and item-sequences. We extracted the purchase behavior, comparison behavior, and item-sequences from the user’s behavior log; calculated the user’s similarity by purchase behavior and comparison behavior; and extended word-embedding to item-embedding to obtain the item’s similarity. Based on the above method, we built a secondary reordering model to generate the recommendation results for users. The results of the experiment on the JData dataset show that our algorithm shows better improvement in regard to recommendation accuracy over other CF algorithms

    Novel Methods to Demarcate Urban House Submarket-Cluster Analysis with Spatially Varying Relationships Between House Value and Attributes

    No full text
    In urban house market studies, urban housing market can be divided into a series of submarkets. Usually, submarkets are identified with either geographic locations or housing structural characteristics, or some combination of both. In this study, we propose an alternative to identify urban housing submarkets. Instead of using house characteristics or locations, we use the relationships obtained through a geographically weighted hedonic regression (GWHR) model. In particular, we apply a K-means classification on the coefficients obtained via GWHR to identify different submarkets. Data from the City of Milwaukee are used to test the model and procedure. Comparison of a regular cluster analysis using housing structural and neighborhood socioeconomic information and the proposed procedure is conducted in terms of prediction accuracy. The analytical results suggest that hedonic regression on demarcated submarkets is better than a uniform market, and our proposed method yields more reasonable result than the ones using raw data

    Mining Approximate Keys based on Reasoning from XML Data

    No full text
    Keys are very important for data management. Due to the hierarchical structure and syntactic flexibility of XML, mining keys from XML data is a more complex and difficult task than from relational databases. In discovering keys from XML data there are some challenges in practice such as unclearness of keys, storage of enormous keys, efficient mining algorithms, etc. In this paper, in order to fill the gap between theory and practice, we propose a novel approximate measure of the support and confidence for XML keys on the basis of the number of null values on key paths. In the mining process, inference rules are used to derive new keys. Through the two-phase reasoning, a target set of approximate keys and its reduced set are obtained. Our research conducted experiments over ten benchmark XML datasets from XMark and four files in the UW XML Repository. The results show that the approach is feasible and efficient, with which effective keys in various XML data can be discovered
    corecore