1,061 research outputs found

    GCG: Mining Maximal Complete Graph Patterns from Large Spatial Data

    Full text link
    Recent research on pattern discovery has progressed from mining frequent patterns and sequences to mining structured patterns, such as trees and graphs. Graphs as general data structure can model complex relations among data with wide applications in web exploration and social networks. However, the process of mining large graph patterns is a challenge due to the existence of large number of subgraphs. In this paper, we aim to mine only frequent complete graph patterns. A graph g in a database is complete if every pair of distinct vertices is connected by a unique edge. Grid Complete Graph (GCG) is a mining algorithm developed to explore interesting pruning techniques to extract maximal complete graphs from large spatial dataset existing in Sloan Digital Sky Survey (SDSS) data. Using a divide and conquer strategy, GCG shows high efficiency especially in the presence of large number of patterns. In this paper, we describe GCG that can mine not only simple co-location spatial patterns but also complex ones. To the best of our knowledge, this is the first algorithm used to exploit the extraction of maximal complete graphs in the process of mining complex co-location patterns in large spatial dataset.Comment: 1

    NEW METHODS FOR MINING SEQUENTIAL AND TIME SERIES DATA

    Get PDF
    Data mining is the process of extracting knowledge from large amounts of data. It covers a variety of techniques aimed at discovering diverse types of patterns on the basis of the requirements of the domain. These techniques include association rules mining, classification, cluster analysis and outlier detection. The availability of applications that produce massive amounts of spatial, spatio-temporal (ST) and time series data (TSD) is the rationale for developing specialized techniques to excavate such data. In spatial data mining, the spatial co-location rule problem is different from the association rule problem, since there is no natural notion of transactions in spatial datasets that are embedded in continuous geographic space. Therefore, we have proposed an efficient algorithm (GridClique) to mine interesting spatial co-location patterns (maximal cliques). These patterns are used as the raw transactions for an association rule mining technique to discover complex co-location rules. Our proposal includes certain types of complex relationships – especially negative relationships – in the patterns. The relationships can be obtained from only the maximal clique patterns, which have never been used until now. Our approach is applied on a well-known astronomy dataset obtained from the Sloan Digital Sky Survey (SDSS). ST data is continuously collected and made accessible in the public domain. We present an approach to mine and query large ST data with the aim of finding interesting patterns and understanding the underlying process of data generation. An important class of queries is based on the flock pattern. A flock is a large subset of objects moving along paths close to each other for a predefined time. One approach to processing a “flock query” is to map ST data into high-dimensional space and to reduce the query to a sequence of standard range queries that can be answered using a spatial indexing structure; however, the performance of spatial indexing structures rapidly deteriorates in high-dimensional space. This thesis sets out a preprocessing strategy that uses a random projection to reduce the dimensionality of the transformed space. We use probabilistic arguments to prove the accuracy of the projection and to present experimental results that show the possibility of managing the curse of dimensionality in a ST setting by combining random projections with traditional data structures. In time series data mining, we devised a new space-efficient algorithm (SparseDTW) to compute the dynamic time warping (DTW) distance between two time series, which always yields the optimal result. This is in contrast to other approaches which typically sacrifice optimality to attain space efficiency. The main idea behind our approach is to dynamically exploit the existence of similarity and/or correlation between the time series: the more the similarity between the time series, the less space required to compute the DTW between them. Other techniques for speeding up DTW, impose a priori constraints and do not exploit similarity characteristics that may be present in the data. Our experiments demonstrate that SparseDTW outperforms these approaches. We discover an interesting pattern by applying SparseDTW algorithm: “pairs trading” in a large stock-market dataset, of the index daily prices from the Australian stock exchange (ASX) from 1980 to 2002

    Mining for Significant Information from Unstructured and Structured Biological Data and Its Applications

    Get PDF
    Massive amounts of biological data are being accumulated in science. Searching for significant meaningful information and patterns from different types of data is necessary towards gaining knowledge from these large amounts of data available to users. However, data mining techniques do not normally deal with significance. Integrating data mining techniques with standard statistical procedures provides a way for mining statistically signi- ficant, interesting information from both structured and unstructured data. In this dissertation, different algorithms for mining significant biological information from both unstructured and structured data are proposed. A weighted-density-based approach is presented for mining item data from unstructured textual representations. Different algorithms in the area of radiation hybrid mapping are developed for mining significant information from structured binary data. The proposed algorithms have different applications in the ordering problem in radiation hybrid mapping including: identifying unreliable markers, and building solid framework maps. Effectiveness of the proposed algorithms towards improving map stability is demonstrated. Map stability is determined based on resampling analysis. The proposed algorithms deal effectively and efficiently with multidimensional data and also reduce computational cost dramatically. Evaluation shows that the proposed algorithms outperform comparative methods in terms of both accuracy and computation cost

    Analysis of Accident Data and Evaluation of Leading Causes for Traffic Accidents in Jordan

    Get PDF
    Road safety is a primary concern and goal of highway and traffic engineers worldwide. The road network in Jordan exhibits relatively high traffic volumes, particularly in urban areas and in the Central Business District (CBD) areas of major cities. Jordan ranks one of the top countries worldwide in terms of having higher numbers of road traffic accidents leading to a relatively high number of fatalities and injuries. In the past few years in particular, the number of registered vehicles in Jordan has considerably increased. As a result, traffic volumes and Vehicle Miles of Travel (VMT) have significantly increased leading to deteriorating traffic flows and escalating traffic congestions and jams. Consequently, the number of road traffic accidents has also noticeably increased in Jordan in the past decade. Complete analysis of statistical data obtained for traffic accidents in Jordan was conducted in this study. Evaluation of the possible leading causes of traffic accidents in Jordan was also carried out. Different possible causes along with behaviors of drivers and pedestrians were investigated and correlated with the number of traffic accidents, fatalities and injuries. Jordan was found to have accident, fatality and injury rates that are considerably higher than those of other countries in the world. Nonetheless, as rates with time, the fatality and injury rates seemed to be moving in the right direction. Yet, the number of traffic accidents, fatalities and injuries looked critical. Traffic accidents and casualties were observed to be higher in summer times. More than 90 percent of traffic accidents, fatalities and injuries occurred on roads with speed limits between 40 and 60 km/h. Pedestrians composed the highest percentage of the total numbers of fatalities and injuries. The majority of driver casualties and passenger casualties (fatalities and injuries) belonged to the age group of 18-42 years. On the other hand, the highest percentage of pedestrian casualties belonged to the age group of 0-18 years. However, about 80 percent of the casualties in Jordan were males and only 20 percent were females. “Tailgating” and “not taking safety measurements during driving” were the most two important driver behaviors in terms of traffic accidents. Yet, behaviors of “using wrong lane” and “not taking safety measurements during driving” led to the highest percentages of the total number of fatalities and injuries. The majority of the pedestrian fatalities and injuries were in fact walking on road during the time of the accident occurrence and about one third of them were walking on sidewalk. Other behaviors of drivers and pedestrians were also important and created traffic complexity and hazardous situations leading to a reduction in saturation flow rates and in capacities and causing bottleneck conditions and traffic jams; hence resulting in traffic safety concerns

    The Mass Housing Dilemma: An Industrial Design Process in Architecture

    Get PDF
    World population growth and global warming are accentuating the long recognized problem of housing for the masses; millions are homeless, live in inadequate shelter, or as in the US Manufactured Housing market that is the focus of this thesis, live in nondurable poor quality ?manufactured? houses that are detrimental to health, at best, or during extreme weather events, suffer catastrophic damages often resulting in death to occupants. In this thesis, we have reviewed the role of the architect in the US Manufactured Housing industry; additionally, we identified the major problems that plaque the US Manufactured Housing Industry. Further, we have reviewed how architects and Industrial Designers use technology in their respective fields. Our findings and analysis suggest that an Industrial Design approach, applied in architecture for mass housing, offers a means of improving the architect?s role in manufactured housing for the masses

    Semiotic Analysis of the Visual Signs of Protest on Online Jordanian Platforms: Code Choice and Language Mobility

    Get PDF
    The political discourse of protesting which comprises carrying signs for clarifying demands and expressing feelings constitutes a significant area of study in the signs of online platforms within the linguistic landscape field. Taking as a case in point the Jordanian protest on May 30, 2018, a few examples of the signs of protest are analyzed using some aspects of visual semiotics, particularly the code choice. The study is grounded on both quantitative and qualitative data culled from online sources. The analysis of the data finds a variety of linguistic codes used in attaining different readerships: the standard form of Arabic as the official language in the country and in other Arab countries; Jordanian Arabic investigated as the device of speaking out the voice of the local audience; English viewed as the language of addressing the global audience; and the multilingualism occurrence as a significant feature in the corpus for achieving further readerships. These codes are largely motivated by other significant semiotic resources, including multimodality, font size, color relevance, and materiality practices. The study further views the signs of protest as a new trend of mobility, often considered a challenging notion to the territoriality of fixed signs in most linguistic landscape studies

    The British Stance on the Arab Emirates in the North of the Arabian Peninsula during the First World War 1914-1918

    Get PDF
    This study deals with a significant historical subject, focusing on the British stance towards the Arab Emirates in the north of the Arabian Peninsula during the First World War, particularly in relation to the British Conflict with the Ottoman Empire. The primary goal for Britain was to end the Ottoman presence in the region and counter the influence of local forces in the north of the Arabian Peninsula. To achieve this objective, Britain employed a combination of both cooperative and coercive methods. Furthermore, the study explores the support from the people of the north of the Arabian Peninsula for the Great Arab Revolt, a movement aimed at liberating and elevating the status of the Arabs. The revolution had the potential to succeed in its aims if not for Britain\u27s failure to honor its promises to the Arabs. This study aims to understand the nature of the British position on the Arab Emirates in the north of the Arabian Peninsula during the First World War and assess the consequences of this stance on the region and the Arabian Peninsula as a whole. Additionally, the study aims to elucidate the British plan to assert control over the entire Arab East and understand the circumstances and conditions surrounding the north of the Arabian Peninsula during the First World War. It also examines the reaction of the local forces towards the British intervention. The significance of this research lies in its examination of British policies towards the local forces in the north of the Arabian Peninsula during the First World War and the responses of these forces. It fills a gap in the existing literature, as most writings have primarily focused on the Najd and Makkah regions, paying little attention to the northern Arabian Peninsula
    • …
    corecore