15 research outputs found
Clustering daily patterns of human activities in the city
Data mining and statistical learning techniques are powerful analysis tools yet to be incorporated in the domain of urban studies and transportation research. In this work, we analyze an activity-based travel survey conducted in the Chicago metropolitan area over a demographic representative sample of its population. Detailed data on activities by time of day were collected from more than 30,000 individuals (and 10,552 households) who participated in a 1-day or 2-day survey implemented from January 2007 to February 2008. We examine this large-scale data in order to explore three critical issues: (1) the inherent daily activity structure of individuals in a metropolitan area, (2) the variation of individual daily activities—how they grow and fade over time, and (3) clusters of individual behaviors and the revelation of their related socio-demographic information. We find that the population can be clustered into 8 and 7 representative groups according to their activities during weekdays and weekends, respectively. Our results enrich the traditional divisions consisting of only three groups (workers, students and non-workers) and provide clusters based on activities of different time of day. The generated clusters combined with social demographic information provide a new perspective for urban and transportation planning as well as for emergency response and spreading dynamics, by addressing when, where, and how individuals interact with places in metropolitan areas.Massachusetts Institute of Technology. Dept. of Urban Studies and PlanningUnited States. Dept. of Transportation (Region One University Transportation Center)Singapore-MIT Alliance for Research and Technolog
Generation of knowledge about the control of a Flow Shop, using simulation and a learning algoÂrithm
International audienc
A simulation and learning technique for generating knowledge about manufacÂturing systems behaÂvior
International audienc
Dynamic clustering for interval data based on L 2 distance
Clustering, Symbolic Data Analysis, Interval Data, Standardization, Cluster Interpretation,
New clustering methods for interval data
Dynamic clustering, interval data, distances, prototypes,
An efficient clustering algorithm for partitioning Y-short tandem repeats data
<p>Abstract</p> <p>Background</p> <p>Y-Short Tandem Repeats (Y-STR) data consist of many similar and almost similar objects. This characteristic of Y-STR data causes two problems with partitioning: non-unique centroids and local minima problems. As a result, the existing partitioning algorithms produce poor clustering results.</p> <p>Results</p> <p>Our new algorithm, called <it>k</it>-Approximate Modal Haplotypes (<it>k</it>-AMH), obtains the highest clustering accuracy scores for five out of six datasets, and produces an equal performance for the remaining dataset. Furthermore, clustering accuracy scores of 100% are achieved for two of the datasets. The <it>k</it>-AMH algorithm records the highest mean accuracy score of 0.93 overall, compared to that of other algorithms: <it>k</it>-Population (0.91), <it>k</it>-Modes-RVF (0.81), New Fuzzy <it>k</it>-Modes (0.80), <it>k</it>-Modes (0.76), <it>k</it>-Modes-Hybrid 1 (0.76), <it>k</it>-Modes-Hybrid 2 (0.75), Fuzzy <it>k</it>-Modes (0.74), and <it>k</it>-Modes-UAVM (0.70).</p> <p>Conclusions</p> <p>The partitioning performance of the <it>k</it>-AMH algorithm for Y-STR data is superior to that of other algorithms, owing to its ability to solve the non-unique centroids and local minima problems. Our algorithm is also efficient in terms of time complexity, which is recorded as <it>O</it>(<it>km</it>(<it>n-k</it>)) and considered to be linear.</p