7,964 research outputs found
Dynamic feature selection for clustering high dimensional data streams
open access articleChange in a data stream can occur at the concept level and at the feature level. Change at the feature level can occur if new, additional features appear in the stream or if the importance and relevance of a feature changes as the stream progresses. This type of change has not received as much attention as concept-level change. Furthermore, a lot of the methods proposed for clustering streams (density-based, graph-based, and grid-based) rely on some form of distance as a similarity metric and this is problematic in high-dimensional data where the curse of dimensionality renders distance measurements and any concept of âdensityâ difficult. To address these two challenges we propose combining them and framing the problem as a feature selection problem, specifically a dynamic feature selection problem. We propose a dynamic feature mask for clustering high dimensional data streams. Redundant features are masked and clustering is performed along unmasked, relevant features. If a feature's perceived importance changes, the mask is updated accordingly; previously unimportant features are unmasked and features which lose relevance become masked. The proposed method is algorithm-independent and can be used with any of the existing density-based clustering algorithms which typically do not have a mechanism for dealing with feature drift and struggle with high-dimensional data. We evaluate the proposed method on four density-based clustering algorithms across four high-dimensional streams; two text streams and two image streams. In each case, the proposed dynamic feature mask improves clustering performance and reduces the processing time required by the underlying algorithm. Furthermore, change at the feature level can be observed and tracked
The ground state of the two-dimensional Hubbard model
We have studied the ground state of the two-dimensional Hubbard model by
using the adaptive sampling quantum monte carlo method. We found enhancement of
the d-wave correlation function, the spin gap and the coexistence of both the
commensurate and incommensurate peaks in , which does not
contradict a recent experimental finding that both the resonance peak and the
incommensurate peaks reside in the same doping level of YBCO and BSCCO.Comment: To be published in Proceedings of Strongly Correlated Electron System
(SCES99
Finding and tracking multi-density clusters in an online dynamic data stream
The file attached to this record is the author's final peer reviewed version.Change is one of the biggest challenges in dynamic stream mining. From a data-mining perspective, adapting and tracking change is desirable in order to understand how and why change has occurred. Clustering, a form of unsupervised learning, can be used to identify the underlying patterns in a stream. Density-based clustering identifies clusters as areas of high density separated by areas of low density. This paper proposes a Multi-Density Stream Clustering (MDSC) algorithm to address these two problems; the multi-density problem and the problem of discovering and tracking changes in a dynamic stream. MDSC consists of two on-line components; discovered, labelled clusters and an outlier buffer. Incoming points are assigned to a live cluster or passed to the outlier buffer. New clusters are discovered in the buffer using an ant-inspired swarm intelligence approach. The newly discovered cluster is uniquely labelled and added to the set of live clusters. Processed data is subject to an ageing function and will disappear when it is no longer relevant. MDSC is shown to perform favourably to state-of-the-art peer stream-clustering algorithms on a range of real and synthetic data-streams. Experimental results suggest that MDSC can discover qualitatively useful patterns while being scalable and robust to noise
Dar es Salaam as a 'Harbour of Peace' in East Africa: Tracing the Role of Creolized Urban Ethnicity in Nation-State Formation
Dar es Salaam is exceptional in East Africa for having a record of relatively little ethnic tension, and remaining tranquil and true to its name, the âharbour of peaceâ. This paper explores the interface between ethnic and national identities in Tanzaniaâs capital city, focusing on its ethnic foundations and their malleability with regard to nationalism, asking how nationalist identities were negotiated vis-Ă -vis existing local ethnic identities. How willing were ethnic groups that were indigenous to the locality to âshareâ the city, its land, and amenities with newcomer compatriots, given that the city was almost as new as the nation-state? How did their modus operandi affect nation-building?nation-state, Tanzania, nationalism, urbanization
Soil Nitrogen and Carbon in Urban and Rural Forests
Previous work by Dr. Nancy Broshot has revealed high tree mortality and low recruitment (new seedlings) in an urban forest (Forest Park in Portland, Oregon). A series of lichen surveys in 2013 showed the lichen community has shifted to one dominated by lichens tolerant of and thriving on high nitrogen levels. To ascertain if nitrogenous air pollution could be a cause of low recruitment, we measured the level of nitrogen and carbon in the soil at 32 sites in Forest Park (24 permanent sites and 8 conifer recruitment sites). We also added 3 control sites in the Mount Hood National Forest above Estacada along an apparent air pollution gradient. The plant community was measured at three transects at each control site and lichen surveys were conducted. Four soil samples were collected at each site, dried at 35oC until weight remained constant and sieved to reduce to fine soil particle size. The samples will be assessed using an elemental analyzer to determine total nitrogen and total carbon
Tree Composition and Seedling Recruitment in Urban and Rural Forests
In 1993, Dr. Nancy Broshot randomly located 25 permanent study sites in Forest Park in Portland, Oregon to examine the effects of urbanization on forest health. Plant community structure was examined. In 2003, Dr. Broshot reexamined the plant communities at each site and found significantly higher tree mortality and reduced recruitment (young trees) in all areas of the park. Many seedlings that had been present in 1993 were absent in 2003. In 2013, a 20-year follow up study of the tree community was conducted. Although the rate of tree mortality had dropped, recruitment of seedlings and saplings was still low. A series of lichen studies completed at each site in 2013 indicated high levels of nitrogenous air pollution at all sites in the park. In 2014, three control sites along a gradient of air quality in the Mount Hood National Forest above Estacada, Oregon were added to the study. Plant community variables were measured in the same manner as in Forest Park. We found significantly more live trees, saplings and seedlings at the control sites than at sites in Forest Park. We also found significantly fewer dead trees at control sites. Indeed, we had more seedlings at the three control sites than at all 25 of the Forest Park sites. We believe the low level of recruitment may be due to nitrogenous deposition from air pollution in Forest Park; we are waiting for results from collected soil samples to evaluate this hypothesis
The Financial Crisis and the Changing Profile of Mortgage Arrears in Ireland. ESRI Research Notes 2014/4/2
Understanding which households go into mortgage arrears during both boom and bust periods in Ireland is of critical importance to ensure suitable policies are deployed to safeguard future financial stability. Many of the difficulties in Ireland arose from the loosening of underwriting standards by financial institutions. This led to excessive household leverage ratios and provided households with limited buffers with which to absorb shocks (McCarthy and McQuinn, 2017; Lydon and McCann, 2017). The joint effects of labour market difficulties and large falls in house prices led to a situation where nearly one-in-five mortgage loans was in arrears at the height of the crisis (McCarthy, 2014)
- âŠ