15 research outputs found

    Multi-scale Kernel discrminant analysis

    Get PDF
    The bandwidth that minimizes the mean integrated square error of a kernel density estimator may not always be good when the density estimate is used for classification purpose. On the other hand cross-validation based techniques for choosing bandwidths may not be computationally feasible when there are many competing classes. Instead of concentrating on a single optimum bandwidth for each population density estimate, it would be more useful in practice to look at the results for different scales of smoothing. This paper presents such a multi-scale approach for classification using kernel density estimates along with a graphical device that leads to a more informative discriminant analysis. Usefulness of this proposed methodology has been illustrated using some benchmark data sets

    Sign tests in multidimension: inference based on the geometry of the data cloud

    No full text
    Multivariate sign tests attracted several statisticians in the past, and it is evident from recent nonparametric literature that they still continue to draw attention. One of the most important features of the univariate sign test is that it does not involve much technical assumptions or complicacy, and this makes it quite popular among statistics users. In this article we have come up with a new method for constructing multivariate sign tests that have reasonable statistical properties and can be used conveniently to solve one-sample location problems. Our principal strategy here is to make a wise utilization of certain geometric structures in the constellation of data points for making inference about the location of their distribution. As we proceed with the development of a fairly broad and general methodology, we indicate its relationship with previous work done by others and sometimes attempt to unify some of the earlier ideas. In particular, we pick up some well-known tests for uniform distribution of directional data and integrate them into the technology of multivariate sign tests to synthesize useful new procedures. Our procedures enjoy affine invariance and the distribution-free property for elliptically symmetric models. We report several interesting results that provide powerful insights into certain critical aspects of the problem. What is most appealing is the fundamental dependence of our approach on the basic geometry of the data cloud formed by the observations. In this article our only key to unlock the information contained in the data is the spatial arrangement of data points in a d-dimensional Euclidean spac

    Présentation

    No full text
    Les soixante-sept dernières années correspondent à l’indépendance politique de l’Inde libérée du joug colonial britannique. Elles lui ont permis d’observer des changements de grande envergure dans les domaines politique, juridique, culturel et économique. Le seul élément de stabilité qui a persisté est l’adhésion aux institutions démocratiques et à la démocratisation de l’Inde. Ce pays a réussi à demeurer une démocratie politique alors même que ses voisins ont connu des coups d’États militair..

    Présentation

    No full text
    Les soixante-sept dernières années correspondent à l’indépendance politique de l’Inde libérée du joug colonial britannique. Elles lui ont permis d’observer des changements de grande envergure dans les domaines politique, juridique, culturel et économique. Le seul élément de stabilité qui a persisté est l’adhésion aux institutions démocratiques et à la démocratisation de l’Inde. Ce pays a réussi à demeurer une démocratie politique alors même que ses voisins ont connu des coups d’États militair..

    Présentation

    No full text

    A note on robust estimation of location

    No full text
    A modified version of the usual M-estimation problem is proposed, and sample median is shown to be a solution of this problem for a wide range of choices of the score function. It exposes certain universality in the robustness of sample median in the univariate case, and this property continues to hold even in multivariate set-ups if we consider the multivariate L1-median. Some interesting facts related to this 'modified M-estimation' are discussed, and the consequences of a similar modification of the traditional maximum likelihood approach are explored.Modified M-estimation score function median unbiasedness multivariate L1-median modified maximum likelihood

    Multi-scale Kernel Discriminant Analysis

    No full text
    The bandwidth that minimizes the mean integrated square error of a kernel density estimator may not always be good when the density estimate is used for classification purpose. On the other hand cross-validation based techniques for choosing bandwidths may not be computationally feasible when there are many competing classes. Instead of concentrating on a single optimum bandwidth for each population density estimate, it would be more useful in practice to look at the results for different scales of smoothing. This paper presents such a multi-scale approach for classification using kernel density estimates along with a graphical device that leads to a more informative discriminant analysis. Usefulness of this proposed methodology has been illustrated using some benchmark data sets. 1

    Classification using kernel density estimates

    No full text
    The use of kernel density estimates in discriminant analysis is quite well known among scientists and engineers interested in statistical pattern recognition. Using a kernel density estimate involves properly selecting the scale of smoothing, namely the bandwidth parameter. The bandwidth that is optimum for the mean integrated square error of a class density estimator may not always be good for discriminant analysis, where the main emphasis is on the minimization of misclassification rates. On the other hand, cross-validation-based methods for bandwidth selection, which try to minimize estimated misclassification rates, may require huge computation when there are several competing populations. Besides, such methods usually allow only one bandwidth for each population density estimate, whereas in a classification problem, the optimum bandwidth for a class density estimate may vary significantly, depending on its competing class densities and their prior probabilities. Therefore, in a multiclass problem, it would be more meaningful to have different bandwidths for a class density when it is compared with different competing class densities. Moreover, good choice of bandwidths should also depend on the specific observation to be classified. Consequently, instead of concentrating on a single optimum bandwidth for each population density estimate, it is more useful in practice to look at the results for different scales of smoothing for the kernel density estimates. This article presents such a multiscale approach along with a graphical device leading to a more informative discriminant analysis than the usual approach based on a single optimum scale of smoothing for each class density estimate. When there are more than two competing classes, this method splits the problem into a number of two-class problems, which allows the flexibility of using different bandwidths for different pairs of competing classes and at the same time reduces the computational burden that one faces for usual cross-validation-based bandwidth selection in the presence of several competing populations. We present some benchmark examples to illustrate the usefulness of the proposed methodology

    Land use and land cover patterns as a reflection of subsurface architecture groundwater quality in a large urban center (Varanasi) in the Ganges river basin, India

    No full text
    Varanasi is an exponentially developing city in the Himalayan-sourced Ganges river basin. To understand the sustainable groundwater-sourced drinking water in Varanasi, it is essential to study the land use-land cover that reflects the surface geomorphology vis-a-vis sub-surface geology, and influence groundwater conditions. We incorporate lithological and groundwater data obtained from an extensive network of boreholes in and around the city at 110 sites, reaching a maximum depth of 100 m below ground level (bgl). The unconsolidated subsurface are primarily composed of sand, silt, clay, and gravel where, silty clay layer. Groundwater quality and stresses were determined through multi-dimensional hydrogeological approaches. The data were analyzed through multivariate statistics (Principal Component Analyses to identify the governing factor influencing the broad hydrogeochemistry. PC1 for urban areas has higher loading values for Fe, Cl− compared to Semi-urban areas highlighting contamination by municipal wastewater. PC2 for urban areas shows higher loading values for Mg2+ and HCO3− compared to semi-urban areas. Due to heavy urbanization in Varanasi, the aquifer suffers substantial groundwater abstraction during particular times of the day compared to the agricultural lands. An increase of about 9% in built-up areas within a span of 10 years (2012–2022) poses a threat to the aquifer system of our study area, jeopardizing access to sustainable drinking water. With the expansion of urbanization and unregulated groundwater extraction, the vulnerability of the aquifer system will probably increase in the foreseeable future. Implementation of sustainable water management policies, engaging all economic sectors of the population in Varanasi, can expedite the process and safeguard the aquifer from attaining its emerging vulnerability. Thus, comprehending evolving groundwater risks through non-invasive methods like that discussed in the present study, holds significant promise for effectively targeting safe groundwater availability in future
    corecore