1,611 research outputs found

    The Loss Rank Principle for Model Selection

    Full text link
    We introduce a new principle for model selection in regression and classification. Many regression models are controlled by some smoothness or flexibility or complexity parameter c, e.g. the number of neighbors to be averaged over in k nearest neighbor (kNN) regression or the polynomial degree in regression with polynomials. Let f_D^c be the (best) regressor of complexity c on data D. A more flexible regressor can fit more data D' well than a more rigid one. If something (here small loss) is easy to achieve it's typically worth less. We define the loss rank of f_D^c as the number of other (fictitious) data D' that are fitted better by f_D'^c than D is fitted by f_D^c. We suggest selecting the model complexity c that has minimal loss rank (LoRP). Unlike most penalized maximum likelihood variants (AIC,BIC,MDL), LoRP only depends on the regression function and loss function. It works without a stochastic noise model, and is directly applicable to any non-parametric regressor, like kNN. In this paper we formalize, discuss, and motivate LoRP, study it for specific regression problems, in particular linear ones, and compare it to other model selection schemes.Comment: 16 page

    Determining Principal Component Cardinality through the Principle of Minimum Description Length

    Full text link
    PCA (Principal Component Analysis) and its variants areubiquitous techniques for matrix dimension reduction and reduced-dimensionlatent-factor extraction. One significant challenge in using PCA, is thechoice of the number of principal components. The information-theoreticMDL (Minimum Description Length) principle gives objective compression-based criteria for model selection, but it is difficult to analytically applyits modern definition - NML (Normalized Maximum Likelihood) - to theproblem of PCA. This work shows a general reduction of NML prob-lems to lower-dimension problems. Applying this reduction, it boundsthe NML of PCA, by terms of the NML of linear regression, which areknown.Comment: LOD 201

    Free Fashion?

    Full text link

    Nonparametric Hierarchical Clustering of Functional Data

    Full text link
    In this paper, we deal with the problem of curves clustering. We propose a nonparametric method which partitions the curves into clusters and discretizes the dimensions of the curve points into intervals. The cross-product of these partitions forms a data-grid which is obtained using a Bayesian model selection approach while making no assumptions regarding the curves. Finally, a post-processing technique, aiming at reducing the number of clusters in order to improve the interpretability of the clustering, is proposed. It consists in optimally merging the clusters step by step, which corresponds to an agglomerative hierarchical classification whose dissimilarity measure is the variation of the criterion. Interestingly this measure is none other than the sum of the Kullback-Leibler divergences between clusters distributions before and after the merges. The practical interest of the approach for functional data exploratory analysis is presented and compared with an alternative approach on an artificial and a real world data set

    Avian Climate Messengers

    Full text link
    This visual essay and video work were commissioned by Philip Ely, for the 'Climate Domesday Book': a speculative design project that explores contemporary questions related to the climate emergency. The book is a hybrid print-digital device designed by researchers in Australian and the UK, which uses a magic bookmark to read pages and trigger (via Bluetooth) an interaction: the playing of a video or audio on a nearby big screen. Our contribution explores ways to affectively communicate biodiversity loss in the Sixth Extinction through creative practice, particularly through visual metaphor and synecdoche. In this case, we focus on the Black-Throated Finch as a modern day canary in (Adani's) coal mine

    Drawing The Extinction Crisis

    Full text link
    ‘Precarious Birds’ is an ongoing collaboration through which the authors ‘stay with the trouble’ of the extinction crisis; engaging in creative practice to process our grief in response to critically endangered and extinct bird species. The project uses birds as an index – markers that point to the ecological, cultural and ethical dimensions of the extinction crisis more broadly. The collaborative aspect of the project involves thinking through deliberately slow processes of drawing, cross-stich and writing, as well as contextualising this creative practice with shared texts and conversations: with each other, as well as ecologists, historians, artists and nature writers. This paper frames the collaboration as an ‘expanded conversation’ and uses the unfolding creative processes in response to two birds – Passenger Pigeon and Laysan Duck — to demonstrate how processes of drawing and tracing open opportunities for us to understand the ‘entangled significance’ of individual species within the extinction crisis, and argue that through documenting and sharing our expanded conversations, processes and artworks, we contribute to cultural ‘archives of loss’, which foster collective cultural memory about precarious bird species

    To Companion a Companion

    Full text link
    Commissioned visual essays for the 'Companion Companion Reader', presented both on site (in Contemporary Art Tasmania, Hobart and UNSW Galleries, Sydney) as well as on the online archive: https://www.companioncompanionreader.com

    Dietary fat, cholesterol and colorectal cancer in a prospective study

    Get PDF
    The relationships between consumption of total fat, major dietary fatty acids, cholesterol, consumption of meat and eggs, and the incidence of colorectal cancers were studied in a cohort based on the Finnish Mobile Clinic Health Examination Survey. Baseline (1967–1972) information on habitual food consumption over the preceding year was collected from 9959 men and women free of diagnosed cancer. A total of 109 new colorectal cancer cases were ascertained late 1999. High cholesterol intake was associated with increased risk for colorectal cancers. The relative risk between the highest and lowest quartiles of dietary cholesterol was 3.26 (95% confidence interval 1.54–6.88) after adjusting for age, sex, body mass index, occupation, smoking, geographic region, energy intake and consumption of vegetables, fruits and cereals. Consumption of total fat and intake of saturated, monounsaturated, or polyunsaturated fatty acids were not significantly associated with colorectal cancer risk. Nonsignificant associations were found between consumption of meat and eggs and colorectal cancer risk. The results of the present study indicate that high cholesterol intake may increase colorectal cancer risk, but do not suggest the presence of significant effects of dietary fat intake on colorectal cancer incidence. © 2001 Cancer Research Campaign http://www.bjcancer.co
    • …
    corecore