1,611 research outputs found
The Loss Rank Principle for Model Selection
We introduce a new principle for model selection in regression and
classification. Many regression models are controlled by some smoothness or
flexibility or complexity parameter c, e.g. the number of neighbors to be
averaged over in k nearest neighbor (kNN) regression or the polynomial degree
in regression with polynomials. Let f_D^c be the (best) regressor of complexity
c on data D. A more flexible regressor can fit more data D' well than a more
rigid one. If something (here small loss) is easy to achieve it's typically
worth less. We define the loss rank of f_D^c as the number of other
(fictitious) data D' that are fitted better by f_D'^c than D is fitted by
f_D^c. We suggest selecting the model complexity c that has minimal loss rank
(LoRP). Unlike most penalized maximum likelihood variants (AIC,BIC,MDL), LoRP
only depends on the regression function and loss function. It works without a
stochastic noise model, and is directly applicable to any non-parametric
regressor, like kNN. In this paper we formalize, discuss, and motivate LoRP,
study it for specific regression problems, in particular linear ones, and
compare it to other model selection schemes.Comment: 16 page
Determining Principal Component Cardinality through the Principle of Minimum Description Length
PCA (Principal Component Analysis) and its variants areubiquitous techniques
for matrix dimension reduction and reduced-dimensionlatent-factor extraction.
One significant challenge in using PCA, is thechoice of the number of principal
components. The information-theoreticMDL (Minimum Description Length) principle
gives objective compression-based criteria for model selection, but it is
difficult to analytically applyits modern definition - NML (Normalized Maximum
Likelihood) - to theproblem of PCA. This work shows a general reduction of NML
prob-lems to lower-dimension problems. Applying this reduction, it boundsthe
NML of PCA, by terms of the NML of linear regression, which areknown.Comment: LOD 201
Nonparametric Hierarchical Clustering of Functional Data
In this paper, we deal with the problem of curves clustering. We propose a
nonparametric method which partitions the curves into clusters and discretizes
the dimensions of the curve points into intervals. The cross-product of these
partitions forms a data-grid which is obtained using a Bayesian model selection
approach while making no assumptions regarding the curves. Finally, a
post-processing technique, aiming at reducing the number of clusters in order
to improve the interpretability of the clustering, is proposed. It consists in
optimally merging the clusters step by step, which corresponds to an
agglomerative hierarchical classification whose dissimilarity measure is the
variation of the criterion. Interestingly this measure is none other than the
sum of the Kullback-Leibler divergences between clusters distributions before
and after the merges. The practical interest of the approach for functional
data exploratory analysis is presented and compared with an alternative
approach on an artificial and a real world data set
Avian Climate Messengers
This visual essay and video work were commissioned by Philip Ely, for the 'Climate Domesday Book': a speculative design project that explores contemporary questions related to the climate emergency. The book is a hybrid print-digital device designed by researchers in Australian and the UK, which uses a magic bookmark to read pages and trigger (via Bluetooth) an interaction: the playing of a video or audio on a nearby big screen.
Our contribution explores ways to affectively communicate biodiversity loss in the Sixth Extinction through creative practice, particularly through visual metaphor and synecdoche. In this case, we focus on the Black-Throated Finch as a modern day canary in (Adani's) coal mine
Drawing The Extinction Crisis
‘Precarious Birds’ is an ongoing collaboration through which the authors ‘stay with the trouble’ of the extinction crisis; engaging in creative practice to process our grief in response to critically endangered and extinct bird species. The project uses birds as an index – markers that point to the ecological, cultural and ethical dimensions of the extinction crisis more broadly. The collaborative aspect of the project involves thinking through deliberately slow processes of drawing, cross-stich and writing, as well as contextualising this creative practice with shared texts and conversations: with each other, as well as ecologists, historians, artists and nature writers. This paper frames the collaboration as an ‘expanded conversation’ and uses the unfolding creative processes in response to two birds – Passenger Pigeon and Laysan Duck — to demonstrate how processes of drawing and tracing open opportunities for us to understand the ‘entangled significance’ of individual species within the extinction crisis, and argue that through documenting and sharing our expanded conversations, processes and artworks, we contribute to cultural ‘archives of loss’, which foster collective cultural memory about precarious bird species
To Companion a Companion
Commissioned visual essays for the 'Companion Companion Reader', presented both on site (in Contemporary Art Tasmania, Hobart and UNSW Galleries, Sydney) as well as on the online archive: https://www.companioncompanionreader.com
Dietary fat, cholesterol and colorectal cancer in a prospective study
The relationships between consumption of total fat, major dietary fatty acids, cholesterol, consumption of meat and eggs, and the incidence of colorectal cancers were studied in a cohort based on the Finnish Mobile Clinic Health Examination Survey. Baseline (1967–1972) information on habitual food consumption over the preceding year was collected from 9959 men and women free of diagnosed cancer. A total of 109 new colorectal cancer cases were ascertained late 1999. High cholesterol intake was associated with increased risk for colorectal cancers. The relative risk between the highest and lowest quartiles of dietary cholesterol was 3.26 (95% confidence interval 1.54–6.88) after adjusting for age, sex, body mass index, occupation, smoking, geographic region, energy intake and consumption of vegetables, fruits and cereals. Consumption of total fat and intake of saturated, monounsaturated, or polyunsaturated fatty acids were not significantly associated with colorectal cancer risk. Nonsignificant associations were found between consumption of meat and eggs and colorectal cancer risk. The results of the present study indicate that high cholesterol intake may increase colorectal cancer risk, but do not suggest the presence of significant effects of dietary fat intake on colorectal cancer incidence. © 2001 Cancer Research Campaign http://www.bjcancer.co
- …