125 research outputs found
Flexible constrained sampling with guarantees for pattern mining
Pattern sampling has been proposed as a potential solution to the infamous
pattern explosion. Instead of enumerating all patterns that satisfy the
constraints, individual patterns are sampled proportional to a given quality
measure. Several sampling algorithms have been proposed, but each of them has
its limitations when it comes to 1) flexibility in terms of quality measures
and constraints that can be used, and/or 2) guarantees with respect to sampling
accuracy. We therefore present Flexics, the first flexible pattern sampler that
supports a broad class of quality measures and constraints, while providing
strong guarantees regarding sampling accuracy. To achieve this, we leverage the
perspective on pattern mining as a constraint satisfaction problem and build
upon the latest advances in sampling solutions in SAT as well as existing
pattern mining algorithms. Furthermore, the proposed algorithm is applicable to
a variety of pattern languages, which allows us to introduce and tackle the
novel task of sampling sets of patterns. We introduce and empirically evaluate
two variants of Flexics: 1) a generic variant that addresses the well-known
itemset sampling task and the novel pattern set sampling task as well as a wide
range of expressive constraints within these tasks, and 2) a specialized
variant that exploits existing frequent itemset techniques to achieve
substantial speed-ups. Experiments show that Flexics is both accurate and
efficient, making it a useful tool for pattern-based data exploration.Comment: Accepted for publication in Data Mining & Knowledge Discovery journal
(ECML/PKDD 2017 journal track
The success-index: an alternative approach to the h-index for evaluating an individual's research output
Among the most recent bibliometric indicators for normalizing the differences among fields of science in terms of citation behaviour, Kosmulski (J Informetr 5(3):481-485, 2011) proposed the NSP (number of successful paper) index. According to the authors, NSP deserves much attention for its great simplicity and immediate meaning— equivalent to those of the h-index—while it has the disadvantage of being prone to manipulation and not very efficient in terms of statistical significance. In the first part of the paper, we introduce the success-index, aimed at reducing the NSP-index's limitations, although requiring more computing effort. Next, we present a detailed analysis of the success-index from the point of view of its operational properties and a comparison with the h-index's ones. Particularly interesting is the examination of the success-index scale of measurement, which is much richer than the h-index's. This makes success-index much more versatile for different types of analysis—e.g., (cross-field) comparisons of the scientific output of (1) individual researchers, (2) researchers with different seniority, (3) research institutions of different size, (4) scientific journals, etc
An index to quantify an individual's scientific research output that takes into account the effect of multiple coauthorship
I propose the index ("hbar"), defined as the number of papers of an
individual that have citation count larger than or equal to the of all
coauthors of each paper, as a useful index to characterize the scientific
output of a researcher that takes into account the effect of multiple
coauthorship. The bar is higher for .Comment: A few minor changes from v1. To be published in Scientometric
The hw-rank: an h-index variant for ranking web pages
We introduce a novel ranking of search results based on a variant of the h-index for directed information networks such as the Web. The h-index was originally
introduced to measure an individual researcher’s scientific output and influence, but here a variant of it is applied to assess the ‘‘importance’’ of web pages. Like PageRank, the‘‘importance’’ of a page is defined by the ‘‘importance’’ of the pages linking to it. However,
unlike the computation of PageRank which involves the whole web graph, computing the h-index for web pages (the hw-rank) is based on a local computation and only the
neighbors of the neighbors of the given node are considered. Preliminary results show a strong correlation between ranking with the hw-rank and PageRank, and moreover its computation is simpler and less complex than computation of the PageRank. Further, larger scale experiments are needed in order to assess the applicability of the method
Statistical regularities in the rank-citation profile of scientists
Recent science of science research shows that scientific impact measures for journals and individual articles have quantifiable regularities across both time and discipline. However, little is known about the scientific impact distribution at the scale of an individual scientist. We analyze the aggregate production and impact using the rank-citation profile ci(r) of 200 distinguished professors and 100 assistant professors. For the entire range of paper rank r, we fit each ci(r) to a common distribution function. Since two scientists with equivalent Hirsch h-index can have significantly different ci(r) profiles, our results demonstrate the utility of the βi scaling parameter in conjunction with hi for quantifying individual publication impact. We show that the total number of citations Ci tallied from a scientist's Ni papers scales as . Such statistical regularities in the input-output patterns of scientists can be used as benchmarks for theoretical models of career progress
Natural disasters and indicators of social cohesion
Do adversarial environmental conditions create social cohesion? We provide new answers to this question by exploiting spatial and temporal variation in exposure to earthquakes across Chile. Using a variety of methods and controlling for a number of socio-economic variables, we find that exposure to earthquakes has a positive effect on several indicators of social cohesion. Social cohesion increases after a big earthquake and slowly erodes in periods where environmental conditions are less adverse. Our results contribute to the current debate on whether and how environmental conditions shape formal and informal institutions
A Measurement of Gravitational Lensing of the Cosmic Microwave Background Using SPT-3G 2018 Data
We present a measurement of gravitational lensing over 1500 deg of the
Southern sky using SPT-3G temperature data at 95 and 150 GHz taken in 2018. The
lensing amplitude relative to a fiducial Planck 2018 CDM cosmology is
found to be , excluding instrumental and astrophysical
systematic uncertainties. We conduct extensive systematic and null tests to
check the robustness of the lensing measurements, and report a minimum-variance
combined lensing power spectrum over angular multipoles of , which
we use to constrain cosmological models. When analyzed alone and jointly with
primary cosmic microwave background (CMB) spectra within the CDM
model, our lensing amplitude measurements are consistent with measurements from
SPT-SZ, SPTpol, ACT, and Planck. Incorporating loose priors on the baryon
density and other parameters including uncertainties on a foreground bias
template, we obtain a constraint on using the SPT-3G 2018 lensing data alone, where
is a common measure of the amplitude of structure today and
is the matter density parameter. Combining SPT-3G 2018 lensing
measurements with baryon acoustic oscillation (BAO) data, we derive parameter
constraints of , , and Hubble constant
km s Mpc. Using CMB anisotropy and lensing measurements from
SPT-3G only, we provide independent constraints on the spatial curvature of
(95% C.L.) and the dark energy density
of (68% C.L.). When combining SPT-3G
lensing data with SPT-3G CMB anisotropy and BAO data, we find an upper limit on
the sum of the neutrino masses of eV (95% C.L.)
- …