126,984 research outputs found
Enhancing the selection of a model-based clustering with external qualitative variables
In cluster analysis, it can be useful to interpret the partition built from
the data in the light of external categorical variables which were not directly
involved to cluster the data. An approach is proposed in the model-based
clustering context to select a model and a number of clusters which both fit
the data well and take advantage of the potential illustrative ability of the
external variables. This approach makes use of the integrated joint likelihood
of the data and the partitions at hand, namely the model-based partition and
the partitions associated to the external variables. It is noteworthy that each
mixture model is fitted by the maximum likelihood methodology to the data,
excluding the external variables which are used to select a relevant mixture
model only. Numerical experiments illustrate the promising behaviour of the
derived criterion
Forecasting Player Behavioral Data and Simulating in-Game Events
Understanding player behavior is fundamental in game data science. Video
games evolve as players interact with the game, so being able to foresee player
experience would help to ensure a successful game development. In particular,
game developers need to evaluate beforehand the impact of in-game events.
Simulation optimization of these events is crucial to increase player
engagement and maximize monetization. We present an experimental analysis of
several methods to forecast game-related variables, with two main aims: to
obtain accurate predictions of in-app purchases and playtime in an operational
production environment, and to perform simulations of in-game events in order
to maximize sales and playtime. Our ultimate purpose is to take a step towards
the data-driven development of games. The results suggest that, even though the
performance of traditional approaches such as ARIMA is still better, the
outcomes of state-of-the-art techniques like deep learning are promising. Deep
learning comes up as a well-suited general model that could be used to forecast
a variety of time series with different dynamic behaviors
Convergence of economic growth in Russian megacities
Purpose: The article presents the results of an empirical analysis of the economic growth of Russian cities with a population of over 1 million people (megacities). Design/Methodology/Approach: The analyzed indicator is the city product calculated according to the UN methodology for the period from 2010 to 2016. The paper analyses the process of β- and σ-convergence across Russian megacities using methods of spatial econometrics in addition to the traditional β-convergence techniques from the neoclassical theoretical framework. Findings: The dynamics of the coefficient of variation confirmed the presence of σ-convergence in city product. Empirically, positive spatial autocorrelation has been confirmed. Beta-convergence for Russian megacities is found to be significant and the spatial location of megacities significantly affects β-convergence. Control factors such as fixed capital investment per capita in 2010, average retail volume per capita in 2010, average annual number of employees of enterprises and organizations in 2010 and the dummy variable introduced for “federal cities” Moscow and St. Petersburg are all found to have positive and statistically significant impact on economic growth. Practical Implications: Policymakers may take the results into account under the planning of economical strategies for megacities and regions in Russia in order to facilitate the regional economic growth and the speed of convergence. Originality/Value: The main contribution of the study is the consideration of the economical growth for the megacities and not for the regions as it often used to be the case in similar studies. The important finding is that megacities‘ economies do converge and the influence of control factors is pronounced.peer-reviewe
Beyond subjective and objective in statistics
We argue that the words "objectivity" and "subjectivity" in statistics
discourse are used in a mostly unhelpful way, and we propose to replace each of
them with broader collections of attributes, with objectivity replaced by
transparency, consensus, impartiality, and correspondence to observable
reality, and subjectivity replaced by awareness of multiple perspectives and
context dependence. The advantage of these reformulations is that the
replacement terms do not oppose each other. Instead of debating over whether a
given statistical method is subjective or objective (or normatively debating
the relative merits of subjectivity and objectivity in statistical practice),
we can recognize desirable attributes such as transparency and acknowledgment
of multiple perspectives as complementary goals. We demonstrate the
implications of our proposal with recent applied examples from pharmacology,
election polling, and socioeconomic stratification.Comment: 35 page
Non-Gaussian statistics of pencil beam surveys
We study the effect of the non-Gaussian clustering of galaxies on the
statistics of pencil beam surveys. We find that the higher order moments of the
galaxy distribution play an important role in the probability distribution for
the power spectrum peaks. Taking into account the observed values for the
kurtosis of galaxy distribution we derive the general probability distribution
for the power spectrum modes in non-Gaussian models and show that the
probability to obtain the 128\hm periodicity found in pencil beam surveys is
raised by roughly one order of magnitude. The non-Gaussianity of the galaxy
distribution is however still insufficient to explain the reported
peak-to-noise ratio of the periodicity, so that extra power on large scales
seems required.Comment: 9 pages,2 figs available on request,Latex, revised version with
significant changes, preprint Fermilab-Pub-94-043-
Analysis of the evolution of the Spanish labour market through unsupervised learning
Unemployment in Spain is one of the biggest concerns of its inhabitants. Its unemployment rate is the second highest in the European Union, and in the second quarter of 2018 there is a 15.2% unemployment rate, some 3.4 million unemployed. Construction is one of the activity sectors that have suffered the most from the economic crisis. In addition, the economic crisis affected in different ways to the labour market in terms of occupation level or location. The aim of this paper is to discover how the labour market is organised taking into account the jobs that workers get during two periods: 2011-2013, which corresponds to the economic crisis period, and 2014-2016, which was a period of economic recovery. The data used are official records of the Spanish administration corresponding to 1.9 and 2.4 million job placements, respectively. The labour market was analysed by applying unsupervised machine learning techniques to obtain a clear and structured information on the employment generation process and the underlying labour mobility. We have applied two clustering methods with two different technologies, and the results indicate that there were some movements in the Spanish labour market which have changed the physiognomy of some of the jobs. The analysis reveals the changes in the labour market: the crisis forces greater geographical mobility and favours the subsequent emergence of new job sources. Nevertheless, there still exist some clusters that remain stable despite the crisis. We may conclude that we have achieved a characterisation of some important groups of workers in Spain. The methodology used, being supported by Big Data techniques, would serve to analyse any alternative job market.Ministerio de Economía y Competitividad TIN2014-55894-C2-R y TIN2017-88209-C2-2-R, CO2017-8678
- …