48 research outputs found

    Predicting Dominance Rankings for Score-Based Games

    Get PDF
    Game competitions may involve different player roles and be score-based rather than win/loss based. This raises the issue of how best to draw opponents for matches in ongoing competitions, and how best to rank the players in each role. An example is the Ms Pac-Man versus Ghosts Competition which requires competitors to develop software controllers to take charge of the game's protagonists: participants may develop software controllers for either or both Ms Pac-Man and the team of four ghosts. In this paper, we compare two ranking schemes for win-loss games, Bayes Elo and Glicko. We convert the game into one of win-loss ("dominance") by matching controllers of identical type against the same opponent in a series of pair-wise comparisons. This implicitly creates a "solution concept" as to what a constitutes a good player. We analyze how many games are needed under two popular ranking algorithms, Glicko and Bayes Elo, before one can infer the strength of the players, according to our proposed solution concept, without performing an exhaustive evaluation. We show that Glicko should be the method of choice for online score-based game competitions

    Team formation using recommendation systems

    Get PDF
    The importance of team formation has been realized since ages, but finding the most effective team out of the available human resources is a problem that persists to the date. Having members with complementary skills, along with a few must-have behavioral traits, such as trust and collaborativeness among the team members are the key ingredients behind team synergy and performance. This thesis designs and implements two different algorithms for the team formation problem using ideas adapted from the recommender systems literature. One of the proposed solutions uses the Glicko-2 rating system to rate the employees’ skills which can easily separate the skill ability and experience of the employees. The final contribution of this thesis is to build a system with ”plug-in” capability, meaning any new recommendation algorithm could be easily plugged in inside the system. Our extensive experimental analyses explore nuances of data sources, data storage methodologies, as well as characteristics of different recommendation algorithms with rating and ranking sub-systems

    Tailoring a psychophysiologically driven rating system

    Get PDF
    Humans have always been interested in ways to measure and compare their performances to establish who is best at a particular activity. The first Olympic Games, for instance, were carried out in 776 BC, and it was a defining moment in history where ranking based competitive activities managed to reach the general populous. Every competition must face the issue of how to evaluate and rank competitors, and often rules are required to account for many different aspects such as variations in conditions, the ability to cheat, and, of course, the value of entertainment. Nowadays, measurements are performed out through various rating systems, which considers the outcomes of the activity to rate the participants. However, they do not seem to address the psychological aspects of an individual in a competition. This dissertation employs several psychophysiological assessment instruments intending to facilitate the acquisition of skill level rating in competitive gaming. To do so, an exergame that uses non-conventional inputs, such as body tracking to prevent input biases, was developed. The sample size of this study is ten, and the participants were put on a round-robin tournament to provide equal intervals between games for each player. After analyzing the outcome of the competition, it revealed some critical insights on the psychophysiological instruments; Especially the significance of Flow in terms of the prolificacy of a player. Although the findings did not provide an alternative for the traditional rating systems, it shows the importance of considering other aspects of the competition, such as psychophysiological metrics to fine-tune the rating. These potentially reveal more in-depth insight into the competition in comparison to just the binary outcome

    A State-Space Perspective on Modelling and Inference for Online Skill Rating

    Full text link
    This paper offers a comprehensive review of the main methodologies used for skill rating in competitive sports. We advocate for a state-space model perspective, wherein players' skills are represented as time-varying, and match results serve as the sole observed quantities. The state-space model perspective facilitates the decoupling of modeling and inference, enabling a more focused approach highlighting model assumptions, while also fostering the development of general-purpose inference tools. We explore the essential steps involved in constructing a state-space model for skill rating before turning to a discussion on the three stages of inference: filtering, smoothing and parameter estimation. Throughout, we examine the computational challenges of scaling up to high-dimensional scenarios involving numerous players and matches, highlighting approximations and reductions used to address these challenges effectively. We provide concise summaries of popular methods documented in the literature, along with their inferential paradigms and introduce new approaches to skill rating inference based on sequential Monte Carlo and finite state-spaces. We close with numerical experiments demonstrating a practical workflow on real data across different sports

    Comparing Elo, Glicko, IRT, and Bayesian IRT Statistical Models for Educational and Gaming Data

    Get PDF
    Statistical models used for estimating skill or ability levels often vary by field, however their underlying mathematical models can be very similar. Differences in the underlying models can be due to the need to accommodate data with different underlying formats and structure. As the models from varying fields increase in complexity, their ability to be applied to different types of data may have the ability to increase. Models that are applied to educational or psychological data have advanced to accommodate a wide range of data formats, including increased estimation accuracy with sparsely populated data matrices. Conversely, the field of online gaming has expanded over the last two decades to include the use of more complex statistical models to provide real-time game matching based on ability estimates. It can be useful to see how statistical models from educational and gaming fields compare as different datasets may benefit from different ability estimation procedures. This study compared statistical models typically used in game match making systems (Elo, Glicko) to models used in psychometric modeling (item response theory and Bayesian item response theory) using both simulated data and real data under a variety of conditions. Results indicated that conditions with small numbers of items or matches had the most accurate skill estimates using the Bayesian IRT (item response theory) one-parameter logistic (1PL) model, regardless of whether educational or gaming data were used. This held true for all sample sizes with small numbers of items. However, the Elo and the non-Bayesian IRT 1PL models were close to the Bayesian IRT 1PL model’s estimations for both gaming and educational data. While the 2PL models were not shown to be accurate for the gaming study conditions, the IRT 2PL and Bayesian IRT 2PL models outperformed the 1PL models when 2PL educational data were generated with the larger sample size and item condition. Overall, the Bayesian IRT 1PL model seemed to be the best choice across the smaller sample and match size conditions

    A Social Network Approach Reveals Associations between Mouse Social Dominance and Brain Gene Expression

    Get PDF
    Modelling complex social behavior in the laboratory is challenging and requires analyses of dyadic interactions occurring over time in a physically and socially complex environment. In the current study, we approached the analyses of complex social interactions in group-housed male CD1 mice living in a large vivarium. Intensive observations of social interactions during a 3-week period indicated that male mice form a highly linear and steep dominance hierarchy that is maintained by fighting and chasing behaviors. Individual animals were classified as dominant, sub-dominant or subordinate according to their David’s Scores and I& SI ranking. Using a novel dynamic temporal Glicko rating method, we ascertained that the dominance hierarchy was stable across time. Using social network analyses, we characterized the behavior of individuals within 66 unique relationships in the social group. We identified two individual network metrics, Kleinberg’s Hub Centrality and Bonacich’s Power Centrality, as accurate predictors of individual dominance and power. Comparing across behaviors, we establish that agonistic, grooming and sniffing social networks possess their own distinctive characteristics in terms of density, average path length, reciprocity out-degree centralization and out-closeness centralization. Though grooming ties between individuals were largely independent of other social networks, sniffing relationships were highly predictive of the directionality of agonistic relationships. Individual variation in dominance status was associated with brain gene expression, with more dominant individuals having higher levels of corticotropin releasing factor mRNA in the medial and central nuclei of the amygdala and the medial preoptic area of the hypothalamus, as well as higher levels of hippocampal glucocorticoid receptor and brain-derived neurotrophic factor mRNA. This study demonstrates the potential and significance of combining complex social housing and intensive behavioral characterization of group-living animals with the utilization of novel statistical methods to further our understanding of the neurobiological basis of social behavior at the individual, relationship and group levels

    Statistical methods for detecting match-fixing in tennis

    Get PDF
    Match-fixing is a key problem facing many sports, undermining the integrity and sporting spectacle of events, ruining players’ careers and enabling the criminals behind the fixes to funnel funds into other illicit activities. Although for a long time authorities were reticent to act, more and more sports bodies and betting companies are now taking steps to tackle the issue, though much remains to be done. Tennis in particular has faced past criticism for its approach to combatting match-fixing, culminating in widespread media coverage of a leak of match-fixing related documents in 2016, although the Tennis Integrity Unit has since intensified its efforts to deal with the problem. In this thesis, we develop new statistical methods for identifying tennis matches in which suspicious betting activity occurs. We also make some advancements on existing sports models to enable us to better analyse tennis matches to detect this corrupt activity. Our work is among the first to use both pre-match and in-play odds data to investigate match-fixing, and to also integrate betting volumes. Our pre-match odds are sampled at several intervals during the pre-match market, allowing for more detailed analysis than other work. Our in-play odds data are recorded during every game break along with live scores so that we can explore how the odds vary as the score progresses. In particular, we look for divergences between market odds and predictions coming both from sports models and from direct predictions of odds based on in-play events. Our methods successfully identify past matches that other external sources have found to contain suspicious betting activity, and are able to quantify how unusual this activity was in relation to typical betting behaviour. This suggests that our methods, coupled with other sources of evidence, can provide a valuable quantification of suspicious betting activity in future matches
    corecore