897 research outputs found

    Optimizing CIGB-300 intralesional delivery in locally advanced cervical cancer

    Get PDF
    Background:We conducted a phase 1 trial in patients with locally advanced cervical cancer by injecting 0.5 ml of the CK2-antagonist CIGB-300 in two different sites on tumours to assess tumour uptake, safety, pharmacodynamic activity and identify the recommended dose.Methods:Fourteen patients were treated with intralesional injections containing 35 or 70 mg of CIGB-300 in three alternate cycles of three consecutive days each before standard chemoradiotherapy. Tumour uptake was determined using 99 Tc-radiolabelled peptide. In situ B23/nucleophosmin was determined by immunohistochemistry.Results:Maximum tumour uptake for CIGB-300 70-mg dose was significantly higher than the one observed for 35 mg: 16.1±8.9 vs 31.3±12.9 mg (P=0.01). Both, AUC 24h and biological half-life were also significantly higher using 70 mg of CIGB-300 (P<0.001). Unincorporated CIGB-300 diffused rapidly to blood and was mainly distributed towards kidneys, and marginally in liver, lungs, heart and spleen. There was no DLT and moderate allergic-like reactions were the most common systemic side effect with strong correlation between unincorporated CIGB-300 and histamine levels in blood. CIGB-300, 70 mg, downregulated B23/nucleophosmin (P=0.03) in tumour specimens.Conclusion:Intralesional injections of 70 mg CIGB-300 in two sites (0.5 ml per injection) and this treatment plan are recommended to be evaluated in phase 2 studies.Fil: Sarduy, M. R.. Medical-surgical Research Center; CubaFil: García, I.. Centro de Ingeniería Genética y Biotecnología; CubaFil: Coca, M. A.. Clinical Investigation Center; CubaFil: Perera, A.. Clinical Investigation Center; CubaFil: Torres, L. A.. Clinical Investigation Center; CubaFil: Valenzuela, C. M.. Centro de Ingeniería Genética y Biotecnología; CubaFil: Baladrón, I.. Centro de Ingeniería Genética y Biotecnología; CubaFil: Solares, M.. Hospital Materno Ramón Gonzålez Coro; CubaFil: Reyes, V.. Center For Genetic Engineering And Biotechnology Havana; CubaFil: Hernåndez, I.. Isotope Center; CubaFil: Perera, Y.. Centro de Ingeniería Genética y Biotecnología; CubaFil: Martínez, Y. M.. Medical-surgical Research Center; CubaFil: Molina, L.. Medical-surgical Research Center; CubaFil: Gonzålez, Y. M.. Medical-surgical Research Center; CubaFil: Ancízar, J. A.. Centro de Ingeniería Genética y Biotecnología; CubaFil: Prats, A.. Clinical Investigation Center; CubaFil: Gonzålez, L.. Centro de Ingeniería Genética y Biotecnología; CubaFil: Casacó, C. A.. Clinical Investigation Center; CubaFil: Acevedo, B. E.. Centro de Ingeniería Genética y Biotecnología; CubaFil: López Saura, P. A.. Centro de Ingeniería Genética y Biotecnología; CubaFil: Alonso, Daniel Fernando. Universidad Nacional de Quilmes; ArgentinaFil: Gómez, R.. Elea Laboratories; ArgentinaFil: Perea Rodríguez, S. E.. Center For Genetic Engineering And Biotechnology Havana; Cuba. Centro de Ingeniería Genética y Biotecnología; Cub

    On environment difficulty and discriminating power

    Full text link
    The final publication is available at Springer via http://dx.doi.org/10.1007/s10458-014-9257-1This paper presents a way to estimate the difficulty and discriminating power of any task instance. We focus on a very general setting for tasks: interactive (possibly multiagent) environments where an agent acts upon observations and rewards. Instead of analysing the complexity of the environment, the state space or the actions that are performed by the agent, we analyse the performance of a population of agent policies against the task, leading to a distribution that is examined in terms of policy complexity. This distribution is then sliced by the algorithmic complexity of the policy and analysed through several diagrams and indicators. The notion of environment response curve is also introduced, by inverting the performance results into an ability scale. We apply all these concepts, diagrams and indicators to two illustrative problems: a class of agent-populated elementary cellular automata, showing how the difficulty and discriminating power may vary for several environments, and a multiagent system, where agents can become predators or preys, and may need to coordinate. Finally, we discuss how these tools can be applied to characterise (interactive) tasks and (multi-agent) environments. These characterisations can then be used to get more insight about agent performance and to facilitate the development of adaptive tests for the evaluation of agent abilities.I thank the reviewers for their comments, especially those aiming at a clearer connection with the field of multi-agent systems and the suggestion of better approximations for the calculation of the response curves. The implementation of the elementary cellular automata used in the environments is based on the library 'CellularAutomaton' by John Hughes for R [58]. I am grateful to Fernando Soler-Toscano for letting me know about their work [65] on the complexity of 2D objects generated by elementary cellular automata. I would also like to thank David L. Dowe for his comments on a previous version of this paper. This work was supported by the MEC/MINECO projects CONSOLIDER-INGENIO CSD2007-00022 and TIN 2010-21062-C02-02, GVA project PROMETEO/2008/051, the COST - European Cooperation in the field of Scientific and Technical Research IC0801 AT, and the REFRAME project, granted by the European Coordinated Research on Long-term Challenges in Information and Communication Sciences & Technologies ERA-Net (CHIST-ERA), and funded by the Ministerio de Economia y Competitividad in Spain (PCIN-2013-037).JosĂ© HernĂĄndez-Orallo (2015). On environment difficulty and discriminating power. Autonomous Agents and Multi-Agent Systems. 29(3):402-454. https://doi.org/10.1007/s10458-014-9257-1S402454293Anderson, J., Baltes, J., & Cheng, C. T. (2011). Robotics competitions as benchmarks for ai research. The Knowledge Engineering Review, 26(01), 11–17.Andre, D., & Russell, S. J. (2002). State abstraction for programmable reinforcement learning agents. In Proceedings of the National Conference on Artificial Intelligence (pp. 119–125). Menlo Park, CA; Cambridge, MA; London; AAAI Press; MIT Press; 1999.Antunes, L., Fortnow, L., van Melkebeek, D., & Vinodchandran, N. V. (2006). Computational depth: Concept and applications. Theoretical Computer Science, 354(3), 391–404. Foundations of Computation Theory (FCT 2003), 14th Symposium on Fundamentals of Computation Theory 2003.Arai, K., Kaminka, G. A., Frank, I., & Tanaka-Ishii, K. (2003). Performance competitions as research infrastructure: Large scale comparative studies of multi-agent teams. Autonomous Agents and Multi-Agent Systems, 7(1–2), 121–144.Ashcraft, M. H., Donley, R. D., Halas, M. A., & Vakali, M. (1992). Chapter 8 working memory, automaticity, and problem difficulty. In Jamie I.D. Campbell (Ed.), The nature and origins of mathematical skills, volume 91 of advances in psychology (pp. 301–329). North-Holland.Ay, N., MĂŒller, M., & Szkola, A. (2010). Effective complexity and its relation to logical depth. IEEE Transactions on Information Theory, 56(9), 4593–4607.Barch, D. M., Braver, T. S., Nystrom, L. E., Forman, S. D., Noll, D. C., & Cohen, J. D. (1997). Dissociating working memory from task difficulty in human prefrontal cortex. Neuropsychologia, 35(10), 1373–1380.Bordini, R. H., HĂŒbner, J. F., & Wooldridge, M. (2007). Programming multi-agent systems in AgentSpeak using Jason. London: Wiley. com.Boutilier, C., Reiter, R., Soutchanski, M., Thrun, S. et al. (2000). Decision-theoretic, high-level agent programming in the situation calculus. In Proceedings of the National Conference on Artificial Intelligence (pp. 355–362). Menlo Park, CA; Cambridge, MA; London; AAAI Press; MIT Press; 1999.Busoniu, L., Babuska, R., & De Schutter, B. (2008). A comprehensive survey of multiagent reinforcement learning. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 38(2), 156–172.Chaitin, G. J. (1977). Algorithmic information theory. IBM Journal of Research and Development, 21, 350–359.Chedid, F. B. (2010). Sophistication and logical depth revisited. In 2010 IEEE/ACS International Conference on Computer Systems and Applications (AICCSA) (pp. 1–4). IEEE.Cheeseman, P., Kanefsky, B. & Taylor, W. M. (1991). Where the really hard problems are. In Proceedings of IJCAI-1991 (pp. 331–337).Dastani, M. (2008). 2APL: A practical agent programming language. Autonomous Agents and Multi-agent Systems, 16(3), 214–248.Delahaye, J. P. & Zenil, H. (2011). Numerical evaluation of algorithmic complexity for short strings: A glance into the innermost structure of randomness. Applied Mathematics and Computation, 219(1), 63–77Dowe, D. L. (2008). Foreword re C. S. Wallace. Computer Journal, 51(5), 523–560. Christopher Stewart WALLACE (1933–2004) memorial special issue.Dowe, D. L., & HernĂĄndez-Orallo, J. (2012). IQ tests are not for machines, yet. Intelligence, 40(2), 77–81.Du, D. Z., & Ko, K. I. (2011). Theory of computational complexity (Vol. 58). London: Wiley-Interscience.Elo, A. E. (1978). The rating of chessplayers, past and present (Vol. 3). London: Batsford.Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. London: Lawrence Erlbaum.FatĂšs, N. & Chevrier, V. (2010). How important are updating schemes in multi-agent systems? an illustration on a multi-turmite model. In Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1-Volume 1 (pp. 533–540). International Foundation for Autonomous Agents and Multiagent Systems.Ferber, J. & MĂŒller, J. P. (1996). Influences and reaction: A model of situated multiagent systems. In Proceedings of Second International Conference on Multi-Agent Systems (ICMAS-96) (pp. 72–79).Ferrando, P. J. (2009). Difficulty, discrimination, and information indices in the linear factor analysis model for continuous item responses. Applied Psychological Measurement, 33(1), 9–24.Ferrando, P. J. (2012). Assessing the discriminating power of item and test scores in the linear factor-analysis model. PsicolĂłgica, 33, 111–139.Gent, I. P., & Walsh, T. (1994). Easy problems are sometimes hard. Artificial Intelligence, 70(1), 335–345.Gershenson, C. & Fernandez, N. (2012). Complexity and information: Measuring emergence, self-organization, and homeostasis at multiple scales. Complexity, 18(2), 29–44.Gruner, S. (2010). Mobile agent systems and cellular automata. Autonomous Agents and Multi-agent Systems, 20(2), 198–233.Hardman, D. K., & Payne, S. J. (1995). Problem difficulty and response format in syllogistic reasoning. The Quarterly Journal of Experimental Psychology, 48(4), 945–975.He, J., Reeves, C., Witt, C., & Yao, X. (2007). A note on problem difficulty measures in black-box optimization: Classification, realizations and predictability. Evolutionary Computation, 15(4), 435–443.HernĂĄndez-Orallo, J. (2000). Beyond the turing test. Journal of Logic Language & Information, 9(4), 447–466.HernĂĄndez-Orallo, J. (2000). On the computational measurement of intelligence factors. In A. Meystel (Ed.), Performance metrics for intelligent systems workshop (pp. 1–8). Gaithersburg, MD: National Institute of Standards and Technology.HernĂĄndez-Orallo, J. (2000). Thesis: Computational measures of information gain and reinforcement in inference processes. AI Communications, 13(1), 49–50.HernĂĄndez-Orallo, J. (2010). A (hopefully) non-biased universal environment class for measuring intelligence of biological and artificial systems. In M. Hutter et al. (Ed.), 3rd International Conference on Artificial General Intelligence (pp. 182–183). Atlantis Press Extended report at http://users.dsic.upv.es/proy/anynt/unbiased.pdf .HernĂĄndez-Orallo, J., & Dowe, D. L. (2010). Measuring universal intelligence: Towards an anytime intelligence test. Artificial Intelligence, 174(18), 1508–1539.HernĂĄndez-Orallo, J., Dowe, D. L., España-Cubillo, S., HernĂĄndez-Lloreda, M. V., & Insa-Cabrera, J. (2011). On more realistic environment distributions for defining, evaluating and developing intelligence. In J. Schmidhuber, K. R. ThĂłrisson, & M. Looks (Eds.), LNAI series on artificial general intelligence 2011 (Vol. 6830, pp. 82–91). Berlin: Springer.HernĂĄndez-Orallo, J., Dowe, D. L., & HernĂĄndez-Lloreda, M. V. (2014). Universal psychometrics: Measuring cognitive abilities in the machine kingdom. Cognitive Systems Research, 27, 50–74.HernĂĄndez-Orallo, J., Insa, J., Dowe, D. L. & Hibbard, B. (2012). Turing tests with turing machines. In A. Voronkov (Ed.), The Alan Turing Centenary Conference, Turing-100, Manchester, 2012, volume 10 of EPiC Series (pp. 140–156).HernĂĄndez-Orallo, J. & Minaya-Collado, N. (1998). A formal definition of intelligence based on an intensional variant of Kolmogorov complexity. In Proceedings of International Symposium of Engineering of Intelligent Systems (EIS’98) (pp. 146–163). ICSC Press.Hibbard, B. (2009). Bias and no free lunch in formal measures of intelligence. Journal of Artificial General Intelligence, 1(1), 54–61.Hoos, H. H. (1999). Sat-encodings, search space structure, and local search performance. In 1999 International Joint Conference on Artificial Intelligence (Vol. 16, pp. 296–303).Insa-Cabrera, J., Benacloch-Ayuso, J. L., & HernĂĄndez-Orallo, J. (2012). On measuring social intelligence: Experiments on competition and cooperation. In J. Bach, B. Goertzel, & M. IklĂ© (Eds.), AGI, volume 7716 of lecture notes in computer science (pp. 126–135). Berlin: Springer.Insa-Cabrera, J., Dowe, D. L., España-Cubillo, S., HernĂĄndez-Lloreda, M. V., & HernĂĄndez-Orallo, J. (2011). Comparing humans and AI agents. In J. Schmidhuber, K. R. ThĂłrisson, & M. Looks (Eds.), LNAI series on artificial general intelligence 2011 (Vol. 6830, pp. 122–132). Berlin: Springer.Knuth, D. E. (1973). Sorting and searching, volume 3 of the art of computer programming. Reading, MA: Addison-Wesley.Kotovsky, K., & Simon, H. A. (1990). What makes some problems really hard: Explorations in the problem space of difficulty. Cognitive Psychology, 22(2), 143–183.Legg, S. (2008). Machine super intelligence. PhD thesis, Department of Informatics, University of Lugano, June 2008.Legg, S., & Hutter, M. (2007). Universal intelligence: A definition of machine intelligence. Minds and Machines, 17(4), 391–444.Leonetti, M. & Iocchi, L. (2010). Improving the performance of complex agent plans through reinforcement learning. In Proceedings of the 2010 International Conference on Autonomous Agents and Multiagent Systems (Vol. 1, pp. 723–730). International Foundation for Autonomous Agents and Multiagent Systems.Levin, L. A. (1973). Universal sequential search problems. Problems of Information Transmission, 9(3), 265–266.Levin, L. A. (1986). Average case complete problems. SIAM Journal on Computing, 15, 285.Li, M., & VitĂĄnyi, P. (2008). An introduction to Kolmogorov complexity and its applications (3rd ed.). Berlin: Springer.Low, C. K., Chen, T. Y., & RĂłnnquist, R. (1999). Automated test case generation for bdi agents. Autonomous Agents and Multi-agent Systems, 2(4), 311–332.Madden, M. G., & Howley, T. (2004). Transfer of experience between reinforcement learning environments with progressive difficulty. Artificial Intelligence Review, 21(3), 375–398.Mellenbergh, G. J. (1994). Generalized linear item response theory. Psychological Bulletin, 115(2), 300.Michel, F. (2004). Formalisme, outils et Ă©lĂ©ments mĂ©thodologiques pour la modĂ©lisation et la simulation multi-agents. PhD thesis, UniversitĂ© des sciences et techniques du Languedoc, Montpellier.Miller, G. A. (1956). The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychological Review, 63(2), 81.Orponen, P., Ko, K. I., Schöning, U., & Watanabe, O. (1994). Instance complexity. Journal of the ACM (JACM), 41(1), 96–121.Simon, H. A., & Kotovsky, K. (1963). Human acquisition of concepts for sequential patterns. Psychological Review, 70(6), 534.Team, R., et al. (2013). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing.Whiteson, S., Tanner, B., & White, A. (2010). The reinforcement learning competitions. The AI Magazine, 31(2), 81–94.Wiering, M., & van Otterlo, M. (Eds.). (2012). Reinforcement learning: State-of-the-art. Berlin: Springer.Wolfram, S. (2002). A new kind of science. Champaign, IL: Wolfram Media.Zatuchna, Z., & Bagnall, A. (2009). Learning mazes with aliasing states: An LCS algorithm with associative perception. Adaptive Behavior, 17(1), 28–57.Zenil, H. (2010). Compression-based investigation of the dynamical properties of cellular automata and other systems. Complex Systems, 19(1), 1–28.Zenil, H. (2011). Une approche expĂ©rimentale Ă  la thĂ©orie algorithmique de la complexitĂ©. PhD thesis, Dissertation in fulfilment of the degree of Doctor in Computer Science, UniversitĂ© de Lille.Zenil, H., Soler-Toscano, F., Delahaye, J. P. & Gauvrit, N. (2012). Two-dimensional kolmogorov complexity and validation of the coding theorem method by compressibility. arXiv, preprint arXiv:1212.6745

    How Much Do Focal Infarcts Distort White Matter Lesions and Global Cerebral Atrophy Measures?

    Get PDF
    BACKGROUND: White matter lesions (WML) and brain atrophy are important biomarkers in stroke and dementia. Stroke lesions, either acute or old, symptomatic or silent, are common in older people. Such stroke lesions can have similar signals to WML and cerebrospinal fluid (CSF) on magnetic resonance (MR) images, and may be classified accidentally as WML or CSF by MR image processing algorithms, distorting WML and brain atrophy volume from the true volume. We evaluated the effect that acute or old stroke lesions at baseline, and new stroke lesions occurring during follow-up, could have on measurement of WML volume, cerebral atrophy and their longitudinal progression. METHODS: We used MR imaging data from patients who had originally presented with acute lacunar or minor cortical ischaemic stroke symptoms, recruited prospectively, who were scanned at baseline and about 3 years later. We measured WML and CSF volumes (ml) semi-automatically. We manually outlined the acute index stroke lesion (ISL), any old stroke lesions present at baseline, and new lesions appearing de novo during follow-up. We compared baseline and follow-up WML volume, cerebral atrophy and their longitudinal progression excluding and including the acute ISL, old and de novo stroke lesions. A non-parametric test (Wilcoxon's signed rank test) was used to compare the effects. RESULTS: Among 46 patients (mean age 72 years), 33 had an ISL visible on MR imaging (median volume 2.05 ml, IQR 0.88–8.88) and 7 of the 33 had old lacunes at baseline: WML volume was 8.54 ml (IQR 5.86–15.80) excluding versus 10.98 ml (IQR 6.91–24.86) including ISL (p < 0.001). At follow-up, median 39 months later (IQR 30–45), 3 patients had a de novo stroke lesion; total stroke lesion volume had decreased in 11 and increased in 22 patients: WML volume was 12.17 ml (IQR 8.54–19.86) excluding versus 14.79 ml (IQR 10.02–38.03) including total stroke lesions (p < 0.001). Including/excluding lacunes at baseline or follow-up also made small differences. Twenty-two of the 33 patients had tissue loss due to stroke lesions between baseline and follow-up, resulting in a net median brain tissue volume loss (i.e. atrophy) during follow-up of 24.49 ml (IQR 12.87–54.01) excluding versus 24.61 ml (IQR 15.54–54.04) including tissue loss due to stroke lesions (p < 0.001). Including stroke lesions in the WML volume added substantial noise, reduced statistical power, and thus increased sample size estimated for a clinical trial. CONCLUSIONS: Failure to exclude even small stroke lesions distorts WML volume, cerebral atrophy and their longitudinal progression measurements. This has important implications for design and sample size calculations for observational studies and randomised trials using WML volume, WML progression or brain atrophy as outcome measures. Improved methods of discriminating between stroke lesions and WML, and between tissue loss due to stroke lesions and true brain atrophy are required

    A xandarellid artiopodan from Morocco – a middle Cambrian link between soft-bodied euarthropod communities in North Africa and South China

    Get PDF
    NB. A corrigendum [correction] for this article was published online on 09 May 2017; this has been attached to this article as an additional file. This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ © The Author(s) 2017. The attached file is the published version of the article

    Estimating retention benchmarks for salvage logging to protect biodiversity

    Get PDF
    S.T. was supported by the Humboldt-Foundation and by the MOST (Ministry of Science and Technology) Taiwan Research Fellowship to work with A.C. at National Tsing Hua University, Taiwan. S.T. received funds from the Gregor Louisoder Environmental Foundation. A.B.L. received funds from the Humboldt-Foundation.Forests are increasingly affected by natural disturbances. Subsequent salvage logging, a widespread management practice conducted predominantly to recover economic capital, produces further disturbance and impacts biodiversity worldwide. Hence, naturally disturbed forests are among the most threatened habitats in the world, with consequences for their associated biodiversity. However, there are no evidence-based benchmarks for the proportion of area of naturally disturbed forests to be excluded from salvage logging to conserve biodiversity. We apply a mixed rarefaction/extrapolation approach to a global multi-taxa dataset from disturbed forests, including birds, plants, insects and fungi, to close this gap. We find that 75 ± 7% (mean ± SD) of a naturally disturbed area of a forest needs to be left unlogged to maintain 90% richness of its unique species, whereas retaining 50% of a naturally disturbed forest unlogged maintains 73 ± 12% of its unique species richness. These values do not change with the time elapsed since disturbance but vary considerably among taxonomic groups.Open Access funding enabled and organized by Projekt DEA

    Gla-rich protein function as an anti-inflammatory agent in monocytes/macrophages: implications for calcification-related chronic inflammatory diseases

    Get PDF
    Calcification-related chronic inflammatory diseases are multifactorial pathological processes, involving a complex interplay between inflammation and calcification events in a positive feed-back loop driving disease progression. Gla-rich protein (GRP) is a vitamin K dependent protein (VKDP) shown to function as a calcification inhibitor in cardiovascular and articular tissues, and proposed as an anti-inflammatory agent in chondrocytes and synoviocytes, acting as a new crosstalk factor between these two interconnected events in osteoarthritis. However, a possible function of GRP in the immune system has never been studied. Here we focused our investigation in the involvement of GRP in the cell inflammatory response mechanisms, using a combination of freshly isolated human leucocytes and undifferentiated/differentiated THP-1 cell line. Our results demonstrate that VKDPs such as GRP and matrix gla protein (MGP) are synthesized and gamma-carboxylated in the majority of human immune system cells either involved in innate or adaptive immune responses. Stimulation of THP-1 monocytes/macrophages with LPS or hydroxyapatite (HA) up-regulated GRP expression, and treatments with GRP or GRP-coated basic calcium phosphate crystals resulted in the down-regulation of mediators of inflammation and inflammatory cytokines, independently of the protein gamma-carboxylation status. Moreover, overexpression of GRP in THP-1 cells rescued the inflammation induced by LPS and HA, by down-regulation of the proinflammatory cytokines TNF alpha, IL-1 beta and NFkB. Interestingly, GRP was detected at protein and mRNA levels in extracellular vesicles released by macrophages, which may act as vehicles for extracellular trafficking and release. Our data indicate GRP as an endogenous mediator of inflammatory responses acting as an anti-inflammatory agent in monocytes/macrophages. We propose that in a context of chronic inflammation and calcification-related pathologies, GRP might act as a novel molecular mediator linking inflammation and calcification events, with potential therapeutic application.Portuguese Science and Technology Foundation (FCT) [PTDC/SAU-ORG/117266/2010, PTDC/BIM-MEC/1168/2012, UID/Multi/ 04326/2013]; FCT fellowships [SFRH/BPD/70277/2010, SFRH/BD/111824/2015

    Evaluation in artificial intelligence: From task-oriented to ability-oriented measurement

    Full text link
    The final publication is available at Springer via http://dx.doi.org/ 10.1007/s10462-016-9505-7.The evaluation of artificial intelligence systems and components is crucial for the progress of the discipline. In this paper we describe and critically assess the different ways AI systems are evaluated, and the role of components and techniques in these systems. We first focus on the traditional task-oriented evaluation approach. We identify three kinds of evaluation: human discrimination, problem benchmarks and peer confrontation. We describe some of the limitations of the many evaluation schemes and competitions in these three categories, and follow the progression of some of these tests. We then focus on a less customary (and challenging) ability-oriented evaluation approach, where a system is characterised by its (cognitive) abilities, rather than by the tasks it is designed to solve. We discuss several possibilities: the adaptation of cognitive tests used for humans and animals, the development of tests derived from algorithmic information theory or more integrated approaches under the perspective of universal psychometrics. We analyse some evaluation tests from AI that are better positioned for an ability-oriented evaluation and discuss how their problems and limitations can possibly be addressed with some of the tools and ideas that appear within the paper. Finally, we enumerate a series of lessons learnt and generic guidelines to be used when an AI evaluation scheme is under consideration.I thank the organisers of the AEPIA Summer School On Artificial Intelligence, held in September 2014, for giving me the opportunity to give a lecture on 'AI Evaluation'. This paper was born out of and evolved through that lecture. The information about many benchmarks and competitions discussed in this paper have been contrasted with information from and discussions with many people: M. Bedia, A. Cangelosi, C. Dimitrakakis, I. GarcIa-Varea, Katja Hofmann, W. Langdon, E. Messina, S. Mueller, M. Siebers and C. Soares. Figure 4 is courtesy of F. Martinez-Plumed. Finally, I thank the anonymous reviewers, whose comments have helped to significantly improve the balance and coverage of the paper. This work has been partially supported by the EU (FEDER) and the Spanish MINECO under Grants TIN 2013-45732-C4-1-P, TIN 2015-69175-C4-1-R and by Generalitat Valenciana PROMETEOII2015/013.JosĂ© HernĂĄndez-Orallo (2016). Evaluation in artificial intelligence: From task-oriented to ability-oriented measurement. Artificial Intelligence Review. 1-51. https://doi.org/10.1007/s10462-016-9505-7S151Abel D, Agarwal A, Diaz F, Krishnamurthy A, Schapire RE (2016) Exploratory gradient boosting for reinforcement learning in complex domains. arXiv preprint arXiv:1603.04119Adams S, Arel I, Bach J, Coop R, Furlan R, Goertzel B, Hall JS, Samsonovich A, Scheutz M, Schlesinger M, Shapiro SC, Sowa J (2012) Mapping the landscape of human-level artificial general intelligence. AI Mag 33(1):25–42Adams SS, Banavar G, Campbell M (2016) I-athlon: towards a multi-dimensional Turing test. AI Mag 37(1):78–84AlcalĂĄ J, FernĂĄndez A, Luengo J, Derrac J, GarcĂ­a S, SĂĄnchez L, Herrera F (2010) Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Mult Valued Logic Soft Comput 17:255–287Alexander JRM, Smales S (1997) Intelligence, learning and long-term memory. Personal Individ Differ 23(5):815–825Alpcan T, Everitt T, Hutter M (2014) Can we measure the difficulty of an optimization problem? In: IEEE information theory workshop (ITW)Alur R, Bodik R, Juniwal G, Martin MMK, Raghothaman M, Seshia SA, Singh R, Solar-Lezama A, Torlak E, Udupa A (2013) Syntax-guided synthesis. In: Formal methods in computer-aided design (FMCAD), 2013, IEEE, pp 1–17Alvarado N, Adams SS, Burbeck S, Latta C (2002) Beyond the Turing test: performance metrics for evaluating a computer simulation of the human mind. In: Proceedings of the 2nd international conference on development and learning, IEEE, pp 147–152Amigoni F, Bastianelli E, Berghofer J, Bonarini A, Fontana G, Hochgeschwender N, Iocchi L, Kraetzschmar G, Lima P, Matteucci M, Miraldo P, Nardi D, Schiaffonati V (2015) Competitions for benchmarking: task and functionality scoring complete performance assessment. IEEE Robot Autom Mag 22(3):53–61Anderson J, Lebiere C (2003) The Newell test for a theory of cognition. Behav Brain Sci 26(5):587–601Anderson J, Baltes J, Cheng CT (2011) Robotics competitions as benchmarks for AI research. Knowl Eng Rev 26(01):11–17Arel I, Rose DC, Karnowski TP (2010) Deep machine learning—a new frontier in artificial intelligence research. IEEE Comput Intell Mag 5(4):13–18Asada M, Hosoda K, Kuniyoshi Y, Ishiguro H, Inui T, Yoshikawa Y, Ogino M, Yoshida C (2009) Cognitive developmental robotics: a survey. IEEE Trans Auton Ment Dev 1(1):12–34Aziz H, Brill M, Fischer F, Harrenstein P, Lang J, Seedig HG (2015) Possible and necessary winners of partial tournaments. J Artif Intell Res 54:493–534Bache K, Lichman M (2013) UCI machine learning repository. http://archive.ics.uci.edu/mlBagnall AJ, Zatuchna ZV (2005) On the classification of maze problems. In: Bull L, Kovacs T (eds) Foundations of learning classifier system. Studies in fuzziness and soft computing, vol. 183, Springer, pp 305–316. http://rd.springer.com/chapter/10.1007/11319122_12Baldwin D, Yadav SB (1995) The process of research investigations in artificial intelligence - a unified view. IEEE Trans Syst Man Cybern 25(5):852–861Bellemare MG, Naddaf Y, Veness J, Bowling M (2013) The arcade learning environment: an evaluation platform for general agents. J Artif Intell Res 47:253–279Besold TR (2014) A note on chances and limitations of psychometric ai. In: KI 2014: advances in artificial intelligence. Springer, pp 49–54Biever C (2011) Ultimate IQ: one test to rule them all. New Sci 211(2829, 10 September 2011):42–45Borg M, Johansen SS, Thomsen DL, Kraus M (2012) Practical implementation of a graphics Turing test. In: Advances in visual computing. Springer, pp 305–313Boring EG (1923) Intelligence as the tests test it. New Repub 35–37Bostrom N (2014) Superintelligence: paths, dangers, strategies. Oxford University Press, OxfordBrazdil P, Carrier CG, Soares C, Vilalta R (2008) Metalearning: applications to data mining. Springer, New YorkBringsjord S (2011) Psychometric artificial intelligence. J Exp Theor Artif Intell 23(3):271–277Bringsjord S, Schimanski B (2003) What is artificial intelligence? Psychometric AI as an answer. In: International joint conference on artificial intelligence, pp 887–893Brundage M (2016) Modeling progress in ai. AAAI 2016 Workshop on AI, Ethics, and SocietyBuchanan BG (1988) Artificial intelligence as an experimental science. Springer, New YorkBuhrmester M, Kwang T, Gosling SD (2011) Amazon’s mechanical turk a new source of inexpensive, yet high-quality, data? Perspect Psychol Sci 6(1):3–5Bursztein E, Aigrain J, Moscicki A, Mitchell JC (2014) The end is nigh: generic solving of text-based captchas. In: Proceedings of the 8th USENIX conference on Offensive Technologies, USENIX Association, p 3Campbell M, Hoane AJ, Hsu F (2002) Deep Blue. Artif Intell 134(1–2):57–83Cangelosi A, Schlesinger M, Smith LB (2015) Developmental robotics: from babies to robots. MIT Press, CambridgeCaputo B, MĂŒller H, Martinez-Gomez J, Villegas M, Acar B, Patricia N, Marvasti N, ÜskĂŒdarlı S, Paredes R, Cazorla M et al (2014) Imageclef 2014: overview and analysis of the results. In: Information access evaluation. Multilinguality, multimodality, and interaction, Springer, pp 192–211Carlson A, Betteridge J, Kisiel B, Settles B, Hruschka ER Jr, Mitchell TM (2010) Toward an architecture for never-ending language learning. In: AAAI, vol 5, p 3Carroll JB (1993) Human cognitive abilities: a survey of factor-analytic studies. Cambridge University Press, CambridgeCaruana R (1997) Multitask learning. Mach Learn 28(1):41–75Chaitin GJ (1982) Gödel’s theorem and information. Int J Theor Phys 21(12):941–954Chandrasekaran B (1990) What kind of information processing is intelligence? In: The foundation of artificial intelligence—a sourcebook. Cambridge University Press, pp 14–46Chater N (1999) The search for simplicity: a fundamental cognitive principle? Q J Exp Psychol Sect A 52(2):273–302Chater N, VitĂĄnyi P (2003) Simplicity: a unifying principle in cognitive science? Trends Cogn Sci 7(1):19–22Chu Z, Gianvecchio S, Wang H, Jajodia S (2010) Who is tweeting on twitter: human, bot, or cyborg? In: Proceedings of the 26th annual computer security applications conference, ACM, pp 21–30Cochran WG (2007) Sampling techniques. Wiley, New YorkCohen PR, Howe AE (1988) How evaluation guides AI research: the message still counts more than the medium. AI Mag 9(4):35Cohen Y (2013) Testing and cognitive enhancement. Technical repor, National Institute for Testing and Evaluation, Jerusalem, IsraelConrad JG, Zeleznikow J (2013) The significance of evaluation in AI and law: a case study re-examining ICAIL proceedings. In: Proceedings of the 14th international conference on artificial intelligence and law, ACM, pp 186–191Conrad JG, Zeleznikow J (2015) The role of evaluation in ai and law. In: Proceedings of the 15th international conference on artificial intelligence and law, pp 181–186Deary IJ, Der G, Ford G (2001) Reaction times and intelligence differences: a population-based cohort study. Intelligence 29(5):389–399Decker KS, Durfee EH, Lesser VR (1989) Evaluating research in cooperative distributed problem solving. Distrib Artif Intell 2:487–519DemĆĄar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30Detterman DK (2011) A challenge to Watson. Intelligence 39(2–3):77–78Dimitrakakis C (2016) Personal communicationDimitrakakis C, Li G, Tziortziotis N (2014) The reinforcement learning competition 2014. AI Mag 35(3):61–65Dowe DL (2013) Introduction to Ray Solomonoff 85th memorial conference. In: Dowe DL (ed) Algorithmic probability and friends. Bayesian prediction and artificial intelligence, lecture notes in computer science, vol 7070. Springer, Berlin, pp 1–36Dowe DL, Hajek AR (1997) A computational extension to the Turing Test. In: Proceedings of the 4th conference of the Australasian cognitive science society, University of Newcastle, NSW, AustraliaDowe DL, Hajek AR (1998) A non-behavioural, computational extension to the Turing test. In: International conference on computational intelligence and multimedia applications (ICCIMA’98), Gippsland, Australia, pp 101–106Dowe DL, HernĂĄndez-Orallo J (2012) IQ tests are not for machines, yet. Intelligence 40(2):77–81Dowe DL, HernĂĄndez-Orallo J (2014) How universal can an intelligence test be? Adapt Behav 22(1):51–69Drummond C (2009) Replicability is not reproducibility: nor is it good science. In: Proceedings of the evaluation methods for machine learning workshop at the 26th ICML, Montreal, CanadaDrummond C, Japkowicz N (2010) Warning: statistical benchmarking is addictive. Kicking the habit in machine learning. J Exp Theor Artif Intell 22(1):67–80Duan Y, Chen X, Houthooft R, Schulman J, Abbeel P (2016) Benchmarking deep reinforcement learning for continuous control. arXiv preprint arXiv:1604.06778Eden AH, Moor JH, Soraker JH, Steinhart E (2013) Singularity hypotheses: a scientific and philosophical assessment. Springer, New YorkEdmondson W (2012) The intelligence in ETI—what can we know? Acta Astronaut 78:37–42Elo AE (1978) The rating of chessplayers, past and present, vol 3. Batsford, LondonEmbretson SE, Reise SP (2000) Item response theory for psychologists. L. Erlbaum, HillsdaleEvans JM, Messina ER (2001) Performance metrics for intelligent systems. NIST Special Publication SP, pp 101–104Everitt T, Lattimore T, Hutter M (2014) Free lunch for optimisation under the universal distribution. In: 2014 IEEE Congress on evolutionary computation (CEC), IEEE, pp 167–174Falkenauer E (1998) On method overfitting. J Heuristics 4(3):281–287Feldman J (2003) Simplicity and complexity in human concept learning. Gen Psychol 38(1):9–15Ferrando PJ (2009) Difficulty, discrimination, and information indices in the linear factor analysis model for continuous item responses. Appl Psychol Meas 33(1):9–24Ferrando PJ (2012) Assessing the discriminating power of item and test scores in the linear factor-analysis model. PsicolĂłgica 33:111–139Ferri C, HernĂĄndez-Orallo J, Modroiu R (2009) An experimental comparison of performance measures for classification. Pattern Recogn Lett 30(1):27–38Ferrucci D, Brown E, Chu-Carroll J, Fan J, Gondek D, Kalyanpur AA, Lally A, Murdock J, Nyberg E, Prager J et al (2010) Building Watson: an overview of the DeepQA project. AI Mag 31(3):59–79Fogel DB (1991) The evolution of intelligent decision making in gaming. Cybern Syst 22(2):223–236Gaschnig J, Klahr P, Pople H, Shortliffe E, Terry A (1983) Evaluation of expert systems: issues and case studies. Build Exp Syst 1:241–278Geissman JR, Schultz RD (1988) Verification & validation. AI Exp 3(2):26–33Genesereth M, Love N, Pell B (2005) General game playing: overview of the AAAI competition. AI Mag 26(2):62GerĂłnimo D, LĂłpez AM (2014) Datasets and benchmarking. In: Vision-based pedestrian protection systems for intelligent vehicles. Springer, pp 87–93Goertzel B, Pennachin C (eds) (2007) Artificial general intelligence. Springer, New YorkGoertzel B, Arel I, Scheutz M (2009) Toward a roadmap for human-level artificial general intelligence: embedding HLAI systems in broad, approachable, physical or virtual contexts. Artif Gen Intell Roadmap InitiatGoldreich O, Vadhan S (2007) Special issue on worst-case versus average-case complexity editors’ foreword. Comput complex 16(4):325–330Gordon BB (2007) Report on panel discussion on (re-)establishing or increasing collaborative links between artificial intelligence and intelligent systems. In: Messina ER, Madhavan R (eds) Proceedings of the 2007 workshop on performance metrics for intelligent systems, pp 302–303Gulwani S, HernĂĄndez-Orallo J, Kitzelmann E, Muggleton SH, Schmid U, Zorn B (2015) Inductive programming meets the real world. Commun ACM 58(11):90–99Hand DJ (2004) Measurement theory and practice. A Hodder Arnold Publication, LondonHernĂĄndez-Orallo J (2000a) Beyond the Turing test. J Logic Lang Inf 9(4):447–466HernĂĄndez-Orallo J (2000b) On the computational measurement of intelligence factors. In: Meystel A (ed) Performance metrics for intelligent systems workshop. National Institute of Standards and Technology, Gaithersburg, pp 1–8HernĂĄndez-Orallo J (2000c) Thesis: computational measures of information gain and reinforcement in inference processes. AI Commun 13(1):49–50HernĂĄndez-Orallo J (2010) A (hopefully) non-biased universal environment class for measuring intelligence of biological and artificial systems. In: Artificial general intelligence, 3rd International Conference. Atlantis Press, Extended report at http://users.dsic.upv.es/proy/anynt/unbiased.pdf , pp 182–183HernĂĄndez-Orallo J (2014) On environment difficulty and discriminating power. Auton Agents Multi-Agent Syst. 29(3):402–454. doi: 10.1007/s10458-014-9257-1HernĂĄndez-Orallo J, Dowe DL (2010) Measuring universal intelligence: towards an anytime intelligence test. Artif Intell 174(18):1508–1539HernĂĄndez-Orallo J, Dowe DL (2013) On potential cognitive abilities in the machine kingdom. Minds Mach 23:179–210HernĂĄndez-Orallo J, Minaya-Collado N (1998) A formal definition of intelligence based on an intensional variant of Kolmogorov complexity. In: Proceedings of international symposium of engineering of intelligent systems (EIS’98), ICSC Press, pp 146–163HernĂĄndez-Orallo J, Dowe DL, España-Cubillo S, HernĂĄndez-Lloreda MV, Insa-Cabrera J (2011) On more realistic environment distributions for defining, evaluating and developing intelligence. In: Schmidhuber J, ThĂłrisson K, Looks M (eds) Artificial general intelligence, LNAI, vol 6830. Springer, New York, pp 82–91HernĂĄndez-Orallo J, Flach P, Ferri C (2012a) A unified view of performance metrics: translating threshold choice into expected classification loss. J Mach Learn Res 13(1):2813–2869HernĂĄndez-Orallo J, Insa-Cabrera J, Dowe DL, Hibbard B (2012b) Turing Tests with Turing machines. In: Voronkov A (ed) Turing-100, EPiC Series, vol 10, pp 140–156HernĂĄndez-Orallo J, Dowe DL, HernĂĄndez-Lloreda MV (2014) Universal psychometrics: measuring cognitive abilities in the machine kingdom. Cogn Syst Res 27:50–74HernĂĄndez-Orallo J, MartĂ­nez-Plumed F, Schmid U, Siebers M, Dowe DL (2016) Computer models solving intelligence test problems: progress and implications. Artif Intell 230:74–107Herrmann E, Call J, HernĂĄndez-Lloreda MV, Hare B, Tomasello M (2007) Humans have evolved specialized skills of social cognition: the cultural intelligence hypothesis. Science 317(5843):1360–1366Hibbard B (2009) Bias and no free lunch in formal measures of intelligence. J Artif Gen Intell 1(1):54–61Hingston P (2010) A new design for a Turing Test for bots. In: 2010 IEEE symposium on computational intelligence and games (CIG), IEEE, pp 345–350Hingston P (2012) Believable bots: can computers play like people?. Springer, New YorkHo TK, Basu M (2002) Complexity measures of supervised classification problems. IEEE Trans Pattern Anal Mach Intell 24(3):289–300Hutter M (2007) Universal algorithmic intelligence: a mathematical top →\rightarrow → down approach. In: Goertzel B, Pennachin C (eds) Artificial general intelligence, cognitive technologies. Springer, Berlin, pp 227–290Igel C, Toussaint M (2005) A no-free-lunch theorem for non-uniform distributions of target functions. J Math Model Algorithms 3(4):313–322Insa-Cabrera J (2016) Towards a universal test of social intelligence. Ph.D. thesis, Departament de Sistemes InformĂĄtics i ComputaciĂł, UPVInsa-Cabrera J, Dowe DL, España-Cubillo S, HernĂĄndez-Lloreda MV, HernĂĄndez-Orallo J (2011a) Comparing humans and ai agents. In: Schmidhuber J, ThĂłrisson K, Looks M (eds) Artificial general intelligence, LNAI, vol 6830. Springer, New York, pp 122–132Insa-Cabrera J, Dowe DL, HernĂĄndez-Orallo J (2011) Evaluating a reinforcement learning algorithm with a general intelligence test. In: Lozano JA, Gamez JM (eds) Current topics in artificial intelligence. CAEPIA 2011, LNAI series 7023. Springer, New YorkInsa-Cabrera J, Benacloch-Ayuso JL, HernĂĄndez-Orallo J (2012) On measuring social intelligence: experiments on competition and cooperation. In: Bach J, Goertzel B, IklĂ© M (eds) AGI, lecture notes in computer science, vol 7716. Springer, New York, pp 126–135Jacoff A, Messina E, Weiss BA, Tadokoro S, Nakagawa Y (2003) Test arenas and performance metrics for urban search and rescue robots. In: Proceedings of 2003 IEEE/RSJ international conference on intelligent robots and systems, 2003 (IROS 2003), IEEE, vol 4, pp 3396–3403Japkowicz N, Shah M (2011) Evaluating learning algorithms. Cambridge University Press, CambridgeJiang J (2008) A literature survey on domain adaptation of statistical classifiers. http://sifaka.cs.uiuc.edu/jiang4/domain_adaptation/surveyJohnson M, Hofmann K, Hutton T, Bignell D (2016) The Malmo platform for artificial intelligence experimentation. In: International joint conference on artificial intelligence (IJCAI)Keith TZ, Reynolds MR (2010) Cattell–Horn–Carroll abilities and cognitive tests: what we’ve learned from 20 years of research. Psychol Schools 47(7):635–650Ketter W, Symeonidis A (2012) Competitive benchmarking: lessons learned from the trading agent competition. AI Mag 33(2):103Khreich W, Granger E, Miri A, Sabourin R (2012) A survey of techniques for incremental learning of HMM parameters. Inf Sci 197:105–130Kim JH (2004) Soccer robotics, vol 11. Springer, New YorkKitano H, Asada M, Kuniyoshi Y, Noda I, Osawa E (1997) Robocup: the robot world cup initiative. In: Proceedings of the first international conference on autonomous agents, ACM, pp 340–347Kleiner K (2011) Who are you calling bird-brained? An attempt is being made to devise a universal intelligence test. Economist 398(8723, 5 March 2011):82Knuth DE (1973) Sorting and searching, volume 3 of the art of computer programming. Addison-Wesley, ReadingKoza JR (2010) Human-competitive results produced by genetic programming. Genet Program Evolvable Mach 11(3–4):251–284Krueger J, Osherson D (1980) On the psychology of structural simplicity. In: Jusczyk PW, Klein RM (eds) The nature of thought: essays in honor of D. O. Hebb. Psychology Press, London, pp 187–205Langford J (2005) Clever methods of overfitting. Machine Learning (Theory). http://hunch.netLangley P (1987) Research papers in machine learning. Mach Learn 2(3):195–198Langley P (2011) The changing science of machine learning. Mach Learn 82(3):275–279Langley P (2012) The cognitive systems paradigm. Adv Cogn Syst 1:3–13Lattimore T, Hutter M (2013) No free lunch versus Occam’s razor in supervised learning. Algorithmic Probability and Friends. Springer, Bayesian Prediction and Artificial Intelligence, pp 223–235Leeuwenberg ELJ, Van Der Helm PA (2012) Structural information theory: the simplicity of visual form. Cambridge University Press, CambridgeLegg S, Hutter M (2007a) Tests of machine intelligence. In: Lungarella M, Iida F, Bongard J, Pfeifer R (eds) 50 Years of Artificial Intelligence, Lecture Notes in Computer Science, vol 4850, Springer Berlin Heidelberg, pp 232–242. doi: 10.1007/978-3-540-77296-5_22Legg S, Hutter M (2007b) Universal intelligence: a definition of machine intelligence. Minds Mach 17(4):391–444Legg S, Veness J (2013) An approximation of the universal intelligence measure. Algorithmic Probability and Friends. Springer, Bayesian Prediction and Artificial Intelligence, pp 236–249Levesque HJ (2014) On our best behaviour. Artif Intell 212:27–35Levesque HJ, Davis E, Morgenstern L (2012) The winog

    Observation of associated near-side and away-side long-range correlations in √sNN=5.02  TeV proton-lead collisions with the ATLAS detector

    Get PDF
    Two-particle correlations in relative azimuthal angle (Δϕ) and pseudorapidity (Δη) are measured in √sNN=5.02  TeV p+Pb collisions using the ATLAS detector at the LHC. The measurements are performed using approximately 1  Όb-1 of data as a function of transverse momentum (pT) and the transverse energy (ÎŁETPb) summed over 3.1<η<4.9 in the direction of the Pb beam. The correlation function, constructed from charged particles, exhibits a long-range (2<|Δη|<5) “near-side” (Δϕ∌0) correlation that grows rapidly with increasing ÎŁETPb. A long-range “away-side” (Δϕ∌π) correlation, obtained by subtracting the expected contributions from recoiling dijets and other sources estimated using events with small ÎŁETPb, is found to match the near-side correlation in magnitude, shape (in Δη and Δϕ) and ÎŁETPb dependence. The resultant Δϕ correlation is approximately symmetric about π/2, and is consistent with a dominant cos⁥2Δϕ modulation for all ÎŁETPb ranges and particle pT

    Sample size considerations for trials using cerebral white matter hyperintensity progression as an intermediate outcome at 1 year after mild stroke: Results of a prospective cohort study

    Get PDF
    Background: White matter hyperintensities (WMHs) are commonly seen on in brain imaging and are associated with stroke and cognitive decline. Therefore, they may provide a relevant intermediate outcome in clinical trials. WMH can be measured as a volume or visually on the Fazekas scale. We investigated predictors of WMH progression and design of efficient studies using WMH volume and Fazekas score as an intermediate outcome. Methods: We prospectively recruited 264 patients with mild ischaemic stroke and measured WMH volume, Fazekas score, age and cardiovascular risk factors at baseline and 1 year. We modelled predictors of WMH burden at 1 year and used the results in sample size calculations for hypothetical randomised controlled trials with different analysis plans and lengths of follow-up. Results: Follow-up WMH volume was predicted by baseline WMH: a 0.73-ml (95% CI 0.65-0.80, p < 0.0001) increase per 1-ml baseline volume increment, and a 2.93-ml increase (95% CI 1.76-4.10, p < 0.0001) per point on the Fazekas scale. Using a mean difference of 1 ml in WMH volume between treatment groups, 80% power and 5% alpha, adjusting for all predictors and 2-year follow-up produced the smallest sample size (n = 642). Other study designs produced samples sizes from 2054 to 21,270. Sample size calculations using Fazekas score as an outcome with the same power and alpha, as well as an OR corresponding to a 1-ml difference, were sensitive to assumptions and ranged from 2504 to 18,886. Conclusions: Baseline WMH volume and Fazekas score predicted follow-up WMH volume. Study size was smallest using volumes and longer-term follow-up, but this must be balanced against resources required to measure volumes versus Fazekas scores, bias due to dropout and scanner drift. Samples sizes based on Fazekas scores may be best estimated with simulation studies

    Anatomy and dietary specialization influence sensory behaviour among sympatric primates

    Get PDF
    Senses form the interface between animals and environments, and provide a window into the ecology of past and present species. However, research on sensory behaviours by wild frugivores is sparse. Here, we examine fruit assessment by three sympatric primates (Alouatta palliata, Ateles geoffroyi and Cebus imitator) to test the hypothesis that dietary and sensory specialization shape foraging behaviours. Ateles and Cebus groups are comprised of dichromats and trichromats, while all Alouatta are trichomats. We use anatomical proxies to examine smell, taste and manual touch, and opsin genotyping to assess colour vision. We find that the frugivorous spider monkeys (Ateles geoffroyi) sniff fruits most often, omnivorous capuchins (Cebus imitator), the species with the highest manual dexterity, use manual touch most often, and that main olfactory bulb volume is a better predictor of sniffing behaviour than nasal turbinate surface area. We also identify an interaction between colour vision phenotype and use of other senses. Controlling for species, dichromats sniff and bite fruits more often than trichromats, and trichromats use manual touch to evaluate cryptic fruits more often than dichromats. Our findings reveal new relationships among dietary specialization, anatomical variation and foraging behaviour, and promote understanding of sensory system evolution
    • 

    corecore