523 research outputs found

    Topic model visualization with IPython

    Get PDF
    The paper introduces an approach to topic model visualization that is characterized by wide possibilities of choosing a method of visualization, user-friendly model representation, and simplicity of implementation for applications. The existing approaches to topic models visualization have been analyzed, and a system, which allows choosing data source for topic models, changing modeling parameters and visualizing the result of topic modeling with IPython has been developed. The example of topic model visualization has been built using the SCTM-en corpus of original news text

    Prepare for Citizen Science Challenges at CERN

    Get PDF
    Abstract: To inspire more people to contribute to science, and educate the public about science, two Citizen Science "challenges" were prepared during summer 2013: the CERN Summer Webfest 2013 and the Virtual LHC Challenge. The first part of this report summarizes how to organize a Webfest at CERN and the outcome of the CERN Summer Webfest 2013.The second part gives an introduction to the current state of the Virtual LHC Challenge: a development of the LHC@Home Test4Theory project planned to attract many unskilled volunteers. This work was supported by a grant from the EU Citizen Cyberlab project, with assistance from the Citizen Cyberscience Centre (CCC)

    Teaching Data Science

    Get PDF
    We describe an introductory data science course, entitled Introduction to Data Science, offered at the University of Illinois at Urbana-Champaign. The course introduced general programming concepts by using the Python programming language with an emphasis on data preparation, processing, and presentation. The course had no prerequisites, and students were not expected to have any programming experience. This introductory course was designed to cover a wide range of topics, from the nature of data, to storage, to visualization, to probability and statistical analysis, to cloud and high performance computing, without becoming overly focused on any one subject. We conclude this article with a discussion of lessons learned and our plans to develop new data science courses.Comment: 10 pages, 4 figures, International Conference on Computational Science (ICCS 2016

    Hardware-accelerated interactive data visualization for neuroscience in Python.

    Get PDF
    Large datasets are becoming more and more common in science, particularly in neuroscience where experimental techniques are rapidly evolving. Obtaining interpretable results from raw data can sometimes be done automatically; however, there are numerous situations where there is a need, at all processing stages, to visualize the data in an interactive way. This enables the scientist to gain intuition, discover unexpected patterns, and find guidance about subsequent analysis steps. Existing visualization tools mostly focus on static publication-quality figures and do not support interactive visualization of large datasets. While working on Python software for visualization of neurophysiological data, we developed techniques to leverage the computational power of modern graphics cards for high-performance interactive data visualization. We were able to achieve very high performance despite the interpreted and dynamic nature of Python, by using state-of-the-art, fast libraries such as NumPy, PyOpenGL, and PyTables. We present applications of these methods to visualization of neurophysiological data. We believe our tools will be useful in a broad range of domains, in neuroscience and beyond, where there is an increasing need for scalable and fast interactive visualization

    Aprendizaje orientado a la programación en economía, negocios y finanzas

    Full text link
    [EN] As the relationship between both students (teachers) and information technology evolves, new tools are required to improve learning (teaching) in social sciences. Economics, business and finance are mainly based on data and dealing with data requires specific skills and techniques such as computer programming in order to get full potential of most quantitative models. In this paper, we propose a coding oriented learning method based on Python Notebooks which is specifically designed for students of degrees in economics, business and finance. We follow a learning-by-doing strategy that encourages students to implement economic models as a suitable way to improve the understanding of fundamental concepts. As an illustrative example, we also describe a case study in which Python Notebooks are the key tool to teach cash management in a Master in Business Administration program. Since students of today are the decision-makers of tomorrow, a further advantage of the use of a programming language as a teaching tool is the possibility to connect theory to practice by enabling students to implement their own decision support tools.[ES] La evolución entre la relación entre los estudiantes (profesores) y la tecnología de la información, requiere nuevas herramientas para mejorar el aprendizaje (enseñanza) en las ciencias sociales. La economía, los negocios y las finanzas se basan principalmente en los datos y el tratamiento de los datos requiere habilidades y técnicas específicas, como la programación informática, para aprovechar al máximo el potencial de la mayoría de los modelos cuantitativos. En este documento, proponemos un método de aprendizaje orientado a la programación basado en Python Notebooks, que está diseñado específicamente para estudiantes de títulos en economía, negocios y finanzas. Nuestra estrategia de aprendizaje es eminentemente práctica motivando a los estudiantes a implementar modelos económicos como una forma adecuada de mejorar la comprensión de los conceptos fundamentales. Como ejemplo ilustrativo, también describimos un estudio de caso en el que Python Notebooks es la herramienta clave para enseñar gestión de efectivo en un programa de Máster en Administración de Empresas. Dado que los estudiantes de hoy son los que toman las decisiones del mañana, una ventaja adicional del uso de un lenguaje de programación como herramienta de enseñanza es la posibilidad de conectar la teoría con la práctica al permitir a los estudiantes implementar sus propias herramientas de apoyo a la decisión.Salas-Molina, F.; Pla-Santamaria, D. (2018). Coding oriented learning in economics,business and finance. Modelling in Science Education and Learning. 11(1):55-64. doi:10.4995/msel.2018.9152SWORD5564111da Costa Moraes, M. B., Nagano, M. S., and Sobreiro, V. A. (2015). Stochastic cash ow management models: A literature review since the 1980s. In Decision Models in Engineering and Management, pages 11-28. Springer International Publishing.DiSessa, A. A. (2001). Changing minds: Computers, learning, and literacy. Mit Press.Guzdial, M. (2010). Why is it so hard to learn to program? In Making software: What really works, and why we believe it, pages 111-121. O'Reilly Media, Inc.Ketcheson, D. I. (2014). Teaching numerical methods with iPython notebooks and inquiry-based learning. In Proceedings of the 13th Python in Science Conference. SciPy. org.Myers, G. J., Sandler, C., and Badgett, T. (2011). The art of software testing. John Wiley & Sons.Rossant, C. (2014). IPython interactive computing and visualization cookbook. Packt Publishing Ltd.VanderPlas, J. (2016). Python Data Science Handbook: Essential Tools for Working with Data. O'Reilly

    API design for machine learning software: experiences from the scikit-learn project

    Get PDF
    Scikit-learn is an increasingly popular machine learning li- brary. Written in Python, it is designed to be simple and efficient, accessible to non-experts, and reusable in various contexts. In this paper, we present and discuss our design choices for the application programming interface (API) of the project. In particular, we describe the simple and elegant interface shared by all learning and processing units in the library and then discuss its advantages in terms of composition and reusability. The paper also comments on implementation details specific to the Python ecosystem and analyzes obstacles faced by users and developers of the library

    Report on the Third Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE3)

    Get PDF
    This report records and discusses the Third Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE3). The report includes a description of the keynote presentation of the workshop, which served as an overview of sustainable scientific software. It also summarizes a set of lightning talks in which speakers highlighted to-the-point lessons and challenges pertaining to sustaining scientific software. The final and main contribution of the report is a summary of the discussions, future steps, and future organization for a set of self-organized working groups on topics including developing pathways to funding scientific software; constructing useful common metrics for crediting software stakeholders; identifying principles for sustainable software engineering design; reaching out to research software organizations around the world; and building communities for software sustainability. For each group, we include a point of contact and a landing page that can be used by those who want to join that group's future activities. The main challenge left by the workshop is to see if the groups will execute these activities that they have scheduled, and how the WSSSPE community can encourage this to happen

    The Connectome Viewer Toolkit: An Open Source Framework to Manage, Analyze, and Visualize Connectomes

    Get PDF
    Advanced neuroinformatics tools are required for methods of connectome mapping, analysis, and visualization. The inherent multi-modality of connectome datasets poses new challenges for data organization, integration, and sharing. We have designed and implemented the Connectome Viewer Toolkit – a set of free and extensible open source neuroimaging tools written in Python. The key components of the toolkit are as follows: (1) The Connectome File Format is an XML-based container format to standardize multi-modal data integration and structured metadata annotation. (2) The Connectome File Format Library enables management and sharing of connectome files. (3) The Connectome Viewer is an integrated research and development environment for visualization and analysis of multi-modal connectome data. The Connectome Viewer's plugin architecture supports extensions with network analysis packages and an interactive scripting shell, to enable easy development and community contributions. Integration with tools from the scientific Python community allows the leveraging of numerous existing libraries for powerful connectome data mining, exploration, and comparison. We demonstrate the applicability of the Connectome Viewer Toolkit using Diffusion MRI datasets processed by the Connectome Mapper. The Connectome Viewer Toolkit is available from http://www.cmtk.org

    An Introduction to Programming for Bioscientists: A Python-based Primer

    Full text link
    Computing has revolutionized the biological sciences over the past several decades, such that virtually all contemporary research in the biosciences utilizes computer programs. The computational advances have come on many fronts, spurred by fundamental developments in hardware, software, and algorithms. These advances have influenced, and even engendered, a phenomenal array of bioscience fields, including molecular evolution and bioinformatics; genome-, proteome-, transcriptome- and metabolome-wide experimental studies; structural genomics; and atomistic simulations of cellular-scale molecular assemblies as large as ribosomes and intact viruses. In short, much of post-genomic biology is increasingly becoming a form of computational biology. The ability to design and write computer programs is among the most indispensable skills that a modern researcher can cultivate. Python has become a popular programming language in the biosciences, largely because (i) its straightforward semantics and clean syntax make it a readily accessible first language; (ii) it is expressive and well-suited to object-oriented programming, as well as other modern paradigms; and (iii) the many available libraries and third-party toolkits extend the functionality of the core language into virtually every biological domain (sequence and structure analyses, phylogenomics, workflow management systems, etc.). This primer offers a basic introduction to coding, via Python, and it includes concrete examples and exercises to illustrate the language's usage and capabilities; the main text culminates with a final project in structural bioinformatics. A suite of Supplemental Chapters is also provided. Starting with basic concepts, such as that of a 'variable', the Chapters methodically advance the reader to the point of writing a graphical user interface to compute the Hamming distance between two DNA sequences.Comment: 65 pages total, including 45 pages text, 3 figures, 4 tables, numerous exercises, and 19 pages of Supporting Information; currently in press at PLOS Computational Biolog
    corecore