23,537 research outputs found
πBUSS:a parallel BEAST/BEAGLE utility for sequence simulation under complex evolutionary scenarios
Background: Simulated nucleotide or amino acid sequences are frequently used
to assess the performance of phylogenetic reconstruction methods. BEAST, a
Bayesian statistical framework that focuses on reconstructing time-calibrated
molecular evolutionary processes, supports a wide array of evolutionary models,
but lacked matching machinery for simulation of character evolution along
phylogenies.
Results: We present a flexible Monte Carlo simulation tool, called piBUSS,
that employs the BEAGLE high performance library for phylogenetic computations
within BEAST to rapidly generate large sequence alignments under complex
evolutionary models. piBUSS sports a user-friendly graphical user interface
(GUI) that allows combining a rich array of models across an arbitrary number
of partitions. A command-line interface mirrors the options available through
the GUI and facilitates scripting in large-scale simulation studies. Analogous
to BEAST model and analysis setup, more advanced simulation options are
supported through an extensible markup language (XML) specification, which in
addition to generating sequence output, also allows users to combine simulation
and analysis in a single BEAST run.
Conclusions: piBUSS offers a unique combination of flexibility and
ease-of-use for sequence simulation under realistic evolutionary scenarios.
Through different interfaces, piBUSS supports simulation studies ranging from
modest endeavors for illustrative purposes to complex and large-scale
assessments of evolutionary inference procedures. The software aims at
implementing new models and data types that are continuously being developed as
part of BEAST/BEAGLE.Comment: 13 pages, 2 figures, 1 tabl
RELEASE: A High-level Paradigm for Reliable Large-scale Server Software
Erlang is a functional language with a much-emulated model for building reliable distributed systems. This paper outlines the RELEASE project, and describes the progress in the first six months. The project aim is to scale the Erlang’s radical concurrency-oriented programming paradigm to build reliable general-purpose software, such as server-based systems, on massively parallel machines. Currently Erlang has inherently scalable computation and reliability models, but in practice scalability is constrained by aspects of the language and virtual machine. We are working at three levels to address these challenges: evolving the Erlang virtual machine so that it can work effectively on large scale multicore systems; evolving the language to Scalable Distributed (SD) Erlang; developing a scalable Erlang infrastructure to integrate multiple, heterogeneous clusters. We are also developing state of the art tools that allow programmers to understand the behaviour of massively parallel SD Erlang programs. We will demonstrate the effectiveness of the RELEASE approach using demonstrators and two large case studies on a Blue Gene
Digital Preservation Services : State of the Art Analysis
Research report funded by the DC-NET project.An overview of the state of the art in service provision for digital preservation and curation. Its focus is on the areas where bridging the gaps is needed between e-Infrastructures and efficient and forward-looking digital preservation services. Based on a desktop study and a rapid analysis of some 190 currently available tools and services for digital preservation, the deliverable provides a high-level view on the range of instruments currently on offer to support various functions within a preservation system.European Commission, FP7peer-reviewe
Application of Supercomputer Technologies for Simulation of Socio-Economic Systems
To date, an extensive experience has been accumulated in investigation of problems related to quality, assessment of management systems, modeling of economic system sustainability. The studies performed have created a basis for formation of a new research area — Economics of Quality. Its tools allow to use opportunities of model simulation for construction of the mathematical models adequately reflecting the role of quality in natural, technical, social regularities of functioning of the complex socioeconomic systems. Extensive application and development of models, and also system modeling with use of supercomputer technologies, on our deep belief, will bring the conducted researches of social and economic systems to essentially new level. Moreover, the current scientific research makes a significant contribution to model simulation of multi-agent social systems and that isn’t less important, it belongs to the priority areas in development of science and technology in our country. This article is devoted to the questions of supercomputer technologies application in public sciences, first of all, — regarding technical realization of the large-scale agent-focused models (AFM). The essence of this tool is that owing to increase in power of computers it became possible to describe the behavior of many separate fragments of a difficult system, as social and economic systems represent. The article also deals with the experience of foreign scientists and practicians in launching the AFM on supercomputers, and also the example of AFM developed in CEMI RAS, stages and methods of effective calculating kernel display of multi-agent system on architecture of a modern supercomputer will be analyzed. The experiments on the basis of model simulation on forecasting the population of St. Petersburg according to three scenarios as one of the major factors influencing the development of social and economic system and quality of life of the population are presented in the conclusion
C to O-O Translation: Beyond the Easy Stuff
Can we reuse some of the huge code-base developed in C to take advantage of
modern programming language features such as type safety, object-orientation,
and contracts? This paper presents a source-to-source translation of C code
into Eiffel, a modern object-oriented programming language, and the supporting
tool C2Eif. The translation is completely automatic and supports the entire C
language (ANSI, as well as many GNU C Compiler extensions, through CIL) as used
in practice, including its usage of native system libraries and inlined
assembly code. Our experiments show that C2Eif can handle C applications and
libraries of significant size (such as vim and libgsl), as well as challenging
benchmarks such as the GCC torture tests. The produced Eiffel code is
functionally equivalent to the original C code, and takes advantage of some of
Eiffel's object-oriented features to produce safe and easy-to-debug
translations
Proactive Empirical Assessment of New Language Feature Adoption via Automated Refactoring: The Case of Java 8 Default Methods
Programming languages and platforms improve over time, sometimes resulting in
new language features that offer many benefits. However, despite these
benefits, developers may not always be willing to adopt them in their projects
for various reasons. In this paper, we describe an empirical study where we
assess the adoption of a particular new language feature. Studying how
developers use (or do not use) new language features is important in
programming language research and engineering because it gives designers
insight into the usability of the language to create meaning programs in that
language. This knowledge, in turn, can drive future innovations in the area.
Here, we explore Java 8 default methods, which allow interfaces to contain
(instance) method implementations.
Default methods can ease interface evolution, make certain ubiquitous design
patterns redundant, and improve both modularity and maintainability. A focus of
this work is to discover, through a scientific approach and a novel technique,
situations where developers found these constructs useful and where they did
not, and the reasons for each. Although several studies center around assessing
new language features, to the best of our knowledge, this kind of construct has
not been previously considered.
Despite their benefits, we found that developers did not adopt default
methods in all situations. Our study consisted of submitting pull requests
introducing the language feature to 19 real-world, open source Java projects
without altering original program semantics. This novel assessment technique is
proactive in that the adoption was driven by an automatic refactoring approach
rather than waiting for developers to discover and integrate the feature
themselves. In this way, we set forth best practices and patterns of using the
language feature effectively earlier rather than later and are able to possibly
guide (near) future language evolution. We foresee this technique to be useful
in assessing other new language features, design patterns, and other
programming idioms
Database independent Migration of Objects into an Object-Relational Database
This paper reports on the CERN-based WISDOM project which is studying the
serialisation and deserialisation of data to/from an object database
(objectivity) and ORACLE 9i.Comment: 26 pages, 18 figures; CMS CERN Conference Report cr02_01
Critters in the Classroom: A 3D Computer-Game-Like Tool for Teaching Programming to Computer Animation Students
The brewing crisis threatening computer science education is a well documented fact. To counter this and to increase enrolment and retention in computer science related degrees, it has been suggested to make programming "more fun" and to offer "multidisciplinary and cross-disciplinary programs" [Carter 2006]. The Computer Visualisation and Animation undergraduate degree at the National Centre for Computer Animation (Bournemouth University) is such a programme. Computer programming forms an integral part of the curriculum of this technical arts degree, and as educators we constantly face the challenge of having to encourage our students to engage with the subject.
We intend to address this with our C-Sheep system, a reimagination of the "Karel the Robot" teaching tool [Pattis 1981], using modern 3D computer game graphics that today's students are familiar with. This provides a game-like setting for writing computer programs, using a task-specific set of instructions which allow users to take control of virtual entities acting within a micro world, effectively providing a graphical representation of the algorithms used. Whereas two decades ago, students would be intrigued by a 2D top-down representation of the micro world, the lack of the visual gimmickry found in modern computer games for representing the virtual world now makes it extremely difficult to maintain the interest of students from today's "Plug&Play generation". It is therefore especially important to aim for a 3D game-like representation which is "attractive and highly motivating to today's generation of media-conscious students" [Moskal et al. 2004].
Our system uses a modern, platform independent games engine, capable of presenting a visually rich virtual environment using a state of the art rendering engine of a type usually found in entertainment systems. Our aim is to entice students to spend more time programming, by providing them with an enjoyable experience.
This paper provides a discussion of the 3D computer game technology employed in our system and presents examples of how this can be exploited to provide engaging exercises to create a rewarding learning experience for our students
Semantic Source Code Models Using Identifier Embeddings
The emergence of online open source repositories in the recent years has led
to an explosion in the volume of openly available source code, coupled with
metadata that relate to a variety of software development activities. As an
effect, in line with recent advances in machine learning research, software
maintenance activities are switching from symbolic formal methods to
data-driven methods. In this context, the rich semantics hidden in source code
identifiers provide opportunities for building semantic representations of code
which can assist tasks of code search and reuse. To this end, we deliver in the
form of pretrained vector space models, distributed code representations for
six popular programming languages, namely, Java, Python, PHP, C, C++, and C#.
The models are produced using fastText, a state-of-the-art library for learning
word representations. Each model is trained on data from a single programming
language; the code mined for producing all models amounts to over 13.000
repositories. We indicate dissimilarities between natural language and source
code, as well as variations in coding conventions in between the different
programming languages we processed. We describe how these heterogeneities
guided the data preprocessing decisions we took and the selection of the
training parameters in the released models. Finally, we propose potential
applications of the models and discuss limitations of the models.Comment: 16th International Conference on Mining Software Repositories (MSR
2019): Data Showcase Trac
- …
