116,074 research outputs found
High Performance Computing of Gene Regulatory Networks using a Message-Passing Model
Gene regulatory network reconstruction is a fundamental problem in
computational biology. We recently developed an algorithm, called PANDA
(Passing Attributes Between Networks for Data Assimilation), that integrates
multiple sources of 'omics data and estimates regulatory network models. This
approach was initially implemented in the C++ programming language and has
since been applied to a number of biological systems. In our current research
we are beginning to expand the algorithm to incorporate larger and most diverse
data-sets, to reconstruct networks that contain increasing numbers of elements,
and to build not only single network models, but sets of networks. In order to
accomplish these "Big Data" applications, it has become critical that we
increase the computational efficiency of the PANDA implementation. In this
paper we show how to recast PANDA's similarity equations as matrix operations.
This allows us to implement a highly readable version of the algorithm using
the MATLAB/Octave programming language. We find that the resulting M-code much
shorter (103 compared to 1128 lines) and more easily modifiable for potential
future applications. The new implementation also runs significantly faster,
with increasing efficiency as the network models increase in size. Tests
comparing the C-code and M-code versions of PANDA demonstrate that this
speed-up is on the order of 20-80 times faster for networks of similar
dimensions to those we find in current biological applications
Inductive Visual Localisation: Factorised Training for Superior Generalisation
End-to-end trained Recurrent Neural Networks (RNNs) have been successfully
applied to numerous problems that require processing sequences, such as image
captioning, machine translation, and text recognition. However, RNNs often
struggle to generalise to sequences longer than the ones encountered during
training. In this work, we propose to optimise neural networks explicitly for
induction. The idea is to first decompose the problem in a sequence of
inductive steps and then to explicitly train the RNN to reproduce such steps.
Generalisation is achieved as the RNN is not allowed to learn an arbitrary
internal state; instead, it is tasked with mimicking the evolution of a valid
state. In particular, the state is restricted to a spatial memory map that
tracks parts of the input image which have been accounted for in previous
steps. The RNN is trained for single inductive steps, where it produces updates
to the memory in addition to the desired output. We evaluate our method on two
different visual recognition problems involving visual sequences: (1) text
spotting, i.e. joint localisation and reading of text in images containing
multiple lines (or a block) of text, and (2) sequential counting of objects in
aerial images. We show that inductive training of recurrent models enhances
their generalisation ability on challenging image datasets.Comment: In BMVC 2018 (spotlight
A Review of integrity constraint maintenance and view updating techniques
Two interrelated problems may arise when updating a database. On one
hand, when an update is applied to the database, integrity constraints
may become violated. In such case, the integrity constraint maintenance
approach tries to obtain additional updates to keep integrity
constraints satisfied. On the other hand, when updates of derived or
view facts are requested, a view updating mechanism must be applied to
translate the update request into correct updates of the underlying base
facts.
This survey reviews the research performed on integrity constraint
maintenance and view updating. It is proposed a general framework to
classify and to compare methods that tackle integrity constraint
maintenance and/or view updating. Then, we analyze some of these methods
in more detail to identify their actual contribution and the main
limitations they may present.Postprint (published version
Transitions and shifting understandings of writing: Building rich pictures of how moving from school to university is experienced through exploration of students’ discourses of writing
In a time of economic constraints and increasing competition for places, negotiating “the transition” from school to university has become crucial for students’ educational success. Writing holds a dominant place in the academy as a mechanism of assessment. Therefore, exploring the writing practices of students as they move from school to university offers a valuable lens into how students negotiate the complex and multiple demands of moving between educational and disciplinary contexts. This paper will explore what insights an analysis of instantiations of students’ discourses of writing (Ivanič, 2004) can offer to develop a rich picture of how students experience their writing “in transition”. The data presented is taken from an ethnographic-style project that followed a group of British students from A-levels (HSC equivalent) to their second year of university study. Ivanič’s framework of discourses of writing offers a useful analytic tool, allowing analysis of the sets of beliefs and assumptions that students draw on when engaging in and talking about writing and can be applied to different kinds of data collected around students’ writing. Discourses of writing also provide an organising frame for exploring how students’ understandings of writing change as they move between educational and disciplinary contexts. This analysis shows that the ways students’ understand their writing are not only influenced by various discourses, which can change as students move between school and university, but understandings are individual, situated and context-dependent. The role of emotions, students’ “face work” (Goffman, 1967) and the dominant force of assessment emerge as significant areas for further development
Somoclu: An Efficient Parallel Library for Self-Organizing Maps
Somoclu is a massively parallel tool for training self-organizing maps on
large data sets written in C++. It builds on OpenMP for multicore execution,
and on MPI for distributing the workload across the nodes in a cluster. It is
also able to boost training by using CUDA if graphics processing units are
available. A sparse kernel is included, which is useful for high-dimensional
but sparse data, such as the vector spaces common in text mining workflows.
Python, R and MATLAB interfaces facilitate interactive use. Apart from fast
execution, memory use is highly optimized, enabling training large emergent
maps even on a single computer.Comment: 26 pages, 9 figures. The code is available at
https://peterwittek.github.io/somoclu
- …