744,861 research outputs found
Data linkage algebra, data linkage dynamics, and priority rewriting
We introduce an algebra of data linkages. Data linkages are intended for
modelling the states of computations in which dynamic data structures are
involved. We present a simple model of computation in which states of
computations are modelled as data linkages and state changes take place by
means of certain actions. We describe the state changes and replies that result
from performing those actions by means of a term rewriting system with rule
priorities. The model in question is an upgrade of molecular dynamics. The
upgrading is mainly concerned with the features to deal with values and the
features to reclaim garbage.Comment: 48 pages, typos corrected, phrasing improved, definition of services
replaced; presentation improved; presentation improved and appendix adde
SLIM : Scalable Linkage of Mobility Data
We present a scalable solution to link entities across mobility datasets using their spatio-temporal information. This is a fundamental problem in many applications such as linking user identities for security, understanding privacy limitations of location based services, or producing a unified dataset from multiple sources for urban planning. Such integrated datasets are also essential for service providers to optimise their services and improve business intelligence. In this paper, we first propose a mobility based representation and similarity computation for entities. An efficient matching process is then developed to identify the final linked pairs, with an automated mechanism to decide when to stop the linkage. We scale the process with a locality-sensitive hashing (LSH) based approach that significantly reduces candidate pairs for matching. To realize the effectiveness and efficiency of our techniques in practice, we introduce an algorithm called SLIM. In the experimental evaluation, SLIM outperforms the two existing state-of-the-art approaches in terms of precision and recall. Moreover, the LSH-based approach brings two to four orders of magnitude speedup
Linking routinely collected social work, education and health data to enable monitoring of the health and health care of school-aged children in state care (‘looked after children’) in Scotland: a national demonstration project
Background and objectives: Children in state care (‘looked after children’) have poorer health than children who are not looked after. Recent developments in Scotland and elsewhere have aimed to improve services and outcomes for looked after children. Routine monitoring of the health outcomes of looked after children compared to those of their non-looked after peers is currently lacking. Developing capacity for comparative monitoring of population based outcomes based on linkage of routinely collected administrative data has been identified as a priority. To our knowledge there are no existing population based data linkage studies providing data on the health of looked after and non-looked after children at national level. Smaller scale studies that are available generally provide very limited information on linkage methods and hence do not allow scrutiny of bias that may be introduced through the linkage process. Study design and methods: National demonstration project testing the feasibility of linking routinely collected looked after children, education, and health data. Participants: All children in publicly funded school in Scotland in 2011/12. Results: Linkage between looked after children data and the national pupil census classified 10,009 (1.5%) and 1,757 (0.3%) of 670,952 children as, respectively, currently and previously looked after. Recording of the unique pupil identifier (Scottish Candidate Number, SCN) on looked after children returns is incomplete, with 66% of looked after records for 2011/12 for children of possible school age containing a valid SCN. This will have resulted in some under-ascertainment of currently and, particularly, previously looked after children within the general pupil population. Further linkage of the pupil census to the NHS Scotland master patient index demonstrated that a safe link to the child’s unique health service (Community Health Index, CHI) number could be obtained for a very high proportion of children in each group (94%, 95%, and 95% of children classified as currently, previously, and non-looked after respectively). In general linkage rates were higher for older children and those living in more affluent areas. Within the looked after group, linkage rates were highest for children with the fewest placements and for those in permanent fostering. Conclusions: This novel data linkage demonstrates the feasibility of monitoring population based health outcomes of school aged looked after and non-looked after children using linked routine administrative data. Improved recording of the unique pupil identifier number on looked after data returns would be beneficial. Extending the range of personal identifiers on looked after children returns would enable linkage to health data for looked after children who are not in publicly funded schooling (i.e. those who are pre- or post-school, home schooled, or in independent schooling)
De novo construction of polyploid linkage maps using discrete graphical models
Linkage maps are used to identify the location of genes responsible for
traits and diseases. New sequencing techniques have created opportunities to
substantially increase the density of genetic markers. Such revolutionary
advances in technology have given rise to new challenges, such as creating
high-density linkage maps. Current multiple testing approaches based on
pairwise recombination fractions are underpowered in the high-dimensional
setting and do not extend easily to polyploid species. We propose to construct
linkage maps using graphical models either via a sparse Gaussian copula or a
nonparanormal skeptic approach. Linkage groups (LGs), typically chromosomes,
and the order of markers in each LG are determined by inferring the conditional
independence relationships among large numbers of markers in the genome.
Through simulations, we illustrate the utility of our map construction method
and compare its performance with other available methods, both when the data
are clean and contain no missing observations and when data contain genotyping
errors and are incomplete. We apply the proposed method to two genotype
datasets: barley and potato from diploid and polypoid populations,
respectively. Our comprehensive map construction method makes full use of the
dosage SNP data to reconstruct linkage map for any bi-parental diploid and
polyploid species. We have implemented the method in the R package netgwas.Comment: 25 pages, 7 figure
Linking science to technology: using bibliographic references in patents to build linkage schemes.
In this paper, we develop and discuss a method to design a linkage scheme that links the systems of science and technology through the use of patent citation data. After conceptually embedding the linkage scheme in the current literature on science-technology interactions and associations, the methodology and algorithms used to decelop the linkage scheme are discussed in detail. The method is subsequently tested on and applied to subsets of USPTO patents. The results point to highly skewed citation distributions, enabling us to discern between those fields of technology that are highly science-interactive and those fields where technology develoment is highly independent from the scientific literature base.Science; Patents; Systems; Data; Algorithms; Distribution;
- …
