109,873 research outputs found
Scalable Spatial Framework for NoSQL Databases - Haslam Scholars Program Undergraduate Thesis
The spatial frameworks used for knowledge discovery in “Big Data” areas such as urban information systems (UIS) are well- developed in SQL databases but are not as extensive within certain NoSQL databases. The focus of this project is to develop this framework for emerging search systems (ESS) in UIS by utilizing NoSQL databases, notably the document-based MongoDB. Such framework includes spatial functions for the most fundamental spatial queries. An ESS in UIS can take advantage of these new and attractive features of scalability within MongoDB to provide a robust approach to spatial search that differs from SQL relations and scalability. MongoDB, which is relatively in its early stages of spatial search in contrast to PostgreSQL, will require contributions to its spatial “toolbox”. Many of the operations present in SQL packages, such as PostGIS, are not in MongoDB. Thus, there is an opportunity to contribute to MongoDB’s ongoing geospatial evolution by developing, testing, and optimizing the spatial utilities used for large NoSQL datasets. Within UIS, these core operations can prove to be an important starting point for detailed geospatial analysis and high-impact data production. We hope, by open sourcing this framework (as an extension), it can serve the research community as the foundation for scalable NoSQL platforms for big geospatial data analytics and be the next stage for open source contributions to MongoDB
Recommended from our members
Using background knowledge for ontology evolution
One of the current bottlenecks for automating ontology evolution is resolving the right links between newly arising information and the existing knowledge in the ontology. Most of existing approaches mainly rely on the user when it comes to capturing and representing new knowledge. Our ontology evolution framework intends to reduce or even eliminate user input through the use of background knowledge. In this paper, we show how various sources of background knowledge could be exploited for relation discovery. We perform a relation discovery experiment focusing on the use of WordNet and Semantic Web ontologies as sources of background knowledge. We back our experiment with a thorough analysis that highlights various issues on how to improve and validate relation discovery in the future, which will directly improve the task of automatically performing ontology changes during evolution
The Neuroscience Information Framework: A Data and Knowledge Environment for Neuroscience
With support from the Institutes and Centers forming the NIH Blueprint for Neuroscience Research, we have designed and implemented a new initiative for integrating access to and use of Web-based neuroscience resources: the Neuroscience Information Framework. The Framework arises from the expressed need of the neuroscience community for neuroinformatic tools and resources to aid scientific inquiry, builds upon prior development of neuroinformatics by the Human Brain Project and others, and directly derives from the Society for Neuroscience’s Neuroscience Database Gateway. Partnered with the Society, its Neuroinformatics Committee, and volunteer consultant-collaborators, our multi-site consortium has developed: (1) a comprehensive, dynamic, inventory of Web-accessible neuroscience resources, (2) an extended and integrated terminology describing resources and contents, and (3) a framework accepting and aiding concept-based queries. Evolving instantiations of the Framework may be viewed at http://nif.nih.gov, http://neurogateway.org, and other sites as they come on line
Innovative in silico approaches to address avian flu using grid technology
The recent years have seen the emergence of diseases which have spread very
quickly all around the world either through human travels like SARS or animal
migration like avian flu. Among the biggest challenges raised by infectious
emerging diseases, one is related to the constant mutation of the viruses which
turns them into continuously moving targets for drug and vaccine discovery.
Another challenge is related to the early detection and surveillance of the
diseases as new cases can appear just anywhere due to the globalization of
exchanges and the circulation of people and animals around the earth, as
recently demonstrated by the avian flu epidemics. For 3 years now, a
collaboration of teams in Europe and Asia has been exploring some innovative in
silico approaches to better tackle avian flu taking advantage of the very large
computing resources available on international grid infrastructures. Grids were
used to study the impact of mutations on the effectiveness of existing drugs
against H5N1 and to find potentially new leads active on mutated strains. Grids
allow also the integration of distributed data in a completely secured way. The
paper presents how we are currently exploring how to integrate the existing
data sources towards a global surveillance network for molecular epidemiology.Comment: 7 pages, submitted to Infectious Disorders - Drug Target
A Genetic Programming Framework for Two Data Mining Tasks: Classification and Generalized Rule Induction
This paper proposes a genetic programming (GP) framework for two major data mining tasks, namely classification and generalized rule induction. The framework emphasizes the integration between a GP algorithm and relational database systems. In particular, the fitness of individuals is computed by submitting SQL queries to a (parallel) database server. Some advantages of this integration from a data mining viewpoint are scalability, data-privacy control and automatic parallelization
- …