109,873 research outputs found

    Scalable Spatial Framework for NoSQL Databases - Haslam Scholars Program Undergraduate Thesis

    Get PDF
    The spatial frameworks used for knowledge discovery in “Big Data” areas such as urban information systems (UIS) are well- developed in SQL databases but are not as extensive within certain NoSQL databases. The focus of this project is to develop this framework for emerging search systems (ESS) in UIS by utilizing NoSQL databases, notably the document-based MongoDB. Such framework includes spatial functions for the most fundamental spatial queries. An ESS in UIS can take advantage of these new and attractive features of scalability within MongoDB to provide a robust approach to spatial search that differs from SQL relations and scalability. MongoDB, which is relatively in its early stages of spatial search in contrast to PostgreSQL, will require contributions to its spatial “toolbox”. Many of the operations present in SQL packages, such as PostGIS, are not in MongoDB. Thus, there is an opportunity to contribute to MongoDB’s ongoing geospatial evolution by developing, testing, and optimizing the spatial utilities used for large NoSQL datasets. Within UIS, these core operations can prove to be an important starting point for detailed geospatial analysis and high-impact data production. We hope, by open sourcing this framework (as an extension), it can serve the research community as the foundation for scalable NoSQL platforms for big geospatial data analytics and be the next stage for open source contributions to MongoDB

    The Neuroscience Information Framework: A Data and Knowledge Environment for Neuroscience

    Get PDF
    With support from the Institutes and Centers forming the NIH Blueprint for Neuroscience Research, we have designed and implemented a new initiative for integrating access to and use of Web-based neuroscience resources: the Neuroscience Information Framework. The Framework arises from the expressed need of the neuroscience community for neuroinformatic tools and resources to aid scientific inquiry, builds upon prior development of neuroinformatics by the Human Brain Project and others, and directly derives from the Society for Neuroscience’s Neuroscience Database Gateway. Partnered with the Society, its Neuroinformatics Committee, and volunteer consultant-collaborators, our multi-site consortium has developed: (1) a comprehensive, dynamic, inventory of Web-accessible neuroscience resources, (2) an extended and integrated terminology describing resources and contents, and (3) a framework accepting and aiding concept-based queries. Evolving instantiations of the Framework may be viewed at http://nif.nih.gov, http://neurogateway.org, and other sites as they come on line

    Innovative in silico approaches to address avian flu using grid technology

    Get PDF
    The recent years have seen the emergence of diseases which have spread very quickly all around the world either through human travels like SARS or animal migration like avian flu. Among the biggest challenges raised by infectious emerging diseases, one is related to the constant mutation of the viruses which turns them into continuously moving targets for drug and vaccine discovery. Another challenge is related to the early detection and surveillance of the diseases as new cases can appear just anywhere due to the globalization of exchanges and the circulation of people and animals around the earth, as recently demonstrated by the avian flu epidemics. For 3 years now, a collaboration of teams in Europe and Asia has been exploring some innovative in silico approaches to better tackle avian flu taking advantage of the very large computing resources available on international grid infrastructures. Grids were used to study the impact of mutations on the effectiveness of existing drugs against H5N1 and to find potentially new leads active on mutated strains. Grids allow also the integration of distributed data in a completely secured way. The paper presents how we are currently exploring how to integrate the existing data sources towards a global surveillance network for molecular epidemiology.Comment: 7 pages, submitted to Infectious Disorders - Drug Target

    A Genetic Programming Framework for Two Data Mining Tasks: Classification and Generalized Rule Induction

    Get PDF
    This paper proposes a genetic programming (GP) framework for two major data mining tasks, namely classification and generalized rule induction. The framework emphasizes the integration between a GP algorithm and relational database systems. In particular, the fitness of individuals is computed by submitting SQL queries to a (parallel) database server. Some advantages of this integration from a data mining viewpoint are scalability, data-privacy control and automatic parallelization
    • …
    corecore