677 research outputs found

    Network Community Detection on Metric Space

    Full text link
    Community detection in a complex network is an important problem of much interest in recent years. In general, a community detection algorithm chooses an objective function and captures the communities of the network by optimizing the objective function, and then, one uses various heuristics to solve the optimization problem to extract the interesting communities for the user. In this article, we demonstrate the procedure to transform a graph into points of a metric space and develop the methods of community detection with the help of a metric defined for a pair of points. We have also studied and analyzed the community structure of the network therein. The results obtained with our approach are very competitive with most of the well-known algorithms in the literature, and this is justified over the large collection of datasets. On the other hand, it can be observed that time taken by our algorithm is quite less compared to other methods and justifies the theoretical findings

    The Global Ant Biodiversity Informatics (GABI) database: synthesizing data on the geographic distribution of ant species (Hymenoptera: Formicidae)

    Get PDF
    The global distribution patterns of most vertebrate groups and several plant groups have been described and analyzed over the past few years, a development facilitated by the compilation of important databases. Similar efforts are needed for large insect groups that constitute he majority of global biodiversity. As a result of this lack of information, invertebrate taxa are often left out of both large-scale analyses of biodiversity patterns and large-scale efforts in conservation planning and prioritization. Here, we introduce the first comprehensive global database of ant species distributions, the Global Ant Biodiversity Informatics (GABI) database, based on the compilation of 1.72 million records extracted from over 8811 publications and 25 existing databases. We first present the main goals of the database, the methodology used to build the database, is well as its limitations and challenges. Then, we discuss how different fields of ant biology may benefit from utilizing this tool. Finally, we emphasize the importance of future participation of myrmecologists to improve the database and use it to identify and fill holes in our knowledge of ant biodiversity.published_or_final_versio

    Knowledge sharing and collaboration in translational research, and the DC-THERA Directory

    Get PDF
    Biomedical research relies increasingly on large collections of data sets and knowledge whose generation, representation and analysis often require large collaborative and interdisciplinary efforts. This dimension of ‘big data’ research calls for the development of computational tools to manage such a vast amount of data, as well as tools that can improve communication and access to information from collaborating researchers and from the wider community. Whenever research projects have a defined temporal scope, an additional issue of data management arises, namely how the knowledge generated within the project can be made available beyond its boundaries and life-time. DC-THERA is a European ‘Network of Excellence’ (NoE) that spawned a very large collaborative and interdisciplinary research community, focusing on the development of novel immunotherapies derived from fundamental research in dendritic cell immunobiology. In this article we introduce the DC-THERA Directory, which is an information system designed to support knowledge management for this research community and beyond. We present how the use of metadata and Semantic Web technologies can effectively help to organize the knowledge generated by modern collaborative research, how these technologies can enable effective data management solutions during and beyond the project lifecycle, and how resources such as the DC-THERA Directory fit into the larger context of e-science

    Deep Learning Training and Benchmarks for Earth Observation Images: Data Sets, Features, and Procedures

    Get PDF
    Deep learning methods are often used for image classification or local object segmentation. The corresponding test and validation data sets are an integral part of the learning process and also of the algorithm performance evaluation. High and particularly very high-resolution Earth observation (EO) applications based on satellite images primarily aim at the semantic labeling of land cover structures or objects as well as of temporal evolution classes. However, one of the main EO objectives is physical parameter retrievals such as temperatures, precipitation, and crop yield predictions. Therefore, we need reliably labeled data sets and tools to train the developed algorithms and to assess the performance of our deep learning paradigms. Generally, imaging sensors generate a visually understandable representation of the observed scene. However, this does not hold for many EO images, where the recorded images only depict a spectral subset of the scattered light field, thus generating an indirect signature of the imaged object. This spots the load of EO image understanding, as a new and particular challenge of Machine Learning (ML) and Artificial Intelligence (AI). This chapter reviews and analyses the new approaches of EO imaging leveraging the recent advances in physical process-based ML and AI methods and signal processing

    Location, location, location: utilizing pipelines and services to more effectively georeference the world's biodiversity data

    Get PDF
    Abstract Background Increasing the quantity and quality of data is a key goal of biodiversity informatics, leading to increased fitness for use in scientific research and beyond. This goal is impeded by a legacy of geographic locality descriptions associated with biodiversity records that are often heterogeneous and not in a map-ready format. The biodiversity informatics community has developed best practices and tools that provide the means to do retrospective georeferencing (e.g., the BioGeomancer toolkit), a process that converts heterogeneous descriptions into geographic coordinates and a measurement of spatial uncertainty. Even with these methods and tools, data publishers are faced with the immensely time-consuming task of vetting georeferenced localities. Furthermore, it is likely that overlap in georeferencing effort is occurring across data publishers. Solutions are needed that help publishers more effectively georeference their records, verify their quality, and eliminate the duplication of effort across publishers. Results We have developed a tool called BioGeoBIF, which incorporates the high throughput and standardized georeferencing methods of BioGeomancer into a beginning-to-end workflow. Custodians who publish their data to the Global Biodiversity Information Facility (GBIF) can use this system to improve the quantity and quality of their georeferences. BioGeoBIF harvests records directly from the publishers' access points, georeferences the records using the BioGeomancer web-service, and makes results available to data managers for inclusion at the source. Using a web-based, password-protected, group management system for each data publisher, we leave data ownership, management, and vetting responsibilities with the managers and collaborators of each data set. We also minimize the georeferencing task, by combining and storing unique textual localities from all registered data access points, and dynamically linking that information to the password protected record information for each publisher. Conclusion We have developed one of the first examples of services that can help create higher quality data for publishers mediated through the Global Biodiversity Information Facility and its data portal. This service is one step towards solving many problems of data quality in the growing field of biodiversity informatics. We envision future improvements to our service that include faster results returns and inclusion of more georeferencing engines

    The consistent representation of scientific knowledge : investigations into the ontology of karyotypes and mitochondria

    Get PDF
    PhD ThesisOntologies are widely used in life sciences to model scienti c knowledge. The engineering of these ontologies is well-studied and there are a variety of methodologies and techniques, some of which have been re-purposed from software engineering methodologies and techniques. However, due to the complex nature of bio-ontologies, they are not resistant to errors and mistakes. This is especially true for more expressive and/or larger ontologies. In order to improve on this issue, we explore a variety of software engineering techniques that were re-purposed in order to aid ontology engineering. This exploration is driven by the construction of two light-weight ontologies, The Mitochondrial Disease Ontology and The Karyotype Ontology. These ontologies have speci c and useful computational goals, as well as providing exemplars for our methodology. This thesis discusses the modelling decisions undertaken as well as the overall success of each ontological model. Due to the added knowledge capture steps required for the mitochondrial knowledge, The Karyotype Ontology is further developed than The Mitochondrial Disease Ontology. Speci cally, this thesis explores the use of a pattern-driven and programmatic approach to bio-medical ontology engineering. During the engineering of our biomedical ontologies, we found many of the components of each model were similar in logical and textual de nitions. This was especially true for The Karyotype Ontology. In software engineering a common technique to avoid replication is to abstract through the use of patterns. Therefore we utilised localised patterns to model these highly repetitive models. There are a variety of possible tools for the encoding of these patterns, but we found ontology development using Graphical User Interface (GUI) tools to be time-consuming due to the necessity of manual GUI interaction when the ontology needed updating. With the development of Tawny- OWL, a programmatic tool for ontology construction, we are able to overcome this issue, with the added bene t of using a single syntax to express both simple and - i - patternised parts of the ontology. Lastly, we brie y discuss how other methodologies and tools from software engineering, namely unit tests, di ng, version control and Continuous Integration (CI) were re-purposed and how they aided the engineering of our two domain ontologies. Together, this knowledge increases our understanding in ontology engineering techniques. By re-purposing software engineering methodologies, we have aided construction, quality and maintainability of two novel ontologies, and have demonstrated their applicability more generally

    Enhanced Laboratory Learning

    Get PDF
    Innovation, in the field of information sharing, has the potential to increase the educational value of material through more convenient and easy to use formats. This project sought to implement an informational database using the Web 2.0 wiki technology that will serve to improve student comprehension of Biology Lab material and procedures. The collected results showed that the wiki generally improved the ability of the students to comprehend and access necessary material. Related areas of study for future project ideas could include activating the option to allow all users the ability to edit pages, creating a peer edited informational database
    • …
    corecore