597 research outputs found

    Implementing the WiPo Architecture

    Full text link

    High-quality Web information provisioning and quality-based data pricing

    Full text link
    Today, information can be considered a production factor. This is attributed to the technological innovations the Internet and the Web have brought about. Now, a plethora of information is available making it hard to find the most relevant information. Subsequently, the issue of finding and purchasing high-quality data arises. Addressing these challenges, this work first examines how high-quality information provisioning can be achieved with an approach called WiPo that exploits the idea of curation, i. e., the selection, organisation, and provisioning of information with human involvement. The second part of this work investigates the issue that there is little understanding of what the value of data is and how it can be priced – despite the fact that it is already being traded on data marketplaces. To overcome this, a pricing approach based on the Multiple-Choice Knapsack Problem is proposed that allows for utility maximisation for customers and profit maximisation for vendors

    The eNanoMapper database for nanomaterial safety information

    Get PDF
    Background: The NanoSafety Cluster, a cluster of projects funded by the European Commision, identified the need for a computational infrastructure for toxicological data management of engineered nanomaterials (ENMs). Ontologies, open standards, and interoperable designs were envisioned to empower a harmonized approach to European research in nanotechnology. This setting provides a number of opportunities and challenges in the representation of nanomaterials data and the integration of ENM information originating from diverse systems. Within this cluster, eNanoMapper works towards supporting the collaborative safety assessment for ENMs by creating a modular and extensible infrastructure for data sharing, data analysis, and building computational toxicology models for ENMs. Results: The eNanoMapper database solution builds on the previous experience of the consortium partners in supporting diverse data through flexible data storage, open source components and web services. We have recently described the design of the eNanoMapper prototype database along with a summary of challenges in the representation of ENM data and an extensive review of existing nano-related data models, databases, and nanomaterials-related entries in chemical and toxicogenomic databases. This paper continues with a focus on the database functionality exposed through its application programming interface (API), and its use in visualisation and modelling. Considering the preferred community practice of using spreadsheet templates, we developed a configurable spreadsheet parser facilitating user friendly data preparation and data upload. We further present a web application able to retrieve the experimental data via the API and analyze it with multiple data preprocessing and machine learning algorithms. Conclusion: We demonstrate how the eNanoMapper database is used to import and publish online ENM and assay data from several data sources, how the “representational state transfer” (REST) API enables building user friendly interfaces and graphical summaries of the data, and how these resources facilitate the modelling of reproducible quantitative structure–activity relationships for nanomaterials (NanoQSAR)

    The first IEEE workshop on the Future of Research Curation and Research Reproducibility

    Full text link
    This report describes perspectives from the Workshop on the Future of Research Curation and Research Reproducibility that was collaboratively sponsored by the U.S. National Science Foundation (NSF) and IEEE (Institute of Electrical and Electronics Engineers) in November 2016. The workshop brought together stakeholders including researchers, funders, and notably, leading science, technology, engineering, and mathematics (STEM) publishers. The overarching objective was a deep dive into new kinds of research products and how the costs of creation and curation of these products can be sustainably borne by the agencies, publishers, and researcher communities that were represented by workshop participants.National Science Foundation Award #164101

    Capturing mobile security policies precisely

    Get PDF
    The security policies of mobile devices that describe how we should use these devices are often informally specified. Users have preferences for some apps over others. Some users may avoid apps which can access large amounts of their personal data, whilst others may not care. A user is unlikely to write down these policies or describe them using a formal policy language. This is unfortunate as without a formal description of the policy we cannot precisely reason about them. We cannot help users to pick the apps they want if we cannot describe their policies. Companies have mobile security policies that definehowan employee should use smart phone devices and tablet computers from home at work. A company might describe the policy in a natural language document for employees to read and agree to. They might also use some software installed on employee’s devices to enforce the company rules. Without a link between the specification of the policy in the natural language document and the implementation of the policy with the tool, understanding how they are related can be hard. This thesis looks at developing an authorisation logic, called AppPAL, to capture the informal security policies of the mobile ecosystem, which we define as the interactions surrounding the use of mobile devices in a particular setting. This includes the policies of the users, the devices, the app stores, and the environments the users bring the devices into. Whilst earlier work has looked on checking and enforcing policies with low-level controls, this work aims to capture these informal policy’s intents and the trust relationships within them separating the policy specification from its enforcement. This allows us to analyse the informal policies precisely, and reason about how they are used. We show how AppPAL instantiates SecPAL, a policy language designed for access control in distributed environments. We describe AppPAL’s implementation as an authorisation logic for mobile ecosystems. We show how we can check AppPAL policies for common errors. Using AppPAL we show that policies describing users privacy preferences do not seem to match the apps users install. We explore the di↔erences between app stores and how to create new ones based on policy. We look at five BYOD policies and discover previously unexamined idioms within them. This suggests aspects of BYOD policies not managed by current BYOD tools

    Quantifying prey availability using the foraging plasticity of a marine predator, the little penguin

    Get PDF
    Detecting changes in marine food webs is challenging, but top predators can provide information on lower trophic levels. However, many commonly measured predator responses can be decoupled from prey availability by plasticity in predator foraging effort. This can be overcome by directly measuring foraging effort and success and integrating these into a measure of foraging efficiency analogous to the catch per unit effort (CPUE) index employed by fisheries. We extended existing CPUE methods so that they would be applicable to the study of generalist foragers, which introduce another layer of complexity through dietary plasticity. Using this method, we inferred species‐specific patterns in prey availability and estimated taxon‐specific biomass consumption. We recorded foraging trip duration and body mass change of breeding little penguins Eudyptula minor and combined these with diet composition identified via non‐invasive faecal DNA metabarcoding to derive CPUE indices for individual prey taxa. We captured weekly patterns of availability of key fish prey in the penguins’ diet and identified a major prey shift from sardine Sardinops sagax to red cod Pseudophycis bachus between years. In each year, predation on a dominant fish species (~150 g/day) was replaced by greater diversity of fish in the diet as the breeding season progressed. We estimated that the colony extracted ~1,300 tonnes of biomass from their coastal ecosystem over two breeding seasons, including 219 tonnes of the commercially important sardine and 215 tonnes of red cod. This enhanced pCPUE is applicable to most central‐placed foragers and offers a valuable alternative to existing metrics. Informed prey‐species biomass estimates extracted by apex and meso predators will be a useful input for mass‐balance ecosystem models and for informing ecosystem‐based management. A free Plain Language Summary can be found within the Supporting Information of this article

    Methods to Improve Applicability and Efficiency of Distributed Data-Centric Compute Frameworks

    Get PDF
    The success of modern applications depends on the insights they collect from their data repositories. Data repositories for such applications currently exceed exabytes and are rapidly increasing in size, as they collect data from varied sources - web applications, mobile phones, sensors and other connected devices. Distributed storage and data-centric compute frameworks have been invented to store and analyze these large datasets. This dissertation focuses on extending the applicability and improving the efficiency of distributed data-centric compute frameworks

    Anomaly-based Filtering of Application-Layer DDoS Against DNS Authoritatives

    Get PDF
    Authoritative DNS infrastructures are at the core of the Internet ecosystem. But how resilient are typical authoritative DNS name servers against application-layer Denial-of-Service attacks? In this paper, with the help of a large country-code TLD operator, we assess the expected attack load and DoS countermeasures. We find that standard botnets or even single-homed attackers can overload the computational resources of authoritative name servers—even if redundancy such as anycast is in place. To prevent the resulting devastating DNS outages, we assess how effective upstream filters can be as a last resort. We propose an anomaly detection defense that allows both, well-behaving high-volume DNS resolvers as well as low-volume clients to continue name lookups—while blocking most of the attack traffic. Upstream ISPs or IXPs can deploy our scheme and drop attack traffic to reasonable query loads at or below 100k queries per second at a false positive rate of 1.2 % to 5.7 % (median 2.4 %)

    Water and Nutrition: Harmonizing actions for the United Nations Decade of Action on Nutrition and the United Nations Water Action Decade

    Get PDF
    Progress for both SDG 2 and SDG 6 has been unsatisfactory, with several indicators worsening over time, including an increase in the number of undernourished, overweight and obese people, as well as rapid increases in the number of people at risk of severe water shortages. This lack of progress is exacerbated by climate change and growing regional and global inequities in food and water security, including access to good quality diets, leading to increased violation of the human rights to water and food. Reversing these trends will require a much greater effort on the part of water, food security, and nutrition communities, including stronger performances by the United Nations Decade of Action on Nutrition and the United Nations International Decade for Action on Water for Sustainable Development. To date, increased collaboration by these two landmark initiatives is lacking, as neither work program has systematically explored linkages or possibilities for joint interventions. Collaboration is especially imperative given the fundamental challenges that characterize the promotion of one priority over another. Without coordination across the water, food security, and nutrition communities, actions toward achieving SDG2 on zero hunger may contribute to further degradation of the world’s water resources and as such, further derail achievement of the UN Decade of Action on Water and SDG 6 on water and sanitation. Conversely, actions to enhance SDG 6 may well reduce progress on the UN Decade of Action on Nutrition and SDG 2. This paper reviews these challenges as part of a broader analysis of the complex web of pathways that link water, food security and nutrition outcomes. Climate change and the growing demand for water resources are also considered, given their central role in shaping future water and nutrition security. The main conclusions are presented as three recommendations focused on potential avenues to deal with the complexity of the water-nutrition nexus, and to optimize outcomes

    Algorithmic Techniques in Gene Expression Processing. From Imputation to Visualization

    Get PDF
    The amount of biological data has grown exponentially in recent decades. Modern biotechnologies, such as microarrays and next-generation sequencing, are capable to produce massive amounts of biomedical data in a single experiment. As the amount of the data is rapidly growing there is an urgent need for reliable computational methods for analyzing and visualizing it. This thesis addresses this need by studying how to efficiently and reliably analyze and visualize high-dimensional data, especially that obtained from gene expression microarray experiments. First, we will study the ways to improve the quality of microarray data by replacing (imputing) the missing data entries with the estimated values for these entries. Missing value imputation is a method which is commonly used to make the original incomplete data complete, thus making it easier to be analyzed with statistical and computational methods. Our novel approach was to use curated external biological information as a guide for the missing value imputation. Secondly, we studied the effect of missing value imputation on the downstream data analysis methods like clustering. We compared multiple recent imputation algorithms against 8 publicly available microarray data sets. It was observed that the missing value imputation indeed is a rational way to improve the quality of biological data. The research revealed differences between the clustering results obtained with different imputation methods. On most data sets, the simple and fast k-NN imputation was good enough, but there were also needs for more advanced imputation methods, such as Bayesian Principal Component Algorithm (BPCA). Finally, we studied the visualization of biological network data. Biological interaction networks are examples of the outcome of multiple biological experiments such as using the gene microarray techniques. Such networks are typically very large and highly connected, thus there is a need for fast algorithms for producing visually pleasant layouts. A computationally efficient way to produce layouts of large biological interaction networks was developed. The algorithm uses multilevel optimization within the regular force directed graph layout algorithm.Siirretty Doriast
    • 

    corecore