3,719 research outputs found

    AiiDA: Automated Interactive Infrastructure and Database for Computational Science

    Full text link
    Computational science has seen in the last decades a spectacular rise in the scope, breadth, and depth of its efforts. Notwithstanding this prevalence and impact, it is often still performed using the renaissance model of individual artisans gathered in a workshop, under the guidance of an established practitioner. Great benefits could follow instead from adopting concepts and tools coming from computer science to manage, preserve, and share these computational efforts. We illustrate here our paradigm sustaining such vision, based around the four pillars of Automation, Data, Environment, and Sharing. We then discuss its implementation in the open-source AiiDA platform (http://www.aiida.net), that has been tuned first to the demands of computational materials science. AiiDA's design is based on directed acyclic graphs to track the provenance of data and calculations, and ensure preservation and searchability. Remote computational resources are managed transparently, and automation is coupled with data storage to ensure reproducibility. Last, complex sequences of calculations can be encoded into scientific workflows. We believe that AiiDA's design and its sharing capabilities will encourage the creation of social ecosystems to disseminate codes, data, and scientific workflows.Comment: 30 pages, 7 figure

    Semantically Resolving Type Mismatches in Scientific Workflows

    No full text
    Scientists are increasingly utilizing Grids to manage large data sets and execute scientific experiments on distributed resources. Scientific workflows are used as means for modeling and enacting scientific experiments. Windows Workflow Foundation (WF) is a major component of Microsoft’s .NET technology which offers lightweight support for long-running workflows. It provides a comfortable graphical and programmatic environment for the development of extended BPEL-style workflows. WF’s visual features ease the syntactic composition of Web services into scientific workflows but do nothing to assure that information passed between services has consistent semantic types or representations or that deviant flows, errors and compensations are handled meaningfully. In this paper we introduce SAWSDL-compliant annotations for WF and use them with a semantic reasoner to guarantee semantic type correctness in scientific workflows. Examples from bioinformatics are presented

    EcoGIS – GIS tools for ecosystem approaches to fisheries management

    Get PDF
    Executive Summary: The EcoGIS project was launched in September 2004 to investigate how Geographic Information Systems (GIS), marine data, and custom analysis tools can better enable fisheries scientists and managers to adopt Ecosystem Approaches to Fisheries Management (EAFM). EcoGIS is a collaborative effort between NOAA’s National Ocean Service (NOS) and National Marine Fisheries Service (NMFS), and four regional Fishery Management Councils. The project has focused on four priority areas: Fishing Catch and Effort Analysis, Area Characterization, Bycatch Analysis, and Habitat Interactions. Of these four functional areas, the project team first focused on developing a working prototype for catch and effort analysis: the Fishery Mapper Tool. This ArcGIS extension creates time-and-area summarized maps of fishing catch and effort from logbook, observer, or fishery-independent survey data sets. Source data may come from Oracle, Microsoft Access, or other file formats. Feedback from beta-testers of the Fishery Mapper was used to debug the prototype, enhance performance, and add features. This report describes the four priority functional areas, the development of the Fishery Mapper tool, and several themes that emerged through the parallel evolution of the EcoGIS project, the concept and implementation of the broader field of Ecosystem Approaches to Management (EAM), data management practices, and other EAM toolsets. In addition, a set of six succinct recommendations are proposed on page 29. One major conclusion from this work is that there is no single “super-tool” to enable Ecosystem Approaches to Management; as such, tools should be developed for specific purposes with attention given to interoperability and automation. Future work should be coordinated with other GIS development projects in order to provide “value added” and minimize duplication of efforts. In addition to custom tools, the development of cross-cutting Regional Ecosystem Spatial Databases will enable access to quality data to support the analyses required by EAM. GIS tools will be useful in developing Integrated Ecosystem Assessments (IEAs) and providing pre- and post-processing capabilities for spatially-explicit ecosystem models. Continued funding will enable the EcoGIS project to develop GIS tools that are immediately applicable to today’s needs. These tools will enable simplified and efficient data query, the ability to visualize data over time, and ways to synthesize multidimensional data from diverse sources. These capabilities will provide new information for analyzing issues from an ecosystem perspective, which will ultimately result in better understanding of fisheries and better support for decision-making. (PDF file contains 45 pages.

    NEXT LEVEL: A COURSE RECOMMENDER SYSTEM BASED ON CAREER INTERESTS

    Get PDF
    Skills-based hiring is a talent management approach that empowers employers to align recruitment around business results, rather than around credentials and title. It starts with employers identifying the particular skills required for a role, and then screening and evaluating candidates’ competencies against those requirements. With the recent rise in employers adopting skills-based hiring practices, it has become integral for students to take courses that improve their marketability and support their long-term career success. A 2017 survey of over 32,000 students at 43 randomly selected institutions found that only 34% of students believe they will graduate with the skills and knowledge required to be successful in the job market. Furthermore, the study found that while 96% of chief academic officers believe that their institutions are very or somewhat effective at preparing students for the workforce, only 11% of business leaders strongly agree [11]. An implication of the misalignment is that college graduates lack the skills that companies need and value. Fortunately, the rise of skills-based hiring provides an opportunity for universities and students to establish and follow clearer classroom-to-career pathways. To this end, this paper presents a course recommender system that aims to improve students’ career readiness by suggesting relevant skills and courses based on their unique career interests

    Multi-level Policy-aware Privacy Analysis

    Get PDF
    Projekt NAPLES (Novel Tools for Analysing Privacy Leakages – Privaatslekete Analüüsi Uudsed Vahendid) on Tartu Ülikooli ja Cybernetica AS-i ühine teadusprojekt, mida rahastab Kaitsealase Täiustatud Uurimisprojektide Agentuuri (DARPA) Brandeisi programm.NAPLES-i raames on välja töötatud teooria ja erinevaid tööriistu, et tuvastada ning kirjeldada infosüsteemide andmelekkeid. PLEAK on tööriist, mille sisendiks on äriprotsessimudeli ja -notatsiooni (BPMN) abil kirja pandud äriprotsess. Lisaks standardsele notatsioonile on mudelile lisatud arvutuslikke detaile ning infot privaatsuskaitse tehnoloogiate kohta, mis võimaldavad erinevatel tasemetel privaatsuslekete analüüse. NAPLES-i projekti käigus on loodud mitu erinevat analüüsitööriista. Peamiselt keskenduvad analüsaatorid niinimetatud "SQL koostöövoole" - BPMN-i koostöö mudelile, mille tegevused ning andmeobjektid on kirjeldatud vastavalt SQL päringute ning tabeli skeemidega. Binaarne avalikustamise analüüs annab privaatsuskaitse tehnoloogiate põhjal kõrgtasemelise ülevaate selle kohta, kellele on mingid andmed kättesaadavad. Teised analüüsivahendid nagu Leaks-When (Millal lekib) ja Guessing Advantage (äraarvamise edukus) lisavad detailsemad kvalitatiivseid ning kvantitatiivseid meetmeid lekete paremaks mõistmiseks.Minu töö oli NAPLE projekti osa ning minu panused olid mitmesugused.Esiteks ma lisasin globaalse ja lokaalse privaatsuspoliitika ideed SQL koostöövoogudessse. Privaatsuspoliitika tagab äriprotsessis osalejale ligipääsuõiguse mingile osale SQL skeemiga kirjeldatud andmetest. Teiseks ma kavandasin ning integreerisin mitmekihilise lekkanalüüsi alates binaarsest avalikustamise analüüsist (millised andmed on nähtaval) kuni tingimusliku avalikustamise (mis tingimustel leke toimub) ja kvantitatiivse meetmeni (kui palju andmete kohta lekib). Mitmekihiline analüüs põhineb PLEAK-i analüsaatoritel, kuid neid oli vaja täiendada, et nad toetaksid ühtseid sisendeid ning et Leaks-When ja Guessing Advantage tööriistad põhineksid privaatsuspoliitikatel. Lisaks arendasin juhtumiuuringu, et demonstreerida integreeritud mitmetasandilist privaatsusanalüüsi ning PLEAK-i tööriistu.The NAPLES (Novel Tools for Analysing Privacy Leakages) project is a research initiative conducted as a collaboration between Cybernetica AS and the University of Tartu, with funds of the Brandeis program of the Defense Advanced Research Projects Agency (DARPA). The research project has produced the theory and a set of tools for the analysis of privacy-related concerns, to determine the potential leakage of the data from the information systems. Specifically, PLEAK is a tool that takes as input business processes specified with the Business Process Model and Notation (BPMN), where modelentities are associated with privacy-enhancing technologies, in order to enable the analysis of privacy concerns at different levels of granularity. With the time, the NAPLES project has produced several analyzers. Such analyzers target SQLcollaborative workflows, that is, BPMN collaborative models that specify the steps of computation that correspond to SQL manipulation statements over the data objects representing the SQL data sources. The simple disclosure analysis performs a high-level data reachability analysis that reveals potentialdata leakages in the privacy-enhanced model of a business process: it tells whether a data object is visible to a given party. Other analyzers, such as the Leaks-When and the Guessing Advantage ones, provide finer-grained, qualitative and quantitative measures of data leakage to stakeholders.My work was part of the NAPLES project and my contributions are manifold. First, I added the concept of Global and Local privacy policies in the SQL collaborative workflows, which endow a party of the business process with access rights to the selected SQL entities with defined constraints. Second,I designed an integrated multi-level approach to the disclosure analysis: from the high-level declarative disclosure (What data might leak?) to the conditional disclosure (When does data leak?) and quantitative measure (How much does data leak?). This approach is based on existing tools of PLEAK for privacyanalysis. However, I refined these tools to accept more unified set of inputs and integrated the privacy policies with the Leaks-When and Guessing Advantage analyzers. Finally, I developed a case study, which has been used for showcasing the aforementioned integrated multi-level approach to the disclosure analysis, and that has been used as a proof-of-concept for NAPLES tools
    corecore