6 research outputs found

    Workflow environments for advanced cyberinfrastructure platforms

    Get PDF
    Progress in science is deeply bound to the effective use of high-performance computing infrastructures and to the efficient extraction of knowledge from vast amounts of data. Such data comes from different sources that follow a cycle composed of pre-processing steps for data curation and preparation for subsequent computing steps, and later analysis and analytics steps applied to the results. However, scientific workflows are currently fragmented in multiple components, with different processes for computing and data management, and with gaps in the viewpoints of the user profiles involved. Our vision is that future workflow environments and tools for the development of scientific workflows should follow a holistic approach, where both data and computing are integrated in a single flow built on simple, high-level interfaces. The topics of research that we propose involve novel ways to express the workflows that integrate the different data and compute processes, dynamic runtimes to support the execution of the workflows in complex and heterogeneous computing infrastructures in an efficient way, both in terms of performance and energy. These infrastructures include highly distributed resources, from sensors and instruments, and devices in the edge, to High-Performance Computing and Cloud computing resources. This paper presents our vision to develop these workflow environments and also the steps we are currently following to achieve it.This work has been supported by the Spanish Government (SEV2015-0493), by the Spanish Ministry of Science and Innovation (contract TIN2015-65316-P), by Generalitat de Catalunya (contract 2014-SGR-1051). Javier Conejero postdoctoral contract is co-financed by the Ministry of Economy and Competitiveness under Juan de la Cierva Formacion´ postdoctoral fellowship number FJCI-2015-24651. This work is supported by the H2020 mF2C project (730929) and the CLASS project (780622). The participation of Rosa M Badia in the BDEC2 meetings is supported by the EXDCI project (800957). The dislib library developments are partially funded under the project agreement between BSC and FUJITSU.Peer ReviewedPostprint (author's final draft

    COMP Superscalar, an interoperable programming framework

    Get PDF
    COMPSs is a programming framework that aims to facilitate the parallelization of existing applications written in Java, C/C++ and Python scripts. For that purpose, it offers a simple programming model based on sequential development in which the user is mainly responsible for identifying the functions to be executed as asynchronous parallel tasks and annotating them with annotations or standard Python decorators. A runtime system is in charge of exploiting the inherent concurrency of the code, automatically detecting and enforcing the data dependencies between tasks and spawning these tasks to the available resources, which can be nodes in a cluster, clouds or grids. In cloud environments, COMPSs provides scalability and elasticity features allowing the dynamic provision of resources.This work has been supported by the following institutions: the Spanish Government with grant SEV-2011-00067 of the Severo Ochoa Program and contract Computacion de Altas Prestaciones VI (TIN2012-34557); by the SGR programme (2014-SGR-1051) of the Catalan Government; by the project The Human Brain Project, funded by the European Commission under contract 604102; by the ASCETiC project funded by the European Commission under contract 610874; by the EUBrazilCloudConnect project funded by the European Commission under contract 614048; and by the Intel-BSC Exascale Lab collaboration.Peer ReviewedPostprint (published version

    OMWS: A Web Service Interface for Ecological Niche Modelling

    Get PDF
    [EN] Ecological niche modelling (ENM) experiments often involve a high number of tasks to be performed. Such tasks may consume a significant amount of computing resources and take a long time to complete, especially when using personal computers. OMWS is a Web service interface that allows more powerful computing back-ends to be remotely exploited by other applications to carry out ENM tasks. Its latest version includes a new operation that can be used to specify complex workflows in a single request, adding the possibility of using workflow management systems on parallel computing back-end. In this paper we describe the OMWS protocol and compare its most recent version with the previous one by running the same ENM experiment using two functionally equivalent clients, each designed for one of the OMWS interface versions. Different back-end configurations were used to investigate how the performance scales for each protocol version when more processing power is made available. Results show that the new version outperforms (in a factor of 2) the previous one when more computing resources are used.The latest version of OMWS contains improvements coming from different sets of requirements originated from two projects that funded their corresponding implementation: EUBrazilOpenBio14, with grants from the European Commission and the National Council for Scientific and Technological Development of Brazil (CNPq) of the Brazilian Ministry of Science and Technology (MCT), and BioVeL, with grants from the European Commission. Server infrastructure was operated through a provisioning system developed in the frame of the Spanish project CLUVIEM (TIN2013-44390-R) funded by the "Ministerio de Economía y Competitividad".Giovanni, RD.; Torres Serrano, E.; Amaral, RB.; Blanquer Espert, I.; Rebello, V.; Canhos, VP. (2015). OMWS: A Web Service Interface for Ecological Niche Modelling. Biodiversity Informatics. 10:35-44. https://doi.org/10.17161/bi.v10i0.4853S35441

    Supporting biodiversity studies with the EUBrazilOpenBio Hybrid Data Infrastructure

    Get PDF
    [EN] EUBrazilOpenBio is a collaborative initiative addressing strategic barriers in biodiversity research by integrating open access data and user-friendly tools widely available in Brazil and Europe. The project deploys the EU-Brazil Hybrid Data Infrastructure that allows the sharing of hardware, software and data on-demand. This infrastructure provides access to several integrated services and resources to seamlessly aggregate taxonomic, biodiversity and climate data, used by processing services implementing checklist cross-mapping and ecological niche modelling. A Virtual Research Environment was created to provide users with a single entry point to processing and data resources. This article describes the architecture, demonstration use cases and some experimental results and validation.EUBrazilOpenBio - Open Data and Cloud Computing e-Infrastructure for Biodiversity (2011-2013) is a Small or medium-scale focused research project (STREP) funded by the European Commission under the Cooperation Programme, Framework Programme Seven (FP7) Objective FP7-ICT-2011- EU-Brazil Research and Development cooperation, and the National Council for Scientific and Technological Development of Brazil (CNPq) of the Brazilian Ministry of Science, Technology and Innovation (MCTI) under the corresponding matching Brazilian Call for proposals MCT/CNPq 066/2010. BSC authors also acknowledge the support of the grant SEV-2011-00067 of Severo Ochoa Program, awarded by the Spanish Government and the Spanish Ministry of Science and Innovation under contract TIN2012-34557 and the Generalitat de Catalunya (contract 2009-SGR-980).Amaral, R.; Badia, RM.; Blanquer Espert, I.; Braga-Neto, R.; Candela, L.; Castelli, D.; Flann, C.... (2015). Supporting biodiversity studies with the EUBrazilOpenBio Hybrid Data Infrastructure. Concurrency and Computation: Practice and Experience. 27(2):376-394. https://doi.org/10.1002/cpe.3238S376394272EUBrazilOpenBio Consortium EU-Brazil Open Data and Cloud Computing e-Infrastructure for Biodiversity http://www.eubrazilopenbio.eu/Triebel, D., Hagedorn, G., & Rambold, G. (2012). An appraisal of megascience platforms for biodiversity information. MycoKeys, 5, 45-63. doi:10.3897/mycokeys.5.4302Edwards, J. L. (2000). Interoperability of Biodiversity Databases: Biodiversity Information on Every Desktop. Science, 289(5488), 2312-2314. doi:10.1126/science.289.5488.2312Grassle, F. (2000). The Ocean Biogeographic Information System (OBIS): An On-line, Worldwide Atlas for Accessing, Modeling and Mapping Marine Biological Data in a Multidimensional Geographic Context. Oceanography, 13(3), 5-7. doi:10.5670/oceanog.2000.01Constable, H., Guralnick, R., Wieczorek, J., Spencer, C., & Peterson, A. T. (2010). VertNet: A New Model for Biodiversity Data Sharing. PLoS Biology, 8(2), e1000309. doi:10.1371/journal.pbio.1000309Roskov Y Kunze T Paglinawan L Orrell T Nicolson D Culham A Bailly N Kirk P Bourgoin T Baillargeon G Hernandez F De Wever A Species 2000 & ITIS Catalogue of Life 2013 www.catalogueoflife.org/col/speciesLink Consortium speciesLink http://splink.cria.org.br 2013 http://splink.cria.org.brList of Species of the Brazilian Flora Consortium List of Species of the Brazilian Flora http://floradobrasil.jbrj.gov.br/ 2013 http://floradobrasil.jbrj.gov.br/Wieczorek, J., Bloom, D., Guralnick, R., Blum, S., Döring, M., Giovanni, R., … Vieglais, D. (2012). Darwin Core: An Evolving Community-Developed Biodiversity Data Standard. PLoS ONE, 7(1), e29715. doi:10.1371/journal.pone.0029715De Giovanni R Copp C Döring M Güntscg A Vieglais D Hobern D Torre J Wieczorek J Gales R Hyam R Blum S Perry S TAPIR - TDWG Access Protocol for Information Retrieval http://www.tdwg.org/activities/abcd/Jetz, W., McPherson, J. M., & Guralnick, R. P. (2012). Integrating biodiversity distribution knowledge: toward a global map of life. Trends in Ecology & Evolution, 27(3), 151-159. doi:10.1016/j.tree.2011.09.007NICE Srl Enginframe 2013 http://www.nice-software.com/products/enginframeHiden, H., Woodman, S., Watson, P., & Cala, J. (2013). Developing cloud applications using the e-Science Central platform. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 371(1983), 20120085. doi:10.1098/rsta.2012.0085Glatard, T., Montagnat, J., Lingrand, D., & Pennec, X. (2008). Flexible and Efficient Workflow Deployment of Data-Intensive Applications On Grids With MOTEUR. The International Journal of High Performance Computing Applications, 22(3), 347-360. doi:10.1177/1094342008096067Kacsuk, P., & Sipos, G. (2005). Multi-Grid, Multi-User Workflows in the P-GRADE Grid Portal. Journal of Grid Computing, 3(3-4), 221-238. doi:10.1007/s10723-005-9012-6Manuali, C., Laganà, A., & Rampino, S. (2010). GriF: A Grid framework for a Web Service approach to reactive scattering. Computer Physics Communications, 181(7), 1179-1185. doi:10.1016/j.cpc.2010.03.001Goecks, J., Nekrutenko, A., Taylor, J., & Galaxy Team, T. (2010). Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biology, 11(8), R86. doi:10.1186/gb-2010-11-8-r86XSEDE consortium Extreme science and engineering discovery environment 2013 https://www.xsede.org/NanoHUB.org Online simulation and more for nanotechnology http://nanohub.org/SCI-BUS consortium Scientific gateway based user support 2011 https://www.sci-bus.eu/Kacsuk, P., Farkas, Z., Kozlovszky, M., Hermann, G., Balasko, A., Karoczkai, K., & Marton, I. (2012). WS-PGRADE/gUSE Generic DCI Gateway Framework for a Large Variety of User Communities. Journal of Grid Computing, 10(4), 601-630. doi:10.1007/s10723-012-9240-5Candela L Castelli D Pagano P D4science: an e-infrastructure for supporting virtual research environments Post-Proceedings of the 5th Italian Res. Conf. on Digital Libraries - IRCDL 2009 2009 166 169Lobo, J. M., Jiménez-Valverde, A., & Hortal, J. (2010). The uncertain nature of absences and their importance in species distribution modelling. Ecography, 33(1), 103-114. doi:10.1111/j.1600-0587.2009.06039.xGrinnell, J. (1917). Field Tests of Theories Concerning Distributional Control. The American Naturalist, 51(602), 115-128. doi:10.1086/279591Peterson, A. T., Soberón, J., Pearson, R. G., Anderson, R. P., Martínez-Meyer, E., Nakamura, M., & Araújo, M. B. (2011). Ecological Niches and Geographic Distributions (MPB-49). doi:10.23943/princeton/9780691136868.001.0001Brazilian Virtual Herbarium Consortium Brazilian Virtual Herbarium 2013 http://biogeo.inct.florabrasil.net/De Souza Muñoz, M. E., De Giovanni, R., de Siqueira, M. F., Sutton, T., Brewer, P., Pereira, R. S., … Canhos, V. P. (2009). openModeller: a generic approach to species’ potential distribution modelling. GeoInformatica, 15(1), 111-135. doi:10.1007/s10707-009-0090-7Hirzel, A. H., Hausser, J., Chessel, D., & Perrin, N. (2002). ECOLOGICAL-NICHE FACTOR ANALYSIS: HOW TO COMPUTE HABITAT-SUITABILITY MAPS WITHOUT ABSENCE DATA? Ecology, 83(7), 2027-2036. doi:10.1890/0012-9658(2002)083[2027:enfaht]2.0.co;2Anderson, R. P., Lew, D., & Peterson, A. T. (2003). Evaluating predictive models of species’ distributions: criteria for selecting optimal models. Ecological Modelling, 162(3), 211-232. doi:10.1016/s0304-3800(02)00349-6Farber, O., & Kadmon, R. (2003). Assessment of alternative approaches for bioclimatic modeling with special emphasis on the Mahalanobis distance. Ecological Modelling, 160(1-2), 115-130. doi:10.1016/s0304-3800(02)00327-7Phillips, S. J., Anderson, R. P., & Schapire, R. E. (2006). Maximum entropy modeling of species geographic distributions. Ecological Modelling, 190(3-4), 231-259. doi:10.1016/j.ecolmodel.2005.03.026Schölkopf, B., Platt, J. C., Shawe-Taylor, J., Smola, A. J., & Williamson, R. C. (2001). Estimating the Support of a High-Dimensional Distribution. Neural Computation, 13(7), 1443-1471. doi:10.1162/089976601750264965Armbrust, M., Stoica, I., Zaharia, M., Fox, A., Griffith, R., Joseph, A. D., … Rabkin, A. (2010). A view of cloud computing. Communications of the ACM, 53(4), 50. doi:10.1145/1721654.1721672Foster I Zhao Y Raicu I Lu S Cloud computing and grid computing 360-degree compared Grid Computing Environments Workshop, 2008. GCE '08 2008 1 10Candela, L., Castelli, D., & Pagano, P. (2013). Virtual Research Environments: An Overview and a Research Agenda. Data Science Journal, 12(0), GRDI75-GRDI81. doi:10.2481/dsj.grdi-013Tsai W Service-oriented system engineering: a new paradigm Service-oriented system engineering, 2005. sose 2005. IEEE International Workshop 3 6 10.1109/SOSE.2005.34Cattell, R. (2011). Scalable SQL and NoSQL data stores. ACM SIGMOD Record, 39(4), 12. doi:10.1145/1978915.1978919Durão FA Assad RE Silva AF Carvalho JF Garcia VC Trinta FAM USTO.RE: A Private Cloud Storage System 13th International Conference on Web Engineering (ICWE 2013) - Industry Track 2013 452 466Lezzi, D., Rafanell, R., Carrión, A., Espert, I. B., Hernández, V., & Badia, R. M. (2012). Enabling e-Science Applications on the Cloud with COMPSs. Lecture Notes in Computer Science, 25-34. doi:10.1007/978-3-642-29737-3_4Boeres, C., & Rebello, V. E. F. (2004). EasyGrid: towards a framework for the automatic Grid enabling of legacy MPI applications. Concurrency and Computation: Practice and Experience, 16(5), 425-432. doi:10.1002/cpe.821VENUS-C consortium Deliverable 6.1 - report on architecture 2012 http://www.venus-c.eu/Content/Publications.aspx?id=bfac02a9-9bc0-4c8f-80e0-7ceddc5c893bThain, D., Tannenbaum, T., & Livny, M. (2005). Distributed computing in practice: the Condor experience. Concurrency and Computation: Practice and Experience, 17(2-4), 323-356. doi:10.1002/cpe.938Couvares, P., Kosar, T., Roy, A., Weber, J., & Wenger, K. (s. f.). Workflow Management in Condor. Workflows for e-Science, 357-375. doi:10.1007/978-1-84628-757-2_22Sena, A., Nascimento, A., Boeres, C., & Rebello, V. (2008). EasyGrid Enabling of Iterative Tightly-Coupled Parallel MPI Applications. 2008 IEEE International Symposium on Parallel and Distributed Processing with Applications. doi:10.1109/ispa.2008.122Edmonds, A., Metsch, T., & Papaspyrou, A. (2011). Open Cloud Computing Interface in Data Management-Related Setups. Grid and Cloud Database Management, 23-48. doi:10.1007/978-3-642-20045-8_2Livenson, I., & Laure, E. (2011). Towards transparent integration of heterogeneous cloud storage platforms. Proceedings of the fourth international workshop on Data-intensive distributed computing - DIDC ’11. doi:10.1145/1996014.1996020The rOCCI framework http://occi-wg.org/2012/04/02/rocci-a-ruby-occi-framework/Mendelsohn N Gudgin M Ruellan H Nottingham M SOAP message transmission optimization mechanism W3C Recommendation 2005 http://www.w3.org/TR/2005/REC-soap12-mtom-20050125/Sencha Sencha GXT application framework for Google web toolkit 2013 http://www.sencha.com/products/gxt/Vicario, S., Hardisty, A., & Haitas, N. (2011). BioVeL: Biodiversity Virtual e-Laboratory. EMBnet.journal, 17(2), 5. doi:10.14806/ej.17.2.238Lezzi D Rafanell R Torres E De Giovanni R Blanquer I Badia RM Programming ecological niche modeling workflows in the cloud Proceed. of the 27th IEEE Int. Conf. on Advanced Information Networking and Applications 2013 1223 1228Lohmann, L. G. (2006). Untangling the phylogeny of neotropical lianas (Bignonieae, Bignoniaceae). American Journal of Botany, 93(2), 304-318. doi:10.3732/ajb.93.2.304Flann C Use Case Study EUBrazilOpenBio Cross-mapping tool Assessment of usability for regional-GSD comparisons 2013 http://www.eubrazilopenbio.eu/Content/Factfile.aspx?id=0750dcd8-23f2-4bf1-bad4-52aa3277d002Brazilian Ministry of Environment Instrução normativa no. 6, 23 de setembro de 2008 2008 http://www.mma.gov.br/estruturas/179/_arquivos/179_05122008033615.pd

    Supporting biodiversity studies with the EUBrazilOpenBio hybrid data infrastructure

    No full text
    EUBrazilOpenBio is a collaborative initiative addressing strategic barriers in biodiversity research by integrating open access data and user-friendly tools widely available in Brazil and Europe. The project deploys the EU-Brazil Hybrid Data Infrastructure that allows the sharing of hardware, software and data on-demand. This infrastructure provides access to several integrated services and resources to seamlessly aggregate taxonomic, biodiversity and climate data, used by processing services implementing checklist cross-mapping and ecological niche modelling. A Virtual Research Environment was created to provide users with a single entry point to processing and data resources. This article describes the architecture, demonstration use cases and experimental results.EUBrazilOpenBio - Open Data and Cloud Computing e-Infrastructure for Biodiversity (2011-2013) is a Small or medium-scale focused research project (STREP) funded by the European Commission under the Cooperation Programme, Framework Programme Seven (FP7) Objective FP7-ICT-2011-EU-Brazil Research and Development cooperation, and the National Council for Scientific and Technological Development of Brazil (CNPq) of the Brazilian Ministry of Science, Technology and Innovation (MCTI) under the corresponding matching Brazilian Call for proposals MCT/CNPq 066/2010.Peer Reviewe
    corecore