1,225 research outputs found
Boosting analyses in the life sciences via clusters, grids and clouds
In the last 20 years, computational methods have become an important part of developing emerging technologies for the field of bioinformatics and biomedicine. Those methods rely heavily on large scale computational resources as they need to manage Tbytes or Pbytes of data with large-scale structural and functional relationships, TFlops or PFlops of computing power for simulating highly complex models, or many-task processes and workflows for processing and analyzing data. This special issue contains papers showing existing solutions and latest developments in Life Sciences and Computing Sciences to collaboratively explore new ideas and approaches to successfully apply distributed IT-systems in translational research, clinical intervention, and decision-making. (C) 2016 Published by Elsevier B.V
A new generation of Parsec-Colibri stellar isochrones including the TP-AGB phase
We introduce a new generation of PARSEC-COLIBRI stellar isochrones that includes a detailed treatment of the thermally pulsing asymptotic giant branch (TP-AGB) phase, covering a wide range of initial metallicities (0.0001. < Z(i) < 0.06). Compared to previous releases, the main novelties and improvements are use of new TP-AGB tracks and related atmosphere models and spectra for M and C-type stars; inclusion of the surface H+He +CNO abundances in the isochrone tables, accounting for the effects of diffusion, dredge-up episodes and hot-bottom burning; inclusion of complete thermal pulse cycles, with a complete description of the in-cycle changes in the stellar parameters; new pulsation models to describe the long-period variability in the fundamental and firstovertone modes; and new dust models that follow the growth of the grains during the AGB evolution, in combination with radiative transfer calculations for the reprocessing of the photospheric emission. Overall, these improvements are expected to lead to a more consistent and detailed description of properties of TP-AGB stars expected in resolved stellar populations, especially in regard to their mean photometric properties from optical to mid-infrared wavelengths. We illustrate the expected numbers of TP-AGB stars of different types in stellar populations covering a wide range of ages and initial metallicities, providing further details on the "C-star island" that appears at intermediate values of age and metallicity, and about the AGB-boosting effect that occurs at ages close to 1.6-Gyr for populations of all metallicities. The isochrones are available through a new dedicated web server
Georeferenced analysis of urban nightlife and noise based on mobile phone data
Urban environments are characterized by a complex soundscape that varies across different periods and geographical zones. This paper presents a novel approach for analyzing nocturnal urban noise patterns and identifying distinct zones using mobile phone data. Traditional noise-monitoring methods often require specialized equipment and are limited in scope. Our methodology involves gathering audio recordings from city sensors and localization data from mobile phones placed in urban areas over extended periods with a focus on nighttime, when noise profiles shift significantly. By leveraging machine learning techniques, the developed system processes the audio data to extract noise features indicative of different sound sources and intensities. These features are correlated with geographic location data to create comprehensive city noise maps during nighttime hours. Furthermore, this work employs clustering algorithms to identify distinct noise zones within the urban landscape, characterized by their unique noise signatures, reflecting the mix of anthropogenic and environmental noise sources. Our results demonstrate the effectiveness of using mobile phone data for nocturnal noise analysis and zone identification. The derived noise maps and zones identification provide insights into noise pollution patterns and offer valuable information for policymakers, urban planners, and public health officials to make informed decisions about noise mitigation efforts and urban development.This work was supported by the Fundação para a Ciência e Tecnologia under Grant [UIDB/00315/2020]; and by the project “BLOCKCHAIN.PT (RE-C05-i01.01—Agendas/Alianças Mobilizadoras para a Reindustrialização, Plano de Recuperação e Resiliência de Portugal” in its component 5—Capitalization and Business Innovation and with the Regulation of the Incentive System “Agendas for Business Innovation”, approved by Ordinance No. 43-A/2022 of 19 January 2022)
클라우드 컴퓨팅 환경기반에서 수치 모델링과 머신러닝을 통한 지구과학 자료생성에 관한 연구
학위논문(박사) -- 서울대학교대학원 : 자연과학대학 지구환경과학부, 2022. 8. 조양기.To investigate changes and phenomena on Earth, many scientists use high-resolution-model results based on numerical models or develop and utilize machine learning-based prediction models with observed data. As information technology advances, there is a need for a practical methodology for generating local and global high-resolution numerical modeling and machine learning-based earth science data.
This study recommends data generation and processing using high-resolution numerical models of earth science and machine learning-based prediction models in a cloud environment.
To verify the reproducibility and portability of high-resolution numerical ocean model implementation on cloud computing, I simulated and analyzed the performance of a numerical ocean model at various resolutions in the model domain, including the Northwest Pacific Ocean, the East Sea, and the Yellow Sea. With the containerization method, it was possible to respond to changes in various infrastructure environments and achieve computational reproducibility effectively.
The data augmentation of subsurface temperature data was performed using generative models to prepare large datasets for model training to predict the vertical temperature distribution in the ocean. To train the prediction model, data augmentation was performed using a generative model for observed data that is relatively insufficient compared to satellite dataset.
In addition to observation data, HYCOM datasets were used for performance comparison, and the data distribution of augmented data was similar to the input data distribution. The ensemble method, which combines stand-alone predictive models, improved the performance of the predictive model compared to that of the model based on the existing observed data. Large amounts of computational resources were required for data synthesis, and the synthesis was performed in a cloud-based graphics processing unit environment.
High-resolution numerical ocean model simulation, predictive model development, and the data generation method can improve predictive capabilities in the field of ocean science. The numerical modeling and generative models based on cloud computing used in this study can be broadly applied to various fields of earth science.지구의 변화와 현상을 연구하기 위해 많은 과학자들은 수치 모델을 기반으로 한 고해상도 모델 결과를 사용하거나 관측된 데이터로 머신러닝 기반 예측 모델을 개발하고 활용한다. 정보기술이 발전함에 따라 지역 및 전 지구적인 고해상도 수치 모델링과 머신러닝 기반 지구과학 데이터 생성을 위한 실용적인 방법론이 필요하다.
본 연구는 지구과학의 고해상도 수치 모델과 머신러닝 기반 예측 모델을 기반으로 한 데이터 생성 및 처리가 클라우드 환경에서 효과적으로 구현될 수 있음을 제안한다.
클라우드 컴퓨팅에서 고해상도 수치 해양 모델 구현의 재현성과 이식성을 검증하기 위해 북서태평양, 동해, 황해 등 모델 영역의 다양한 해상도에서 수치 해양 모델의 성능을 시뮬레이션하고 분석하였다. 컨테이너화 방식을 통해 다양한 인프라 환경 변화에 대응하고 계산 재현성을 효과적으로 확보할 수 있었다.
머신러닝 기반 데이터 생성의 적용을 검증하기 위해 생성 모델을 이용한 표층 이하 온도 데이터의 데이터 증강을 실행하여 해양의 수직 온도 분포를 예측하는 모델 훈련을 위한 대용량 데이터 세트를 준비했다. 예측모델 훈련을 위해 위성 데이터에 비해 상대적으로 부족한 관측 데이터에 대해서 생성 모델을 사용하여 데이터 증강을 수행하였다. 모델의 예측성능 비교에는 관측 데이터 외에도 HYCOM 데이터 세트를 사용하였으며, 증강 데이터의 데이터 분포는 입력 데이터 분포와 유사함을 확인하였다. 독립형 예측 모델을 결합한 앙상블 방식은 기존 관측 데이터를 기반으로 하는 예측 모델의 성능에 비해 향상되었다. 데이터합성을 위해 많은 양의 계산 자원이 필요했으며, 데이터 합성은 클라우드 기반 GPU 환경에서 수행되었다.
고해상도 수치 해양 모델 시뮬레이션, 예측 모델 개발, 데이터 생성 방법은 해양 과학 분야에서 예측 능력을 향상시킬 수 있다. 본 연구에서 사용된 클라우드 컴퓨팅 기반의 수치 모델링 및 생성 모델은 지구 과학의 다양한 분야에 광범위하게 적용될 수 있다.1. General Introduction 1
2. Performance of numerical ocean modeling on cloud computing 6
2.1. Introduction 6
2.2. Cloud Computing 9
2.2.1. Cloud computing overview 9
2.2.2. Commercial cloud computing services 12
2.3. Numerical model for performance analysis of commercial clouds 15
2.3.1. High Performance Linpack Benchmark 15
2.3.2. Benchmark Sustainable Memory Bandwidth and Memory Latency 16
2.3.3. Numerical Ocean Model 16
2.3.4. Deployment of Numerical Ocean Model and Benchmark Packages on Cloud Clusters 19
2.4. Simulation results 21
2.4.1. Benchmark simulation 21
2.4.2. Ocean model simulation 24
2.5. Analysis of ROMS performance on commercial clouds 26
2.5.1. Performance of ROMS according to H/W resources 26
2.5.2. Performance of ROMS according to grid size 34
2.6. Summary 41
3. Reproducibility of numerical ocean model on the cloud computing 44
3.1. Introduction 44
3.2. Containerization of numerical ocean model 47
3.2.1. Container virtualization 47
3.2.2. Container-based architecture for HPC 49
3.2.3. Container-based architecture for hybrid cloud 53
3.3. Materials and Methods 55
3.3.1. Comparison of traditional and container based HPC cluster workflows 55
3.3.2. Model domain and datasets for numerical simulation 57
3.3.3. Building the container image and registration in the repository 59
3.3.4. Configuring a numeric model execution cluster 64
3.4. Results and Discussion 74
3.4.1. Reproducibility 74
3.4.2. Portability and Performance 76
3.5. Conclusions 81
4. Generative models for the prediction of ocean temperature profile 84
4.1. Introduction 84
4.2. Materials and Methods 87
4.2.1. Model domain and datasets for predicting the subsurface temperature 87
4.2.2. Model architecture for predicting the subsurface temperature 90
4.2.3. Neural network generative models 91
4.2.4. Prediction Models 97
4.2.5. Accuracy 103
4.3. Results and Discussion 104
4.3.1. Data Generation 104
4.3.2. Ensemble Prediction 109
4.3.3. Limitations of this study and future works 111
4.4. Conclusion 111
5. Summary and conclusion 114
6. References 118
7. Abstract (in Korean) 140박
Environment determinants in business adoption of Cloud Computing
Purpose – The purpose of this paper is to analyze the influence of Technology Providers, Public Administrations and R&D Institutions on Cloud Computing adoption. This research also considers Killer Applications and Success Cases as other environmental factors.
Design/methodology/approach – Factorial analyses and structural equation models were used on a sample of high-technology firms located in technological parks in Southern Europe, with more than ten employees and sustained investments in R&D.
Findings – Results show that Technology Providers and Success Cases are determinant in Cloud Computing adoption. Moreover, Killer Applications are a forerunner for Success Cases.
Practical implications – An appropriate fit between the tools and resources provided by suppliers and the internal resources of the company is needed to create competitive advantages. Firms should evaluate Technology Providers, identify Success Cases to Cloud Computing adoption and implement technological benchmarking.
Originality/value – This study contributes to Cloud Computing adoption literature because it includes Technology Providers, Public Administrations and R&D Institutions simultaneously as well as other variables as Killer Applications and Success Cases. The importance of the external agents on information technology (IT) adoption, especially when the technologies to be adopted are new and in an emergent stage, together with the lack of prior investigations focusing on specific environmental factors affecting the adoption of these new, emerging IT, justify the value of this research
Recommended from our members
Scheduling, Characterization and Prediction of HPC Workloads for Distributed Computing Environments
As High Performance Computing (HPC) has grown considerably and is expected to grow even more, effective resource management for distributed computing sys- tems is motivated more than ever. As the computational workloads grow in quantity, it is becoming more crucial to apply efficient resource management and workload scheduling to use resources efficiently while keeping the computational performance reasonably good. The problem of efficiently scheduling workloads on resources while meeting performance standards is hard. Additionally, non-clairvoyance of job dimen- sions makes resource management even harder in real-world scenarios. Our research methodology investigates the scheduling problem compliant for HPC and researches the challenges for deploying the scheduling in real world-scenarios using state of the art machine learning and data science techniques.To this end, this Ph.D. dissertation makes the following core contributions: a) We perform a theoretical analysis of space-sharing, non-preemptive scheduling: we studied this scheduling problem and proposed scheduling algorithms with polyno- mial computation time. We also proved constant upper-bounds for the performance of these algorithms. b) We studied the sensitivity of scheduling algorithms to the accuracy of runtime and devised a meta-learning approach to estimate prediction accuracy for newly submitted jobs to the HPC system. c) We studied the runtime prediction problem for HPC applications. For this purpose, we studied the distri- bution of available public workloads and proposed two different solutions that can predict multi-modal distributions: switching state-space models and Mixture Density Networks. d) We studied the effectiveness of recent recurrent neural network models for CPU usage trace prediction for individual VM traces as well as aggregate CPU usage traces. In this dissertation, we explore solutions to improve the performance of scheduling workloads on distributed systems.We begin by looking at the problem from the theoretical perspective. Modeling the problem mathematically, we first propose a scheduling algorithm that finds a constant approximation of the optimal solution for the problem in polynomial time. We prove that the performance of the algorithm (average completion time is the constant approximation of the performance of the optimal scheduling. We next look at the problem in real-world scenarios. Considering High-Performance Computing (HPC) workload computing environments as the most similar real-world equivalent of our mathematical model, we explore the problem of predicting application runtime. We propose an algorithm to handle the existing uncertainties in the real world and show-case our algorithm with demonstrative effectiveness in terms of response time and resource utilization. After looking at the uncertainty problem, we focus on trying to improve the accuracy of existing prediction approaches for HPC application runtime. We propose two solutions, one based on Kalman filters and one based on deep density mixture networks. We showcase the effectiveness of our prediction approaches by comparing with previous prediction approaches in terms of prediction accuracy and impact on improving scheduling performance. In the end, we focus on predicting resource usage for individual applications during their execution. We explore the application of recurrent neural networks for predicting resource usage of applications deployed on individual virtual machines. To validate our proposed models and solutions, we performed extensive trace-driven simulation and measured the effectiveness of our approaches
Calibrating the TP-AGR Phase through Resolved Stellar Populations in the Small Magellanic Cloud
This thesis is a starting point for the TP-AGB phase calibration based on observations
of resolved stellar populations in the Small Magellanic Cloud. Using a spatially-resolved SFH,
infrared color-magnitude diagrams have been simulated with the population synthesis code
TRILEGAL in order to reproduce observed star counts, luminosity functions, color distributions and
therefore constrain TP-AGB lifetimes, 3rd DU and mass-loss efficiencies, dust production
- …