12 research outputs found
Fast ocean data assimilation and forecasting using a neural-network reduced-space regional ocean model of the north Brazil current
Data assimilation is computationally demanding, typically many times slower than model forecasts. Fast and reliable ocean assimilation methods are attractive for multiple applications such as emergency situations, search and rescue, and oil spills. A novel framework which performs fast data assimilation with sufficient accuracy is proposed for the first time for the open ocean. Speed improvement is achieved by performing the data assimilation on a reduced-space rather than on a full-space. A surface 10km resolution hindcast of the North Brazil current from the Regional Ocean Modelling System (ROMS) serves as the full-space state. The target variables are sea surface height, sea surface temperature, and surface currents. A dimension reduction of the full-state is made by an Empirical Orthogonal Function analysis while retaining most of the explained variance. The dynamics are replicated by a state-of-the-art neural network trained on the truncated principal components of the full-state. An Ensemble Kalman filter assimilates the data in the reduced-space, where the trained neural network produces short-range forecasts from perturbed ensembles. The Ensemble Kalman filter of the reduced-space is successful in reducing the root mean squared error by ∼ 45% and increases the correlations between state variables and data. The performance is similar to other full-space data assimilation studies. However, the computations are three to four orders of magnitude faster than for other full-space data assimilation schemes. The forecast of ocean variables is a computationally demanding task in terms of speed and accuracy. This framework manages to create fast forecasts in ∼ 30 seconds, once data have been assimilated. The forecasts are obtained using the trained neural network. We performed additional experiments using data and forecasts from July 2015 and January 2016. The analysis and forecasts in our framework yield a higher skill score and high spatial correlation when compared to the operational dataset Global Ocean Physics Analysis and Forecast by the UK MetOffice. Forcing the neural network with 10 m surface winds in order to improve the total surface currents forecast was considered. There is no additional skill in the forecasts using wind forcing because of the low Ekman component compared to the dominant geostrophic currents. The reduced model approach could be a useful tool when full physics regional models are not available to make a forecast.Open Acces
Forecasting Tropical Cyclones with Cascaded Diffusion Models
As cyclones become more intense due to climate change, the rise of AI-based
modelling provides a more affordable and accessible approach compared to
traditional methods based on mathematical models. This work leverages diffusion
models to forecast cyclone trajectories and precipitation patterns by
integrating satellite imaging, remote sensing, and atmospheric data, employing
a cascaded approach that incorporates forecasting, super-resolution, and
precipitation modelling, with training on a dataset of 51 cyclones from six
major basins. Experiments demonstrate that the final forecasts from the
cascaded models show accurate predictions up to a 36-hour rollout, with SSIM
and PSNR values exceeding 0.5 and 20 dB, respectively, for all three tasks.
This work also highlights the promising efficiency of AI methods such as
diffusion models for high-performance needs, such as cyclone forecasting, while
remaining computationally affordable, making them ideal for highly vulnerable
regions with critical forecasting needs and financial limitations. Code
accessible at \url{https://github.com/nathzi1505/forecast-diffmodels}.Comment: 6 pages, 3 figure
Enhancing Microdroplet Image Analysis with Deep Learning
Microfluidics is a highly interdisciplinary field where the integration of deep-learning models has the potential to streamline processes and increase precision and reliability. This study investigates the use of deep-learning methods for the accurate detection and measurement of droplet diameters and the image restoration of low-resolution images. This study demonstrates that the Segment Anything Model (SAM) provides superior detection and reduced droplet diameter error measurement compared to the Circular Hough Transform, which is widely implemented and used in microfluidic imaging. SAM droplet detections prove to be more robust to image quality and microfluidic images with low contrast between the fluid phases. In addition, this work proves that a deep-learning super-resolution network MSRN-BAM can be trained on a dataset comprising of droplets in a flow-focusing microchannel to super-resolve images for scales ×2, ×4, ×6, ×8. Super-resolved images obtain comparable detection and segmentation results to those obtained using high-resolution images. Finally, the potential of deep learning in other computer vision tasks, such as denoising for microfluidic imaging, is shown. The results show that a DnCNN model can denoise effectively microfluidic images with additive Gaussian noise up to σ = 4. This study highlights the potential of employing deep-learning methods for the analysis of microfluidic images
Med-UniC: Unifying Cross-Lingual Medical Vision-Language Pre-Training by Diminishing Bias
The scarcity of data presents a critical obstacle to the efficacy of medical
visionlanguage pre-training (VLP). A potential solution lies in the combination
of datasets from various language communities. Nevertheless, the main challenge
stems from the complexity of integrating diverse syntax and semantics,
language-specific medical terminology, and culture-specific implicit knowledge.
Therefore, one crucial aspect to consider is the presence of community bias
caused by different languages. This paper presents a novel framework named
Unifying Cross-Lingual Medical Vision-Language Pre-Training (Med-UniC),
designed to integrate multimodal medical data from the two most prevalent
languages, English and Spanish. Specifically, we propose Cross-lingual Text
Alignment Regularization (CTR) to explicitly unify cross-lingual semantic
representations of medical reports originating from diverse language
communities. CTR is optimized through latent language disentanglement,
rendering our optimization objective to not depend on negative samples, thereby
significantly mitigating the bias from determining positive-negative sample
pairs within analogous medical reports. Furthermore, it ensures that the
cross-lingual representation is not biased toward any specific language
community. Med-UniC reaches superior performance across 5 medical image tasks
and 10 datasets encompassing over 30 diseases, offering a versatile framework
for unifying multi-modal medical data within diverse linguistic communities.
The experimental outcomes highlight the presence of community bias in
cross-lingual VLP. Reducing this bias enhances the performance not only in
vision-language tasks but also in uni-modal visual tasks.Comment: NeurIPS 2023 Main trac
An efficient digital twin based on machine learning SVD autoencoder and generalised latent assimilation for nuclear reactor physics
International audience• A real-time operational digital twin is proposed for the prediction of power field in the core. • A non-intrusive forward model is built based on SVD-autoencoder and machine learning prediction methods. • An inverse model is realised based on a generalised latent assimilation method to overcome the bottleneck of efficient parameter identification
Recommended from our members
Parameter Flexible Wildfire Prediction Using Machine Learning Techniques: Forward and Inverse Modelling
International audienceParameter identification for wildfire forecasting models often relies on case-by-case tuning or posterior diagnosis/analysis, which can be computationally expensive due to the complexity of the forward prediction model. In this paper, we introduce an efficient parameter flexible fire prediction algorithm based on machine learning and reduced order modelling techniques. Using a training dataset generated by physics-based fire simulations, the method forecasts burned area at different time steps with a low computational cost. We then address the bottleneck of efficient parameter estimation by developing a novel inverse approach relying on data assimilation techniques (latent assimilation) in the reduced order space. The forward and the inverse modellings are tested on two recent large wildfire events in California. Satellite observations are used to validate the forward prediction approach and identify the model parameters. By combining these forward and inverse approaches, the system manages to integrate real-time observations for parameter adjustment, leading to more accurate future predictions
Surfactant-laden droplet size prediction in a flow-focusing microchannel::a data-driven approach
The control of droplet formation and size using microfluidic devices is a critical operation for both laboratory and industrial applications, e.g. in micro-dosage. Surfactants can be added to improve the stability and control the size of the droplets by modifying their interfacial properties. In this study, a large-scale data set of droplet size was obtained from high-speed imaging experiments conducted on a flow-focusing microchannel where aqueous surfactant-laden droplets were generated in silicone oil. Three types of surfactants were used including anionic, cationic and non-ionic at concentrations below and above the critical micelle concentration (CMC). To predict the final droplet size as a function of flow rates, surfactant type and concentration of surfactant, two data-driven models were built. Using a Bayesian regularised artificial neural network and XGBoost, these models were initially based on four inputs (flow rates of the two phases, interfacial tension at equilibrium and the normalised surfactant concentration). The mean absolute percentage errors (MAPE) show that data-driven models are more accurate (MAPE = 3.9%) compared to semi-empirical models (MAPE = 11.4%). To overcome experimental difficulties in acquiring accurate interfacial tension values under some conditions, both models were also trained with reduced inputs by removing the interfacial tension. The results show again a very good prediction of the droplet diameter. Finally, over 10 000 synthetic data were generated, based on the initial data set, with a Variational Autoencoder (VAE). The high-fidelity of the extended synthetic data set highlights that this method can be a quick and low-cost alternative to study microdroplet formation in future lab on a chip applications, where experimental data may not be readily available
Data learning: integrating data assimilation and machine learning
Data Assimilation (DA) is the approximation of the true state of some physical system by combining observations with a dynamic model. DA incorporates observational data into a prediction model to improve forecasted results. These models have increased in sophistication to better fit application requirements and circumvent implementation issues. Nevertheless, these approaches are incapable of fully overcoming their unrealistic assumptions. Machine Learning (ML) shows great capability in approximating nonlinear systems and extracting meaningful features from high-dimensional data. ML algorithms are capable of assisting or replacing traditional forecasting methods. However, the data used during training in any Machine Learning (ML) algorithm include numerical, approximation and round off errors, which are trained into the forecasting model. Integration of ML with DA increases the reliability of prediction by including information with a physical meaning. This work provides an introduction to Data Learning, a field that integrates Data Assimilation and Machine Learning to overcome limitations in applying these fields to real-world data. The fundamental equations of DA and ML are presented and developed to show how they can be combined into Data Learning. We present a number of Data Learning methods and results for some test cases, though the equations are general and can easily be applied elsewhere