31 research outputs found
Probabilistic Page Replacement Policy in Buffer Cache Management for Flash-Based Cloud Databases
In the fast evolution of storage systems, the newly emerged flash memory-based Solid State Drives (SSDs) are becoming an important part of the computer storage hierarchy. Amongst the several advantages of flash-based SSDs, high read performance, and low power consumption are of primary importance. Amongst its few disadvantages, its asymmetric I/O latencies for read, write and erase operations are the most crucial for overall performance. In this paper, we proposed two novel probabilistic adaptive algorithms that compute the future probability of reference based on recency, frequency, and periodicity of past page references. The page replacement is performed by considering the probability of reference of cached pages as well as asymmetric read-write-erase properties of flash devices. The experimental results show that our proposed method is successful in minimizing the performance overheads of flash-based systems as well as in maintaining the good hit ratio. The results also justify the utility of a genetic algorithm in maximizing the overall performance gains
Probabilistic change detection and visualization methods for the assessment of temporal stability in biomedical data quality
The final publication is available at Springer via http://dx.doi.org/DOI 10.1007/s10618-014-0378-6. Published online.Knowledge discovery on biomedical data can be based on on-line, data-stream analyses, or using retrospective, timestamped, off-line datasets. In both cases, changes in the processes that generate data or in their quality features through time may hinder either the knowledge discovery process or the generalization of past knowledge. These problems can be seen as a lack of data temporal stability. This work establishes the temporal stability as a data quality dimension and proposes new methods for its assessment based on a probabilistic framework. Concretely, methods are proposed for (1) monitoring changes, and (2) characterizing changes, trends and detecting temporal subgroups. First, a probabilistic change detection algorithm is proposed based on the Statistical Process Control of the posterior Beta distribution of the JensenāShannon distance, with a memoryless forgetting mechanism. This algorithm (PDF-SPC) classifies the degree of current change in three states: In-Control, Warning, and Out-of-Control. Second, a novel method is proposed to visualize and characterize the temporal changes of data based on the projection of a non-parametric information-geometric statistical manifold of time windows. This projection facilitates the exploration of temporal trends using the proposed IGT-plot and, by means of unsupervised learning methods, discovering conceptually-related temporal subgroups. Methods are evaluated using real and simulated data based on the National Hospital Discharge Survey (NHDS) dataset.The work by C Saez has been supported by an Erasmus Lifelong Learning Programme 2013 Grant. This work has been supported by own IBIME funds. The authors thank Dr. Gregor Stiglic, from the Univeristy of Maribor, Slovenia, for his support on the NHDS data.SĆ”ez Silvestre, C.; Pereira Rodrigues, P.; Gama, J.; Robles Viejo, M.; GarcĆa GĆ³mez, JM. (2014). Probabilistic change detection and visualization methods for the assessment of temporal stability in biomedical data quality. Data Mining and Knowledge Discovery. 28:1-1. doi:10.1007/s10618-014-0378-6S1128Aggarwal C (2003) A framework for diagnosing changes in evolving data streams. In Proceedings of the International Conference on Management of Data ACM SIGMOD, pp 575ā586Amari SI, Nagaoka H (2007) Methods of information geometry. American Mathematical Society, Providence, RIArias E (2014) United states life tables, 2009. Natl Vital Statist Rep 62(7): 1ā63Aspden P, Corrigan JM, Wolcott J, Erickson SM (2004) Patient safety: achieving a new standard for care. Committee on data standards for patient safety. The National Academies Press, Washington, DCBasseville M, Nikiforov IV (1993) Detection of abrupt changes: theory and application. Prentice-Hall Inc, Upper Saddle River, NJBorg I, Groenen PJF (2010) Modern multidimensional scaling: theory and applications. Springer, BerlinBowman AW, Azzalini A (1997) Applied smoothing techniques for data analysis: the Kernel approach with S-plus illustrations (Oxford statistical science series). Oxford University Press, OxfordBrandes U, Pich C (2007) Eigensolver methods for progressive multidimensional scaling of large data. In: Kaufmann M, Wagner D (eds) Graph drawing. Lecture notes in computer science, vol 4372. Springer, Berlin, pp 42ā53Brockwell P, Davis R (2009) Time series: theory and methods., Springer series in statisticsSpringer, BerlinCesario SK (2002) The āChristmas Effectā and other biometeorologic influences on childbearing and the health of women. J Obstet Gynecol Neonatal Nurs 31(5):526ā535Chakrabarti K, Garofalakis M, Rastogi R, Shim K (2001) Approximate query processing using wavelets. VLDB J 10(2ā3):199ā223Cruz-Correia RJ, Pereira Rodrigues P, Freitas A, Canario Almeida F, Chen R, Costa-Pereira A (2010) Data quality and integration issues in electronic health records. Information discovery on electronic health records, pp 55ā96CsiszĆ”r I (1967) Information-type measures of difference of probability distributions and indirect observations. Studia Sci Math Hungar 2:299ā318Dasu T, Krishnan S, Lin D, Venkatasubramanian S, Yi K (2009) Change (detection) you can believe. In: Finding distributional shifts in data streams. In: Proceedings of the 8th international symposium on intelligent data analysis: advances in intelligent data analysis VIII, IDA ā09. Springer, Berlin, pp 21ā34Endres D, Schindelin J (2003) A new metric for probability distributions. IEEE Trans Inform Theory 49(7):1858ā1860Gama J, Gaber MM (2007) Learning from data streams: processing techniques in sensor networks. Springer, BerlinGama J, Medas P, Castillo G, Rodrigues P (2004) Learning with drift detection. In: Bazzan A, Labidi S (eds) Advances in artificial intelligenceāSBIA 2004., Lecture notes in computer scienceSpringer, Berlin, pp 286ā295Gama J (2010) Knowledge discovery from data streams, 1st edn. Chapman & Hall, LondonGehrke J, Korn F, Srivastava D (2001) On computing correlated aggregates over continual data streams. SIGMOD Rec 30(2):13ā24Guha S, Shim K, Woo J (2004) Rehist: relative error histogram construction algorithms. In: Proceedings of the thirtieth international conference on very large data bases VLDB, pp 300ā311Han J, Kamber M, Pei J (2012) Data mining: concepts and techniques. Morgan Kaufmann, Elsevier, Burlington, MAHowden LM, Meyer JA, (2011) Age and sex composition. 2010 Census Briefs US Department of Commerce. Economics and Statistics Administration, US Census BureauHrovat G, Stiglic G, Kokol P, Ojstersek M (2014) Contrasting temporal trend discovery for large healthcare databases. Comput Methods Program Biomed 113(1):251ā257Keim DA (2000) Designing pixel-oriented visualization techniques: theory and applications. IEEE Trans Vis Comput Graph 6(1):59ā78Kifer D, Ben-David S, Gehrke J (2004) Detecting change in data streams. In: Proceedings of the thirtieth international conference on Very large data bases, VLDB Endowment, VLDB ā04, vol 30, pp 180ā191Klinkenberg R, Renz I (1998) Adaptive information filtering: Learning in the presence of concept drifts. In: Workshop notes of the ICML/AAAI-98 workshop learning for text categorization. AAAI Press, Menlo Park, pp 33ā40Kohonen T (1982) Self-organized formation of topologically correct feature maps. Biolog Cybern 43(1):59ā69Lin J (1991) Divergence measures based on the Shannon entropy. IEEE Trans Inform Theory 37:145ā151Mitchell TM, Caruana R, Freitag D, McDermott J, Zabowski D (1994) Experience with a learning personal assistant. Commun ACM 37(7):80ā91Mouss H, Mouss D, Mouss N, Sefouhi L (2004) Test of page-hinckley, an approach for fault detection in an agro-alimentary production system. In: Proceedings of the 5th Asian Control Conference, vol 2, pp 815ā818National Research Council (2011) Explaining different levels of longevity in high-income countries. The National Academies Press, Washington, DCNHDS (2010) United states department of health and human services. Centers for disease control and prevention. National center for health statistics. National hospital discharge survey codebookNHDS (2014) National Center for Health Statistics, National Hospital Discharge Survey (NHDS) data, US Department of Health and Human Services, Centers for Disease Control and Prevention, National Center for Health Statistics, Hyattsville, Maryland. http://www.cdc.gov/nchs/nhds.htmPapadimitriou S, Sun J, Faloutsos C (2005) Streaming pattern discovery in multiple time-series. In: Proceedings of the 31st international conference on very large data bases, VLDB endowment, VLDB ā05, pp 697ā708Parzen E (1962) On estimation of a probability density function and mode. Ann Math Statist 33(3):1065ā1076Ramsay JO, Silverman BW (2005) Functional data analysis. Springer, New YorkRodrigues P, Correia R (2013) Streaming virtual patient records. In: Krempl G, Zliobaite I, Wang Y, Forman G (eds) Real-world challenges for data stream mining. University Magdeburg, Otto-von-Guericke, pp 34ā37Rodrigues P, Gama J, Pedroso J (2008) Hierarchical clustering of time-series data streams. IEEE Trans Knowl Data Eng 20(5):615ā627Rodrigues PP, Gama Ja (2010) A simple dense pixel visualization for mobile sensor data mining. In: Proceedings of the second international conference on knowledge discovery from sensor data, sensor-KDDā08. Springer, Berlin, pp 175ā189Rodrigues PP, Gama J, SebastiĆ£ o R (2010) Memoryless fading windows in ubiquitous settings. In Proceedings of ubiquitous data mining (UDM) workshop in conjunction with the 19th european conference on artificial intelligenceāECAI 2010, pp 27ā32Rodrigues PP, SebastiĆ£ o R, Santos CC (2011) Improving cardiotocography monitoring: a memory-less stream learning approach. In: Proceedings of the learning from medical data streams workshop. Bled, SloveniaRubner Y, Tomasi C, Guibas L (2000) The earth moverās distance as a metric for image retrieval. Int J Comput Vision 40(2):99ā121SebastiĆ£o R, Gama J (2009) A study on change detection methods. In: 4th Portuguese conference on artificial intelligenceSebastiĆ£o R, Gama J, Rodrigues P, Bernardes J (2010) Monitoring incremental histogram distribution for change detection in data streams. In: Gaber M, Vatsavai R, Omitaomu O, Gama J, Chawla N, Ganguly A (eds) Knowledge discovery from sensor data, vol 5840., Lecture notes in computer science. Springer, Berlin, pp 25ā42SebastiĆ£o R, Silva M, RabiƧo R, Gama J, MendonƧa T (2013) Real-time algorithm for changes detection in depth of anesthesia signals. Evol Syst 4(1):3ā12SĆ”ez C, MartĆnez-Miranda J, Robles M, GarcĆa-GĆ³mez JM (2012) O rganizing data quality assessment of shifting biomedical data. Stud Health Technol Inform 180:721ā725SĆ”ez C, Robles M, GarcĆa-GĆ³mez JM (2013) Comparative study of probability distribution distances to define a metric for the stability of multi-source biomedical research data. In: Engineering in medicine and biology society (EMBC), 2013 35th annual international conference of the IEEE, pp 3226ā3229SĆ”ez C, Robles M, GarcĆa-GĆ³mez JM (2014) Stability metrics for multi-source biomedical data based on simplicial projections from probability distribution distances. Statist Method Med Res (forthcoming)Shewhart WA, Deming WE (1939) Statistical method from the viewpoint of quality control. Graduate School of the Department of Agriculture, Washington, DCShimazaki H, Shinomoto S (2010) Kernel bandwidth optimization in spike rate estimation. J Comput Neurosci 29(1ā2):171ā182Solberg LI, Engebretson KI, Sperl-Hillen JM, Hroscikoski MC, OāConnor PJ (2006) Are claims data accurate enough to identify patients for performance measures or quality improvement? the case of diabetes, heart disease, and depression. Am J Med Qual 21(4):238ā245Spiliopoulou M, Ntoutsi I, Theodoridis Y, Schult R (2006) monic: modeling and monitoring cluster transitions. In: Proceedings of the 12th ACm SIGKDD international conference on knowledge discovery and data mining, KDD ā06. ACm, New York, NY, pp 706ā711Stiglic G, Kokol P (2011) Interpretability of sudden concept drift in medical informatics domain. In Proceedings of the 2010 IEEE international conference on data mining workshops, pp 609ā613Torgerson W (1952) Multidimensional scaling: I theory and method. Psychometrika 17(4):401ā419Wang RY, Strong DM (1996) Beyond accuracy: what data quality means to data consumers. J Manage Inform Syst 12(4):5ā33Weiskopf NG, Weng C (2013) M ethods and dimensions of electronic health record data quality assessment: enabling reuse for clinical research. J Am Med Inform Assoc 20(1):144ā151Wellings K, Macdowall W, Catchpole M, Goodrich J (1999) Seasonal variations in sexual activity and their implications for sexual health promotion. J R Soc Med 92(2):60ā64Westgard JO, Barry PL (2010) Basic QC practices: training in statistical quality control for medical laboratories. Westgard Quality Corporation, Madison, WIWidmer G, Kubat M (1996) Learning in the presence of concept drift and hidden contexts. Mach Learn 23(1):69ā10
Offline and Online Density Estimation for Large High-Dimensional Data
Density estimation has wide applications in machine learning and data analysis techniques including clustering, classification, multimodality analysis, bump hunting and anomaly detection. In high-dimensional space, sparsity of data in local neighborhood makes many of parametric and nonparametric density estimation methods mostly inefficient.
This work presents development of computationally efficient algorithms for high-dimensional density estimation, based on Bayesian sequential partitioning (BSP). Copula transform is used to separate the estimation of marginal and joint densities, with the purpose of reducing the computational complexity and estimation error. Using this separation, a parallel implementation of the density estimation algorithm on a 4-core CPU is presented. Also, some example applications of the high-dimensional density estimation in density-based classification and clustering are presented.
Another challenge in the area of density estimation rises in dealing with online sources of data, where data is arriving over an open-ended and non-stationary stream. This calls for efficient algorithms for online density estimation. An online density estimator needs to be capable of providing up-to-date estimates of the density, bound to the available computing resources and requirements of the application. In response to this, BBSP method for online density estimation is introduced. It works based on collecting and processing the data in blocks of fixed size, followed by a weighted averaging over block-wise estimates of the density. Proper choice of block size is discussed via simulations for streams of synthetic and real datasets.
Further, with the purpose of efficiency improvement in offline and online density estimation, progressive update of the binary partitions in BBSP is proposed, which as simulation results show, leads into improved accuracy as well as speed-up, for various block sizes
Towards outlier detection for high-dimensional data streams using projected outlier analysis strategy
[Abstract]: Outlier detection is an important research problem in data mining that aims to discover useful abnormal and irregular patterns hidden in large data sets. Most existing outlier detection methods only deal with static data with relatively low dimensionality.
Recently, outlier detection for high-dimensional stream data became a new emerging research problem. A key observation that motivates this research is that outliers
in high-dimensional data are projected outliers, i.e., they are embedded in lower-dimensional subspaces. Detecting projected outliers from high-dimensional stream
data is a very challenging task for several reasons. First, detecting projected outliers is difficult even for high-dimensional static data. The exhaustive search for the out-lying subspaces where projected outliers are embedded is a NP problem. Second, the algorithms for handling data streams are constrained to take only one pass to process the streaming data with the conditions of space limitation and time criticality. The currently existing methods for outlier detection are found to be ineffective for detecting projected outliers in high-dimensional data streams.
In this thesis, we present a new technique, called the Stream Project Outlier deTector (SPOT), which attempts to detect projected outliers in high-dimensional
data streams. SPOT employs an innovative window-based time model in capturing dynamic statistics from stream data, and a novel data structure containing a set of
top sparse subspaces to detect projected outliers effectively. SPOT also employs a multi-objective genetic algorithm as an effective search method for finding the
outlying subspaces where most projected outliers are embedded. The experimental results demonstrate that SPOT is efficient and effective in detecting projected outliers
for high-dimensional data streams. The main contribution of this thesis is that it provides a backbone in tackling the challenging problem of outlier detection for high-
dimensional data streams. SPOT can facilitate the discovery of useful abnormal patterns and can be potentially applied to a variety of high demand applications, such as for sensor network data monitoring, online transaction protection, etc
Recommended from our members
Inferring Social and Internal Context Using a Mobile Phone
This dissertation is composed of research studies that contribute to three research areas including social context-aware computing, internal context-aware computing, and human behavioral data mining. In social context-aware computing, four studies are conducted. First, mobile phone user calling behavioral patterns are characterized in forms of randomness level where relationships among them are then identified. Next, a study is conducted to investigate the relationship between the calling behavior and organizational groups. Third, a method is presented to quantitatively define mobile social closeness and social groups, which are then used to identify social group sizes and scaling ratio. Last, based on the mobile social grouping framework, the significant role of social ties in communication patterns is revealed. In internal context-aware computing, two studies are conducted where the notions of internal context are intention and situation. For intentional context, the goal is to sense the intention of the user in placing calls. A model is thus presented for predicting future calls envisaged as a call predicted list (CPL), which makes use of call history to build a probabilistic model of calling behavior. As an incoming call predictor, CPL is a list of numbers/contacts that are the most likely to be the callers within the next hour(s), which is useful for scheduling and daily planning. As an outgoing call predictor, CPL is generated as a list of numbers/contacts that are the most likely to be dialed when the user attempts to make an outgoing call (e.g., by flipping open or unlocking the phone). This feature helps save time from having to search through a lengthy phone book. For situational context, a model is presented for sensing the user's situation (e.g., in a library, driving a car, etc.) based on embedded sensors. The sensed context is then used to switch the phone into a suitable alert mode accordingly (e.g., vibrate mode while in a library, handsfree mode while driving, etc.). Inferring (social and internal) context introduces a challenging research problem in human behavioral data mining. Context is determined by the current state of mind (internal), relationship (social), and surroundings (physical). Thus, the current state of context is important and can be derived from the recent behavior and pattern. In data mining research area, therefore, two frameworks are developed for detecting recent patterns, where one is a model-driven approach and the other is a data-driven approach
Recent Advances in Signal Processing
The signal processing task is a very critical issue in the majority of new technological inventions and challenges in a variety of applications in both science and engineering fields. Classical signal processing techniques have largely worked with mathematical models that are linear, local, stationary, and Gaussian. They have always favored closed-form tractability over real-world accuracy. These constraints were imposed by the lack of powerful computing tools. During the last few decades, signal processing theories, developments, and applications have matured rapidly and now include tools from many areas of mathematics, computer science, physics, and engineering. This book is targeted primarily toward both students and researchers who want to be exposed to a wide variety of signal processing techniques and algorithms. It includes 27 chapters that can be categorized into five different areas depending on the application at hand. These five categories are ordered to address image processing, speech processing, communication systems, time-series analysis, and educational packages respectively. The book has the advantage of providing a collection of applications that are completely independent and self-contained; thus, the interested reader can choose any chapter and skip to another without losing continuity
Probability density estimation over evolving data streams using tilted Parzen window
Probability density estimation is a very important technology which has been widely used in data mining and data analysis. In this paper, we generalize the traditional Parzen window method to data streams and propose a new method of tilted Parzen window (TPW) for probability density estimation. To adapt to the evolvement of the data streams, we use the tilted window size that is proportional to datapsilas arrival time instead of the fixed window size. Theoretical analysis shows that the tilted Parzen window method is a valid method for estimating the probability density function (pdf) for data streams. We also propose a new strategy for discarding the historical data in data streams. We prove that this strategy can describe the probability density changes more accurately than the conventional discarding strategy. Empirical results on synthetic data set demonstrate the effectiveness and efficiency of this method.Shen Hong & Yan Xiao-Longhttp://www.ieee-iscc.org/2008
Recommended from our members
The Application of Adaptive Linear and N on-Linear Filters to Fringe Order Identification in White-Light Interferometry Systems
Conventional optical interferometry systems driven by highly coherent light sources have a very short unambiguous operating range, a direct consequence of the flatness of the interference fringes visibility profile at the output of the system.
The range can be extended by using a white-light interferometer (WU), which is driven by a low-coherence source and produces a Gaussian visibility profile with a unique maximum in correspondence of the central fringe.
Due to system and/or measurement noise, however, the position of the maximum (from which an accurate measurement of the measurand - displacement, temperature, pressure, flow, etc. - can be derived) is not easily detectable, and can lead to large measurement errors. This is especially true in a multiplexing scheme, where the source power is distributed evenly among various sensors, with a corresponding drop in the overall signal-to-noise ratio. The inclusion of a signal processing scheme at the receiver end is thus a necessity.
As the fringe pattern at the output of a WLI system is basically a noisy sine wave amplitude modulated by a Gaussian envelope, it can be classified as a non-stationary, narrow-band, linear but non-Gaussian signa\. So far, no attempt has been made to apply digital filtering techniques, as understood in the signal processing community, to the output signal of a WLI system. This thesis constitutes a first step in that direction.
Since the only measurable information given by the system is contained in the output signal, the system is modelled as a "black box" driven by the system and measurement noise processes and containing an unknown set of parameters. Standard least squares techniques can then be applied to estimate the parameters of the model, as is usually done in the field of system identification when only noisy output measurements are available.
It is shown that identification of the model parameters is equivalent to finding a set of coefficients for an inverse filter which takes the WU signal at its input and delivers the unknown noise process at the output.
The non-stationarity of the signal is accounted for by allowing for time variations of the model parameters; this justifies the use of adaptive filters with time-varying coefficients. A new central fringe identification scheme is proposed, based on a modification of the standard least mean square (LMS) adaptive filtering algorithm in combination with amplitude thresholding of the fringe pattern. The new scheme is shown to offer considerable improvement in the identification rate when tested against current schemes over comparable operating ranges, while retaining the computational simplicity and operational speed of the standard LMS. Its performance is also shown to be largely independent of the step-size parameter controlling the rate of convergence and tracking in the standard LMS, which is known to be the main obstacle for a successful application of the algorithm in a practical setting.
The non-Gaussianity of the signal is explored and an attempt is made to apply higher-order statistics (HOS) algorithms to central fringe identification. The effectiveness of Gaussianity tests on pilot Gaussian data is seen to depend not only on the number and length of records available but, perhaps more importantly, on the bandwidth of the process. Violation of the stationarity assumption is shown to lead to mis-classification of a seemingly non-Gaussian signal into a Gaussian one, as the visibility profile may alter the distribution of the underlying sinusoid making it appear Gaussian, even when beam diffraction and wavefront aberrations combine to produce a nonGaussian profile. HOS-based adaptive algorithms may still be of some benefit, however, if processing is confined to that region of the fringe pattern where sufficient non-Gaussianity is allowed to develop.
Non-linear adaptive filters based on the Volterra theories are finally applied to compensate for possible non-linearities introduced by mismatches in optical components, chromatic aberrations, and analogue-to-digital converters. It is shown that although a Volterra filter is able to reproduce the low-amplitude distortions of the fringe pattern better than a linear filter does, the identification rate does not improve. Reasons are given for such behaviour
Advancements in mesoscale ensemble prediction strategies: Application to Mediterranean high-impact weather
[cat] La predictibilitat d'esdeveniments d'alt impacte a la regi o Mediterr ania ha millorat substancialment al llarg de les darreres d ecades. No obstant aix o, una representaci o precisa d'aspectes dels sistemes convectius rellevants per la societat, tals com el moment en qu e es produeixen, i la seva localitzaci o i intensitat encara suposen un repte. Aquestes febleses de la predicci o a escala convectiva provenen d'imprecisions a l'estimaci o de l'estat atmosf eric inicial, la formulaci o de processos f sics rellevants i la natura ca otica dels sistema associada a la seva no linealitat. En el marc probabil stic imposat per les incerteses intr nseques implicades en la predicci o num erica del temps, l'entitat matem atica que quanti ca la incertesa en l'estat atmosf eric es la funci o densitat de probabilitat. Malgrat aix o, el c alcul de la seva evoluci o temporal es inviable per situacions realistes amb els recursos computacionals disponibles actualment. La modesta aproximaci o habitual per estimar aquesta evoluci o es l' us d'un discret i petit nombre de mostres de l'estat del sistema, que es coneix com a predicci o per conjunts (ensemble forecasting). L'objectiu general d'aquesta Tesi es entendre millor els l mits de la predictibilitat i contribuir a una millora de la predicci o de temps sever a la regi o Mediterr ania. En primer lloc, s'avalua l'evoluci o temporal de les funcions densitat de probabilitat per sistemes de baixa complexitat amb un cert grau de realisme adoptant el formalisme de Liouville. En segon lloc, es dissenya una estrat egia de mostreig per crear pertorbacions a les condicions inicials per abastos de predicci o curts (24-36 h). La t ecnica es basa en el m etode de breeding, que utilitza la din amica completa no lineal per identi car modes de creixement r apid. La modi caci o proposada est a dirigida a ajustar l'escala de les pertorbacions per tal de cobrir l'ample rang d'escales rellevants per la predicci o de curt abast. En tercer lloc, s'investiga el potencial de varis m etodes per tenir en compte la incertesa en el model per a un episodi recent de precipitacions intenses i inundacions que va oc orrer al llarg de la costa Mediterr ania espanyola (12-13 setembre de 2019). S'avaluen m ultiples estrat egies estoc astiques en front l'aproximaci o ordin aria de multif sica en termes de diversitat i habilitat de l'ensemble. Les t ecniques considerades inclouen pertorbacions estoc astiques a les tend encies f siques i pertorbacions a par ametres in uents de l'esquema de microf sica. Finalment, aquestes estrat egies de generaci o d'ensembles s'utilitzen com a for cament meteorol ogic per a un model hidrol ogic per tal d'investigar la predictibilitat 21 22 CONTENTS hidrometeorol ogica de l'episodi del 12-13 setembre de 2019. Les t ecniques desenvolupades, juntament amb l'assimilaci o de dades mitjan cant Ensemble Kalman Filter es comparen amb altres estrat egies populars, tals com el downscaling d'un model global i l'aproximaci o de multif sica. Els resultats d'aquesta Tesi s on rellevants des d'una perspectiva te orica, ja que la soluci o de l'equaci o de Liouville revela estructures complexes per la funci o densitat de probabilitat que podrien comprometre les hip otesis de compacitat i suavitat assumides per la majoria d'eines d'interpretaci o i post proc es d'ensembles. Per altra banda, les estrat egies de generaci o d'ensembles desenvolupades mostren potencial per millorar la predicci o d'esdeveniments d'alt impacte, que es demostra per una major diversitat i habilitat dels ensembles comparades amb les estrat egies de refer encia. Aquests resultats prometedors posen les bases per un sistema avan cat d'alertes a la regi o Mediterr ania per encarar els esdeveniments de temps sever.[spa] La predictibilidad de eventos de alto impacto en la regi on Mediterr anea ha mejorado sustancialmente a lo largo de las ultimas d ecadas. No obstante, una representaci on precisa de aspectos relevantes de los sistemas convectivos relevantes para la sociedad, como el momento en el que se producen, su localizaci on e intensidad a un suponen un reto. Estas debilidades de la predicci on a escala convectiva provienen de imprecisiones en la estimaci on del estado atmosf erico inicial, la formulaci on de los procesos f sicos relevantes y la naturaleza ca otica del sistema asociada a su no linealidad. En el marco probabilista impuesto por las incertidumbres intr nsecas implicadas en la predicci on num erica del tiempo, la entidad matem atica que cuanti ca la incertidumbre en el estado atmosf erico inicial es la funci on densidad de probabilidad. Sin embargo, el c alculo de su evoluci on temporal es inviable para situaciones realistas con los recursos computacionales disponibles actualmente. La modesta aproximaci on habitual para estimar esta evoluci on en el uso de un discreto y peque~no n umero de muestras del estado del sistema, lo que se conoce como predicci on por conjuntos (ensemble forecasting). El objetivo general de esta Tesis es entender mejor los l mites de la predictibilidad y contribuir a una mejora de la predicci on del tiempo severo en la regi on Mediterr anea. En primer lugar, se eval ua la evoluci on temporal de las funciones densidad de probabilidad para sistemas de baja complejidad con un cierto grado de realismo adoptando el formalismo te orico de Liouville. En segundo lugar, se dise~na una estrategia de muestreo para crear perturbaciones en les condiciones iniciales para alcances de predicci on cortos (24-36 h). La t ecnica se basa en el m etodo de breeding, que utiliza la din amica completa no lineal para identi car modos de crecimiento r apido. La modi caci on propuesta est a dirigida a ajustar la escala de las perturbaciones para cubrir el amplio rango de escalas relevantes para la predicci on de corto alcance. En tercer lugar, se investiga el potencial de varios m etodos para tener en cuenta la incertidumbre en el modelo para un episodio reciente de precipitaciones intensas e inundaciones que ocurri o a lo largo de la costa Mediterr anea espa~nola (12-13 de septiembre de 2019). Se eval uan m ultiples estrategias estoc asticas frente a la aproximaci on ordinaria de multif sica en t erminos de diversidad y habilidad del ensemble. Las t ecnicas consideradas incluyen perturbaciones estoc asticas en las tendencias f sicas y perturbaciones en par ametros in uyentes del esquema de microf sica. Finalmente, estas estrategias de generaci on de ensembles se usan como forzamiento meteorol ogico para un modelo hidrol ogico con el n de investigar la predictibilidad hidrometeorol ogica del episodio del 12-13 de septiembre de 2019. Las t ecnicas desarrolladas, junto a la asimilaci on de datos mediante Ensemble Kalman Filter se comparan con otras estrategias populares, como el dowscaling de un modelo global y la aproximaci on de multif sica. Los resultados de esta Tesis son relevantes desde una perspectiva te orica, ya que la soluci on de la ecuaci on de Liouville revela estructuras complejas para la funci on densidad de probabilidad que podr an comprometer las hip otesis de compacidad y suavidad asumidas por la mayor a de herramientas de interpretaci on y pos proceso de ensembles. Por otro lado, las estrategias de generaci on de ensembles desarrolladas muestran potencial para mejorar la predicci on de eventos de alto impacto, que se demuestra por una mayor diversidad y habilidad de los ensembles comparadas con las estrategias de referencia. Estos resultados prometedores sientan las bases para un sistema avanzado de alertas en la regi on Mediterr anea para afrontar los eventos de tiempo severo.[eng] The predictability of meteorological high-impact events in the Mediterranean region has substantially improved over the last decades. Nevertheless, a precise representation of socially relevant aspects of convective systems, such as their timing, location, and intensity is still challenging. These weaknesses of convective-scale forecasting stem from inaccuracies in the estimation of the atmospheric initial state, formulation of relevant physical processes, and the chaotic nature of the system associated with its nonlinearity. In the probabilistic framework imposed by the intrinsic uncertainties involved in numerical weather prediction, the mathematical entity that quanti es the uncertainty in the atmospheric state is the probability density function. However, the computation of its time evolution is unfeasible for realistic situations with the current available computational resources. The usual modest approach to estimate this evolution is the use of a discrete and small number of samples of the state of the system, which is known as ensemble forecasting. The general aim of this Thesis is to better understand the predictability limits and contribute towards the improvement of severe weather forecasting in the Mediterranean region. Firstly, the time evolution of probability density functions for low complexity systems with a certain degree of realism is evaluated by adopting the Liouville formalism. Secondly, a sampling strategy to create initial condition perturbations for the short-range (24-36 h) is designed. The technique is based on the breeding method, which uses the full nonlinear dynamics to identify fast-growing modes. The proposed modi cation is aimed at tailoring the scale of the perturbations in order to cover the wide range of scales relevant for short-range forecasting. Thirdly, the potential of several methods to account for model uncertainty is investigated for a recent heavy precipitation and ash ood episode occurred along the Spanish Mediterranean coast (12-13 September 2019). Multiple stochastic strategies are evaluated against the ordinary multiphysics approach in terms of ensemble diversity and skill. The considered techniques include stochastically perturbed physics tendencies and perturbations to in uential parameters within the microphysics scheme. Finally, these ensemble generation strategies are used as the meteorological forcing for a hydrological model in order to investigate the hydrometeorological predictability of the 12-13 September 2019 episode. The developed techniques, along with data assimilation by means of Ensemble Kalman Filter are compared to other popular strategies, such as the downscaling from a global model and the multiphysics approach. The results of this Thesis are relevant from a theoretical perspective, as the solution of the Liouville equation reveals complex structures for the probability density function that could compromise the hypothesis of compactness and smoothness assumed by most current ensemble interpretation and postprocessing tools. Conversely, the ensemble generation strategies developed show potential to improve the forecasting of high-impact events, proven by higher ensemble diversity and skill compared to the benchmark strategies. These encouraging results lay the foundations for an advanced warning system in the Mediterranean region to deal with severe weather events
2nd Edition of Health Emergency and Disaster Risk Management (Health-EDRM)
Disasters such as earthquakes, cyclones, floods, heat waves, nuclear accidents, and large-scale pollution incidents take lives and incur major health problems. The majority of large-scale disasters affect the most vulnerable populations, which often comprise extreme ages, remote living areas, and endemic poverty, as well as people with low literacy. Health emergency and disaster risk management (Health-EDRM) refers to the systematic analysis and management of health risks surrounding emergencies and disasters, and plays an important role in reducing the hazards and vulnerability along with extending preparedness, responses, and recovery measures. This concept encompasses risk analyses and interventions, such as accessible early warning systems, the timely deployment of relief workers, and the provision of suitable drugs and medical equipment to decrease the impact of disasters on people before, during, and after an event (or events). Currently, there is a major gap in the scientific literature regarding Health-EDRM to facilitate major global policies and initiatives for disaster risk reduction worldwide