232 research outputs found
Graph-based techniques for compression and reconstruction of sparse sources
The main goal of this thesis is to develop lossless compression schemes for analog and binary sources. All the considered compression schemes have as common feature that the encoder can be represented by a graph, so they can be studied employing tools from modern coding theory.
In particular, this thesis is focused on two compression problems: the group testing and the noiseless compressed sensing problems. Although both problems may seem unrelated, in the thesis they are shown to be very close. Furthermore, group testing has the same mathematical formulation as non-linear binary source compression schemes that use the OR operator. In this thesis, the similarities between these problems are exploited.
The group testing problem is aimed at identifying the defective subjects of a population with as few tests as possible. Group testing schemes can be divided into two groups: adaptive and non-adaptive group testing schemes. The former schemes generate tests sequentially and exploit the partial decoding results to attempt to reduce the overall number of tests required to label all members of the population, whereas non-adaptive schemes perform all the test in parallel and attempt to label as many subjects as possible.
Our contributions to the group testing problem are both theoretical and practical. We propose a novel adaptive scheme aimed to efficiently perform the testing process. Furthermore, we develop tools to predict the performance of both adaptive and non-adaptive schemes when the number of subjects to be tested is large. These tools allow to characterize the performance of adaptive and non-adaptive group testing schemes without simulating them.
The goal of the noiseless compressed sensing problem is to retrieve a signal from its lineal projection version in a lower-dimensional space. This can be done only whenever the amount of null components of the original signal is large enough. Compressed sensing deals with the design of sampling schemes and reconstruction algorithms that manage to reconstruct the original signal vector with as few samples as possible.
In this thesis we pose the compressed sensing problem within a probabilistic framework, as opposed to the classical compression sensing formulation. Recent results in the state of the art show that this approach is more efficient than the classical one.
Our contributions to noiseless compressed sensing are both theoretical and practical. We deduce a necessary and sufficient matrix design condition to guarantee that the reconstruction is lossless. Regarding the design of practical schemes, we propose two novel reconstruction algorithms based on message passing over the sparse representation of the matrix, one of them with very low computational complexity.El objetivo principal de la tesis es el desarrollo de esquemas de compresión sin pérdidas para fuentes analógicas y binarias. Los esquemas analizados tienen en común la representación del compresor mediante un grafo; esto ha permitido emplear en su estudio las herramientas de codificación modernas. Más concretamente la tesis estudia dos problemas de compresión en particular: el diseño de experimentos de testeo comprimido de poblaciones (de sangre, de presencia de elementos contaminantes, secuenciado de ADN, etcétera) y el muestreo comprimido de señales reales en ausencia de ruido. A pesar de que a primera vista parezcan problemas totalmente diferentes, en la tesis mostramos que están muy relacionados. Adicionalmente, el problema de testeo comprimido de poblaciones tiene una formulación matemática idéntica a los códigos de compresión binarios no lineales basados en puertas OR. En la tesis se explotan las similitudes entre todos estos problemas. Existen dos aproximaciones al testeo de poblaciones: el testeo adaptativo y el no adaptativo. El primero realiza los test de forma secuencial y explota los resultados parciales de estos para intentar reducir el número total de test necesarios, mientras que el segundo hace todos los test en bloque e intenta extraer el máximo de datos posibles de los test. Nuestras contribuciones al problema de testeo comprimido han sido tanto teóricas como prácticas. Hemos propuesto un nuevo esquema adaptativo para realizar eficientemente el proceso de testeo. Además hemos desarrollado herramientas que permiten predecir el comportamiento tanto de los esquemas adaptativos como de los esquemas no adaptativos cuando el número de sujetos a testear es elevado. Estas herramientas permiten anticipar las prestaciones de los esquemas de testeo sin necesidad de simularlos. El objetivo del muestreo comprimido es recuperar una señal a partir de su proyección lineal en un espacio de menor dimensión. Esto sólo es posible si se asume que la señal original tiene muchas componentes que son cero. El problema versa sobre el diseño de matrices y algoritmos de reconstrucción que permitan implementar esquemas de muestreo y reconstrucción con un número mínimo de muestras. A diferencia de la formulación clásica de muestreo comprimido, en esta tesis se ha empleado un modelado probabilístico de la señal. Referencias recientes en la literatura demuestran que este enfoque permite conseguir esquemas de compresión y descompresión más eficientes. Nuestras contribuciones en el campo de muestreo comprimido de fuentes analógicas dispersas han sido también teóricas y prácticas. Por un lado, la deducción de la condición necesaria y suficiente que debe garantizar la matriz de muestreo para garantizar que se puede reconstruir unívocamente la secuencia de fuente. Por otro lado, hemos propuesto dos algoritmos, uno de ellos de baja complejidad computacional, que permiten reconstruir la señal original basados en paso de mensajes entre los nodos de la representación gráfica de la matriz de proyección.Postprint (published version
Which health and biomedical topics generate the most Facebook interest and the strongest citation relationships?
This is an accepted manuscript of an article published by Elsevier in Information Processing and Management on 26/02/2020, available online: https://doi.org/10.1016/j.ipm.2020.102230
The accepted version of the publication may differ from the final published version.Although more than a million academic papers have been posted on Facebook, there is little detailed research about which fields or cross-field issues are involved and whether there are field or public interest relationships between Facebook mentions and future citations. In response, we identified health and biomedical scientific papers mentioned on Facebook and assigned subjects to them using the MeSH and Science Metrix journal classification schema. Multistage adaptive LASSO and unpenalized least-squares regressions were used to model Facebook mentions by fields and MeSH terms. The fields Science and Technology, General and Internal Medicine, Complementary and Alternative Medicine, and Sport Sciences produced higher Facebook mention counts than average. However, no MeSH cross-field issue differences were found in the rate of attracting Facebook mentions. The relationship between Facebook mentions and citations varies between both fields and MeSH cross-field issues. General and Internal Medicine, Cardiovascular System and Hematology and Developmental Biology have strongest correlations between Facebook mentions and citations, probably due to high citation rates and high Facebook visibility in these areas
Cuckoo search epistasis: a new method for exploring significant genetic interactions
The advent of high-throughput sequencing technology has resulted in the ability to measure millions of single-nucleotide polymorphisms (SNPs) from thousands of individuals. Although these high-dimensional data have paved the way for better understanding of the genetic architecture of common diseases, they have also given rise to challenges in developing computational methods for learning epistatic relationships among genetic markers. We propose a new method, named cuckoo search epistasis (CSE) for identifying significant epistatic interactions in population-based association studies with a case-control design. This method combines a computationally efficient Bayesian scoring function with an evolutionary-based heuristic search algorithm, and can be efficiently applied to high-dimensional genome-wide SNP data. The experimental results from synthetic data sets show that CSE outperforms existing methods including multifactorial dimensionality reduction and Bayesian epistasis association mapping. In addition, on a real genome-wide data set related to Alzheimer's disease, CSE identified SNPs that are consistent with previously reported results, and show the utility of CSE for application to genome-wide data. © 2014 Macmillan Publishers Limited All rights reserved
Building a scalable and interpretable bayesian deep learning framework for quality control of free form surfaces
Deep learning has demonstrated high accuracy for 3D object shape error modeling necessary to estimate dimensional and geometric quality defects in multi-station assembly systems (MAS). Increasingly, deep learning-driven Root Cause Analysis (RCA) is used for decision-making when planning corrective action of quality defects. However, given the current absence of scalability enabling models, training deep learning models for each individual MAS is exceedingly time-consuming as it requires large amounts of labelled data and multiple computational cycles. Additionally, understanding and interpreting how deep learning produces final predictions while quantifying various uncertainties also remains a fundamental challenge. In an effort to address these gaps, a novel closed-loop in-process (CLIP) diagnostic framework underpinned algorithm portfolio is proposed which simultaneously enhances scalability and interpretability of the current Bayesian deep learning approach, Object Shape Error Response (OSER), to isolate root cause(s) of quality defects in MAS. The OSER-MAS leverages a Bayesian 3D U-Net architecture integrated with Computer-Aided Engineering simulations to estimate root causes. The CLIP diagnostic framework shortens OSER-MAS model training time by developing: (i) closed-loop training to enable faster convergence for a single MAS by leveraging uncertainty estimates of the Bayesian 3D U-net model; and, (ii) transfer/continual learning-based scalability model to transmit meta-knowledge from the trained model to a new MAS resulting in convergence using comparatively less training samples. Additionally, CLIP increases the transparency for quality-related root cause predictions by developing interpretability model which is based on 3D Gradient-based Class Activation Maps (3D Grad-CAMs) and entails: (a) linking elements of MAS model with functional elements of the U-Net architecture; and, (b) relating features extracted by the architecture with elements of the MAS model and further with the object shape error patterns for root cause(s) that occur in MAS. Benchmarking studies are conducted using six automotive-MAS with varying complexities. Results highlight a reduction in training samples of up to 56% with a loss in performance of up to 2.1%
Computational Intelligence in Healthcare
This book is a printed edition of the Special Issue Computational Intelligence in Healthcare that was published in Electronic
Computational Intelligence in Healthcare
The number of patient health data has been estimated to have reached 2314 exabytes by 2020. Traditional data analysis techniques are unsuitable to extract useful information from such a vast quantity of data. Thus, intelligent data analysis methods combining human expertise and computational models for accurate and in-depth data analysis are necessary. The technological revolution and medical advances made by combining vast quantities of available data, cloud computing services, and AI-based solutions can provide expert insight and analysis on a mass scale and at a relatively low cost. Computational intelligence (CI) methods, such as fuzzy models, artificial neural networks, evolutionary algorithms, and probabilistic methods, have recently emerged as promising tools for the development and application of intelligent systems in healthcare practice. CI-based systems can learn from data and evolve according to changes in the environments by taking into account the uncertainty characterizing health data, including omics data, clinical data, sensor, and imaging data. The use of CI in healthcare can improve the processing of such data to develop intelligent solutions for prevention, diagnosis, treatment, and follow-up, as well as for the analysis of administrative processes. The present Special Issue on computational intelligence for healthcare is intended to show the potential and the practical impacts of CI techniques in challenging healthcare applications
Co-gestão em cirurgia colorectal
RESUMO: O aumento da esperança média de vida leva a que a população que recorre aos hospitais
seja progressivamente mais velha e, quase inerentemente, com maior multimorbilidade.
Embora para a maioria dos doentes o risco do procedimento cirúrgico
visto isoladamente seja muito baixo, as complicações após cirurgia são ainda uma
importante causa de morte. A qualidade e resultado da assistência a doentes cirúrgicos
internados depende da capacidade das equipas de cuidados integrarem a
interacção dos múltiplos problemas e patologias que estes apresentam, com uma
abordagem coerente multidisciplinar.
Em termos práticos, a colaboração entre as especialidades médicas e cirúrgicas
pode ter vários modelos assistenciais, todos com defeitos e virtudes. As duas abordagens
gerais mais usadas para o controlo clínico dos aspectos médicos em doentes
cirúrgicos são: 1) modelo tradicional de consultoria por chamada e 2) o modelo
da co-gestão (CG). A CG é uma forma de organização de cuidados moderna e que já
se demonstrou ser favorável em situações complexas.
O objectivo do presente trabalho de investigação foi conceber e estruturar o circuito
clínico optimizado de cuidados do doente cirúrgico, de forma a ser orientador
para as especialidades envolvidas e desenvolver modelos preditivos de apoio à
decisão a utilizar na implementação de um serviço de co-gestão entre a Medicina
Interna (MI) e a Cirurgia Geral no Hospital da Luz Lisboa (HLL).
Usou-se a modelação gráfica de linguagem de Business Process Modeling and Notation
(BPMN) para desenhar o circuito clínico de cuidados do doente cirúrgico e tomar
consciência das eventuais lacunas existentes, bem como identificar os locais onde,
no circuito do doente, se deve tomar a decisão de orientação para CG e quem a
deve tomar Os critérios para selecção dos doentes para CG foram explicitados com base numa
coorte histórica de doentes cirúrgicos de cirurgia colorectal de Janeiro de 2012 a
Dezembro de 2015 (48 meses).
Desenvolveram-se dois modelos preditivos independentes que usam métodos
de tratamento de dados adequados ao objecto do estudo, em cada momento do
mesmo. O primeiro modelo preditivo, a ser aplicado na consulta de anestesia pré-
-operatória, usa apenas variáveis pré-operatórias e o segundo modelo preditivo,
a ser aplicado à saída do recobro, usa variáveis pré e pós-operatórias. Estes modelos
preditivos funcionam como ferramentas de apoio à decisão médica de selecção
para CG entre a MI e a Cirurgia Geral.
Ambos os modelos mostraram medidas de desempenho muito satisfatórias. O modelo
de variáveis pré-operatórias obteve uma area under the receiver-operating characteristic
(ROC) curve (AUC) de 0.81, uma sensibilidade de 0.74, uma especificidade de 0.78 e
um valor preditivo negativo (NPV) de 0.93 para um ponto de corte das probabilidades
estimadas de 0.18. O modelo de variáveis pré e pós-operatórias obteve uma AUC de
0.86, uma sensibilidade de 0.80, uma especificidade de 0.82 e um NPV de 0.95 para
um ponto de corte das probabilidades estimadas de 0.18. A capacidade preditiva dos
dois modelos, avaliada através dos gráficos de calibração, foi boa.
Fez-se a validação interna do modelo das variáveis pré-operatórias e a validação
externa do modelo das variáveis pré e pós-operatórias, ainda que sem transportabilidade
geográfica.
Os gráficos do circuito clínico de cuidados foram refeitos com a introdução das
alterações que se entenderam propor e com a aplicação, nos locais considerados
apropriados, das ferramentas de apoio à decisão.
alexandra bayão horta 13
Esta ferramenta de Apoio à decisão permite:
i. Codificar de forma explícita o conhecimento implícito existente nos médicos
da organização,
ii. Criar o método como o primeiro passo para extrapolar para outras doenças,
outros procedimentos e outros ambientes hospitalares e,
iii. Acrescentar racional de gestão à prática hospitalar.
Sabemos que a diversidade dos doentes e das estruturas organizativas dos diferentes
hospitais torna impossível que haja uma única forma de desenhar e implementar
um programa de CG bem-sucedido, mas os principais factores e variáveis
envolvidas foram descritos nesta tese com base na experiência adquirida pelo autor
no HLL e apoiada na literatura. Para atingir bons padrões de qualidade na prática
médica é importante usar o método no qual a decisão clínica envolve a integração
da evidência objectiva com a perícia individual.ABSTRACT: Increased life expectancy leads to surgical patients that are increasingly older
with more morbidity.
Post-surgical complications are still an important mortality cause, although most
individual surgical procedures carry a very low risk. The quality of care of surgical
in-patients became thus critically dependent on the medical teams’ capacity to
integrate the interactions of their different pathological conditions, warranting
multidisciplinary teams for the more complex cases.
The collaboration between medical and surgical specialties may follow different
models, with different advantages and disadvantages. The most common approaches
are (i) the traditional “on-call” model and (ii) co-management (CM). Co-management
is a modern approach to the problem, which has been proven favourable in complex
situations.
We focused on designing the complex patients' clinical-care pathway for CM
between Internal Medicine (IM) and General Surgery at Hospital da Luz Lisboa
(HLL) , in order to clarify the role of the different specialties involved and to develop
predictive models to support the clinical decision on which patients to select to CM.
The clinical-care pathway for complex patients at HLL was graphically modelled
using Business Process Modeling Notation. There followed highlighting points
for potential improvement as well as critical decision nodes for CM.
The criteria for inclusion in the CM pathway were made explicit using retrospective
material from the patients submitted to colorectal surgery at HLL between January
2012 and December 2015 (48 months).Two different mathematical/statistical predictive models were developed, each
one using the most adequate technique for the pre-processing of the data at each
stage of the study. The first model, to be used in a preoperative anaesthesiology
clinic, using only preoperative data; and the second one, with both preoperative
and postoperative data, to be used upon discharge from the recovery facility.
Both decision support tools showed good performance measures. The pre-operative
model with an area under the receiver-operating characteristic (ROC) curve (AUC)
of 0.81, a sensitivity of 0.74, a specificity of 0.78 and negative predictive value (NPV) of
0.93, leading to a satisfactory calibration plot. The model with both pre-operative
and post-operative data attained an AUC of 0.86, a sensitivity of 0.08, a specificity
of 0.82 and a NPV of 0.95, leading to a good predictive capacity gauged by the
calibration plot.
The pre-operative data model was internally validated and the pre and post-operative
data model was externally validated, albeit without geographical transportability.
The clinical-pathway graphics were thus changed, with the integration in two
critical points of the input from the decision support tools.
This clinical decision support tool allows to:
i. Codify explicitly the implicit knowledge held typically by the clinicians,
ii. Develop and extrapolate to other diseases, other procedures and other
hospital settings, and
iii. Add managerial reasoning to hospital practice. We acknowledge that diversity of patients and hospital setting makes impossible
the existence of an exact formula to build a standard successful CM program.
However, based on our experience at HLL and on the literature, we describe the
main factors and variables involved, suggesting a simple approach. The integration
of objective data with clinical expertise is key to achieve high standards in clinical
practice
Data fusion for system modeling, performance assessment and improvement
Due to rapid advancements in sensing and computation technology, multiple types of sensors have been embedded in various applications, on-line automatically collecting massive production information. Although this data-rich environment provides great opportunity for more effective process control, it also raises new research challenges on data analysis and decision making due to the complex data structures, such as heterogeneous data dependency, and large-volume and high-dimensional characteristics.
This thesis contributes to the area of System Informatics and Control (SIAC) to develop systematic data fusion methodologies for effective quality control and performance improvement in complex systems. These advanced methodologies enable (1) a better handling of the rich data environment communicated by complex engineering systems, (2) a closer monitoring of the system status, and (3) a more accurate forecasting of future trends and behaviors. The research bridges the gaps in methodologies among advanced statistics, engineering domain knowledge and operation research. It also forms close linkage to various application areas such as manufacturing, health care, energy and service systems.
This thesis started from investigating the optimal sensor system design and conducting multiple sensor data fusion analysis for process monitoring and diagnosis in different applications. In Chapter 2, we first studied the couplings or interactions between the optimal design of a sensor system in a Bayesian Network and quality management of a manufacturing system, which can improve cost-effectiveness and production yield by considering sensor cost, process change detection speed, and fault diagnosis accuracy in an integrated manner. An algorithm named “Best Allocation Subsets by Intelligent Search” (BASIS) with optimality proof is developed to obtain the optimal sensor allocation design at minimum cost under different user specified detection requirements.
Chapter 3 extended this line of research by proposing a novel adaptive sensor allocation framework, which can greatly improve the monitoring and diagnosis capabilities of the previous method. A max-min criterion is developed to manage sensor reallocation and process change detection in an integrated manner. The methodology was tested and validated based on a hot forming process and a cap alignment process.
Next in Chapter 4, we proposed a Scalable-Robust-Efficient Adaptive (SERA) sensor allocation strategy for online high-dimensional process monitoring in a general network. A monitoring scheme of using the sum of top-r local detection statistics is developed, which is scalable, effective and robust in detecting a wide range of possible shifts in all directions. This research provides a generic guideline for practitioners on determining (1) the appropriate sensor layout; (2) the “ON” and “OFF” states of different sensors; and (3) which part of the acquired data should be transmitted to and analyzed at the fusion center, when only limited resources are available.
To improve the accuracy of remaining lifetime prediction, Chapter 5 proposed a data-level fusion methodology for degradation modeling and prognostics. When multiple sensors are available to measure the degradation mechanism of a same system, it becomes a high dimensional and challenging problem to determine which sensors to use and how to combine them together for better data analysis. To address this issue, we first defined two essential properties if present in a degradation signal, can enhance the effectiveness for prognostics. Then, we proposed a generic data-level fusion algorithm to construct a composite health index to achieve those two identified properties. The methodology was tested using the degradation signals of aircraft gas turbine engine, which demonstrated a much better prognostic result compared to relying solely on the data from an individual sensor.
In summary, this thesis is the research drawing attention to the area of data fusion for effective employment of the underlying data gathering capabilities for system modeling, performance assessment and improvement. The fundamental data fusion methodologies are developed and further applied to various applications, which can facilitate resources planning, real-time monitoring, diagnosis and prognostics.Ph.D
- …