7 research outputs found
Recursive Parameter Estimation of Non-Gaussian Hidden Markov Models for Occupancy Estimation in Smart Buildings
A significant volume of data has been produced in this era. Therefore, accurately modeling these
data for further analysis and extraction of meaningful patterns is becoming a major concern in a
wide variety of real-life applications. Smart buildings are one of these areas urgently demanding
analysis of data. Managing the intelligent systems in smart homes, will reduce energy consumption
as well as enhance users’ comfort. In this context, Hidden Markov Model (HMM) as a learnable
finite stochastic model has consistently been a powerful tool for data modeling. Thus, we have been
motivated to propose occupancy estimation frameworks for smart buildings through HMM due to
the importance of indoor occupancy estimations in automating environmental settings. One of the
key factors in modeling data with HMM is the choice of the emission probability. In this thesis, we
have proposed novel HMMs extensions through Generalized Dirichlet (GD), Beta-Liouville (BL),
Inverted Dirichlet (ID), Generalized Inverted Dirichlet (GID), and Inverted Beta-Liouville (IBL)
distributions as emission probability distributions. These distributions have been investigated due
to their capabilities in modeling a variety of non-Gaussian data, overcoming the limited covariance
structures of other distributions such as the Dirichlet distribution. The next step after determining
the emission probability is estimating an optimized parameter of the distribution. Therefore, we
have developed a recursive parameter estimation based on maximum likelihood estimation approach
(MLE). Due to the linear complexity of the proposed recursive algorithm, the developed models can
successfully model real-time data, this allowed the models to be used in an extensive range of
practical applications
A critical review of the current state of forensic science knowledge and its integration in legal systems
Forensic science has a significant historical and contemporary relationship with the criminal justice system. It is a relationship between two disciplines whose origins stem from different backgrounds. It is trite that effective communication assist in resolving underlying problems in any given context. However, a lack of communication continues to characterise the intersection between law and science.
As recently as 2019, a six-part symposium on the use of forensic science in the criminal justice system again posed the question on how the justice system could ensure the reliability of forensic science evidence presented during trials. As the law demands finality, science is always evolving and can never be considered finite or final. Legal systems do not always adapt to the nature of scientific knowledge, and are not willing to abandon finality when that scientific knowledge shifts.
Advocacy plays an important role in the promotion of forensic science, particularly advocacy to the broader scientific community for financial support, much needed research and more testing. However, despite its important function, advocacy should not be conflated with science. The foundation of advocacy is a cause; whereas the foundation of science is fact.
The objective of this research was to conduct a qualitative literature review of the field of forensic science; to identify gaps in the knowledge of forensic science and its integration in the criminal justice system. The literature review will provide researchers within the field of forensic science with suggested research topics requiring further examination and research. To achieve its objective, the study critically analysed the historical development of, and evaluated the use of forensic science evidence in legal systems generally, including its role regarding the admissibility or inadmissibility of the evidence in the courtroom.
In conclusion, it was determined that the breadth of forensic scientific knowledge is comprehensive but scattered. The foundational underpinning of the four disciplines, discussed in this dissertation, has been put to the legal test on countless occasions. Some gaps still remain that require further research in order to strengthen the foundation of the disciplines. Human influence will always be present in examinations and interpretations and will lean towards subjective decision making.JurisprudenceD. Phil
Consensus ou fusion de segmentation pour quelques applications de détection ou de classification en imagerie
Récemment, des vraies mesures de distances, au sens d’un certain critère (et possédant de bonnes propriétés asymptotiques) ont été introduites entre des résultats de partitionnement (clustering) de donnés, quelquefois indexées spatialement comme le sont les images segmentées. À partir de ces métriques, le principe de segmentation moyenne
(ou consensus) a été proposée en traitement d’images, comme étant la solution d’un problème d’optimisation et une façon simple et efficace d’améliorer le résultat final de segmentation ou de classification obtenues en moyennant (ou fusionnant) différentes segmentations de la même scène estimée grossièrement à partir de plusieurs algorithmes de segmentation simples (ou identiques mais utilisant différents paramètres internes). Ce principe qui peut se concevoir comme un débruitage de données d’abstraction élevée, s’est avéré récemment une alternative efficace et très parallélisable, comparativement aux méthodes utilisant des modèles de segmentation toujours plus complexes et plus coûteux en temps de calcul.
Le principe de distance entre segmentations et de moyennage ou fusion de segmentations peut être exploité, directement ou facilement adapté, par tous les algorithmes ou les méthodes utilisées en imagerie numérique où les données peuvent en fait se substituer à des images segmentées. Cette thèse a pour but de démontrer cette assertion et de présenter différentes applications originales dans des domaines comme la visualisation et l’indexation dans les grandes bases d’images au sens du contenu segmenté de chaque image, et non plus au sens habituel de la couleur et de la texture, le traitement d’images pour améliorer sensiblement et facilement la performance des méthodes de détection du mouvement dans une séquence d’images ou finalement en analyse et classification d’images médicales avec une application permettant la détection automatique et la quantification de la maladie d’Alzheimer à partir d’images par résonance magnétique du cerveau.Recently, some true metrics in a criterion sense (with good asymptotic properties)
were introduced between data partitions (or clusterings) even for data spatially ordered
such as image segmentations. From these metrics, the notion of average clustering (or
consensus segmentation) was then proposed in image processing as the solution of an
optimization problem and a simple and effective way to improve the final result of segmentation
or classification obtained by averaging (or fusing) different segmentations of
the same scene which are roughly estimated from several simple segmentation models
(or obtained with the same model but with different internal parameters). This principle,
which can be conceived as a denoising of high abstraction data, has recently proved to
be an effective and very parallelizable alternative, compared to methods using ever more
complex and time-consuming segmentation models.
The principle of distance between segmentations, and averaging of segmentations,
in a criterion sense, can be exploited, directly or easily adapted, by all the algorithms
or methods used in digital imaging where data can in fact be substituted to segmented
images. This thesis proposal aims at demonstrating this assertion and to present different
original applications in various fields in digital imagery such as the visualization and
the indexation in the image databases, in the sense of the segmented contents of each
image, and no longer in the common color and texture sense, or in image processing in
order to sensibly and easily improve the detection of movement in the image sequence
or finally in analysis and classification in medical imaging with an application allowing
the automatic detection and quantification of Alzheimer’s disease
Cost-Sensitive Boosting for Classification of Imbalanced Data
The classification of data with imbalanced class distributions has
posed a significant drawback in the performance attainable by most
well-developed classification systems, which assume relatively
balanced class distributions. This problem is especially crucial
in many application domains, such as medical diagnosis, fraud
detection, network intrusion, etc., which are of great importance
in machine learning and data mining.
This thesis explores meta-techniques which are applicable to most
classifier learning algorithms, with the aim to advance the
classification of imbalanced data. Boosting is a powerful
meta-technique to learn an ensemble of weak models with a promise
of improving the classification accuracy. AdaBoost has been taken
as the most successful boosting algorithm. This thesis starts with
applying AdaBoost to an associative classifier for both learning
time reduction and accuracy improvement. However, the promise of
accuracy improvement is trivial in the context of the class
imbalance problem, where accuracy is less meaningful. The insight
gained from a comprehensive analysis on the boosting strategy of
AdaBoost leads to the investigation of cost-sensitive boosting
algorithms, which are developed by introducing cost items into the
learning framework of AdaBoost. The cost items are used to denote
the uneven identification importance among classes, such that the
boosting strategies can intentionally bias the learning towards
classes associated with higher identification importance and
eventually improve the identification performance on them. Given
an application domain, cost values with respect to different types
of samples are usually unavailable for applying the proposed
cost-sensitive boosting algorithms. To set up the effective cost
values, empirical methods are used for bi-class applications and
heuristic searching of the Genetic Algorithm is employed for
multi-class applications.
This thesis also covers the implementation of the proposed
cost-sensitive boosting algorithms. It ends with a discussion on
the experimental results of classification of real-world
imbalanced data. Compared with existing algorithms, the new
algorithms this thesis presents are superior in achieving better
measurements regarding the learning objectives
Modelación geoespacial de variables de densidad forestal
Propósitos y Métodos de Estudio: Se presenta un marco metodológico para la modelación geoespacial de la densidad forestal mediante datos espectrales de resolución moderada para una porción del lÃmite septentrional de la zona intertropical: San Luis PotosÃ, México. Se emplearon algoritmos de modelación estadÃstica paramétrica y no paramétrica, análisis espacial y procesamiento de imágenes de satélite.
Contribución y Conclusiones: El CapÃtulo 1 presenta un estado general de las metodologÃas en otros paÃses de referencia como Finlandia, Suecia, Alemania, Canadá, Estados Unidos y México e introduce a temas relacionados con inventarios forestales realizados con herramientas de análisis geoespacial. El CapÃtulo 2 investiga los Ãndices espectrales de vegetación en la modelación de la densidad forestal concluyendo que los Ãndices normalizados sensibles al contenido de humedad con una tendencia no lineal, modelan mejor la biomasa arbórea aérea.
El CapÃtulo 3 explora métodos estadisticos para la estimación de cobertura fraccional a nivel subpixel y concluye que el más apropiado es el análisis mezcla espectral lineal para dos clases puras: bosque y matorral. En el CapÃtulo 4 se comparan variantes del algoritmo no paramétrico del vecino más cercano que relacionan la densidad forestal con variables espectrales y
auxiliares, con la flexibilidad propia de retener la estructura de los datos de referencia en las estimaciones. Los métodos empleados en esta investigación representan un esfuerzo en la
modelación pÃxel a pÃxel de variables de densidad forestal en México, sobre todo cuando se trata de áreas de estudio considerables. Este trabajo propone una metodologÃa interesante para la modelación de la biomasa/carbono aéreo para satisfacer la necesidad de información de la iniciativa de la Organización de las Naciones Unidas (ONU) para la Reducción de Emisiones de
la Deforestación y la Degradación Forestal (REDD, por sus siglas en inglés)