    Generation of a Land Cover Atlas of environmental critic zones using unconventional tools

    Toward Large Scale Semantic Image Understanding and Retrieval

    Semantic image retrieval is a multifaceted, highly complex problem. Not only does the solution to this problem require advanced image processing and computer vision techniques, but it also requires knowledge beyond what can be inferred from the image content alone. In contrast, traditional image retrieval systems are based upon keyword searches on filenames or metadata tags, e.g. Google image search, Flickr search, etc. These conventional systems do not analyze the image content and their keywords are not guaranteed to represent the image. Thus, there is significant need for a semantic image retrieval system that can analyze and retrieve images based upon the content and relationships that exist in the real world.In this thesis, I present a framework that moves towards advancing semantic image retrieval in large scale datasets. At a conceptual level, semantic image retrieval requires the following steps: viewing an image, understanding the content of the image, indexing the important aspects of the image, connecting the image concepts to the real world, and finally retrieving the images based upon the index concepts or related concepts. My proposed framework addresses each of these components in my ultimate goal of improving image retrieval. The first task is the essential task of understanding the content of an image. Unfortunately, typically the only data used by a computer algorithm when analyzing images is the low-level pixel data. But, to achieve human level comprehension, a machine must overcome the semantic gap, or disparity that exists between the image data and human understanding. This translation of the low-level information into a high-level representation is an extremely difficult problem that requires more than the image pixel information. I describe my solution to this problem through the use of an online knowledge acquisition and storage system. This system utilizes the extensible, visual, and interactable properties of Scalable Vector Graphics (SVG) combined with online crowd sourcing tools to collect high level knowledge about visual content.I further describe the utilization of knowledge and semantic data for image understanding. Specifically, I seek to incorporate knowledge in various algorithms that cannot be inferred from the image pixels alone. This information comes from related images or structured data (in the form of hierarchies and ontologies) to improve the performance of object detection and image segmentation tasks. These understanding tasks are crucial intermediate steps towards retrieval and semantic understanding. However, the typical object detection and segmentation tasks requires an abundance of training data for machine learning algorithms. The prior training information provides information on what patterns and visual features the algorithm should be looking for when processing an image. In contrast, my algorithm utilizes related semantic images to extract the visual properties of an object and also to decrease the search space of my detection algorithm. Furthermore, I demonstrate the use of related images in the image segmentation process. Again, without the use of prior training data, I present a method for foreground object segmentation by finding the shared area that exists in a set of images. I demonstrate the effectiveness of my method on structured image datasets that have defined relationships between classes i.e. parent-child, or sibling classes.Finally, I introduce my framework for semantic image retrieval. I enhance the proposed knowledge acquisition and image understanding techniques with semantic knowledge through linked data and web semantic languages. This is an essential step in semantic image retrieval. For example, a car class classified by an image processing algorithm not enhanced by external knowledge would have no idea that a car is a type of vehicle which would also be highly related to a truck and less related to other transportation methods like a train . However, a query for modes of human transportation should return all of the mentioned classes. Thus, I demonstrate how to integrate information from both image processing algorithms and semantic knowledge bases to perform interesting queries that would otherwise be impossible. The key component of this system is a novel property reasoner that is able to translate low level image features into semantically relevant object properties. I use a combination of XML based languages such as SVG, RDF, and OWL in order to link to existing ontologies available on the web. My experiments demonstrate an efficient data collection framework and novel utilization of semantic data for image analysis and retrieval on datasets of people and landmarks collected from sources such as IMDB and Flickr. Ultimately, my thesis presents improvements to the state of the art in visual knowledge representation/acquisition and computer vision algorithms such as detection and segmentation toward the goal of enhanced semantic image retrieval

    A Systematic Survey of ML Datasets for Prime CV Research Areas-Media and Metadata

    The ever-growing capabilities of computers have enabled pursuing Computer Vision through Machine Learning (i.e., MLCV). ML tools require large amounts of information to learn from (ML datasets). These are costly to produce but have received reduced attention regarding standardization. This prevents the cooperative production and exploitation of these resources, impedes countless synergies, and hinders ML research. No global view exists of the MLCV dataset tissue. Acquiring it is fundamental to enable standardization. We provide an extensive survey of the evolution and current state of MLCV datasets (1994 to 2019) for a set of specific CV areas as well as a quantitative and qualitative analysis of the results. Data were gathered from online scientific databases (e.g., Google Scholar, CiteSeerX). We reveal the heterogeneous plethora that comprises the MLCV dataset tissue; their continuous growth in volume and complexity; the specificities of the evolution of their media and metadata components regarding a range of aspects; and that MLCV progress requires the construction of a global standardized (structuring, manipulating, and sharing) MLCV "library". Accordingly, we formulate a novel interpretation of this dataset collective as a global tissue of synthetic cognitive visual memories and define the immediately necessary steps to advance its standardization and integration

    Reduction of False Positives in Intrusion Detection Based on Extreme Learning Machine with Situation Awareness

    Protecting computer networks from intrusions is more important than ever for our privacy, economy, and national security. Seemingly a month does not pass without news of a major data breach involving sensitive personal identity, financial, medical, trade secret, or national security data. Democratic processes can now be potentially compromised through breaches of electronic voting systems. As ever more devices, including medical machines, automobiles, and control systems for critical infrastructure are increasingly networked, human life is also more at risk from cyber-attacks. Research into Intrusion Detection Systems (IDSs) began several decades ago and IDSs are still a mainstay of computer and network protection and continue to evolve. However, detecting previously unseen, or zero-day, threats is still an elusive goal. Many commercial IDS deployments still use misuse detection based on known threat signatures. Systems utilizing anomaly detection have shown great promise to detect previously unseen threats in academic research. But their success has been limited in large part due to the excessive number of false positives that they produce. This research demonstrates that false positives can be better minimized, while maintaining detection accuracy, by combining Extreme Learning Machine (ELM) and Hidden Markov Models (HMM) as classifiers within the context of a situation awareness framework. This research was performed using the University of New South Wales - Network Based 2015 (UNSW-NB15) data set which is more representative of contemporary cyber-attack and normal network traffic than older data sets typically used in IDS research. It is shown that this approach provides better results than either HMM or ELM alone and with a lower False Positive Rate (FPR) than other comparable approaches that also used the UNSW-NB15 data set

    Unified Implicit and Explicit Feedback for Multi-Application User Interest Modeling

    A user often interacts with multiple applications while working on a task. User models can be developed individually at each of the individual applications, but there is no easy way to come up with a more complete user model based on the distributed activity of the user. To address this issue, this research studies the importance of combining various implicit and explicit relevance feedback indicators in a multi-application environment. It allows different applications used for different purposes by the user to contribute user activity and its context to mutually support users with unified relevance feedback. Using the data collected by the web browser, Microsoft Word and Microsoft PowerPoint, Adobe Acrobat Writer and VKB, combinations of implicit relevance feedback with semi-explicit relevance feedback were analyzed and compared with explicit user ratings. Our past research show that multi-application interest models based on implicit feedback theoretically out performed single application interest models based on implicit feedback. Also in practice, a multi-application interest model based on semi-explicit feedback increased user attention to high-value documents. In the current dissertation study, we have incorporated topic modeling to represent interest in user models for textual content and compared similarity measures for improved recall and precision based on the text content. We also learned the relative value of features from content consumption applications and content production applications. Our experimental results show that incorporating implicit feedback in page-level user interest estimation resulted in significant improvements over the baseline models. Furthermore, incorporating semi-explicit content (e.g. annotated text) with the authored text is effective in identifying segment-level relevant content. We have evaluated the effectiveness of the recommendation support from both semi-explicit model (authored/annotated text) and unified model (implicit + semi-explicit) and have found that they are successful in allowing users to locate the content easily because the relevant details are selectively highlighted and recommended documents and passages within documents based on the user’s indicated interest. Our recommendations based on the semi-explicit feedback were viewed the same as those from unified feedback and recommendations based on semi-explicit feedback outperformed those from unified feedback in terms of matching post-task document assessments

    Combining visual recognition and computational linguistics : linguistic knowledge for visual recognition and natural language descriptions of visual content

    Extensive efforts are being made to improve visual recognition and semantic understanding of language. However, surprisingly little has been done to exploit the mutual benefits of combining both fields. In this thesis we show how the different fields of research can profit from each other. First, we scale recognition to 200 unseen object classes and show how to extract robust semantic relatedness from linguistic resources. Our novel approach extends zero-shot to few shot recognition and exploits unlabeled data by adopting label propagation for transfer learning. Second, we capture the high variability but low availability of composite activity videos by extracting the essential information from text descriptions. For this we recorded and annotated a corpus for fine-grained activity recognition. We show improvements in a supervised case but we are also able to recognize unseen composite activities. Third, we present a corpus of videos and aligned descriptions. We use it for grounding activity descriptions and for learning how to automatically generate natural language descriptions for a video. We show that our proposed approach is also applicable to image description and that it outperforms baselines and related work. In summary, this thesis presents a novel approach for automatic video description and shows the benefits of extracting linguistic knowledge for object and activity recognition as well as the advantage of visual recognition for understanding activity descriptions.Trotz umfangreicher Anstrengungen zur Verbesserung der die visuelle Erkennung und dem automatischen Verständnis von Sprache, ist bisher wenig getan worden, um diese beiden Forschungsbereiche zu kombinieren. In dieser Dissertation zeigen wir, wie beide voneinander profitieren können. Als erstes skalieren wir Objekterkennung zu 200 ungesehen Klassen und zeigen, wie man robust semantische Ähnlichkeiten von Sprachressourcen extrahiert. Unser neuer Ansatz kombiniert Transfer und halbüberwachten Lernverfahren und kann so Daten ohne Annotation ausnutzen und mit keinen als auch mit wenigen Trainingsbeispielen auskommen. Zweitens erfassen wir die hohe Variabilität aber geringe Verfügbarkeit von Videos mit zusammengesetzten Aktivitäten durch Extraktion der wesentlichen Informationen aus Textbeschreibungen. Wir verbessern überwachtes Training als auch die Erkennung von ungesehenen Aktivitäten. Drittens stellen wir einen parallelen Datensatz von Videos und Beschreibungen vor. Wir verwenden ihn für Grounding von Aktivitätsbeschreibungen und um die automatische Generierung natürlicher Sprache für ein Video zu erlernen. Wir zeigen, dass sich unsere Ansatz auch für Bildbeschreibung einsetzten lässt und das er bisherige Ansätze übertrifft. Zusammenfassend stellt die Dissertation einen neuen Ansatz zur automatische Videobeschreibung vor und zeigt die Vorteile von sprachbasierten Ähnlichkeitsmaßen für die Objekt- und Aktivitätserkennung als auch umgekehrt

    Learning for text mining : tackling the cost of feature and knowledge engineering.

    Over the last decade, the state-of-the-art in text mining has moved towards the adoption of machine learning as the main paradigm at the heart of approaches. Despite significant advances, machine learning based text mining solutions remain costly to design, develop and maintain for real world problems. An important component of such cost (feature engineering) concerns the effort required to understand which features or characteristics of the data can be successfully exploited in inducing a predictive model of the data. Another important component of the cost (knowledge engineering) has to do with the effort in creating labelled data, and in eliciting knowledge about the mining systems and the data itself. I present a series of approaches, methods and findings aimed at reducing the cost of creating and maintaining document classification and information extraction systems. They address the following questions: Which classes of features lead to an improved classification accuracy in the document classification and entity extraction tasks? How to reduce the amount of labelled examples needed to train machine learning based document classification and information extraction systems, so as to relieve domain experts from this costly task? How to effectively represent knowledge about these systems and the data that they manipulate, in order to make systems interoperable and results replicable? I provide the reader with the background information necessary to understand the above questions and the contributions to the state-of the- art contained herein. The contributions include: the identification of novel classes of features for the document classification task which exploit the multimedia nature of documents and lead to improved classification accuracy; a novel approach to domain adaptation for text categorization which outperforms standard supervised and semi-supervised methods while requiring considerably less supervision; and a well-founded formalism for declaratively specifying text and multimedia mining systems

    Automatic management tool for attribution and monitorization of projects/internships

    No último ano académico, os estudantes do ISEP necessitam de realizar um projeto final para obtenção do grau académico que pretendem alcançar. O ISEP fornece uma plataforma digital onde é possível visualizar todos os projetos que os alunos se podem candidatar. Apesar das vantagens que a plataforma digital traz, esta também possui alguns problemas, nomeadamente a difícil escolha de projetos adequados ao estudante devido à excessiva oferta e falta de mecanismos de filtragem. Para além disso, existe também uma indecisão acrescida para selecionar um supervisor que seja compatível para o projeto selecionado. Tendo o aluno escolhido o projeto e o supervisor, dá-se início à fase de monitorização do mesmo, que possui também os seus problemas, como o uso de diversas ferramentas que posteriormente levam a possíveis problemas de comunicação e dificuldade em manter um histórico de versões do trabalho desenvolvido. De forma a responder aos problemas mencionados, realizou-se um estudo aprofundado dos tópicos de sistemas de recomendação aplicados a Machine Learning e Learning Management Systems. Para cada um desses grandes temas, foram analisados sistemas semelhantes capazes de solucionar o problema proposto, tais como sistemas de recomendação desenvolvidos em artigos científicos, aplicações comerciais e ferramentas como o ChatGPT. Através da análise do estado da arte, concluiu-se que a solução para os problemas propostos seria a criação de uma aplicação Web para alunos e supervisores, que juntasse as duas temáticas analisadas. O sistema de recomendação desenvolvido possui filtragem colaborativa com factorização de matrizes, e filtragem por conteúdo com semelhança de cossenos. As tecnologias utilizadas no sistema centram-se em Python no back-end (com o uso de TensorFlow e NumPy para funcionalidades de Machine Learning) e Svelte no front-end. O sistema foi inspirado numa arquitetura em microsserviços em que cada serviço é representado pelo seu próprio contentor de Docker, e disponibilizado ao público através de um domínio público. O sistema foi avaliado através de três métricas: performance, confiabilidade e usabilidade. Foi utilizada a ferramenta Quantitative Evaluation Framework para definir dimensões, fatores e requisitos(e respetivas pontuações). Os estudantes que testaram a solução avaliaram o sistema de recomendação com um valor de aproximadamente 7 numa escala de 1 a 10, e os valores de precision, recall, false positive rate e F-Measure foram avaliados em 0.51, 0.71, 0.23 e 0.59 respetivamente. Adicionalmente, ambos os grupos classificaram a aplicação como intuitiva e de fácil utilização, com resultados a rondar o 8 numa escala de 1 em 10.In the last academic year, students at ISEP need to complete a final project to obtain the academic degree they aim to achieve. ISEP provides a digital platform where all the projects that students can apply for can be viewed. Besides the advantages this platform has, it also brings some problems, such as the difficult selection of projects suited for the student due to the excessive offering and lack of filtering mechanisms. Additionally, there is also increased difficulty in selecting a supervisor compatible with their project. Once the student has chosen the project and the supervisor, the monitoring phase begins, which also has its issues, such as using various tools that may lead to potential communication problems and difficulty in maintaining a version history of the work done. To address the mentioned problems, an in-depth study of recommendation systems applied to Machine Learning and Learning Management Systems was conducted. For each of these themes, similar systems that could solve the proposed problem were analysed, such as recommendation systems developed in scientific papers, commercial applications, and tools like ChatGPT. Through the analysis of the state of the art, it was concluded that the solution to the proposed problems would be the creation of a web application for students and supervisors that combines the two analysed themes. The developed recommendation system uses collaborative filtering with matrix factorization and content-based filtering with cosine similarity. The technologies used in the system are centred around Python on the backend (with the use of TensorFlow and NumPy for Machine Learning functionalities) and Svelte on the frontend. The system was inspired by a microservices architecture, where each service is represented by its own Docker container, and it was made available online through a public domain. The system was evaluated through performance, reliability, and usability. The Quantitative Evaluation Framework tool was used to define dimensions, factors, and requirements (and their respective scores). The students who tested the solution rated the recommendation system with a value of approximately 7 on a scale of 1 to 10, and the precision, recall, false positive rate, and F-Measure values were evaluated at 0.51, 0.71, 0.23, and 0.59, respectively. Additionally, both groups rated the application as intuitive and easy to use, with ratings around 8 on a scale of 1 to 10