76 research outputs found

    Using Taxonomy Tree to Generalize a Fuzzy Thematic Cluster

    Get PDF
    D.F. and B.M. acknowledge continuing support by the Academic Fund Program at the National Research University Higher School of Economics (grant 19-04-019 in 2018-2019) and by the International Decision Choice and Analysis Laboratory (DECAN) NRU HSE, in the framework of a subsidy granted to the HSE by the Government of the Russian Federation for the implementation of the the Russian Academic Excellence Project “5-100”. S.N. acknowledges the support by FCT/MCTES, NOVA LINCS (UID/CEC/04516/2019).This paper presents an algorithm, ParGenFS, for generalizing, or 'lifting', a fuzzy set of topics to higher ranks of a hierarchical taxonomy of a research domain. The algorithm ParGenFS finds a globally optimal generalization of the topic set to minimize a penalty function, by balancing the number of introduced 'head subjects' and related errors, the 'gaps' and 'offshoots', differently weighted. This leads to a generalization of the topic set in the taxonomy. The usefulness of the method is illustrated on a set of 17685 abstracts of research papers on Data Science published in Springer journals for the past 20 years. We extracted a taxonomy of Data Science from the international Association for Computing Machinery Computing Classification System 2012 (ACM-CCS). We find fuzzy clusters of leaf topics over the text collection, lift them in the taxonomy, and interpret found head subjects to comment on the tendencies of current research.authorsversionpublishe

    Information Extraction and Modeling from Remote Sensing Images: Application to the Enhancement of Digital Elevation Models

    Get PDF
    To deal with high complexity data such as remote sensing images presenting metric resolution over large areas, an innovative, fast and robust image processing system is presented. The modeling of increasing level of information is used to extract, represent and link image features to semantic content. The potential of the proposed techniques is demonstrated with an application to enhance and regularize digital elevation models based on information collected from RS images

    Pattern Recognition

    Get PDF
    A wealth of advanced pattern recognition algorithms are emerging from the interdiscipline between technologies of effective visual features and the human-brain cognition process. Effective visual features are made possible through the rapid developments in appropriate sensor equipments, novel filter designs, and viable information processing architectures. While the understanding of human-brain cognition process broadens the way in which the computer can perform pattern recognition tasks. The present book is intended to collect representative researches around the globe focusing on low-level vision, filter design, features and image descriptors, data mining and analysis, and biologically inspired algorithms. The 27 chapters coved in this book disclose recent advances and new ideas in promoting the techniques, technology and applications of pattern recognition

    Remote Sensing

    Get PDF
    This dual conception of remote sensing brought us to the idea of preparing two different books; in addition to the first book which displays recent advances in remote sensing applications, this book is devoted to new techniques for data processing, sensors and platforms. We do not intend this book to cover all aspects of remote sensing techniques and platforms, since it would be an impossible task for a single volume. Instead, we have collected a number of high-quality, original and representative contributions in those areas

    Semi-supervised and unsupervised kernel-based novelty detection with application to remote sensing images

    Get PDF
    The main challenge of new information technologies is to retrieve intelligible information from the large volume of digital data gathered every day. Among the variety of existing data sources, the satellites continuously observing the surface of the Earth are key to the monitoring of our environment. The new generation of satellite sensors are tremendously increasing the possibilities of applications but also increasing the need for efficient processing methodologies in order to extract information relevant to the users' needs in an automatic or semi-automatic way. This is where machine learning comes into play to transform complex data into simplified products such as maps of land-cover changes or classes by learning from data examples annotated by experts. These annotations, also called labels, may actually be difficult or costly to obtain since they are established on the basis of ground surveys. As an example, it is extremely difficult to access a region recently flooded or affected by wildfires. In these situations, the detection of changes has to be done with only annotations from unaffected regions. In a similar way, it is difficult to have information on all the land-cover classes present in an image while being interested in the detection of a single one of interest. These challenging situations are called novelty detection or one-class classification in machine learning. In these situations, the learning phase has to rely only on a very limited set of annotations, but can exploit the large set of unlabeled pixels available in the images. This setting, called semi-supervised learning, allows significantly improving the detection. In this Thesis we address the development of methods for novelty detection and one-class classification with few or no labeled information. The proposed methodologies build upon the kernel methods, which take place within a principled but flexible framework for learning with data showing potentially non-linear feature relations. The thesis is divided into two parts, each one having a different assumption on the data structure and both addressing unsupervised (automatic) and semi-supervised (semi-automatic) learning settings. The first part assumes the data to be formed by arbitrary-shaped and overlapping clusters and studies the use of kernel machines, such as Support Vector Machines or Gaussian Processes. An emphasis is put on the robustness to noise and outliers and on the automatic retrieval of parameters. Experiments on multi-temporal multispectral images for change detection are carried out using only information from unchanged regions or none at all. The second part assumes high-dimensional data to lie on multiple low dimensional structures, called manifolds. We propose a method seeking a sparse and low-rank representation of the data mapped in a non-linear feature space. This representation allows us to build a graph, which is cut into several groups using spectral clustering. For the semi-supervised case where few labels of one class of interest are available, we study several approaches incorporating the graph information. The class labels can either be propagated on the graph, constrain spectral clustering or used to train a one-class classifier regularized by the given graph. Experiments on the unsupervised and oneclass classification of hyperspectral images demonstrate the effectiveness of the proposed approaches

    Cross-systems Personalisierung

    Get PDF
    The World Wide Web provides access to a wealth of information and services to a huge and heterogeneous user population on a global scale. One important and successful design mechanism in dealing with this diversity of users is to personalize Web sites and services, i.e. to customize system content, characteristics, or appearance with respect to a specific user. Each system independently builds up user proïŹles and uses this information to personalize the service offering. Such isolated approaches have two major drawbacks: firstly, investments of users in personalizing a system either through explicit provision of information or through long and regular use are not transferable to other systems. Secondly, users have little or no control over the information that defines their profile, since user data are deeply buried in personalization engines running on the server side. Cross system personalization (CSP) (Mehta, Niederee, & Stewart, 2005) allows for sharing information across different information systems in a user-centric way and can overcome the aforementioned problems. Information about users, which is originally scattered across multiple systems, is combined to obtain maximum leverage and reuse of information. Our initial approaches to cross system personalization relied on each user having a unified profile which different systems can understand. The unified profile contains facets modeling aspects of a multidimensional user which is stored inside a "Context Passport" that the user carries along in his/her journey across information space. The user’s Context Passport is presented to a system, which can then understand the context in which the user wants to use the system. The basis of ’understanding’ in this approach is of a semantic nature, i.e. the semantics of the facets and dimensions of the uniïŹed proïŹle are known, so that the latter can be aligned with the proïŹles maintained internally at a specific site. The results of the personalization process are then transfered back to the user’s Context Passport via a protocol understood by both parties. The main challenge in this approach is to establish some common and globally accepted vocabulary and to create a standard every system will comply with. Machine Learning techniques provide an alternative approach to enable CSP without the need of accepted semantic standards or ontologies. The key idea is that one can try to learn dependencies between proïŹles maintained within one system and profiles maintained within a second system based on data provided by users who use both systems and who are willing to share their proïŹles across systems – which we assume is in the interest of the user. Here, instead of requiring a common semantic framework, it is only required that a sufficient number of users cross between systems and that there is enough regularity among users that one can learn within a user population, a fact that is commonly exploited in collaborative filtering. In this thesis, we aim to provide a principled approach towards achieving cross system personalization. We describe both semantic and learning approaches, with a stronger emphasis on the learning approach. We also investigate the privacy and scalability aspects of CSP and provide solutions to these problems. Finally, we also explore in detail the aspect of robustness in recommender systems. We motivate several approaches for robustifying collaborative filtering and provide the best performing algorithm for detecting malicious attacks reported so far.Die Personalisierung von Software Systemen ist von stetig zunehmender Bedeutung, insbesondere im Zusammenhang mit Web-Applikationen wie Suchmaschinen, Community-Portalen oder Electronic Commerce Sites, die große, stark diversifizierte Nutzergruppen ansprechen. Da explizite Personalisierung typischerweise mit einem erheblichen zeitlichem Aufwand fĂŒr den Nutzer verbunden ist, greift man in vielen Applikationen auf implizite Techniken zur automatischen Personalisierung zurĂŒck, insbesondere auf Empfehlungssysteme (Recommender Systems), die typischerweise Methoden wie das Collaborative oder Social Filtering verwenden. WĂ€hrend diese Verfahren keine explizite Erzeugung von Benutzerprofilen mittels Beantwortung von Fragen und explizitem Feedback erfordern, ist die QualitĂ€t der impliziten Personalisierung jedoch stark vom verfĂŒgbaren Datenvolumen, etwa Transaktions-, Query- oder Click-Logs, abhĂ€ngig. Ist in diesem Sinne von einem Nutzer wenig bekannt, so können auch keine zuverlĂ€ssigen persönlichen Anpassungen oder Empfehlungen vorgenommen werden. Die vorgelegte Dissertation behandelt die Frage, wie Personalisierung ĂŒber Systemgrenzen hinweg („cross system“) ermöglicht und unterstĂŒtzt werden kann, wobei hauptsĂ€chlich implizite Personalisierungstechniken, aber eingeschrĂ€nkt auch explizite Methodiken wie der semantische Context Passport diskutiert werden. Damit behandelt die Dissertation eine wichtige Forschungs-frage von hoher praktischer Relevanz, die in der neueren wissenschaftlichen Literatur zu diesem Thema nur recht unvollstĂ€ndig und unbefriedigend gelöst wurde. Automatische Empfehlungssysteme unter Verwendung von Techniken des Social Filtering sind etwas seit Mitte der 90er Jahre mit dem Aufkommen der ersten E-Commerce Welle popularisiert orden, insbesondere durch Projekte wie Information Tapistery, Grouplens und Firefly. In den spĂ€ten 90er Jahren und Anfang dieses Jahrzehnts lag der Hauptfokus der Forschungsliteratur dann auf verbesserten statistischen Verfahren und fortgeschrittenen Inferenz-Methodiken, mit deren Hilfe die impliziten Beobachtungen auf konkrete Anpassungs- oder Empfehlungsaktionen abgebildet werden können. In den letzten Jahren sind vor allem Fragen in den Vordergrund gerĂŒckt, wie Personalisierungssysteme besser auf die praktischen Anforderungen bestimmter Applikationen angepasst werden können, wobei es insbesondere um eine geeignete Anpassung und Erweiterung existierender Techniken geht. In diesem Rahmen stellt sich die vorgelegte Arbeit

    Object recognition using fractal geometry and fuzzy logic.

    Get PDF

    Generation of a Land Cover Atlas of environmental critic zones using unconventional tools

    Get PDF
    L'abstract Ăš presente nell'allegato / the abstract is in the attachmen

    Representation Learning for Natural Language Processing

    Get PDF
    This open access book provides an overview of the recent advances in representation learning theory, algorithms and applications for natural language processing (NLP). It is divided into three parts. Part I presents the representation learning techniques for multiple language entries, including words, phrases, sentences and documents. Part II then introduces the representation techniques for those objects that are closely related to NLP, including entity-based world knowledge, sememe-based linguistic knowledge, networks, and cross-modal entries. Lastly, Part III provides open resource tools for representation learning techniques, and discusses the remaining challenges and future research directions. The theories and algorithms of representation learning presented can also benefit other related domains such as machine learning, social network analysis, semantic Web, information retrieval, data mining and computational biology. This book is intended for advanced undergraduate and graduate students, post-doctoral fellows, researchers, lecturers, and industrial engineers, as well as anyone interested in representation learning and natural language processing
    • 

    corecore