845 research outputs found

    Green inter-cluster interference management in uplink of multi-cell processing systems

    Get PDF
    This paper examines the uplink of cellular systems employing base station cooperation for joint signal processing. We consider clustered cooperation and investigate effective techniques for managing inter-cluster interference to improve users' performance in terms of both spectral and energy efficiency. We use information theoretic analysis to establish general closed form expressions for the system achievable sum rate and the users' Bit-per-Joule capacity while adopting a realistic user device power consumption model. Two main inter-cluster interference management approaches are identified and studied, i.e., through: 1) spectrum re-use; and 2) users' power control. For the former case, we show that isolating clusters by orthogonal resource allocation is the best strategy. For the latter case, we introduce a mathematically tractable user power control scheme and observe that a green opportunistic transmission strategy can significantly reduce the adverse effects of inter-cluster interference while exploiting the benefits from cooperation. To compare the different approaches in the context of real-world systems and evaluate the effect of key design parameters on the users' energy-spectral efficiency relationship, we fit the analytical expressions into a practical macrocell scenario. Our results demonstrate that significant improvement in terms of both energy and spectral efficiency can be achieved by energy-aware interference management

    CROSS-LAYER CUSTOMIZATION FOR LOW POWER AND HIGH PERFORMANCE EMBEDDED MULTI-CORE PROCESSORS

    Get PDF
    Due to physical limitations and design difficulties, computer processor architecture has shifted to multi-core and even many-core based approaches in recent years. Such architectures provide potentials for sustainable performance scaling into future peta-scale/exa-scale computing platforms, at affordable power budget, design complexity, and verification efforts. To date, multi-core processor products have been replacing uni-core processors in almost every market segment, including embedded systems, general-purpose desktops and laptops, and super computers. However, many issues still remain with multi-core processor architectures that need to be addressed before their potentials could be fully realized. People in both academia and industry research community are still seeking proper ways to make efficient and effective use of these processors. The issues involve hardware architecture trade-offs, the system software service, the run-time management, and user application design, which demand more research effort into this field. Due to the architectural specialties with multi-core based computers, a Cross-Layer Customization framework is proposed in this work, which combines application specific information and system platform features, along with necessary operating system service support, to achieve exceptional power and performance efficiency for targeted multi-core platforms. Several topics are covered with specific optimization goals, including snoop cache coherence protocol, inter-core communication for producer-consumer applications, synchronization mechanisms, and off-chip memory bandwidth limitations. Analysis of benchmark program execution with conventional mechanisms is made to reveal the overheads in terms of power and performance. Specific customizations are proposed to eliminate such overheads with support from hardware, system software, compiler, and user applications. Experiments show significant improvement on system performance and power efficiency

    Adaptive heterogeneous parallelism for semi-empirical lattice dynamics in computational materials science.

    Get PDF
    With the variability in performance of the multitude of parallel environments available today, the conceptual overhead created by the need to anticipate runtime information to make design-time decisions has become overwhelming. Performance-critical applications and libraries carry implicit assumptions based on incidental metrics that are not portable to emerging computational platforms or even alternative contemporary architectures. Furthermore, the significance of runtime concerns such as makespan, energy efficiency and fault tolerance depends on the situational context. This thesis presents a case study in the application of both Mattsons prescriptive pattern-oriented approach and the more principled structured parallelism formalism to the computational simulation of inelastic neutron scattering spectra on hybrid CPU/GPU platforms. The original ad hoc implementation as well as new patternbased and structured implementations are evaluated for relative performance and scalability. Two new structural abstractions are introduced to facilitate adaptation by lazy optimisation and runtime feedback. A deferred-choice abstraction represents a unified space of alternative structural program variants, allowing static adaptation through model-specific exhaustive calibration with regards to the extrafunctional concerns of runtime, average instantaneous power and total energy usage. Instrumented queues serve as mechanism for structural composition and provide a representation of extrafunctional state that allows realisation of a market-based decentralised coordination heuristic for competitive resource allocation and the Lyapunov drift algorithm for cooperative scheduling

    Bandwidth-aware distributed ad-hoc grids in deployed wireless sensor networks

    Get PDF
    Nowadays, cost effective sensor networks can be deployed as a result of a plethora of recent engineering advances in wireless technology, storage miniaturisation, consolidated microprocessor design, and sensing technologies. Whilst sensor systems are becoming relatively cheap to deploy, two issues arise in their typical realisations: (i) the types of low-cost sensors often employed are capable of limited resolution and tend to produce noisy data; (ii) network bandwidths are relatively low and the energetic costs of using the radio to communicate are relatively high. To reduce the transmission of unnecessary data, there is a strong argument for performing local computation. However, this can require greater computational capacity than is available on a single low-power processor. Traditionally, such a problem has been addressed by using load balancing: fragmenting processes into tasks and distributing them amongst the least loaded nodes. However, the act of distributing tasks, and any subsequent communication between them, imposes a geographically defined load on the network. Because of the shared broadcast nature of the radio channels and MAC layers in common use, any communication within an area will be slowed by additional traffic, delaying the computation and reporting that relied on the availability of the network. In this dissertation, we explore the tradeoff between the distribution of computation, needed to enhance the computational abilities of networks of resource-constrained nodes, and the creation of network traffic that results from that distribution. We devise an application-independent distribution paradigm and a set of load distribution algorithms to allow computationally intensive applications to be collaboratively computed on resource-constrained devices. Then, we empirically investigate the effects of network traffic information on the distribution performance. We thus devise bandwidth-aware task offload mechanisms that, combining both nodes computational capabilities and local network conditions, investigate the impacts of making informed offload decisions on system performance. The highly deployment-specific nature of radio communication means that simulations that are capable of producing validated, high-quality, results are extremely hard to construct. Consequently, to produce meaningful results, our experiments have used empirical analysis based on a network of motes located at UCL, running a variety of I/O-bound, CPU-bound and mixed tasks. Using this setup, we have established that even relatively simple load sharing algorithms can improve performance over a range of different artificially generated scenarios, with more or less timely contextual information. In addition, we have taken a realistic application, based on location estimation, and implemented that across the same network with results that support the conclusions drawn from the artificially generated traffic

    System Support For Stream Processing In Collaborative Cloud-Edge Environment

    Get PDF
    Stream processing is a critical technique to process huge amount of data in real-time manner. Cloud computing has been used for stream processing due to its unlimited computation resources. At the same time, we are entering the era of Internet of Everything (IoE). The emerging edge computing benefits low-latency applications by leveraging computation resources at the proximity of data sources. Billions of sensors and actuators are being deployed worldwide and huge amount of data generated by things are immersed in our daily life. It has become essential for organizations to be able to stream and analyze data, and provide low-latency analytics on streaming data. However, cloud computing is inefficient to process all data in a centralized environment in terms of the network bandwidth cost and response latency. Although edge computing offloads computation from the cloud to the edge of the Internet, there is not a data sharing and processing framework that efficiently utilizes computation resources in the cloud and the edge. Furthermore, the heterogeneity of edge devices brings more difficulty to the development of collaborative cloud-edge applications. To explore and attack the challenges of stream processing system in collaborative cloudedge environment, in this dissertation we design and develop a series of systems to support stream processing applications in hybrid cloud-edge analytics. Specifically, we develop an hierarchical and hybrid outlier detection model for multivariate time series streams that automatically selects the best model for different time series. We optimize one of the stream processing system (i.e., Spark Streaming) to reduce the end-to-end latency. To facilitate the development of collaborative cloud-edge applications, we propose and implement a new computing framework, Firework that allows stakeholders to share and process data by leveraging both the cloud and the edge. A vision-based cloud-edge application is implemented to demonstrate the capabilities of Firework. By combining all these studies, we provide comprehensive system support for stream processing in collaborative cloud-edge environment

    Symmetry-Adapted Machine Learning for Information Security

    Get PDF
    Symmetry-adapted machine learning has shown encouraging ability to mitigate the security risks in information and communication technology (ICT) systems. It is a subset of artificial intelligence (AI) that relies on the principles of processing future events by learning past events or historical data. The autonomous nature of symmetry-adapted machine learning supports effective data processing and analysis for security detection in ICT systems without the interference of human authorities. Many industries are developing machine-learning-adapted solutions to support security for smart hardware, distributed computing, and the cloud. In our Special Issue book, we focus on the deployment of symmetry-adapted machine learning for information security in various application areas. This security approach can support effective methods to handle the dynamic nature of security attacks by extraction and analysis of data to identify hidden patterns of data. The main topics of this Issue include malware classification, an intrusion detection system, image watermarking, color image watermarking, battlefield target aggregation behavior recognition model, IP camera, Internet of Things (IoT) security, service function chain, indoor positioning system, and crypto-analysis

    Design of static intercell interference coordination schemes for realistic lte-based cellular networks

    Get PDF
    Today, 3.5 and 4G systems including Long Term Evolution (LTE) and LTE-Advanced (LTE-A) support packet-based services and provide mobile broadband access for bandwidth-hungry applications. In this context of fast evolution, new and challenging technical issues must be e ectively addressed. The nal target is to achieve a signi cant step forward toward the improvement of the Quality of Experience (QoE). To that end, interference management has been recognized by the industry as a key enabler for cellular technologies based on OFDMA. Indeed, with a low frequency reuse factor, intercell interference (ICI) becomes a major concern since the Quality of Service (QoS) is not uniformly delivered across the network, it remarkably depends on user position. Hence, cell edge performance is an important issue in LTE and LTE-A. Intercell Interference Coordination (ICIC) encompasses strategies whose goal is to keep ICI at cell edges as low as possible. This alleviates the aforementioned situation. For this reason, the novelties presented in this Ph.D. thesis include not only developments of static ICIC mechanisms for data and control channels, but also e orts towards further improvements of the energy e ciency perspective. Based on a comprehensive review of the state of the art, a set of research opportunities were identi ed. To be precise, the need for exible performance evaluation methods and optimization frameworks for static ICIC strategies. These mechanisms are grouped in two families: the schemes that de ne constraints on the frequency domain and the ones that propose adjustments on the power levels. Thus, Soft- and Fractional Frequency Reuse (SFR and FFR, respectively) are identi ed as the base of the vast majority of static ICIC proposals. Consequently, during the rst part of this Ph.D. thesis, interesting insights into the operation of SFR and FFR were identi ed beyond well-known facts. These studies allow for the development of a novel statistical framework to evaluate the performance of these schemes in realistic deployments. As a result of the analysis, the poor performance of classic con gurations of SFR and FFR in real-world contexts is shown, and hence, the need for optimization is established. In addition, the importance of the interworking between static ICIC schemes and other network functionalities such as CSI feedback has also been identi ed. Therefore, novel CSI feedback schemes, suitable to operate in conjunction with SFR and FFR, have been developed. These mechanisms exploit the resource allocation pattern of these static ICIC techniques in order to improve the accuracy of the CSI feedback process. The second part is focused on the optimization of SFR and FFR. The use of multiobjective techniques is investigated as a tool to achieve e ective network-speci c optimization. The approach o ers interesting advantages. On the one hand, it allows for simultaneous optimization of several con icting criteria. On the other hand, the multiobjective nature results in outputs composed of several high quality (Pareto e cient) network con gurations, all of them featuring a near-optimal tradeo between the performance criteria. Multiobjective evolutionary algorithms allow employing complex mathematical structures without the need for relaxation, thus capturing accurately the system behavior in terms of ICI. The multiobjective optimization formulation of the problem aims at achieving e ective adjustment of the operational parameters of SFR and FFR both at cell level and network-wide. Moreover, the research was successfully extended to the control channels, both the PDCCH and ePDCCH. Finally, in an e ort to further improve the network energy e ciency (an aspect always considered throughout the thesis), the framework of Cell Switch O (CSO), having close connections with ICIC, is also introduced. By means of the proposed method, signi cant improvements with respect to traditional approaches, baseline con gurations, and previous proposals can be achieved. The gains are obtained in terms of energy consumption, network capacity, and cell edge performance.Actualmente los sistemas 3.5 y 4G tales como Long Term Evolution (LTE) y LTE-Advanced (LTE-A) soportan servicios basados en paquetes y proporcionan acceso de banda ancha m ovil para aplicaciones que requieren elevadas tasas de transmisi on. En este contexto de r apida evoluci on, aparecen nuevos retos t ecnicos que deben ser resueltos e cientemente. El objetivo ultimo es conseguir un salto cualitativo importante en la experiencia de usuario (QoE). Con tal n, un factor clave que ha sido reconocido en las redes celulares basadas en Orthogonal Frequency- Division Multiple Access (OFDMA) es la gesti on de interferencias. De hecho, la utilizaci on de un factor de reuso bajo permite una elevada e ciencia espectral pero a costa de una distribuci on de la calidad de servicio (QoS) que no es uniforme en la red, depende de la posici on del usuario. Por lo tanto, el rendimiento en los l mites de la celda se ve muy penalizado y es un problema importante a resolver en LTE y LTE-A. La coordinaci on de interferencias entre celdas (ICIC, del ingl es Intercell Interfe- rence Coordination) engloba las estrategias cuyo objetivo es mantener la interferencia intercelular (ICI) lo m as baja posible en los bordes de celda. Esto permite aliviar la situaci on antes mencionada. La contribuci on presentada en esta tesis doctoral incluye el dise~no de nuevos mecanismos de ICIC est atica para los canales de datos y control, as como tambi en mejoras desde el punto de vista de e ciencia energ etica. A partir de una revisi on completa del estado del arte, se identi caron una serie de retos abiertos que requer an esfuerzos de investigaci on. En concreto, la necesidad de m etodos de evaluaci on exibles y marcos de optimizaci on de las estrategias de ICIC est aticas. Estos mecanismos se agrupan en dos familias: los esquemas que de nen restricciones sobre el dominio de la frecuencia y los que proponen ajustes en los niveles de potencia. Es decir, la base de la gran mayor a de propuestas ICIC est aticas son la reutilizaci on de frecuencias de tipo soft y fraccional (SFR y FFR, respectivamente). De este modo, durante la primera parte de esta tesis doctoral, se han estudiado los aspectos m as importantes del funcionamiento de SFR y FFR, haciendo especial enfasis en las conclusiones que van m as all a de las bien conocidas. Ello ha permitido introducir un nuevo marco estad stico para evaluar el funcionamiento de estos sistemas en condiciones de despliegue reales. Como resultado de estos an alisis, se muestra el pobre desempe~no de SFR y FFR en despliegues reales cuando funcionan con sus con guraciones cl asicas y se establece la necesidad de optimizaci on. Tambi en se pone de mani esto la importancia del funcionamiento conjunto entre esquemas ICIC est aticos y otras funcionalidades de la red radio, tales como la informaci on que env an los usuarios sobre el estado de su canal downlink (feedback del CSI, del ingl es Channel State Information). De este modo, se han propuesto diferentes esquemas de feedback apropiados para trabajar conjuntamente con SFR y FFR. Estos mecanismos explotan el patr on de asignaci on de recursos que se utiliza en ICIC est atico para mejorar la precisi on del proceso. La segunda parte se centra en la optimizaci on de SFR y FFR. Se ha investigado el uso de t ecnicas multiobjetivo como herramienta para lograr una optimizaci on e caz, que es espec ca para cada red. El enfoque ofrece ventajas interesantes, por un lado, se permite la optimizaci on simult anea de varios criterios contradictorios. Por otro lado, la naturaleza multiobjetivo implica obtener como resultado con guraciones de red de elevada calidad (Pareto e cientes), todas ellas con un equilibrio casi- optimo entre las diferentes m etricas de rendimiento. Los algoritmos evolucionarios multiobjetivo permiten la utilizaci on de estructuras matem aticas complejas sin necesidad de relajar el problema, de este modo capturan adecuadamente su comportamiento en t erminos de ICI. La formulaci on multiobjetivo consigue un ajuste efectivo de los par ametros operacionales de SFR y FFR, tanto a nivel de celda como a nivel de red. Adem as, la investigaci on se extiende con resultados satisfactorios a los canales de control, PDCCH y ePDCCH. Finalmente, en un esfuerzo por mejorar la e ciencia energ etica de la red (un aspecto siempre considerado a lo largo de la tesis), se introduce en el an alisis global el apagado inteligente de celdas, estrategia con estrechos v nculos con ICIC. A trav es del m etodo propuesto, se obtienen mejoras signi cativas con respecto a los enfoques tradicionales y propuestas previas. Las ganancias se obtienen en t erminos de consumo energ etico, capacidad de la red, y rendimiento en el l mite de las celdas.Actualment els sistemes 3.5 i 4G tals com Long Term Evolution (LTE) i LTE- Advanced (LTE-A) suporten serveis basats en paquets i proporcionen acc es de banda ampla m obil per a aplicacions que requereixen elevades taxes de transmissi o. En aquest context de r apida evoluci o, apareixen nous reptes t ecnics que han de ser resolts e cientment. L'objectiu ultim es aconseguir un salt qualitatiu important en l'experi encia d'usuari (QoE). Amb tal , un factor clau que ha estat reconegut a les xarxes cel lulars basades en Orthogonal Frequency-Division Multiple Access (OFDMA) es la gesti o d'interfer encies. De fet, la utilizaci o d'un factor de re us baix permet una elevada e ci encia espectral per o a costa d'una distribuci o de la qualitat de servei (QoS) que no es uniforme a la xarxa, dep en de la posici o de l'usuari. Per tant, el rendiment en els l mits de la cel la es veu molt penalitzat i es un problema important a resoldre en LTE i LTE-A. La coordinaci o d'interfer encies entre cel les (ICIC, de l'angl es Intercell Interfe- rence Coordination) engloba les estrat egies que tenen com a objectiu mantenir la interfer encia intercel lular (ICI) el m es baixa possible en les vores de la cel la. Aix o permet alleujar la situaci o abans esmentada. La contribuci o presentada en aquesta tesi doctoral inclou el disseny de nous mecanismes de ICIC est atica per als canals de dades i control, aix com tamb e millores des del punt de vista d'e ci encia energ etica. A partir d'una revisi o completa de l'estat de l'art, es van identi car una s erie de reptes oberts que requerien esfor cos de recerca. En concret, la necessitat de m etodes d'avaluaci o exibles i marcs d'optimitzaci o de les estrat egies de ICIC est atiques. Aquests mecanismes s'agrupen en dues fam lies: els esquemes que de neixen restriccions sobre el domini de la freq u encia i els que proposen ajustos en els nivells de pot encia. Es a dir, la base de la gran majoria de propostes ICIC est atiques s on la reutilitzaci o de freq u encies de tipus soft i fraccional (SFR i FFR, respectivament). D'aquesta manera, durant la primera part d'aquesta tesi doctoral, s'han estudiat els aspectes m es importants del funcionament de SFR i FFR, fent especial emfasi en les conclusions que van m es enll a de les ben conegudes. Aix o ha perm es introduir un nou marc estad stic per avaluar el funcionament d'aquests sistemes en condicions de desplegament reals. Com a resultat d'aquestes an alisis, es mostra el pobre acompliment de SFR i FFR en desplegaments reals quan funcionen amb les seves con guracions cl assiques i s'estableix la necessitat d'optimitzaci o. Tamb e es posa de manifest la import ancia del funcionament conjunt entre esquemes ICIC est atics i altres funcionalitats de la xarxa radio, tals com la informaci o que envien els usuaris sobre l'estat del seu canal downlink (feedback del CSI, de l'angl es Channel State Information). D'aquesta manera, s'han proposat diferents esquemes de feedback apropiats per treballar conjuntament amb SFR i FFR. Aquests mecanismes exploten el patr o d'assignaci o de recursos que s'utilitza en ICIC est atic per millorar la precisi o del proc es. La segona part se centra en l'optimitzaci o de SFR i FFR. S'ha investigat l' us de t ecniques multiobjectiu com a eina per aconseguir una optimitzaci o e ca c, que es espec ca per a cada xarxa. L'enfocament ofereix avantatges interessants, d'una banda, es permet l'optimitzaci o simult ania de diversos criteris contradictoris. D'altra banda, la naturalesa multiobjectiu implica obtenir com resultat con guracions de xarxa d'elevada qualitat (Pareto e cients), totes elles amb un equilibri gaireb e optim entre les diferents m etriques de rendiment. Els algorismes evolucionaris multiobjectiu permeten la utilitzaci o d'estructures matem atiques complexes sense necessitat de relaxar el problema, d'aquesta manera capturen adequadament el seu comportament en termes de ICI. La formulaci o multiobjectiu aconsegueix un ajust efectiu dels par ametres operacionals de SFR i FFR, tant a nivell de cel la com a nivell de xarxa. A m es, la recerca s'est en amb resultats satisfactoris als canals de control, PDCCH i ePDCCH. Finalment, en un esfor c per millorar l'e ci encia energ etica de la xarxa (un aspecte sempre considerat al llarg de la tesi), s'introdueix en l'an alisi global l'apagat intel ligent de cel les, estrat egia amb estrets vincles amb ICIC. Mitjan cant el m etode proposat, s'obtenen millores signi catives pel que fa als enfocaments tradicionals i propostes pr evies. Els guanys s'obtenen en termes de consum energ etic, capacitat de la xarxa, i rendiment en el l mit de les cel les

    Aspects of knowledge mining on minimizing drive tests in self-organizing cellular networks

    Get PDF
    The demand for mobile data traffic is about to explode and this drives operators to find ways to further increase the offered capacity in their networks. If networks are deployed in the traditional way, this traffic explosion will be addressed by increasing the number of network elements significantly. This is expected to increase the costs and the complexity of planning, operating and optimizing the networks. To ensure effective and cost-efficient operations, a higher degree of automation and self-organization is needed in the next generation networks. For this reason, the concept of self-organizing networks was introduced in LTE covering multitude of use cases. This was specifically done in the areas of self-configuration, self-optimization and selfhealing of networks. From an operator’s perspective, automated collection and analysis of field measurements while complementing the traditional drive test campaigns is one of the top use cases that can provide significant cost savings in self-organizing networks. This thesis studies the Minimization of Drive Tests in self-organizing cellular networks from three different aspects. The first aspect is network operations, and particularly the network fault management process, as the traditional drive tests are often conducted for troubleshooting purposes. The second aspect is network functionality, and particularly the technical details about the specified measurement and signaling procedures in different network elements that are needed for automating the collection of the field measurement data. The third aspect concerns the analysis of the measurement databases that is a process used for increasing the degree of automation and self-awareness in the networks, and particularly the mathematical means for autonomously finding meaningful patterns of knowledge from huge amounts of data. Although the above mentioned technical areas have been widely discussed in previous literature, it has been done separately and only a few papers discuss how for example, knowledge mining is employed for processing field measurement data in a way that minimizes the drive tests in self-organizing LTE networks. The objective of the thesis is to use well known knowledge mining principles to develop novel self-healing and self-optimization algorithms. These algorithms analyze MDT databases to detect coverage holes, sleeping cells and other geographical areas of anomalous network behavior. The results of the research suggest that by employing knowledge mining in processing the MDT databases, one can acquire knowledge for discriminating between different network problems and detecting anomalous network behavior. For example, downlink coverage optimization is enhanced by classifying RLF reports into coverage, interference and handover problems. Moreover, by incorporating a normalized power headroom report with the MDT reports, better discrimination between uplink coverage problems and the parameterization problems is obtained. Knowledge mining is also used to detect sleeping cells by means of supervised and unsupervised learning. The detection framework is based on a novel approach where diffusion mapping is used to learn about network behavior in its healthy state. The sleeping cells are detected by observing an increase in the number of anomalous reports associated with a certain cell. The association is formed by correlating the geographical location of anomalous reports with the estimated dominance areas of the cells. Moreover, RF fingerprint positioning of the MDT reports is studied and the results suggest that RF fingerprinting can provide a quite detailed location estimation in dense heterogeneous networks. In addition, self-optimization of the mobility state estimation parameters is studied in heterogeneous LTE networks and the results suggest that by gathering MDT measurements and constructing statistical velocity profiles, MSE parameters can be adjusted autonomously, thus resulting in reasonably good classification accuracy. The overall outcome of the thesis is as follows. By automating the classification of the measurement reports between certain problems, network engineers can acquire knowledge about the root causes of the performance degradation in the networks. This saves time and resources and results in a faster decision making process. Due to the faster decision making process the duration of network breaks become shorter and the quality of the network is improved. By taking into account the geographical locations of the anomalous field measurements in the network performance analysis, finer granularity for estimating the location of the problem areas can be achieved. This can further improve the operational decision making that guides the corresponding actions for example, where to start the network optimization. Moreover, by automating the time and resource consuming task of tuning the mobility state estimation parameters, operators can enhance the mobility performance of the high velocity UEs in heterogeneous radio networks in a cost-efficient and backward compatible manner

    Discriminative dimensionality reduction: variations, applications, interpretations

    Get PDF
    Schulz A. Discriminative dimensionality reduction: variations, applications, interpretations. Bielefeld: Universität Bielefeld; 2017.The amount of digital data increases rapidly as a result of advances in information and sensor technology. Because the data sets grow with respect to their size, complexity and dimensionality, they are no longer easily accessible to a human user. The framework of dimensionality reduction addresses this problem by aiming to visualize complex data sets in two dimensions while preserving the relevant structure. While these methods can provide significant insights, the problem formulation of structure preservation is ill-posed in general and can lead to undesired effects. In this thesis, the concept of discriminative dimensionality reduction is investigated as a particular promising way to indicate relevant structure by specifying auxiliary data. The goal is to overcome challenges in data inspection and to investigate in how far discriminative dimensionality reduction methods can yield an improvement. The main scientific contributions are the following: (I) The most popular techniques for discriminative dimensionality reduction are based on the Fisher metric. However, they are restricted in their applicability as concerns complex settings: They can only be employed for fixed data sets, i.e. new data cannot be included in an existing embedding. Only data provided in vectorial representation can be processed. And they are designed for discrete-valued auxiliary data and cannot be applied to real-valued ones. We propose solutions to overcome these challenges. (II) Besides the problem that complex data are not accessible to humans, the same holds for trained machine learning models which often constitute black box models. In order to provide an intuitive interface to such models, we propose a general framework which allows to visualize high-dimensional functions, such as regression or classification functions, in two dimensions. (III) Although nonlinear dimensionality reduction techniques illustrate the structure of the data very well, they suffer from the fact that there is no explicit relationship between the original features and the obtained projection. We propose a methodology to create a connection, thus allowing to understand the importance of the features. (IV) Although linear mappings constitute a very popular tool, a direct interpretation of their weights as feature relevance can be misleading. We propose a methodology which enables a valid interpretation by providing relevance bounds for each feature. (V) The problem of transfer learning without given correspondence information between the source and target space and without labels is particularly challenging. Here, we utilize the structure preserving property of dimensionality reduction methods to transfer knowledge in a latent space given by dimensionality reduction

    The 4th Conference of PhD Students in Computer Science

    Get PDF
    • …
    corecore