745 research outputs found

    Probabilistic framework for image understanding applications using Bayesian Networks

    Get PDF
    Machine learning algorithms have been successfully utilized in various systems/devices. They have the ability to improve the usability/quality of such systems in terms of intelligent user interface, fast performance, and more importantly, high accuracy. In this research, machine learning techniques are used in the field of image understanding, which is a common research area between image analysis and computer vision, to involve higher processing level of a target image to make sense of the scene captured in it. A general probabilistic framework for image understanding where topics associated with (i) collection of images to generate a comprehensive and valid database, (ii) generation of an unbiased ground-truth for the aforesaid database, (iii) selection of classification features and elimination of the redundant ones, and (iv) usage of such information to test a new sample set, are discussed. Two research projects have been developed as examples of the general image understanding framework; identification of region(s) of interest, and image segmentation evaluation. These techniques, in addition to others, are combined in an object-oriented rendering system for printing applications. The discussion included in this doctoral dissertation explores the means for developing such a system from an image understanding/ processing aspect. It is worth noticing that this work does not aim to develop a printing system. It is only proposed to add some essential features for current printing pipelines to achieve better visual quality while printing images/photos. Hence, we assume that image regions have been successfully extracted from the printed document. These images are used as input to the proposed object-oriented rendering algorithm where methodologies for color image segmentation, region-of-interest identification and semantic features extraction are employed. Probabilistic approaches based on Bayesian statistics have been utilized to develop the proposed image understanding techniques

    Drawing graphs for cartographic applications

    Get PDF
    Graph Drawing is a relatively young area that combines elements of graph theory, algorithms, (computational) geometry and (computational) topology. Research in this field concentrates on developing algorithms for drawing graphs while satisfying certain aesthetic criteria. These criteria are often expressed in properties like edge complexity, number of edge crossings, angular resolutions, shapes of faces or graph symmetries and in general aim at creating a drawing of a graph that conveys the information to the reader in the best possible way. Graph drawing has applications in a wide variety of areas which include cartography, VLSI design and information visualization. In this thesis we consider several graph drawing problems. The first problem we address is rectilinear cartogram construction. A cartogram, also known as value-by-area map, is a technique used by cartographers to visualize statistical data over a set of geographical regions like countries, states or counties. The regions of a cartogram are deformed such that the area of a region corresponds to a particular geographic variable. The shapes of the regions depend on the type of cartogram. We consider rectilinear cartograms of constant complexity, that is cartograms where each region is a rectilinear polygon with a constant number of vertices. Whether a cartogram is good is determined by how closely the cartogram resembles the original map and how precisely the area of its regions describe the associated values. The cartographic error is defined for each region as jAc¡Asj=As, where Ac is the area of the region in the cartogram and As is the specified area of that region, given by the geographic variable to be shown. In this thesis we consider the construction of rectilinear cartograms that have correct adjacencies of the regions and zero cartographic error. We show that any plane triangulated graph admits a rectilinear cartogram where every region has at most 40 vertices which can be constructed in O(nlogn) time. We also present experimental results that show that in practice the algorithm works significantly better than suggested by the complexity bounds. In our experiments on real-world data we were always able to construct a cartogram where the average number of vertices per region does not exceed five. Since a rectangle has four vertices, this means that most of the regions of our rectilinear car tograms are in fact rectangles. Moreover, the maximum number vertices of each region in these cartograms never exceeded ten. The second problem we address in this thesis concerns cased drawings of graphs. The vertices of a drawing are commonly marked with a disk, but differentiating between vertices and edge crossings in a dense graph can still be difficult. Edge casing is a wellknown method—used, for example, in electrical drawings, when depicting knots, and, more generally, in information visualization—to alleviate this problem and to improve the readability of a drawing. A cased drawing orders the edges of each crossing and interrupts the lower edge in an appropriate neighborhood of the crossing. One can also envision that every edge is encased in a strip of the background color and that the casing of the upper edge covers the lower edge at the crossing. If there are no application-specific restrictions that dictate the order of the edges at each crossing, then we can in principle choose freely how to arrange them. However, certain orders will lead to a more readable drawing than others. In this thesis we formulate aesthetic criteria for a cased drawing as optimization problems and solve these problems. For most of the problems we present either a polynomial time algorithm or demonstrate that the problem is NP-hard. Finally we consider a combinatorial question in computational topology concerning three types of objects: closed curves in the plane, surfaces immersed in the plane, and surfaces embedded in space. In particular, we study casings of closed curves in the plane to decide whether these curves can be embedded as the boundaries of certain special surfaces. We show that it is NP-complete to determine whether an immersed disk is the projection of a surface embedded in space, or whether a curve is the boundary of an immersed surface in the plane that is not constrained to be a disk. However, when a casing is supplied with a self-intersecting curve, describing which component of the curve lies above and which below at each crossing, we can determine in time linear in the number of crossings whether the cased curve forms the projected boundary of a surface in space. As a related result, we show that an immersed surface with a single boundary curve that crosses itself n times has at most 2n=2 combinatorially distinct spatial embeddings and we discuss the existence of fixed-parameter tractable algorithms for related problems

    Demand-Responsive Transport: Models and Algorithms

    Get PDF
    Demand-responsive transport is a form of public transport between bus and taxi services, involving flexible routing of small or medium sized vehicles. This dissertation presents mathematical models for demand-responsive transport and methods that can be used to solve combinatorial problems related to vehicle routing and journey planning in a transport network. Public transport can be viewed as a market where demand affects supply and vice versa. In the first part of the dissertation related to vehicle routing, we show how a given demand for transportation can be satisfied by using a fleet of vehicles, assuming that the demand is known at the individual level. In the second part, by considering the journey planning problem faced by commuters, we study how the demand adapts to the supply of transport services, assuming that the supply remains unchanged for a short period of time. We also present a stochastic network model for determining the economic equilibrium, that is, the point at which the demand meets the supply, by assuming that commuters attempt to minimize travel time and transport operators aim to maximize profit. The mathematical models proposed in this work can be used to simulate the operations of public transport services in a wide range of scenarios, from paratransit services for the elderly and disabled to large-scale demand-responsive transport services designed to compete with private car traffic. Such calculations can provide valuable information to public authorities and planners of transportation services, regarding, for example, regulation and investments. In addition to public transport, potential applications of the proposed methods for solving vehicle routing and journey planning problems include freight transportation, courier and food delivery services, military logistics and air traffic.Kysyntäohjautuvalla joukkoliikenteellä tarkoitetaan bussi- ja taksipalvelujen välimuotoa, joka perustuu pienten tai keskisuurten ajoneuvojen joustavaan reititykseen. Tässä väitöskirjassa esitetään matemaattisia malleja kysyntäohjautuvalle joukkoliikenteelle, ja menetelmiä, joilla voidaan ratkaista ajoneuvojen reitinlaskentaan ja matkansuunnitteluun liittyviä kombinatorisia ongelmia liikenneverkossa. Joukkoliikennettä voidaan tarkastella markkinana, jossa kysyntä vaikuttaa tarjontaan ja päinvastoin. Väitoskirjan ensimmäisessa osassa, joka käsittelee ajoneuvojen reitinlaskentaa, näytetään miten tunnettuun kysyntään voidaan vastata käyttämällä tiettyä ajoneuvokantaa, kun oletetaan kysyntä tunnetuksi yhden matkustajan tarkkuudella. Toisessa osassa tarkastellaan matkustajien matkansuunnittelua joukkoliikenneverkossa, eli sitä miten kysyntä mukautuu liikennepalvelujen tarjonnan mukaan, kun oletetaan tarjonta muuttumattomaksi lyhyellä aikavälillä. Lopuksi esitetään menetelmä taloudellisen tasapainopisteen, eli kysynnän ja tarjonnan kohtaamispisteen, määrittämiseksi, kun oletetaan että matkustajat pyrkivät minimoimaan matka-aikaa ja liikennepalvelujen tarjoajat pyrkivät maksimoimaan taloudellista voittoa. Tässä työssä esiteltyjen mallien avulla voidaan simuloida useita erityyppisiä liikennepalvelujavanhuksille ja liikuntarajoitteisille suunnatuista kutsulinjoista henkilöautoliikenteen kanssa kilpaileviin laajamittaisiin kysyntäohjautuviin joukkoliikennejärjestelmiin. Mallien avulla tehdyt laskelmat voivat tuottaa arvokasta tietoa viranomaisille ja liikennepalvelujen suunnittelijoille liikenteen säännöstelyyn ja investointeihin liittyen. Joukkoliikenteen lisäksi esiteltyjä reitinlaskenta- ja matkansuunnittelumenetelmiä voidaan soveltaa muun muassa rahti- ja lentoliikenteessä, lähetti- ja ruoankuljetuspalveluissa sekä sotilaslogistiikassa

    Sequencing by enumerative methods

    Get PDF

    Gene set based ensemble methods for cancer classification

    Get PDF
    Diagnosis of cancer very often depends on conclusions drawn after both clinical and microscopic examinations of tissues to study the manifestation of the disease in order to place tumors in known categories. One factor which determines the categorization of cancer is the tissue from which the tumor originates. Information gathered from clinical exams may be partial or not completely predictive of a specific category of cancer. Further complicating the problem of categorizing various tumors is that the histological classification of the cancer tissue and description of its course of development may be atypical. Gene expression data gleaned from micro-array analysis provides tremendous promise for more accurate cancer diagnosis. One hurdle in the classification of tumors based on gene expression data is that the data space is ultra-dimensional with relatively few points; that is, there are a small number of examples with a large number of genes. A second hurdle is expression bias caused by the correlation of genes. Analysis of subsets of genes, known as gene set analysis, provides a mechanism by which groups of differentially expressed genes can be identified. We propose an ensemble of classifiers whose base classifiers are â„“1-regularized logistic regression models with restriction of the feature space to biologically relevant genes. Some researchers have already explored the use of ensemble classifiers to classify cancer but the effect of the underlying base classifiers in conjunction with biologically-derived gene sets on cancer classification has not been explored

    Multilabel Classification through Structured Output Learning - Methods and Applications

    Get PDF
    Multilabel classification is an important topic in machine learning that arises naturally from many real world applications. For example, in document classification, a research article can be categorized as “science”, “drug discovery” and “genomics” at the same time. The goal of multilabel classification is to reliably predict multiple outputs for a given input. As multiple interdependent labels can be “on” and “off” simultaneously, the central problem in multilabel classification is how to best exploit the correlation between labels to make accurate predictions. Compared to the previous flat multilabel classification approaches which treat multiple labels as a flat vector, structured output learning relies on an output graph connecting multiple labels to model the correlation between labels in a comprehensive manner. The main question studied in this thesis is how to tackle multilabel classification through structured output learning. This thesis starts with an extensive review on the topic of classification learning including both single-label and multilabel classification. The first problem we address is how to solve the multilabel classification problem when the output graph is observed apriori. We discuss several well-established structured output learning algorithms and study the network response prediction problem within the context of social network analysis. As the current structured output learning algorithms rely on the output graph to exploit the dependency between labels, the second problem we address is how to use structured output learning when the output graph is not known. Specifically, we examine the potential of learning on a set of random output graphs when the “real” one is hidden. This problem is relevant as in most multilabel classification problems there does not exist any output graph that reveals the dependency between labels. The third problem we address is how to analyze the proposed learning algorithms in a theoretical manner. Specifically, we want to explain the intuition behind the proposed models and to study the generalization error. The main contributions of this thesis are several new learning algorithms that widen the applicability of structured output learning. For the problem with an observed output graph, the proposed algorithm “SPIN” is able to predict an optimal directed acyclic graph from an observed underlying network that best responses to an input. For general multilabel classification problems without any known output graph, we proposed several learning algorithms that combine a set of structured output learners built on random output graphs. In addition, we develop a joint learning and inference framework which is based on max-margin learning over a random sample of spanning trees. The theoretic analysis also guarantees the generalization error of the proposed methods
    • …
    corecore