2,223 research outputs found

    Automatic Chinese Postal Address Block Location Using Proximity Descriptors and Cooperative Profit Random Forests.

    Get PDF
    Locating the destination address block is key to automated sorting of mails. Due to the characteristics of Chinese envelopes used in mainland China, we here exploit proximity cues in order to describe the investigated regions on envelopes. We propose two proximity descriptors encoding spatial distributions of the connected components obtained from the binary envelope images. To locate the destination address block, these descriptors are used together with cooperative profit random forests (CPRFs). Experimental results show that the proposed proximity descriptors are superior to two component descriptors, which only exploit the shape characteristics of the individual components, and the CPRF classifier produces higher recall values than seven state-of-the-art classifiers. These promising results are due to the fact that the proposed descriptors encode the proximity characteristics of the binary envelope images, and the CPRF classifier uses an effective tree node split approach

    Geospatial relationships of air pollution and acute asthma events across the Detroit–Windsor international border: Study design and preliminary results

    Get PDF
    The Geospatial Determinants of Health Outcomes Consortium (GeoDHOC) study investigated ambient air quality across the international border between Detroit, Michigan, USA and Windsor, Ontario, Canada and its association with acute asthma events in 5- to 89-year-old residents of these cities. NO2, SO2, and volatile organic compounds (VOCs) were measured at 100 sites, and particulate matter (PM) and polycyclic aromatic hydrocarbons (PAHs) at 50 sites during two 2-week sampling periods in 2008 and 2009. Acute asthma event rates across neighborhoods in each city were calculated using emergency room visits and hospitalizations and standardized to the overall age and gender distribution of the population in the two cities combined. Results demonstrate that intra-urban air quality variations are related to adverse respiratory events in both cities. Annual 2008 asthma rates exhibited statistically significant positive correlations with total VOCs and total benzene, toluene, ethylbenzene and xylene (BTEX) at 5-digit zip code scale spatial resolution in Detroit. In Windsor, NO2, VOCs, and PM10 concentrations correlated positively with 2008 asthma rates at a similar 3-digit postal forward sortation area scale. The study is limited by its coarse temporal resolution (comparing relatively short term air quality measurements to annual asthma health data) and interpretation of findings is complicated by contrasts in population demographics and health-care delivery systems in Detroit and Windsor

    Statistical Machine Learning for Breast Cancer Detection with Terahertz Imaging

    Get PDF
    Breast conserving surgery (BCS) is a common breast cancer treatment option, in which the cancerous tissue is excised while leaving most of the healthy breast tissue intact. The lack of in-situ margin evaluation unfortunately results in a re-excision rate of 20-30% for this type of procedure. This study aims to design statistical and machine learning segmentation algorithms for the detection of breast cancer in BCS by using terahertz (THz) imaging. Given the material characterization properties of the non-ionizing radiation in the THz range, we intend to employ the responses from the THz system to identify healthy and cancerous breast tissue in BCS samples. In particular, this dissertation covers the description of four segmentation algorithms for the detection of breast cancer in THz imaging. We first explore the performance of one-dimensional (1D) Gaussian mixture and t-mixture models with Markov chain Monte Carlo (MCMC). Second, we propose a novel low-dimension ordered orthogonal projection (LOOP) algorithm for the dimension reduction of the THz information through a modified Gram-Schmidt process. Once the key features within the THz waveform have been detected by LOOP, the segmentation algorithm employs a multivariate Gaussian mixture model with MCMC and expectation maximization (EM). Third, we explore the spatial information of each pixel within the THz image through a Markov random field (MRF) approach. Finally, we introduce a supervised multinomial probit regression algorithm with polynomial and kernel data representations. For evaluation purposes, this study makes use of fresh and formalin-fixed paraffin-embedded (FFPE) heterogeneous human and mice tissue models for the quantitative assessment of the segmentation performance in terms of receiver operating characteristics (ROC) curves. Overall, the experimental results demonstrate that the proposed approaches represent a promising technique for tissue segmentation within THz images of freshly excised breast cancer samples

    Deep 1.1 mm-wavelength imaging of the GOODS-S field by AzTEC/ASTE - I. Source catalogue and number counts

    Get PDF
    [Abridged] We present the first results from a 1.1 mm confusion-limited map of the GOODS-S field taken with AzTEC on the ASTE telescope. We imaged a 270 sq. arcmin field to a 1\sigma depth of 0.48 - 0.73 mJy/beam, making this one of the deepest blank-field surveys at mm-wavelengths ever achieved. Although our GOODS-S map is extremely confused, we demonstrate that our source identification and number counts analyses are robust, and the techniques discussed in this paper are relevant for other deeply confused surveys. We find a total of 41 dusty starburst galaxies with S/N >= 3.5 within this uniformly covered region, where only two are expected to be false detections. We derive the 1.1mm number counts from this field using both a "P(d)" analysis and a semi-Bayesian technique, and find that both methods give consistent results. Our data are well-fit by a Schechter function model with (S', N(3mJy), \alpha) = (1.30+0.19 mJy, 160+27 (mJy/deg^2)^(-1), -2.0). Given the depth of this survey, we put the first tight constraints on the 1.1 mm number counts at S(1.1mm) = 0.5 mJy, and we find evidence that the faint-end of the number counts at S(850\mu m) < 2.0 mJy from various SCUBA surveys towards lensing clusters are biased high. In contrast to the 870 \mu m survey of this field with the LABOCA camera, we find no apparent under-density of sources compared to previous surveys at 1.1 mm. Additionally, we find a significant number of SMGs not identified in the LABOCA catalogue. We find that in contrast to observations at wavelengths < 500 \mu m, MIPS 24 \mu m sources do not resolve the total energy density in the cosmic infrared background at 1.1 mm, demonstrating that a population of z > 3 dust-obscured galaxies that are unaccounted for at these shorter wavelengths potentially contribute to a large fraction (~2/3) of the infrared background at 1.1 mm.Comment: 21 pages, 9 figures. Accepted to MNRAS

    RANK-BASED TEMPO-SPATIAL CLUSTERING: A FRAMEWORK FOR RAPID OUTBREAK DETECTION USING SINGLE OR MULTIPLE DATA STREAMS

    Get PDF
    In the recent decades, algorithms for disease outbreak detection have become one of the main interests of public health practitioners to identify and localize an outbreak as early as possible in order to warrant further public health response before a pandemic develops. Today’s increased threat of biological warfare and terrorism provide an even stronger impetus to develop methods for outbreak detection based on symptoms as well as definitive laboratory diagnoses. In this dissertation work, I explore the problems of rapid disease outbreak detection using both spatial and temporal information. I develop a framework of non-parameterized algorithms which search for patterns of disease outbreak in spatial sub-regions of the monitored region within a certain period. Compared to the current existing spatial or tempo-spatial algorithm, the algorithms in this framework provide a methodology for fast searching of either univariate data set or multivariate data set. It first measures which study area is more likely to have an outbreak occurring given the baseline data and currently observed data. Then it applies a greedy searching mechanism to look for clusters with high posterior probabilities given the risk measurement for each unit area as heuristic. I also explore the performance of the proposed algorithms. From the perspective of predictive modeling, I adopt a Gamma-Poisson (GP) model to compute the probability of having an outbreak in each cluster when analyzing univariate data. I build a multinomial generalized Dirichlet (MGD) model to identify outbreak clusters from multivariate data which include the OTC data streams collected by the national retail data monitor (NRDM) and the ED data streams collected by the RODS system. Key contributions of this dissertation include 1) it introduces a rank-based tempo-spatial clustering algorithm, RSC, by utilizing greedy searching and Bayesian GP model for disease outbreak detection with comparable detection timeliness, cluster positive prediction value (PPV) and improved running time; 2) it proposes a multivariate extension of RSC (MRSC) which applies MGD model. The evaluation demonstrated the advantage that MGD model can effectively suppress the false alarms caused by elevated signals that are non-disease relevant and occur in all the monitored data streams

    Abstraction and cartographic generalization of geographic user-generated content: use-case motivated investigations for mobile users

    Full text link
    On a daily basis, a conventional internet user queries different internet services (available on different platforms) to gather information and make decisions. In most cases, knowingly or not, this user consumes data that has been generated by other internet users about his/her topic of interest (e.g. an ideal holiday destination with a family traveling by a van for 10 days). Commercial service providers, such as search engines, travel booking websites, video-on-demand providers, food takeaway mobile apps and the like, have found it useful to rely on the data provided by other users who have commonalities with the querying user. Examples of commonalities are demography, location, interests, internet address, etc. This process has been in practice for more than a decade and helps the service providers to tailor their results based on the collective experience of the contributors. There has been also interest in the different research communities (including GIScience) to analyze and understand the data generated by internet users. The research focus of this thesis is on finding answers for real-world problems in which a user interacts with geographic information. The interactions can be in the form of exploration, querying, zooming and panning, to name but a few. We have aimed our research at investigating the potential of using geographic user-generated content to provide new ways of preparing and visualizing these data. Based on different scenarios that fulfill user needs, we have investigated the potential of finding new visual methods relevant to each scenario. The methods proposed are mainly based on pre-processing and analyzing data that has been offered by data providers (both commercial and non-profit organizations). But in all cases, the contribution of the data was done by ordinary internet users in an active way (compared to passive data collections done by sensors). The main contributions of this thesis are the proposals for new ways of abstracting geographic information based on user-generated content contributions. Addressing different use-case scenarios and based on different input parameters, data granularities and evidently geographic scales, we have provided proposals for contemporary users (with a focus on the users of location-based services, or LBS). The findings are based on different methods such as semantic analysis, density analysis and data enrichment. In the case of realization of the findings of this dissertation, LBS users will benefit from the findings by being able to explore large amounts of geographic information in more abstract and aggregated ways and get their results based on the contributions of other users. The research outcomes can be classified in the intersection between cartography, LBS and GIScience. Based on our first use case we have proposed the inclusion of an extended semantic measure directly in the classic map generalization process. In our second use case we have focused on simplifying geographic data depiction by reducing the amount of information using a density-triggered method. And finally, the third use case was focused on summarizing and visually representing relatively large amounts of information by depicting geographic objects matched to the salient topics emerged from the data

    Multi-script handwritten character recognition:Using feature descriptors and machine learning

    Get PDF

    Lightning Imaging Sensor (LIS) for the Earth Observing System

    Get PDF
    Not only are scientific objectives and instrument characteristics given of a calibrated optical LIS for the EOS but also for the Tropical Rainfall Measuring Mission (TRMM) which was designed to acquire and study the distribution and variability of total lightning on a global basis. The LIS can be traced to a lightning mapper sensor planned for flight on the GOES meteorological satellites. The LIS consists of a staring imager optimized to detect and locate lightning. The LIS will detect and locate lightning with storm scale resolution (i.e., 5 to 10 km) over a large region of the Earth's surface along the orbital track of the satellite, mark the time of occurrence of the lightning, and measure the radiant energy. The LIS will have a nearly uniform 90 pct. detection efficiency within the area viewed by the sensor, and will detect intracloud and cloud-to-ground discharges during day and night conditions. Also, the LIS will monitor individual storms and storm systems long enough to obtain a measure of the lightning flashing rate when they are within the field of view of the LIS. The LIS attributes include low cost, low weight and power, low data rate, and important science. The LIS will study the hydrological cycle, general circulation and sea surface temperature variations, along with examinations of the electrical coupling of thunderstorms with the ionosphere and magnetosphere, and observations and modeling of the global electric circuit

    An implicit, conservative, zonal-boundary scheme for Euler equation calculations

    Get PDF
    A zonal, or patched, grid approach is one in which the flow region of interest is divided into subregions which are then discretized independently, using existing grid generators. The equations of motion are integrated in each subregion in conjunction with zonal boundary schemes which allow proper information transfer across interfaces that separate subregions. The zonal approach greatly simplifies the treatment of complex geometries and also the addition of grid points to selected regions of the flow. A conservative, zonal boundary condition that could be used with explicit schemes was extended so that it can be used with existing second order accurate implicit integration schemes such as the Beam-Warming and Osher schemes. In the test case considered, the implicit schemes increased the rate of convergence considerably (by a factor of about 30 over that of the explicit scheme). Results demonstrating the time accuracy of the zonal scheme and the feasibility of performing calculations on zones that move relative to each other are also presented
    • …
    corecore