4,925 research outputs found

    BlogForever D2.6: Data Extraction Methodology

    Get PDF
    This report outlines an inquiry into the area of web data extraction, conducted within the context of blog preservation. The report reviews theoretical advances and practical developments for implementing data extraction. The inquiry is extended through an experiment that demonstrates the effectiveness and feasibility of implementing some of the suggested approaches. More specifically, the report discusses an approach based on unsupervised machine learning that employs the RSS feeds and HTML representations of blogs. It outlines the possibilities of extracting semantics available in blogs and demonstrates the benefits of exploiting available standards such as microformats and microdata. The report proceeds to propose a methodology for extracting and processing blog data to further inform the design and development of the BlogForever platform

    Learning to Transform Time Series with a Few Examples

    Get PDF
    We describe a semi-supervised regression algorithm that learns to transform one time series into another time series given examples of the transformation. This algorithm is applied to tracking, where a time series of observations from sensors is transformed to a time series describing the pose of a target. Instead of defining and implementing such transformations for each tracking task separately, our algorithm learns a memoryless transformation of time series from a few example input-output mappings. The algorithm searches for a smooth function that fits the training examples and, when applied to the input time series, produces a time series that evolves according to assumed dynamics. The learning procedure is fast and lends itself to a closed-form solution. It is closely related to nonlinear system identification and manifold learning techniques. We demonstrate our algorithm on the tasks of tracking RFID tags from signal strength measurements, recovering the pose of rigid objects, deformable bodies, and articulated bodies from video sequences. For these tasks, this algorithm requires significantly fewer examples compared to fully-supervised regression algorithms or semi-supervised learning algorithms that do not take the dynamics of the output time series into account

    Towards accurate and efficient live cell imaging data analysis

    Get PDF
    Dynamische zellulĂ€re Prozesse wie Zellzyklus, Signaltransduktion oder Transkription zu analysieren wird Live-cell-imaging mittels Zeitraffermikroskopie verwendet. Um nun aber ZellabstammungsbĂ€ume aus einem Zeitraffervideo zu extrahieren, mĂŒssen die Zellen segmentiert und verfolgt werden können. Besonders hier, wo lebende Zellen ĂŒber einen langen Zeitraum betrachtet werden, sind Fehler in der Analyse fatal: Selbst eine extrem niedrige Fehlerrate kann sich amplifizieren, wenn viele Zeitpunkte aufgenommen werden, und damit den gesamten Datensatz unbrauchbar machen. In dieser Arbeit verwenden wir einen einfachen aber praktischen Ansatz, der die VorzĂŒge der manuellen und automatischen AnsĂ€tze kombiniert. Das von uns entwickelte Live-cell-Imaging Datenanalysetool ‘eDetect’ ergĂ€nzt die automatische Zellsegmentierung und -verfolgung durch Nachbearbeitung. Das Besondere an dieser Arbeit ist, dass sie mehrere interaktive Datenvisualisierungsmodule verwendet, um den Benutzer zu fĂŒhren und zu unterstĂŒtzen. Dies erlaubt den gesamten manuellen Eingriffsprozess zu rational und effizient zu gestalten. Insbesondere werden zwei Streudiagramme und eine Heatmap verwendet, um die Merkmale einzelner Zellen interaktiv zu visualisieren. Die Streudiagramme positionieren Ă€hnliche Objekte in unmittelbarer NĂ€he. So kann eine große Gruppe Ă€hnlicher Fehler mit wenigen Mausklicks erkannt und korrigiert werden, und damit die manuellen Eingriffe auf ein Minimum reduziert werden. Die Heatmap ist darauf ausgerichtet, alle ĂŒbersehenen Fehler aufzudecken und den Benutzern dabei zu helfen, bei der Zellabstammungsrekonstruktion schrittweise die perfekte Genauigkeit zu erreichen. Die quantitative Auswertung zeigt, dass eDetect die Genauigkeit der Nachverfolgung innerhalb eines akzeptablen Zeitfensters erheblich verbessern kann. Beurteilt nach biologisch relevanten Metriken, ĂŒbertrifft die Leistung von eDetect die derer Tools, die den Wettbewerb ‘Cell Tracking Challenge’ gewonnen haben.Live cell imaging based on time-lapse microscopy has been used to study dynamic cellular behaviors, such as cell cycle, cell signaling and transcription. Extracting cell lineage trees out of a time-lapse video requires cell segmentation and cell tracking. For long term live cell imaging, data analysis errors are particularly fatal. Even an extremely low error rate could potentially be amplified by the large number of sampled time points and render the entire video useless. In this work, we adopt a straightforward but practical design that combines the merits of manual and automatic approaches. We present a live cell imaging data analysis tool `eDetect', which uses post-editing to complement automatic segmentation and tracking. What makes this work special is that eDetect employs multiple interactive data visualization modules to guide and assist users, making the error detection and correction procedure rational and efficient. Specifically, two scatter plots and a heat map are used to interactively visualize single cells' visual features. The scatter plots position similar results in close vicinity, making it easy to spot and correct a large group of similar errors with a few mouse clicks, minimizing repetitive human interventions. The heat map is aimed at exposing all overlooked errors and helping users progressively approach perfect accuracy in cell lineage reconstruction. Quantitative evaluation proves that eDetect is able to largely improve accuracy within an acceptable time frame, and its performance surpasses the winners of most tasks in the `Cell Tracking Challenge', as measured by biologically relevant metrics

    A methodology to produce geographical information for land planning using very-high resolution images

    Get PDF
    Actualmente, os municĂ­pios sĂŁo obrigados a produzir, no Ăąmbito da elaboração dos instrumentos de gestĂŁo territorial, cartografia homologada pela autoridade nacional. O Plano Director Municipal (PDM) tem um perĂ­odo de vigĂȘncia de 10 anos. PorĂ©m, no que diz respeito Ă  cartografia para estes planos, principalmente em municĂ­pios onde a pressĂŁo urbanĂ­stica Ă© elevada, esta periodicidade nĂŁo Ă© compatĂ­vel com a dinĂąmica de alteração de uso do solo. Emerge assim, a necessidade de um processo de produção mais eficaz, que permita a obtenção de uma nova cartografia de base e temĂĄtica mais frequentemente. Em Portugal recorre-se Ă  fotografia aĂ©rea como informação de base para a produção de cartografia de grande escala. Por um lado, embora este suporte de informação resulte em mapas bastante rigorosos e detalhados, a sua produção tĂȘm custos muito elevados e consomem muito tempo. As imagens de satĂ©lite de muito alta-resolução espacial podem constituir uma alternativa, mas sem substituir as fotografias aĂ©reas na produção de cartografia temĂĄtica, a grande escala. O tema da tese trata assim da satisfação das necessidades municipais em informação geogrĂĄfica actualizada. Para melhor conhecer o valor e utilidade desta informação, realizou-se um inquĂ©rito aos municĂ­pios Portugueses. Este passo foi essencial para avaliar a pertinĂȘncia e a utilidade da introdução de imagens de satĂ©lite de muito alta-resolução espacial na cadeia de procedimentos de actualização de alguns temas, quer na cartografia de base quer na cartografia temĂĄtica. A abordagem proposta para solução do problema identificado baseia-se no uso de imagens de satĂ©lite e outros dados digitais em ambiente de Sistemas de Informação GeogrĂĄfica. A experimentação teve como objectivo a extracção automĂĄtica de elementos de interesse municipal a partir de imagens de muito alta-resolução espacial (fotografias aĂ©reas ortorectificadas, imagem QuickBird, e imagem IKONOS), bem como de dados altimĂ©tricos (dados LiDAR). Avaliaram-se as potencialidades da informação geogrĂĄfica extraĂ­das das imagens para fins cartogrĂĄficos e analĂ­ticos. Desenvolveram-se quatro casos de estudo que reflectem diferentes usos para os dados geogrĂĄficos a nĂ­vel municipal, e que traduzem aplicaçÔes com exigĂȘncias diferentes. No primeiro caso de estudo, propĂ”e-se uma metodologia para actualização periĂłdica de cartografia a grande escala, que faz uso de fotografias aĂ©reas vi ortorectificadas na ĂĄrea da Alta de Lisboa. Esta Ă© uma aplicação quantitativa onde as qualidades posicionais e geomĂ©tricas dos elementos extraĂ­dos sĂŁo mais exigentes. No segundo caso de estudo, criou-se um sistema de alarme para ĂĄreas potencialmente alteradas, com recurso a uma imagem QuickBird e dados LiDAR, no Bairro da Madre de Deus, com objectivo de auxiliar a actualização de cartografia de grande escala. No terceiro caso de estudo avaliou-se o potencial solar de topos de edifĂ­cios nas Avenidas Novas, com recurso a dados LiDAR. No quarto caso de estudo, propĂ”e-se uma sĂ©rie de indicadores municipais de monitorização territorial, obtidos pelo processamento de uma imagem IKONOS que cobre toda a ĂĄrea do concelho de Lisboa. Esta Ă© uma aplicação com fins analĂ­ticos onde a qualidade temĂĄtica da extracção Ă© mais relevante.Currently, the Portuguese municipalities are required to produce homologated cartography, under the Territorial Management Instruments framework. The Municipal Master Plan (PDM) has to be revised every 10 years, as well as the topographic and thematic maps that describe the municipal territory. However, this period is inadequate for representing counties where urban pressure is high, and where the changes in the land use are very dynamic. Consequently, emerges the need for a more efficient mapping process, allowing obtaining recent geographic information more often. Several countries, including Portugal, continue to use aerial photography for large-scale mapping. Although this data enables highly accurate maps, its acquisition and visual interpretation are very costly and time consuming. Very-High Resolution (VHR) satellite imagery can be an alternative data source, without replacing the aerial images, for producing large-scale thematic cartography. The focus of the thesis is the demand for updated geographic information in the land planning process. To better understand the value and usefulness of this information, a survey of all Portuguese municipalities was carried out. This step was essential for assessing the relevance and usefulness of the introduction of VHR satellite imagery in the chain of procedures for updating land information. The proposed methodology is based on the use of VHR satellite imagery, and other digital data, in a Geographic Information Systems (GIS) environment. Different algorithms for feature extraction that take into account the variation in texture, color and shape of objects in the image, were tested. The trials aimed for automatic extraction of features of municipal interest, based on aerial and satellite high-resolution (orthophotos, QuickBird and IKONOS imagery) as well as elevation data (altimetric information and LiDAR data). To evaluate the potential of geographic information extracted from VHR images, two areas of application were identified: mapping and analytical purposes. Four case studies that reflect different uses of geographic data at the municipal level, with different accuracy requirements, were considered. The first case study presents a methodology for periodic updating of large-scale maps based on orthophotos, in the area of Alta de Lisboa. This is a situation where the positional and geometric accuracy of the extracted information are more demanding, since technical mapping standards must be complied. In the second case study, an alarm system that indicates the location of potential changes in building areas, using a QuickBird image and LiDAR data, was developed for the area of Bairro da Madre de Deus. The goal of the system is to assist the updating of large scale mapping, providing a layer that can be used by the municipal technicians as the basis for manual editing. In the third case study, the analysis of the most suitable roof-tops for installing solar systems, using LiDAR data, was performed in the area of Avenidas Novas. A set of urban environment indicators obtained from VHR imagery is presented. The concept is demonstrated for the entire city of Lisbon, through IKONOS imagery processing. In this analytical application, the positional quality issue of extraction is less relevant.GEOSAT – Methodologies to extract large scale GEOgraphical information from very high resolution SATellite images (PTDC/GEO/64826/2006), e-GEO – Centro de Estudos de Geografia e Planeamento Regional, da Faculdade de CiĂȘncias Sociais e Humanas, no quadro do Grupo de Investigação Modelação GeogrĂĄfica, Cidades e Ordenamento do TerritĂłri

    Detecting, segmenting and tracking bio-medical objects

    Get PDF
    Studying the behavior patterns of biomedical objects helps scientists understand the underlying mechanisms. With computer vision techniques, automated monitoring can be implemented for efficient and effective analysis in biomedical studies. Promising applications have been carried out in various research topics, including insect group monitoring, malignant cell detection and segmentation, human organ segmentation and nano-particle tracking. In general, applications of computer vision techniques in monitoring biomedical objects include the following stages: detection, segmentation and tracking. Challenges in each stage will potentially lead to unsatisfactory results of automated monitoring. These challenges include different foreground-background contrast, fast motion blur, clutter, object overlap and etc. In this thesis, we investigate the challenges in each stage, and we propose novel solutions with computer vision methods to overcome these challenges and help automatically monitor biomedical objects with high accuracy in different cases --Abstract, page iii

    Visual pattern recognition using neural networks

    Get PDF
    Neural networks have been widely studied in a number of fields, such as neural architectures, neurobiology, statistics of neural network and pattern classification. In the field of pattern classification, neural network models are applied on numerous applications, for instance, character recognition, speech recognition, and object recognition. Among these, character recognition is commonly used to illustrate the feature and classification characteristics of neural networks. In this dissertation, the theoretical foundations of artificial neural networks are first reviewed and existing neural models are studied. The Adaptive Resonance Theory (ART) model is improved to achieve more reasonable classification results. Experiments in applying the improved model to image enhancement and printed character recognition are discussed and analyzed. We also study the theoretical foundation of Neocognitron in terms of feature extraction, convergence in training, and shift invariance. We investigate the use of multilayered perceptrons with recurrent connections as the general purpose modules for image operations in parallel architectures. The networks are trained to carry out classification rules in image transformation. The training patterns can be derived from user-defmed transformations or from loading the pair of a sample image and its target image when the prior knowledge of transformations is unknown. Applications of our model include image smoothing, enhancement, edge detection, noise removal, morphological operations, image filtering, etc. With a number of stages stacked up together we are able to apply a series of operations on the image. That is, by providing various sets of training patterns the system can adapt itself to the concatenated transformation. We also discuss and experiment in applying existing neural models, such as multilayered perceptron, to realize morphological operations and other commonly used imaging operations. Some new neural architectures and training algorithms for the implementation of morphological operations are designed and analyzed. The algorithms are proven correct and efficient. The proposed morphological neural architectures are applied to construct the feature extraction module of a personal handwritten character recognition system. The system was trained and tested with scanned image of handwritten characters. The feasibility and efficiency are discussed along with the experimental results

    Identification and Classification of Radio Pulsar Signals Using Machine Learning

    Get PDF
    Automated single-pulse search approaches are necessary as ever-increasing amount of observed data makes the manual inspection impractical. Detecting radio pulsars using single-pulse searches, however, is a challenging problem for machine learning because pul- sar signals often vary significantly in brightness, width, and shape and are only detected in a small fraction of observed data. The research work presented in this dissertation is focused on development of ma- chine learning algorithms and approaches for single-pulse searches in the time domain. Specifically, (1) We developed a two-stage single-pulse search approach, named Single- Pulse Event Group IDentification (SPEGID), which automatically identifies and clas- sifies pulsars in radio pulsar search data. SPEGID first identifies pulse candidates as trial single-pulse event groups and then extracts features from the candidates and trains classifiers using supervised machine learning. SPEGID also addressed the challenges in- troduced by the current data processing techniques and successfully identified bright and dim candidates as well as other types of challenging pulsar candidates. (2) To address the lack of training data in the early stages of pulsar surveys, we explored the cross-surveys prediction. Our results showed that using instance-based and parameter-based transfer learning methods improved the performance of pulsar classification across surveys. (3) We developed a hybrid recommender system aimed to detect rare pulsar signals that are often missed by supervised learning. The proposed recommender system uses a target rare case to state users’ requirements and ranks the candidates using a similarity func- tion which is calculated as a weighted sum of individual feature similarities. Our hybrid recommender system successfully detects both low signal-to-noise ratio (S/N) pulsars and Fast Radio Bursts (FRBs). The approaches proposed in this dissertation were used to analyze data from the Green Bank Telescope 350 MHz drift (GBTDrift) pulsar survey and the Arecibo 327 MHz (AO327) drift pulsar survey and discovered eight pulsars that were overlooked in previous analysis done with existing methods

    Object Tracking

    Get PDF
    Object tracking consists in estimation of trajectory of moving objects in the sequence of images. Automation of the computer object tracking is a difficult task. Dynamics of multiple parameters changes representing features and motion of the objects, and temporary partial or full occlusion of the tracked objects have to be considered. This monograph presents the development of object tracking algorithms, methods and systems. Both, state of the art of object tracking methods and also the new trends in research are described in this book. Fourteen chapters are split into two sections. Section 1 presents new theoretical ideas whereas Section 2 presents real-life applications. Despite the variety of topics contained in this monograph it constitutes a consisted knowledge in the field of computer object tracking. The intention of editor was to follow up the very quick progress in the developing of methods as well as extension of the application
    • 

    corecore