213 research outputs found

    Report on shape analysis and matching and on semantic matching

    No full text
    In GRAVITATE, two disparate specialities will come together in one working platform for the archaeologist: the fields of shape analysis, and of metadata search. These fields are relatively disjoint at the moment, and the research and development challenge of GRAVITATE is precisely to merge them for our chosen tasks. As shown in chapter 7 the small amount of literature that already attempts join 3D geometry and semantics is not related to the cultural heritage domain. Therefore, after the project is done, there should be a clear ‘before-GRAVITATE’ and ‘after-GRAVITATE’ split in how these two aspects of a cultural heritage artefact are treated.This state of the art report (SOTA) is ‘before-GRAVITATE’. Shape analysis and metadata description are described separately, as currently in the literature and we end the report with common recommendations in chapter 8 on possible or plausible cross-connections that suggest themselves. These considerations will be refined for the Roadmap for Research deliverable.Within the project, a jargon is developing in which ‘geometry’ stands for the physical properties of an artefact (not only its shape, but also its colour and material) and ‘metadata’ is used as a general shorthand for the semantic description of the provenance, location, ownership, classification, use etc. of the artefact. As we proceed in the project, we will find a need to refine those broad divisions, and find intermediate classes (such as a semantic description of certain colour patterns), but for now the terminology is convenient – not least because it highlights the interesting area where both aspects meet.On the ‘geometry’ side, the GRAVITATE partners are UVA, Technion, CNR/IMATI; on the metadata side, IT Innovation, British Museum and Cyprus Institute; the latter two of course also playing the role of internal users, and representatives of the Cultural Heritage (CH) data and target user’s group. CNR/IMATI’s experience in shape analysis and similarity will be an important bridge between the two worlds for geometry and metadata. The authorship and styles of this SOTA reflect these specialisms: the first part (chapters 3 and 4) purely by the geometry partners (mostly IMATI and UVA), the second part (chapters 5 and 6) by the metadata partners, especially IT Innovation while the joint overview on 3D geometry and semantics is mainly by IT Innovation and IMATI. The common section on Perspectives was written with the contribution of all

    Scene Segmentation and Object Classification for Place Recognition

    Get PDF
    This dissertation tries to solve the place recognition and loop closing problem in a way similar to human visual system. First, a novel image segmentation algorithm is developed. The image segmentation algorithm is based on a Perceptual Organization model, which allows the image segmentation algorithm to ‘perceive’ the special structural relations among the constituent parts of an unknown object and hence to group them together without object-specific knowledge. Then a new object recognition method is developed. Based on the fairly accurate segmentations generated by the image segmentation algorithm, an informative object description that includes not only the appearance (colors and textures), but also the parts layout and shape information is built. Then a novel feature selection algorithm is developed. The feature selection method can select a subset of features that best describes the characteristics of an object class. Classifiers trained with the selected features can classify objects with high accuracy. In next step, a subset of the salient objects in a scene is selected as landmark objects to label the place. The landmark objects are highly distinctive and widely visible. Each landmark object is represented by a list of SIFT descriptors extracted from the object surface. This object representation allows us to reliably recognize an object under certain viewpoint changes. To achieve efficient scene-matching, an indexing structure is developed. Both texture feature and color feature of objects are used as indexing features. The texture feature and the color feature are viewpoint-invariant and hence can be used to effectively find the candidate objects with similar surface characteristics to a query object. Experimental results show that the object-based place recognition and loop detection method can efficiently recognize a place in a large complex outdoor environment

    Discovery of Visual Semantics by Unsupervised and Self-Supervised Representation Learning

    Full text link
    The success of deep learning in computer vision is rooted in the ability of deep networks to scale up model complexity as demanded by challenging visual tasks. As complexity is increased, so is the need for large amounts of labeled data to train the model. This is associated with a costly human annotation effort. To address this concern, with the long-term goal of leveraging the abundance of cheap unlabeled data, we explore methods of unsupervised "pre-training." In particular, we propose to use self-supervised automatic image colorization. We show that traditional methods for unsupervised learning, such as layer-wise clustering or autoencoders, remain inferior to supervised pre-training. In search for an alternative, we develop a fully automatic image colorization method. Our method sets a new state-of-the-art in revitalizing old black-and-white photography, without requiring human effort or expertise. Additionally, it gives us a method for self-supervised representation learning. In order for the model to appropriately re-color a grayscale object, it must first be able to identify it. This ability, learned entirely self-supervised, can be used to improve other visual tasks, such as classification and semantic segmentation. As a future direction for self-supervision, we investigate if multiple proxy tasks can be combined to improve generalization. This turns out to be a challenging open problem. We hope that our contributions to this endeavor will provide a foundation for future efforts in making self-supervision compete with supervised pre-training.Comment: Ph.D. thesi

    Reactive Correction of Object Placement Errors for Robotic Arrangement Tasks

    Full text link
    When arranging objects with robotic arms, the quality of the end result strongly depends on the achievable placement accuracy. However, even the most advanced robotic systems are prone to positioning errors that can occur at different steps of the manipulation process. Ignoring such errors can lead to the partial or complete failure of the arrangement. In this paper, we present a novel approach to autonomously detect and correct misplaced objects by pushing them with a robotic arm. We thoroughly tested our approach both in simulation and on real hardware using a Robotiq two-finger gripper mounted on a UR5 robotic arm. In our evaluation, we demonstrate the successful compensation for different errors injected during the manipulation of regular shaped objects. Consequently, we achieve a highly reliable object placement accuracy in the millimeter range

    Exploiting Spatio-Temporal Coherence for Video Object Detection in Robotics

    Get PDF
    This paper proposes a method to enhance video object detection for indoor environments in robotics. Concretely, it exploits knowledge about the camera motion between frames to propagate previously detected objects to successive frames. The proposal is rooted in the concepts of planar homography to propose regions of interest where to find objects, and recursive Bayesian filtering to integrate observations over time. The proposal is evaluated on six virtual, indoor environments, accounting for the detection of nine object classes over a total of ∼ 7k frames. Results show that our proposal improves the recall and the F1-score by a factor of 1.41 and 1.27, respectively, as well as it achieves a significant reduction of the object categorization entropy (58.8%) when compared to a two-stage video object detection method used as baseline, at the cost of small time overheads (120 ms) and precision loss (0.92).</p

    On the Automation and Diagnosis of Visual Intelligence

    Get PDF
    One of the ultimate goals of computer vision is to equip machines with visual intelligence: the ability to understand a scene at the level that is indistinguishable from human's. This not only requires detecting the 2D or 3D locations of objects, but also recognizing their semantic categories, or even higher level interactions. Thanks to decades of vision research as well as recent developments in deep learning, we are closer to this goal than ever. But to keep closing the gap, more research is needed on two themes. One, current models are still far from perfect, so we need a mechanism to keep proposing new, better models to improve performance. Two, while we are pushing for performance, it is also important to do careful analysis and diagnosis of existing models, to make sure we are indeed moving in the right direction. In this dissertation, I study either of the two research themes for various steps in the visual intelligence pipeline. The first part of the dissertation focuses on category-level understanding of 2D images, which is arguably the most critical step in the visual intelligence pipeline as it bridges vision and language. The theme is on automating the process of model improvement: in particular, the architecture of neural networks. The second part extends the visual intelligence pipeline along the language side, and focuses on the more challenging language-level understanding of 2D images. The theme also shifts to diagnosis, by examining existing models, proposing interpretable models, or building diagnostic datasets. The third part continues in the diagnosis theme, this time extending along the vision side, focusing on how incorporating 3D scene knowledge may facilitate the evaluation of image recognition models

    Enhancing Online Security with Image-based Captchas

    Get PDF
    Given the data loss, productivity, and financial risks posed by security breaches, there is a great need to protect online systems from automated attacks. Completely Automated Public Turing Tests to Tell Computers and Humans Apart, known as CAPTCHAs, are commonly used as one layer in providing online security. These tests are intended to be easily solvable by legitimate human users while being challenging for automated attackers to successfully complete. Traditionally, CAPTCHAs have asked users to perform tasks based on text recognition or categorization of discrete images to prove whether or not they are legitimate human users. Over time, the efficacy of these CAPTCHAs has been eroded by improved optical character recognition, image classification, and machine learning techniques that can accurately solve many CAPTCHAs at rates approaching those of humans. These CAPTCHAs can also be difficult to complete using the touch-based input methods found on widely used tablets and smartphones.;This research proposes the design of CAPTCHAs that address the shortcomings of existing implementations. These CAPTCHAs require users to perform different image-based tasks including face detection, face recognition, multimodal biometrics recognition, and object recognition to prove they are human. These are tasks that humans excel at but which remain difficult for computers to complete successfully. They can also be readily performed using click- or touch-based input methods, facilitating their use on both traditional computers and mobile devices.;Several strategies are utilized by the CAPTCHAs developed in this research to enable high human success rates while ensuring negligible automated attack success rates. One such technique, used by fgCAPTCHA, employs image quality metrics and face detection algorithms to calculate a fitness value representing the simulated performance of human users and automated attackers, respectively, at solving each generated CAPTCHA image. A genetic learning algorithm uses these fitness values to determine customized generation parameters for each CAPTCHA image. Other approaches, including gradient descent learning, artificial immune systems, and multi-stage performance-based filtering processes, are also proposed in this research to optimize the generated CAPTCHA images.;An extensive RESTful web service-based evaluation platform was developed to facilitate the testing and analysis of the CAPTCHAs developed in this research. Users recorded over 180,000 attempts at solving these CAPTCHAs using a variety of devices. The results show the designs created in this research offer high human success rates, up to 94.6\% in the case of aiCAPTCHA, while ensuring resilience against automated attacks

    Mothers\u27 Adaptation to Caring for a New Baby

    Get PDF
    To date, most research on parents\u27 adjustment after adding a new baby to their family unit has focused on mothers\u27 initial transition to parenthood. This past research has examined changes in mothers\u27 marital satisfaction and perceived well-being across the transition, and has compared their prenatal expectations to their postnatal experiences. This project assessed first-time and experienced mothers\u27 stress and satisfaction associated with parenting, their adjustment to competing demands, and their perceived well-being longitudinally before and after the birth of a baby. Additionally, how maternal and child-related variables influenced the trajectory of mothers\u27 postnatal adaptation was assessed. These variables included mothers\u27 age, their education level, their prenatal expectations and postnatal experiences concerning shared infant care, their satisfaction with the division of infant caregiving, and their perceptions of their infant\u27s temperament. Mothers (N = 136) completed an online survey during their third trimester and additional online surveys when their baby was approximately 2, 4, 6, and 8 weeks old.;First-time mothers prenatally expected a more equal division of infant caregiving between themselves and their partners than did experienced mothers. Both first-time and experienced mothers reported less assistance from their partners than they had prenatally expected. Additionally, they experienced almost twice as many violated expectations than met expectations. Growth curve modeling revealed that a cubic function of time best fit the trajectory of mothers\u27 postnatal parenting satisfaction. Mothers reported less parenting satisfaction at 4 weeks, compared to 2 and 6 weeks, and reported stability in their satisfaction between 6 and 8 weeks. A quadratic function of time best fit the trajectories of mothers\u27 postnatal parenting stress and adjustment to the demands of their baby. Mothers reported more stress and difficulty adjusting to their baby\u27s demands at 4 and 6 weeks, compared to 2 and 8 weeks. A linear function of time best fit the trajectories of mothers\u27 adjustment to home demands, generalized state anxiety, and depressive symptoms. Mothers reported less difficulty meeting home demands, less generalized anxiety, and fewer depressive symptoms across the postnatal period. Mothers\u27 violated expectations were associated with level differences in all aspects of mothers\u27 postnatal adaptation except their adjustment to home demands. Specifically, more violated expectations, in number or in magnitude, were associated with poorer postnatal adaptation. Mothers\u27 violated expectations were not associated with the slope of mothers\u27 postnatal adaptation trajectories. Exploratory models revealed that other maternal and child-related variables also impacted the level and slope of mothers\u27 postnatal adaptation.;Overall, first-time and experienced mothers were more similar than different in regards to their postnatal adaptation. This study suggests that prior findings concerning adults\u27 initial transition to parenthood may also apply to adults during each addition of a new baby into the family unit. Additionally, mothers who reported less of a mismatch between their expectations and experiences concerning shared infant care had fewer issues adapting the postnatal period. Thus, methods to increase the assistance mothers receive from their partner should be sought. Limitations of this study and suggestions for future research are also discussed

    Allocation of Two Dimensional Parts Using a Shape Reasoning Heuristic.

    Get PDF
    A technique is outlined for the allocation of irregular parts onto arbitrarily shaped resources. Placements are generated by matching complementary shapes between the unplaced parts and the remaining areas of the stock material. The part and resource profiles are characterized to varying levels of detail using geometric features . Information contained in the features is used at each stage of processing to intelligently select and place parts on the resource. Techniques for the efficient handling of complex profiles and other practical implementation issues are described. The utility of the proposed approach is verified using diverse problems from a marine fabrication facility. The formulation and performance of the method is contrasted to previously published works
    • …
    corecore