497 research outputs found

    Solving Visual Madlibs with Multiple Cues

    Get PDF
    This paper focuses on answering fill-in-the-blank style multiple choice questions from the Visual Madlibs dataset. Previous approaches to Visual Question Answering (VQA) have mainly used generic image features from networks trained on the ImageNet dataset, despite the wide scope of questions. In contrast, our approach employs features derived from networks trained for specialized tasks of scene classification, person activity prediction, and person and object attribute prediction. We also present a method for selecting sub-regions of an image that are relevant for evaluating the appropriateness of a putative answer. Visual features are computed both from the whole image and from local regions, while sentences are mapped to a common space using a simple normalized canonical correlation analysis (CCA) model. Our results show a significant improvement over the previous state of the art, and indicate that answering different question types benefits from examining a variety of image cues and carefully choosing informative image sub-regions

    Recycling of epidermal growth factor-receptor complexes in A431 cells: identification of dual pathways

    Get PDF
    The intracellular sorting of EGF-receptor complexes (EGF-RC) has been studied in human epidermoid carcinoma A431 cells. Recycling of EGF was found to occur rapidly after internalization at 37 degrees C. The initial rate of EGF recycling was reduced at 18 degrees C. A significant pool of internalized EGF was incapable of recycling at 18 degrees C but began to recycle when cells were warmed to 37 degrees C. The relative rate of EGF outflow at 37 degrees C from cells exposed to an 18 degrees C temperature block was slower (t1/2 approximately 20 min) than the rate from cells not exposed to a temperature block (t1/2 approximately 5-7 min). These data suggest that there might be both short- and long-time cycles of EGF recycling in A431 cells. Examination of the intracellular EGF-RC dissociation and dynamics of short- and long-time recycling indicated that EGF recycled as EGF-RC. Moreover, EGF receptors that were covalently labeled with a photoactivatable derivative of 125I-EGF recycled via the long-time pathway at a rate similar to that of 125I-EGF. Since EGF-RC degradation was also blocked at 18 degrees C, we propose that sorting to the lysosomal and long-time recycling pathway may occur after a highly temperature-sensitive step, presumably in the late endosomes

    Salient Objects in Clutter: Bringing Salient Object Detection to the Foreground

    Full text link
    We provide a comprehensive evaluation of salient object detection (SOD) models. Our analysis identifies a serious design bias of existing SOD datasets which assumes that each image contains at least one clearly outstanding salient object in low clutter. The design bias has led to a saturated high performance for state-of-the-art SOD models when evaluated on existing datasets. The models, however, still perform far from being satisfactory when applied to real-world daily scenes. Based on our analyses, we first identify 7 crucial aspects that a comprehensive and balanced dataset should fulfill. Then, we propose a new high quality dataset and update the previous saliency benchmark. Specifically, our SOC (Salient Objects in Clutter) dataset, includes images with salient and non-salient objects from daily object categories. Beyond object category annotations, each salient image is accompanied by attributes that reflect common challenges in real-world scenes. Finally, we report attribute-based performance assessment on our dataset.Comment: ECCV 201

    Direct Image to Point Cloud Descriptors Matching for 6-DOF Camera Localization in Dense 3D Point Cloud

    Full text link
    We propose a novel concept to directly match feature descriptors extracted from RGB images, with feature descriptors extracted from 3D point clouds. We use this concept to localize the position and orientation (pose) of the camera of a query image in dense point clouds. We generate a dataset of matching 2D and 3D descriptors, and use it to train a proposed Descriptor-Matcher algorithm. To localize a query image in a point cloud, we extract 2D keypoints and descriptors from the query image. Then the Descriptor-Matcher is used to find the corresponding pairs 2D and 3D keypoints by matching the 2D descriptors with the pre-extracted 3D descriptors of the point cloud. This information is used in a robust pose estimation algorithm to localize the query image in the 3D point cloud. Experiments demonstrate that directly matching 2D and 3D descriptors is not only a viable idea but also achieves competitive accuracy compared to other state-of-the-art approaches for camera pose localization

    The history of degenerate (bipartite) extremal graph problems

    Full text link
    This paper is a survey on Extremal Graph Theory, primarily focusing on the case when one of the excluded graphs is bipartite. On one hand we give an introduction to this field and also describe many important results, methods, problems, and constructions.Comment: 97 pages, 11 figures, many problems. This is the preliminary version of our survey presented in Erdos 100. In this version 2 only a citation was complete

    Asymptotic Limits and Zeros of Chromatic Polynomials and Ground State Entropy of Potts Antiferromagnets

    Full text link
    We study the asymptotic limiting function W(G,q)=limnP(G,q)1/nW({G},q) = \lim_{n \to \infty}P(G,q)^{1/n}, where P(G,q)P(G,q) is the chromatic polynomial for a graph GG with nn vertices. We first discuss a subtlety in the definition of W(G,q)W({G},q) resulting from the fact that at certain special points qsq_s, the following limits do not commute: limnlimqqsP(G,q)1/nlimqqslimnP(G,q)1/n\lim_{n \to \infty} \lim_{q \to q_s} P(G,q)^{1/n} \ne \lim_{q \to q_s} \lim_{n \to \infty} P(G,q)^{1/n}. We then present exact calculations of W(G,q)W({G},q) and determine the corresponding analytic structure in the complex qq plane for a number of families of graphs G{G}, including circuits, wheels, biwheels, bipyramids, and (cyclic and twisted) ladders. We study the zeros of the corresponding chromatic polynomials and prove a theorem that for certain families of graphs, all but a finite number of the zeros lie exactly on a unit circle, whose position depends on the family. Using the connection of P(G,q)P(G,q) with the zero-temperature Potts antiferromagnet, we derive a theorem concerning the maximal finite real point of non-analyticity in W(G,q)W({G},q), denoted qcq_c and apply this theorem to deduce that qc(sq)=3q_c(sq)=3 and qc(hc)=(3+5)/2q_c(hc) = (3+\sqrt{5})/2 for the square and honeycomb lattices. Finally, numerical calculations of W(hc,q)W(hc,q) and W(sq,q)W(sq,q) are presented and compared with series expansions and bounds.Comment: 33 pages, Latex, 5 postscript figures, published version; includes further comments on large-q serie

    Single view silhouette fitting techniques for estimating tennis racket position

    Get PDF
    Stereo camera systems have been used to track markers attached to a racket, allowing its position to be obtained in three-dimensional (3D) space. Typically, markers are manually selected on the image plane, but this can be time-consuming. A markerless system based on one stationary camera estimating 3D racket position data is desirable for research and play. The markerless method presented in this paper relies on a set of racket silhouette views in a common reference frame captured with a calibrated camera and a silhouette of a racket captured with a camera whose relative pose is outside the common reference frame. The aim of this paper is to provide validation of these single view fitting techniques to estimate the pose of a tennis racket. This includes the development of a calibration method to provide the relative pose of a stationary camera with respect to a racket. Mean static racket position was reconstructed to within ±2 mm. Computer generated camera poses and silhouette views of a full size racket model were used to demonstrate the potential of the method to estimate 3D racket position during a simplified serve scenario. From a camera distance of 14 m, 3D racket position was estimated providing a spatial accuracy of 1.9 ± 0.14 mm, similar to recent 3D video marker tracking studies of tennis

    Modelling search for people in 900 scenes: A combined source model of eye guidance

    Get PDF
    How predictable are human eye movements during search in real world scenes? We recorded 14 observers’ eye movements as they performed a search task (person detection) in 912 outdoor scenes. Observers were highly consistent in the regions fixated during search, even when the target was absent from the scene. These eye movements were used to evaluate computational models of search guidance from three sources: Saliency, target features, and scene context. Each of these models independently outperformed a cross-image control in predicting human fixations. Models that combined sources of guidance ultimately predicted 94% of human agreement, with the scene context component providing the most explanatory power. None of the models, however, could reach the precision and fidelity of an attentional map defined by human fixations. This work puts forth a benchmark for computational models of search in real world scenes. Further improvements in modelling should capture mechanisms underlying the selectivity of observers’ fixations during search.National Eye Institute (Integrative Training Program in Vision grant T32 EY013935)Massachusetts Institute of Technology (Singleton Graduate Research Fellowship)National Science Foundation (U.S.) (Graduate Research Fellowship)National Science Foundation (U.S.) (CAREER Award (0546262))National Science Foundation (U.S.) (NSF contract (0705677))National Science Foundation (U.S.) (Career Award (0747120)
    corecore