4,900 research outputs found
LiveSketch: Query Perturbations for Guided Sketch-based Visual Search
LiveSketch is a novel algorithm for searching large image collections using
hand-sketched queries. LiveSketch tackles the inherent ambiguity of sketch
search by creating visual suggestions that augment the query as it is drawn,
making query specification an iterative rather than one-shot process that helps
disambiguate users' search intent. Our technical contributions are: a triplet
convnet architecture that incorporates an RNN based variational autoencoder to
search for images using vector (stroke-based) queries; real-time clustering to
identify likely search intents (and so, targets within the search embedding);
and the use of backpropagation from those targets to perturb the input stroke
sequence, so suggesting alterations to the query in order to guide the search.
We show improvements in accuracy and time-to-task over contemporary baselines
using a 67M image corpus.Comment: Accepted to CVPR 201
Recommended from our members
Linking early geospatial documents, one place at a time: annotation of geographic documents with Recogito
Recogito is an open source tool for the semi-automatic annotation of place references in maps and texts. It was developed as part of the Pelagios 3 research project, which aims to build up a comprehensive directory of places referred to in early maps and geographic writing predating the year 1492. Pelagios 3 focuses specifically on sources from the Classical Latin, Greek and Byzantine periods; on Mappae Mundi and narrative texts from the European Medieval period; on Late Medieval Portolans; and on maps and texts from the early Islamic and early Chinese traditions. Since the start of the project in September 2013, the team has harvested more than 120,000 toponyms, manually verifying almost 60,000 of them. Furthermore, the team held two public annotation workshops supported through the Open Humanities Awards 2014. In these workshops, a mixed audience of students and academics of different backgrounds used Recogito to add several thousand contributions on each workshop day.
A number of benefits arise out of this work: on the one hand, the digital identification of places â and the names used for them â makes the documents' contents amenable to information retrieval technology, i.e. documents become more easily search- and discoverable to users than through conventional metadata-based search alone. On the other hand, the documents are opened up to new forms of re-use. For example, it becomes possible to âmapâ and compare the narrative of texts, and the contents of maps with modern day tools like Web maps and GIS; or to analyze and contrast documentsâ geographic properties, toponymy and spatial relationships. Seen in a wider context, we argue that initiatives such as ours contribute to the growing ecosystem of the âGraph of Humanities Dataâ that is gathering pace in the Digital Humanities (linking data about people, places, events, canonical references, etc.), which has the potential to open up new avenues for computational and quantitative research in a variety of fields including History, Geography, Archaeology, Classics, Genealogy and Modern Languages
Recommended from our members
Recognition by directed attention to recursively partitioned images
A learning/recognition model (and instantiating program) is described which recursively combines the learning paradigms of conceptual clustering (Michalski, 1980) and learning-from-examples to resolve the ambiguities of real-world recognition. The model is based on neuropsychological and psychological evidence that the visual system is analytic, hierarchical, and composed of a parallel/serial dichotomy (many, see conclusions by Crick, 1984). Emulating the experimental evidence, parallel processes in the model decompose the image into components and cluster the constituents in much the same way as the image processing technique known as moment analysis (Alt, 1962). Serial, attentive mechanisms then reassemble the decompositions by investigating spatial relationships between components. The use of attentive mechanisms extends the moment analysis technique to handle alterations in structure and solves the contention problem created by combining the two learning paradigms. The contention results from a disagreement between the teacher and the model on what constitutes the salient features at the highest level of the symbol. There are four cases ZBT must handle, two of which result from the disagreement with the teacher. The parallel/serial dichotomy represents a vertical/horizontal tradeoff between the invariant and variant features of a domain. The resultant learned hierarchy allows ZBT to recognize structural differences while avoiding problems of exponential growth
RAPID WEBGIS DEVELOPMENT FOR EMERGENCY MANAGEMENT
The use of spatial data during emergency response and management helps to make faster and better decisions. Moreover spatial data should be as much updated as possible and easy to access. To face the challenge of rapid and updated data sharing the most efficient solution is largely considered the use of internet where the field of web mapping is constantly evolving. ITHACA (Information Technology for Humanitarian Assistance, Cooperation and Action) is a non profit association founded by Politecnico di Torino and SITI (Higher Institute for the Environmental Systems) as a joint project with the WFP (World Food Programme). The collaboration with the WFP drives some projects related to Early Warning Systems (i.e. flood and drought monitoring) and Early Impact Systems (e.g. rapid mapping and assessment through remote sensing systems). The Web GIS team has built and is continuously improving a complex architecture based entirely on Open Source tools. This architecture is composed by three main areas: the database environment, the server side logic and the client side logic. Each of them is implemented respecting the MCV (Model Controller View) pattern which means the separation of the different logic layers (database interaction, business logic and presentation). The MCV architecture allows to easily and fast build a Web GIS application for data viewing and exploration. In case of emergency data publication can be performed almost immediately as soon as data production is completed. The server side system is based on Python language and Django web development framework, while the client side on OpenLayers, GeoExt and Ext.js that manage data retrieval and user interface. The MCV pattern applied to javascript allows to keep the interface generation and data retrieval logic separated from the general application configuration, thus the server side environment can take care of the generation of the configuration file. The web application building process is data driven and can be considered as a view of the current architecture composed by data and data interaction tools. Once completely automated, the Web GIS application building process can be performed directly by the final user, that can customize data layers and controls to interact with the
A Review of Neural Network Approach on Engineering Drawing Recognition and Future Directions
Engineering Drawing (ED) digitization is a crucial aspect of modern industrial processes, enabling efficient data management and facilitating automation. However, the accurate detection and recognition of ED elements pose significant challenges. This paper presents a comprehensive review of existing research on ED element detection and recognition, focusing on the role of neural networks in improving the analysis process. The study evaluates the performance of the YOLOv7 model in detecting ED elements through rigorous experimentation. The results indicate promising precision and recall rates of up to 87.6% and 74.4%, respectively, with a mean average precision (mAP) of 61.1% at IoU threshold 0.5. Despite these advancements, achieving 100% accuracy remains elusive due to factors such as symbol and text overlapping, limited dataset sizes, and variations in ED formats. Overcoming these challenges is vital to ensuring the reliability and practical applicability of ED digitization solutions. By comparing the YOLOv7 results with previous research, the study underscores the efficacy of neural network-based approaches in handling ED element detection tasks. However, further investigation is necessary to address the challenges above effectively. Future research directions include exploring ensemble methods to improve detection accuracy, fine-tuning model parameters to enhance performance, and incorporating domain adaptation techniques to adapt models to specific ED formats and domains. To enhance the real-world viability of ED digitization solutions, this work highlights the importance of conducting testing on diverse datasets representing different industries and applications. Additionally, fostering collaborations between academia and industry will enable the development of tailored solutions that meet specific industrial needs. Overall, this research contributes to understanding the challenges in ED digitization and paves the way for future advancements in this critical field
New trends on digitisation of complex engineering drawings
Engineering drawings are commonly used across different industries such as oil and gas, mechanical engineering and others. Digitising these drawings is becoming increasingly important. This is mainly due to the legacy of drawings and documents that may provide rich source of information for industries. Analysing these drawings often requires applying a set of digital image processing methods to detect and classify symbols and other components. Despite the recent significant advances in image processing, and in particular in deep neural networks, automatic analysis and processing of these engineering drawings is still far from being complete. This paper presents a general framework for complex engineering drawing digitisation. A thorough and critical review of relevant literature, methods and algorithms in machine learning and machine vision is presented. Real-life industrial scenario on how to contextualise the digitised information from specific type of these drawings, namely piping and instrumentation diagrams, is discussed in details. A discussion of how new trends on machine vision such as deep learning could be applied to this domain is presented with conclusions and suggestions for future research directions
Forest cover mask from historical topographic maps based on image processing
This study aimed to obtain accurate binary forest masks which might be directly used in analysis of land cover changes over large areas. A sequence of image processing operations was conceived, parameterized and tested using various topographic maps from mountain areas in Poland and Switzerland. First, the input maps were ïŹltered and binarized by thresholding in Hue-Saturation-Value colour space. The second step consisted of a set of morphological image analysis procedures leading to ïŹnal forest masks. The forest masks were then assessed and compared to manual forest boundary vectorization. The Polish topographical map published in the 1930s showed low accuracy which could be attributed to methods of cartographic presentation used and degradation of original colour prints. For maps published in the 1970s, the automated forest extraction performed very well, with accuracy exceeding 97%, comparable to accuracies of manual vectorization of the same maps performed by nontrained operators. With this method, we obtained a forest cover mask for the entire area of the Polish Carpathians, easily readable in any Geographic Information System software
- âŠ