200 research outputs found

    Next Generation of Product Search and Discovery

    Get PDF
    Online shopping has become an important part of people’s daily life with the rapid development of e-commerce. In some domains such as books, electronics, and CD/DVDs, online shopping has surpassed or even replaced the traditional shopping method. Compared with traditional retailing, e-commerce is information intensive. One of the key factors to succeed in e-business is how to facilitate the consumers’ approaches to discover a product. Conventionally a product search engine based on a keyword search or category browser is provided to help users find the product information they need. The general goal of a product search system is to enable users to quickly locate information of interest and to minimize users’ efforts in search and navigation. In this process human factors play a significant role. Finding product information could be a tricky task and may require an intelligent use of search engines, and a non-trivial navigation of multilayer categories. Searching for useful product information can be frustrating for many users, especially those inexperienced users. This dissertation focuses on developing a new visual product search system that effectively extracts the properties of unstructured products, and presents the possible items of attraction to users so that the users can quickly locate the ones they would be most likely interested in. We designed and developed a feature extraction algorithm that retains product color and local pattern features, and the experimental evaluation on the benchmark dataset demonstrated that it is robust against common geometric and photometric visual distortions. Besides, instead of ignoring product text information, we investigated and developed a ranking model learned via a unified probabilistic hypergraph that is capable of capturing correlations among product visual content and textual content. Moreover, we proposed and designed a fuzzy hierarchical co-clustering algorithm for the collaborative filtering product recommendation. Via this method, users can be automatically grouped into different interest communities based on their behaviors. Then, a customized recommendation can be performed according to these implicitly detected relations. In summary, the developed search system performs much better in a visual unstructured product search when compared with state-of-art approaches. With the comprehensive ranking scheme and the collaborative filtering recommendation module, the user’s overhead in locating the information of value is reduced, and the user’s experience of seeking for useful product information is optimized

    Computer Vision System for Tactode Programming

    Get PDF
    A programação tangível, quando direcionada à robótica, torna a atividade de programar mais compreensível e direta. Este tipo de programação ajuda no desenvolvimento precoce das capacidades de programação e do pensamento computacional das crianças de uma forma interativa. Desta ideia surgiu o Tactode: um sistema de programação tangível composto por peças tipo puzzle e uma aplicação web que visa a programação de robôs. Os utilizadores alvo deste sistema são as crianças que, recorrendo às peças, formam um código tangível, tiram uma fotografia ao mesmo e depois podem carregá-lo para a aplicação para, posteriormente, ser testado e executado no robô. O projeto Tactode encontra-se desenvolvido com base em marcadores ArUco, isto é, cada peça contém um marcador deste tipo que facilita a sua deteção e distinção no código tangível. Posto isto, esta dissertação vai dar continuidade a este projeto através do desenvolvimento de um sistema de visão computacional capaz de detetar e identificar cada peça em fotografias de códigos Tactode, sem recorrer aos marcadores ArUco.Tangible programming, when applied to robotics, makes programming more understandable and straightforward. This type of programming helps children developing their abilities of programming and computational thinking interactively and at earlier stages of their lives. From this idea came Tactode: a tangible programming system composed by puzzle-like pieces and a web application that aims robot programming. The target users of this system are children who, using the pieces, build a tangible code, take a picture of it and then can upload it to the application to be tested and executed on the robot later. The Tactode project is developed based on ArUco markers, meaning that each piece have a marker of this type that facilitates its detection and distinction in the tangible code. Therefore, this dissertation will continue this project through the development of a computer vision system capable of detecting and identifying each piece in photographed Tactode codes without depending on the ArUco markers

    Autonomous vehicle guidance in unknown environments

    Get PDF
    Gaining from significant advances in their performance granted by technological evolution, Autonomous Vehicles are rapidly increasing the number of fields of possible and effective applications. From operations in hostile, dangerous environments (military use in removing unexploded projectiles, survey of nuclear power and chemical industrial plants following accidents) to repetitive 24h tasks (border surveillance), from power-multipliers helping in production to less exotic commercial application in household activities (cleaning robots as consumer electronics products), the combination of autonomy and motion offers nowadays impressive options. In fact, an autonomous vehicle can be completed by a number of sensors, actuators, devices making it able to exploit a quite large number of tasks. However, in order to successfully attain these results, the vehicle should be capable to navigate its path in different, sometimes unknown environments. This is the goal of this dissertation: to analyze and - mainly - to propose a suitable solution for the guidance of autonomous vehicles. The frame in which this research takes its steps is the activity carried on at the Guidance and Navigation Lab of Sapienza – Università di Roma, hosted at the School of Aerospace Engineering. Indeed, the solution proposed has an intrinsic, while not limiting, bias towards possible space applications, that will become obvious in some of the following content. A second bias dictated by the Guidance and Navigation Lab activities is represented by the choice of a sample platform. In fact, it would be difficult to perform a meaningful study keeping it a very general level, independent on the characteristics of the targeted kind of vehicle: it is easy to see from the rough list of applications cited above that these characteristics are extremely varied. The Lab hosted – even before the beginning of this thesis activity – a simple, home-designed and manufactured model of a small, yet performing enough autonomous vehicle, called RAGNO (standing for Rover for Autonomous Guidance Navigation and Observation): it was an obvious choice to select that rover as the reference platform to identify solutions for guidance, and to use it, cooperating to its improvement, for the test activities which should be considered as mandatory in this kind of thesis work to validate the suggested approaches. The draft of the thesis includes four main chapters, plus introduction, final remarks and future perspectives, and the list of references. The first chapter (“Autonomous Guidance Exploiting Stereoscopic Vision”) investigates in detail the technique which has been deemed as the most interesting for small vehicles. The current availability of low cost, high performance cameras suggests the adoption of the stereoscopic vision as a quite effective technique, also capable to making available to remote crew a view of the scenario quite similar to the one humans would have. Several advanced image analysis techniques have been investigated for the extraction of the features from left- and right-eye images, with SURF and BRISK algorithm being selected as the most promising one. In short, SURF is a blob detector with an associated descriptor of 64 elements, where the generic feature is extracted by applying sequential box filters to the surrounding area. The features are then localized in the point of the image where the determinant of the Hessian matrix H(x,y) is maximum. The descriptor vector is than determined by calculating the Haar wavelet response in a sampling pattern centered in the feature. BRISK is instead a corner detector with an associated binary descriptor of 512 bit. The generic feature is identified as the brightest point in a sampling circular area of N pixels while the descriptor vector is calculated by computing the brightness gradient of each of the N(N-1)/2 pairs of sampling points. Once left and right features have been extracted, their descriptors are compared in order to determine the corresponding pairs. The matching criterion consists in seeking for the two descriptors for which their relative distance (Euclidean norm for SURF, Hamming distance for BRISK) is minimum. The matching process is computationally expensive: to reduce the required time the thesis successfully explored the theory of the epipolar geometry, based on the geometric constraint existing between the left and right projection of the scene point P, and indeed limiting the space to be searched. Overall, the selected techniques require between 200 and 300 ms on a 2.4GHz clock CPU for the feature extraction and matching in a single (left+right) capture, making it a feasible solution for slow motion vehicles. Once matching phase has been finalized, a disparity map can be prepared highlighting the position of the identified objects, and by means of a triangulation (the baseline between the two cameras is known, the size of the targeted object is measured in pixels in both images) the position and distance of the obstacles can be obtained. The second chapter (“A Vehicle Prototype and its Guidance System”) is devoted to the implementation of the stereoscopic vision onboard a small test vehicle, which is the previously cited RAGNO rover. Indeed, a description of the vehicle – the chassis, the propulsion system with four electric motors empowering the wheels, the good roadside performance attainable, the commanding options – either fully autonomous, partly autonomous with remote monitoring, or fully remotely controlled via TCP/IP on mobile networks - is included first, with a focus on different sensors that, depending on the scenario, can integrate the stereoscopic vision system. The intelligence-side of guidance subsystem, exploiting the navigation information provided by the camera, is then detailed. Two guidance techniques have been studied and implemented to identify the optimal trajectory in a field with scattered obstacles: the artificial potential guidance, based on the Lyapunov approach, and the A-star algorithm, looking for the minimum of a cost function built on graphs joining the cells of a mesh over-imposed to the scenario. Performance of the two techniques are assessed for two specific test-cases, and the possibility of unstable behavior of the artificial potential guidance, bouncing among local minima, has been highlighted. Overall, A-star guidance is the suggested solution in terms of time, cost and reliability. Notice that, withstanding the noise affecting information from sensors, an estimation process based on Kalman filtering has been also included in the process to improve the smoothness of the targeted trajectory. The third chapter (“Examples of Possible Missions and Applications”) reports two experimental campaigns adopting RAGNO for the detection of dangerous gases. In the first one, the rover accommodates a specific sensor, and autonomously moves in open fields, avoiding possible obstacles, to exploit measurements at given time intervals. The same configuration for RAGNO is also used in the second campaign: this time, however, the path of the rover is autonomously computed on the basis of the way points communicated by a drone which is flying above the area of measurements and identifies possible targets of interest. The fourth chapter (“Guidance of Fleet of Autonomous Vehicles ”) stresses this successful idea of fleet of vehicles, and numerically investigates by algorithms purposely written in Matlab the performance of a simple swarm of two rovers exploring an unknown scenario, pretending – as an example - to represent a case of planetary surface exploration. The awareness of the surrounding environment is dictated by the characteristics of the sensors accommodated onboard, which have been assumed on the basis of the experience gained with the material of previous chapter. Moreover, the communication issues that would likely affect real world cases are included in the scheme by the possibility to model the comm link, and by running the simulation in a multi-task configuration where the two rovers are assigned to two different computer processes, each of them having a different TCP/IP address with a behavior actually depending on the flow of information received form the other explorer. Even if at a simulation-level only, it is deemed that such a final step collects different aspects investigated during the PhD period, with feasible sensors’ characteristics (obviously focusing on stereoscopic vision), guidance technique, coordination among autonomous agents and possible interesting application cases

    A systemic approach to automatic metadata extraction from multimedia content

    Get PDF
    There is a need for automatic processing and extracting of meaningful metadata from multimedia information, especially in the audiovisual industry. This higher level information is used in a variety of practices, such as enriching multimedia content with external links, clickable objects and useful related information in general. This paper presents a system for efficient multimedia content analysis and automatic annotation within a multimedia processing and publishing framework. This system is comprised of three modules: the first provides detection of faces and recognition of known persons; the second provides generic object detection, based on a deep convolutional neural network topology; the third provides automated location estimation and landmark recognition based on state-of-the-art technologies. The results are exported in meaningful metadata that can be utilized in various ways. The system has been successfully tested in the framework of the EC Horizon 2020 Mecanex project, targeting advertising and production markets

    ASSESSMENT OF ELECTRO-OPTICAL IMAGING TECHNOLOGY FOR UNMANNED AERIAL SYSTEM NAVIGATION IN A GPS-DENIED ENVIRONMENT

    Get PDF
    Navigation systems of unmanned aircraft systems (UAS) are heavily dependent on the availability of Global Positioning Systems (GPS) or other Global Navigation Satellite Systems (GNSS). Although inertial navigation systems (INS) can provide position and velocity of an aircraft based on acceleration measurements, the information degrades over time and reduces the capability of the system. In a GPS-denied environment, a UAS must utilize alternative sensor sources for navigating. This thesis presents preliminary evaluation results on the usage of onboard down-looking electro-optical sensors and image matching techniques to assist in GPS-free navigation of aerial platforms. Following the presentation of the fundamental mathematics behind the proposed concept, the thesis analyzes the key results from three flight campaign experiments that use different sets of sensors to collect data. Each of the flight experiments explores different sensor setups, assesses a variety of image processing methods, looks at different terrain environments, and reveals limitations related to the proposed approach. In addition, an attempt to incorporate navigational aid solutions into a navigation system using a Kalman filter is demonstrated. The thesis concludes with recommendations for future research on developing an integrated navigation system that relies on inertial measurement unit data complemented by the positional fixes from the image-matching technique.Outstanding ThesisCivilian, DSO National Laboratories, SingaporeApproved for public release. Distribution is unlimited

    Cable Tension Monitoring using Non-Contact Vision-based Techniques

    Get PDF
    In cable-stayed bridges, the structural systems of tensioned cables play a critical role in structural and functional integrity. Thereby, tensile forces in the cables become one of the essential indicators in structural health monitoring (SHM). In this thesis, a video image processing technology integrated with cable dynamic analysis is proposed as a non-contact vision-based measurement technique, which provides a user-friendly, cost-effective, and computationally efficient solution to displacement extraction, frequency identification, and cable tension monitoring. In contrast to conventional contact sensors, the vision-based system is capable of taking remote measurements of cable dynamic response while having flexible sensing capability. Since cable detection is a substantial step in displacement extraction, a comprehensive study on the feasibility of the adopted feature detector is conducted under various testing scenarios. The performance of the feature detector is quantified by developing evaluation parameters. Enhancement methods for the feature detector in cable detection are investigated as well under complex testing environments. Threshold-dependent image matching approaches, which optimize the functionality of the feature-based video image processing technology, is proposed for noise-free and noisy background scenarios. The vision-based system is validated through experimental studies of free vibration tests on a single undamped cable in laboratory settings. The maximum percentage difference of the identified cable fundamental frequency is found to be 0.74% compared with accelerometer readings, while the maximum percentage difference of the estimated cable tensile force is 4.64% compared to direct measurement by a load cell
    • …
    corecore