8,150 research outputs found

    Framework for extracting and solving combination puzzles

    Get PDF
    Selles töös uuritakse, kuidas arvuti nägemisega seotud algoritme on võimalik rakendada objektide tuvastuse probleemile. Täpsemalt, kas arvuti nägemist on võimalik kasutada päris maailma kombinatoorsete probleemide lahendamiseks. Idee kasutada arvuti rakendust probleemide lahendamiseks, tulenes tähelepanekust, et probleemide lahenduse protsessid on kõik enamasti algoritmid. Sellest võib järeldada, et arvutid sobivad algoritmiliste probleemide lahendamiseks paremini kui inimesed, kellel võib sama ülesande peale kuluda kordades kauem. Siiski ei vaatle arvutid probleeme samamoodi nagu inimesed ehk nad ei saa probleeme analüüsida. Niisiis selle töö panuseks saab olema erinevate arvuti nägemise algoritmide uurimine, mille eesmärgiks on päris maailma kombinatoorsete probleemide tõlgendamine abstraktseteks struktuurideks, mida arvuti on võimeline mõistma ning lahendama.Praegu on antud valdkonnas vähe materiali, mis annab hea võimaluse panustada sellesse valdkonda. Seda saavutatakse läbi empiirilise uurimise testide kogumiku kujul selleks, et veenduda millised lähenemised on kõige paremad. Nende eesmärkide saavutamiseks töötati läbi suur hulk arvuti nägemisega seotud materjale ning teooriat. Lisaks võeti ka arvesse reaalaja toimingute tähtsus, mida võib näha erinevate liikumisest struktuuri eraldavate algoritmide(SLAM, PTAM) õpingutest, mida hiljem edukalt kasutati navigatsiooni ja liitreaalsuse probleemide lahendamiseks. Siiski tuleb mainida, et neid algoritme ei kasutatud objektide omaduste tuvastamiseks.See töö uurib, kuidas saab erinevaid lähenemisi kasutada selleks, et aidata vähekogenud kasutajaid kombinatoorsete päris maailma probleemide lahendamisel. Lisaks tekib selle töö tulemusena võimalus tuvastada objektide liikumist (translatsioon, pöörlemine), mida saab kasutada koos virutaalse probleemi mudeliga, et parandada kasutaja kogemust.This thesis describes and investigates how computer vision algorithms and stereo vision algorithms may be applied to the problem of object detection. In particular, if computer vision can aid on puzzle solving. The idea to use computer application for puzzle solving came from the fact that all solution techniques are algorithms in the end. This fact leads to the conclusion that algorithms are well solved by machines, for instance, a machine requires milliseconds to compute the solution while a human can handle this in minutes or hours. Unfortunately, machines cannot see puzzles from human perspective thus cannot analyze them. Hence, the contribution of this thesis is to study different computer vision approaches from non-related solutions applied to the problem of translating the physical puzzle model into the abstract structure that can be understood and solved by a machine.Currently, there is a little written on this subject, therefore, there is a great chance to contribute. This is achieved through empirical research represented as a set of experiments in order to ensure which approaches are suitable. To accomplish these goals huge amount of computer vision theory has been studied. In addition, the relevance of real-time operations was taken into account. This was manifested through the Different real-time Structure from Motion algorithms (SLAM, PTAM) studies that were successfully applied for navigation or augmented reality problems; however, none of them for object characteristics extraction.This thesis examines how these different approaches can be applied to the given problem to help inexperienced users solve the combination puzzles. Moreover, it produces a side effect which is a possibility to track objects movement (rotation, translation) that can be used for manipulating a rendered game puzzle and increase interactivity and engagement of the user

    Estimating Epipolar Geometry With The Use of a Camera Mounted Orientation Sensor

    Get PDF
    Context: Image processing and computer vision are rapidly becoming more and more commonplace, and the amount of information about a scene, such as 3D geometry, that can be obtained from an image, or multiple images of the scene is steadily increasing due to increasing resolutions and availability of imaging sensors, and an active research community. In parallel, advances in hardware design and manufacturing are allowing for devices such as gyroscopes, accelerometers and magnetometers and GPS receivers to be included alongside imaging devices at a consumer level. Aims: This work aims to investigate the use of orientation sensors in the field of computer vision as sources of data to aid with image processing and the determination of a scene’s geometry, in particular, the epipolar geometry of a pair of images - and devises a hybrid methodology from two sets of previous works in order to exploit the information available from orientation sensors alongside data gathered from image processing techniques. Method: A readily available consumer-level orientation sensor was used alongside a digital camera to capture images of a set of scenes and record the orientation of the camera. The fundamental matrix of these pairs of images was calculated using a variety of techniques - both incorporating data from the orientation sensor and excluding its use Results: Some methodologies could not produce an acceptable result for the Fundamental Matrix on certain image pairs, however, a method described in the literature that used an orientation sensor always produced a result - however in cases where the hybrid or purely computer vision methods also produced a result - this was found to be the least accurate. Conclusion: Results from this work show that the use of an orientation sensor to capture information alongside an imaging device can be used to improve both the accuracy and reliability of calculations of the scene’s geometry - however noise from the orientation sensor can limit this accuracy and further research would be needed to determine the magnitude of this problem and methods of mitigation

    Probabilistic framework for image understanding applications using Bayesian Networks

    Get PDF
    Machine learning algorithms have been successfully utilized in various systems/devices. They have the ability to improve the usability/quality of such systems in terms of intelligent user interface, fast performance, and more importantly, high accuracy. In this research, machine learning techniques are used in the field of image understanding, which is a common research area between image analysis and computer vision, to involve higher processing level of a target image to make sense of the scene captured in it. A general probabilistic framework for image understanding where topics associated with (i) collection of images to generate a comprehensive and valid database, (ii) generation of an unbiased ground-truth for the aforesaid database, (iii) selection of classification features and elimination of the redundant ones, and (iv) usage of such information to test a new sample set, are discussed. Two research projects have been developed as examples of the general image understanding framework; identification of region(s) of interest, and image segmentation evaluation. These techniques, in addition to others, are combined in an object-oriented rendering system for printing applications. The discussion included in this doctoral dissertation explores the means for developing such a system from an image understanding/ processing aspect. It is worth noticing that this work does not aim to develop a printing system. It is only proposed to add some essential features for current printing pipelines to achieve better visual quality while printing images/photos. Hence, we assume that image regions have been successfully extracted from the printed document. These images are used as input to the proposed object-oriented rendering algorithm where methodologies for color image segmentation, region-of-interest identification and semantic features extraction are employed. Probabilistic approaches based on Bayesian statistics have been utilized to develop the proposed image understanding techniques
    • …
    corecore