804 research outputs found
Robust approach to object recognition through fuzzy clustering and hough transform based methods
Object detection from two dimensional intensity images as well as three dimensional range images is considered. The emphasis is on the robust detection of shapes such as cylinders, spheres, cones, and planar surfaces, typically found in mechanical and manufacturing engineering applications. Based on the analyses of different HT methods, a novel method, called the Fast Randomized Hough Transform (FRHT) is proposed. The key idea of FRHT is to divide the original image into multiple regions and apply random sampling method to map data points in the image space into the parameter space or feature space, then obtain the parameters of true clusters. This results in the following characteristics, which are highly desirable in any method: high computation speed, low memory requirement, high result resolution and infinite parameter space. This project also considers use of fuzzy clustering techniques, such as Fuzzy C Quadric Shells (FCQS) clustering algorithm but combines the concept of noise prototype to form the Noise FCQS clustering algorithm that is robust against noise. Then a novel integrated clustering algorithm combining the advantages of FRHT and NFCQS methods is proposed. It is shown to be a robust clustering algorithm having the distinct advantages such as: the number of clusters need not be known in advance, the results are initialization independent, the detection accuracy is greatly improved, and the computation speed is very fast. Recent concepts from robust statistics, such as least trimmed squares estimation (LTS), minimum volume ellipsoid estimator (MVE) and the generalized MVE are also utilized to form a new robust algorithm called the generalized LTS for Quadric Surfaces (GLTS-QS) algorithm is developed. The experimental results indicate that the clustering method combining the FRHT and the GLTS-QS can improve clustering performance. Moreover, a new cluster validity method for circular clusters is proposed by considering the distribution of the points on the circular edge. Different methods for the computation of distance of a point from a cluster boundary, a common issue in all the range image clustering algorithms, are also discussed. The performance of all these algorithms is tested using various real and synthetic range and intensity images. The application of the robust clustering methods to the experimental granular flow research is also included
A Vision-Based Algorithm for UAV State Estimation During Vehicle Recovery
A computer vision-based algorithm for Unmanned Aerial Vehicle state estimation during vehicle recovery is presented. The algorithm is intended to be used to augment or back up Global Positioning System as the primary means of navigation during vehicle recovery for UAVs. The method requires a clearly visible recovery target with markers placed on the corners in addition to known target geometry. The algorithm uses clustering techniques to identify the markers, a Canny Edge detector and a Hough Transform to verify these markers actually lie on the recovery target, an optimizer to match the detected markers with coordinates in three-space, a non-linear transformation and projection solver to observe the position and orientation of the camera, and an Extended Kalman Filter (EKF) to improve the tracking of the state estimate. While it must be acknowledged that the resolution of the test images used is much higher than the resolution of images used in previous algorithms and that the images used to test this algorithm are either synthetic or taken in static conditions, the algorithm presented does give much better state estimates than previously-developed vision systems
Automatic Main Road Extraction from High Resolution Satellite Imagery
Road information is essential for automatic GIS (geographical information system) data acquisition, transportation and urban planning. Automatic road (network) detection from high resolution satellite imagery will hold great potential for significant reduction of database development/updating cost and turnaround time. From so called low level feature detection to high level context supported grouping, so many algorithms and methodologies have been presented for this purpose. There is not any practical system that can fully automatically extract road network from space imagery for the purpose of automatic mapping. This paper presents the methodology of automatic main road detection from high resolution satellite IKONOS imagery. The strategies include multiresolution or image pyramid method, Gaussian blurring and the line finder using 1-dimemsional template correlation filter, line segment grouping and multi-layer result integration. Multi-layer or multi-resolution method for road extraction is a very effective strategy to save processing time and improve robustness. To realize the strategy, the original IKONOS image is compressed into different corresponding image resolution so that an image pyramid is generated; after that the line finder of 1-dimemsional template correlation filter after Gaussian blurring filtering is applied to detect the road centerline. Extracted centerline segments belong to or do not belong to roads. There are two ways to identify the attributes of the segments, the one is using segment grouping to form longer line segments and assign a possibility to the segment depending on the length and other geometric and photometric attribute of the segment, for example the longer segment means bigger possibility of being road. Perceptual-grouping based method is used for road segment linking by a possibility model that takes multi-information into account; here the clues existing in the gaps are considered. Another way to identify the segments is feature detection back-to-higher resolution layer from the image pyramid
Auf einem menschlichen Gehörmodell basierende Elektrodenstimulationsstrategie für Cochleaimplantate
Cochleaimplantate (CI), verbunden mit einer professionellen Rehabilitation,
haben mehreren hunderttausenden Hörgeschädigten die verbale Kommunikation
wieder ermöglicht. Betrachtet man jedoch die Rehabilitationserfolge, so
haben CI-Systeme inzwischen ihre Grenzen erreicht. Die Tatsache, dass die
meisten CI-Träger nicht in der Lage sind, Musik zu genießen oder einer
Konversation in geräuschvoller Umgebung zu folgen, zeigt, dass es noch Raum
für Verbesserungen gibt.Diese Dissertation stellt die neue
CI-Signalverarbeitungsstrategie Stimulation based on Auditory Modeling
(SAM) vor, die vollständig auf einem Computermodell des menschlichen
peripheren Hörsystems beruht.Im Rahmen der vorliegenden Arbeit wurde die
SAM Strategie dreifach evaluiert: mit vereinfachten Wahrnehmungsmodellen
von CI-Nutzern, mit fünf CI-Nutzern, und mit 27 Normalhörenden mittels
eines akustischen Modells der CI-Wahrnehmung. Die Evaluationsergebnisse
wurden stets mit Ergebnissen, die durch die Verwendung der Advanced
Combination Encoder (ACE) Strategie ermittelt wurden, verglichen. ACE
stellt die zurzeit verbreitetste Strategie dar. Erste Simulationen zeigten,
dass die Sprachverständlichkeit mit SAM genauso gut wie mit ACE ist.
Weiterhin lieferte SAM genauere binaurale Merkmale, was potentiell zu einer
Verbesserung der Schallquellenlokalisierungfähigkeit führen kann. Die
Simulationen zeigten ebenfalls einen erhöhten Anteil an zeitlichen
Pitchinformationen, welche von SAM bereitgestellt wurden. Die Ergebnisse
der nachfolgenden Pilotstudie mit fünf CI-Nutzern zeigten mehrere Vorteile
von SAM auf. Erstens war eine signifikante Verbesserung der
Tonhöhenunterscheidung bei Sinustönen und gesungenen Vokalen zu erkennen.
Zweitens bestätigten CI-Nutzer, die kontralateral mit einem Hörgerät
versorgt waren, eine natürlicheren Klangeindruck. Als ein sehr bedeutender
Vorteil stellte sich drittens heraus, dass sich alle Testpersonen in sehr
kurzer Zeit (ca. 10 bis 30 Minuten) an SAM gewöhnen konnten. Dies ist
besonders wichtig, da typischerweise Wochen oder Monate nötig sind. Tests
mit Normalhörenden lieferten weitere Nachweise für die verbesserte
Tonhöhenunterscheidung mit SAM.Obwohl SAM noch keine marktreife Alternative
ist, versucht sie den Weg für zukünftige Strategien, die auf Gehörmodellen
beruhen, zu ebnen und ist somit ein erfolgversprechender Kandidat für
weitere Forschungsarbeiten.Cochlear implants (CIs) combined with professional rehabilitation have
enabled several hundreds of thousands of hearing-impaired individuals to
re-enter the world of verbal communication. Though very successful, current
CI systems seem to have reached their peak potential. The fact that most
recipients claim not to enjoy listening to music and are not capable of
carrying on a conversation in noisy or reverberative environments shows
that there is still room for improvement.This dissertation presents a new
cochlear implant signal processing strategy called Stimulation based on
Auditory Modeling (SAM), which is completely based on a computational model
of the human peripheral auditory system.SAM has been evaluated through
simplified models of CI listeners, with five cochlear implant users, and
with 27 normal-hearing subjects using an acoustic model of CI perception.
Results have always been compared to those acquired using Advanced
Combination Encoder (ACE), which is today’s most prevalent CI strategy.
First simulations showed that speech intelligibility of CI users fitted
with SAM should be just as good as that of CI listeners fitted with ACE.
Furthermore, it has been shown that SAM provides more accurate binaural
cues, which can potentially enhance the sound source localization ability
of bilaterally fitted implantees. Simulations have also revealed an
increased amount of temporal pitch information provided by SAM. The
subsequent pilot study, which ran smoothly, revealed several benefits of
using SAM. First, there was a significant improvement in pitch
discrimination of pure tones and sung vowels. Second, CI users fitted with
a contralateral hearing aid reported a more natural sound of both speech
and music. Third, all subjects were accustomed to SAM in a very short
period of time (in the order of 10 to 30 minutes), which is particularly
important given that a successful CI strategy change typically takes weeks
to months. An additional test with 27 normal-hearing listeners using an
acoustic model of CI perception delivered further evidence for improved
pitch discrimination ability with SAM as compared to ACE.Although SAM is
not yet a market-ready alternative, it strives to pave the way for future
strategies based on auditory models and it is a promising candidate for
further research and investigation
High-level environment representations for mobile robots
In most robotic applications we are faced with the problem of building
a digital representation of the environment that allows the robot to
autonomously complete its tasks. This internal representation can be
used by the robot to plan a motion trajectory for its mobile base
and/or end-effector. For most man-made environments we do not have
a digital representation or it is inaccurate. Thus, the robot must
have the capability of building it autonomously. This is done by
integrating into an internal data structure incoming sensor
measurements. For this purpose, a common solution consists in solving
the Simultaneous Localization and Mapping (SLAM) problem. The map
obtained by solving a SLAM problem is called ``metric'' and it
describes the geometric structure of the environment. A metric map is
typically made up of low-level primitives (like points or
voxels). This means that even though it represents the shape of the
objects in the robot workspace it lacks the information of which
object a surface belongs to. Having an object-level representation of
the environment has the advantage of augmenting the set of possible
tasks that a robot may accomplish. To this end, in this thesis we
focus on two aspects. We propose a formalism to represent in a uniform
manner 3D scenes consisting of different geometric primitives,
including points, lines and planes. Consequently, we derive a local
registration and a global optimization algorithm that can exploit this
representation for robust estimation. Furthermore, we present a
Semantic Mapping system capable of building an \textit{object-based}
map that can be used for complex task planning and execution. Our
system exploits effective reconstruction and recognition techniques
that require no a-priori information about the environment and can be
used under general conditions
Advanced Technique and Future Perspective for Next Generation Optical Fiber Communications
Optical fiber communication industry has gained unprecedented opportunities and achieved rapid progress in recent years. However, with the increase of data transmission volume and the enhancement of transmission demand, the optical communication field still needs to be upgraded to better meet the challenges in the future development. Artificial intelligence technology in optical communication and optical network is still in its infancy, but the existing achievements show great application potential. In the future, with the further development of artificial intelligence technology, AI algorithms combining channel characteristics and physical properties will shine in optical communication. This reprint introduces some recent advances in optical fiber communication and optical network, and provides alternative directions for the development of the next generation optical fiber communication technology
Bio-inspired log-polar based color image pattern analysis in multiple frequency channels
The main topic addressed in this thesis is to implement color image pattern recognition based on the lateral inhibition subtraction phenomenon combined with a complex log-polar mapping in multiple spatial frequency channels. It is shown that the individual red, green and blue channels have different recognition performances when put in the context of former work done by Dragan Vidacic. It is observed that the green channel performs better than the other two channels, with the blue channel having the poorest performance. Following the application of a contrast stretching function the object recognition performance is improved in all channels. Multiple spatial frequency filters were designed to simulate the filtering channels that occur in the human visual system. Following these preprocessing steps Dragan Vidacic\u27s methodology is followed in order to determine the benefits that are obtained from the preprocessing steps being investigated. It is shown that performance gains are realized by using such preprocessing steps
Real-Time Statistics for Padel Tennis Using Artificial Intelligence
O Padel, desporto conhecido pelo seu crescimento explosivo e jogabilidade emocionante,
está à beira de uma revolução tecnológica. Com o objetivo de transformar o jogo de Padel
através do uso criativo de técnicas de deteção de objetos e Deep Learning, esta dissertação de
mestrado investiga a junção da Inteligência Arti cial (IA) e do Padel. O principal objetivo é
usar a IA para produzir estatísticas em tempo real que darão aos jogadores, treinadores e fãs
um melhor conhecimento das complexidades do Padel e dos meios para levar o jogo a novos
patamares.
Esta dissertação explora a monitorização e localização em tempo real dos jogadores e
da bola dentro do campo, através de algoritmos de visão computacional. As Redes Neu ronais de Convolução (RNC), um tipo de modelo de Deep Learning, são essenciais para o
reconhecimento preciso de eventos e ações importantes durante o jogo.
A criação de um sistema baseado em IA que produz dados instantâneos para partidas de
Padel é a inovação central desta dissertação. Estas estatísticas oferecem uma visão analítica
e detalhada de cada jogo, tendo em consideração os movimentos dos jogadores, as trajetórias
da bola e a dinâmica do jogo. Esta dissertação não promove apenas o Padel, mas também
cria novas oportunidades para a utilização de IA em outros desportos.The sport of Padel, known for its explosive growth and exciting gameplay, is on the verge
of a technological revolution. With the goal of transforming the game of Padel through the
creative use of object detection and deep learning techniques, this master's thesis investi gates the junction of Arti cial Intelligence (AI) and Padel. The main goal is to use AI to
produce real-time statistics that will give players, coaches and fans a better knowledge of the
complexities of Padel and the means to take the game to new heights.
This dissertation explores the real-time tracking and localization of players and the ball
within the court by utilizing cutting-edge computer vision algorithms. Convolution Neural
Networks (CNN), one type of deep learning model, are essential for the precise recognition
of important gaming events and actions.
The creation of an AI-driven system that produces in-the-moment data for Padel matches
is the central innovation of this dissertation. These statistics o er a detailed and analytical
view of each game by taking into account player movements, ball trajectories, and game
dynamics. This dissertation not only advances the sport of Padel but also creates new op portunities for the use of AI in other sports analytics
- …