1,631 research outputs found

    Vision for Looking at Traffic Lights:Issues, Survey, and Perspectives

    Get PDF

    Ongoing Work on Traffic Lights:Detection and Evaluation

    Get PDF

    Vision Based Object Recognition and Localisation by a Wireless Connected Distributed Robotic Systems

    Get PDF
    Object recognition and localisation are important processes in computer vision and robotics. Advances in computer vision have resulted in many object recognition techniques, but most of them are computationally very intensive and require robots with powerful processing systems. For small robots, these techniques are not applicable because of the constraints of execution time. In this study, an optimised implementation of SURF based recognition technique is presented. Suitable image pre-processing techniques were developed which reduced the recognition time on small robots with limited processing resources. The recognition time was reduced from 39 seconds to 780 milliseconds. This recognition technique was adopted by a team of small robots which were given prior training to search for objects of interest in the environment. For the localisation of the robots and objects a new template, designed for passive markers based tracking, was introduced. These markers were placed on the top of each robot and they were tracked by the two ceiling mounted cameras. The information from both sources, that is ceiling mounted cameras and team of robots, was used collectively to localise the objects in the environment. The objects were localised with an error ranging from 2.8cm to 5.2cm from their actual positions in the test arena which has the dimensions of 150x163cm

    Detecting, Tracking, And Recognizing Activities In Aerial Video

    Get PDF
    In this dissertation, we address the problem of detecting humans and vehicles, tracking them in crowded scenes, and finally determining their activities in aerial video. Even though this is a well explored problem in the field of computer vision, many challenges still remain when one is presented with realistic data. These challenges include large camera motion, strong scene parallax, fast object motion, large object density, strong shadows, and insufficiently large action datasets. Therefore, we propose a number of novel methods based on exploiting scene constraints from the imagery itself to aid in the detection and tracking of objects. We show, via experiments on several datasets, that superior performance is achieved with the use of proposed constraints. First, we tackle the problem of detecting moving, as well as stationary, objects in scenes that contain parallax and shadows. We do this on both regular aerial video, as well as the new and challenging domain of wide area surveillance. This problem poses several challenges: large camera motion, strong parallax, large number of moving objects, small number of pixels on target, single channel data, and low frame-rate of video. We propose a method for detecting moving and stationary objects that overcomes these challenges, and evaluate it on CLIF and VIVID datasets. In order to find moving objects, we use median background modelling which requires few frames to obtain a workable model, and is very robust when there is a large number of moving objects in the scene while the model is being constructed. We then iii remove false detections from parallax and registration errors using gradient information from the background image. Relying merely on motion to detect objects in aerial video may not be sufficient to provide complete information about the observed scene. First of all, objects that are permanently stationary may be of interest as well, for example to determine how long a particular vehicle has been parked at a certain location. Secondly, moving vehicles that are being tracked through the scene may sometimes stop and remain stationary at traffic lights and railroad crossings. These prolonged periods of non-motion make it very difficult for the tracker to maintain the identities of the vehicles. Therefore, there is a clear need for a method that can detect stationary pedestrians and vehicles in UAV imagery. This is a challenging problem due to small number of pixels on the target, which makes it difficult to distinguish objects from background clutter, and results in a much larger search space. We propose a method for constraining the search based on a number of geometric constraints obtained from the metadata. Specifically, we obtain the orientation of the ground plane normal, the orientation of the shadows cast by out of plane objects in the scene, and the relationship between object heights and the size of their corresponding shadows. We utilize the above information in a geometry-based shadow and ground plane normal blob detector, which provides an initial estimation for the locations of shadow casting out of plane (SCOOP) objects in the scene. These SCOOP candidate locations are then classified as either human or clutter using a combination of wavelet features, and a Support Vector Machine. Additionally, we combine regular SCOOP and inverted SCOOP candidates to obtain vehicle candidates. We show impressive results on sequences from VIVID and CLIF datasets, and provide comparative quantitative and qualitative analysis. We also show that we can extend the SCOOP detection method to automatically estimate the iv orientation of the shadow in the image without relying on metadata. This is useful in cases where metadata is either unavailable or erroneous. Simply detecting objects in every frame does not provide sufficient understanding of the nature of their existence in the scene. It may be necessary to know how the objects have travelled through the scene over time and which areas they have visited. Hence, there is a need to maintain the identities of the objects across different time instances. The task of object tracking can be very challenging in videos that have low frame rate, high density, and a very large number of objects, as is the case in the WAAS data. Therefore, we propose a novel method for tracking a large number of densely moving objects in an aerial video. In order to keep the complexity of the tracking problem manageable when dealing with a large number of objects, we divide the scene into grid cells, solve the tracking problem optimally within each cell using bipartite graph matching and then link the tracks across the cells. Besides tractability, grid cells also allow us to define a set of local scene constraints, such as road orientation and object context. We use these constraints as part of cost function to solve the tracking problem; This allows us to track fast-moving objects in low frame rate videos. In addition to moving through the scene, the humans that are present may be performing individual actions that should be detected and recognized by the system. A number of different approaches exist for action recognition in both aerial and ground level video. One of the requirements for the majority of these approaches is the existence of a sizeable dataset of examples of a particular action from which a model of the action can be constructed. Such a luxury is not always possible in aerial scenarios since it may be difficult to fly a large number of missions to observe a particular event multiple times. Therefore, we propose a method for v recognizing human actions in aerial video from as few examples as possible (a single example in the extreme case). We use the bag of words action representation and a 1vsAll multi-class classification framework. We assume that most of the classes have many examples, and construct Support Vector Machine models for each class. Then, we use Support Vector Machines that were trained for classes with many examples to improve the decision function of the Support Vector Machine that was trained using few examples, via late weighted fusion of decision values

    A computer vision approach to drone-based traffic analysis of road intersections

    Get PDF
    Recentemente, tem existido interesse na monitorização do tráfego automóvel, particularmente em cruzamentos, com o objectivo de obter um modelo estátistco do fluxo de veículos através dos ditos. Enquanto que os métodos convencionais - sensores em cada uma das entradas/saídas do cruzamento - permitem contar o número de veículos, estes são limitados, dado que não permitem acompanhar um veículo da entrada à saída. Estes dados são preciosos para perceber como funciona a dinâmica de mobilidade de uma cidade, e como pode ser melhorada, pelo que novas técnicas que forneçam esse tipo de informação têm que ser desenvolvidas. Uma das abordagens possíveis a este problema é a análise, por meio de algoritmos de visão por computador, de vídeo capturado dum cruzamento, para identificar e seguir veículos. Uma das formas possíveis de obtenção de vídeo é a utilização de um "drone" - um pequeno veículo aéreo não tripulado - com uma câmera, para voar por cima de um cruzamento e o filmar.Algum trabalho foi feito com esta ideia em mente, mas a utilização de uma perspectiva vertical "birds-eye", em vez de uma perspectiva inclinada, é limitada ou inexistente. Esta abordagem é interessante porque controna o problema das oclusões patente noutras formas de captura de imagem. O objectivo desta dissertação é, então, desenvolver e aplicar algoritmos de visão por computador a imagens obtidas desta maneira, para identificar e seguir veículos num cruzamento, para que um modelo estatístico do dito possa ser obtido. Este modelo baseia-se na supracitada associação de entradas a saídas. Com base na implementação que foi desenvolvida, esta abordagem para ser útil para, pelo menos, alguns tipos de veículos.In recent years, there has been interest in detailed monitoring of road traffic, particularly in intersections, in order to obtain a statistical model of the flow of vehicles through them. While conventional methods - sensors at each of the intersection's entrances/exits - allow for counting, they are limited in the sense that it is impossible to track a vehicle from origin to destination. This data is invaluable to understand the how the dynamic of a city's mobility works, and how it can be improved, therefore new techniques must be developed which provide that kind of information. One of the possible approaches to this problem is to analyse video footage of said intersections by means of computer vision algorithms, in order to identify and track individual vehicles. One of the possible ways to obtain this footage is by flying a drone - a small unmanned air vehicle (UAV) - carrying a camera over an intersection.Some work has been done with this solution in mind, but the usage of a top-down birds-eye perspective, obtained by flying the drone directly above the intersection, rather than at an angle, is limited or inexistent. This approach is interesting because it circumvents the problem of occlusions present in other footage capture set ups. The focus of this dissertation is, then, to develop and apply computer vision algorithms to footage obtained in this way in order to identify and track vehicles across intersections, so that a statistical model may be extracted. This model is based on said association of an origin and a destination. Based on the implementation which was developed, this approach seems to be useful for at least some types of vehicles

    A system for traffic violation detection

    Get PDF
    This paper describes the framework and components of an experimental platform for an advanced driver assistance system (ADAS) aimed at providing drivers with a feedback about traffic violations they have committed during their driving. The system is able to detect some specific traffic violations, record data associated to these faults in a local data-base, and also allow visualization of the spatial and temporal information of these traffic violations in a geographical map using the standard Google Earth tool. The test-bed is mainly composed of two parts: a computer vision subsystem for traffic sign detection and recognition which operates during both day and nighttime, and an event data recorder (EDR) for recording data related to some specific traffic violations. The paper covers firstly the description of the hardware architecture and then presents the policies used for handling traffic violations

    Vehicle Classification For Automatic Traffic Density Estimation

    Get PDF
    Automatic traffic light control at intersection has recently become one of the most active research areas related to the development of intelligent transportation systems (ITS). Due to the massive growth in urbanization and traffic congestion, intelligent vision based traffic light controller is needed to reduce the traffi c delay and travel time especially in developing countries as the current automatic time based control is not realistic while sensor-based tra ffic light controller is not reliable in developing countries. Vision based traffi c light controller depends mainly on traffic congestion estimation at cross roads, because the main road junctions of a city are these roads where most of the road-beds are lost. Most of the previous studies related to this topic do not take unattended vehicles into consideration when estimating the tra ffic density or traffi c flow. In this study we would like to improve the performance of vision based traffi c light control by detecting stationary and unattended vehicles to give them higher weights, using image processing and pattern recognition techniques for much e ffective and e ffecient tra ffic congestion estimation

    Advanced traffic video analytics for robust traffic accident detection

    Get PDF
    Automatic traffic accident detection is an important task in traffic video analysis due to its key applications in developing intelligent transportation systems. Reducing the time delay between the occurrence of an accident and the dispatch of the first responders to the scene may help lower the mortality rate and save lives. Since 1980, many approaches have been presented for the automatic detection of incidents in traffic videos. In this dissertation, some challenging problems for accident detection in traffic videos are discussed and a new framework is presented in order to automatically detect single-vehicle and intersection traffic accidents in real-time. First, a new foreground detection method is applied in order to detect the moving vehicles and subtract the ever-changing background in the traffic video frames captured by static or non-stationary cameras. For the traffic videos captured during day-time, the cast shadows degrade the performance of the foreground detection and road segmentation. A novel cast shadow detection method is therefore presented to detect and remove the shadows cast by moving vehicles and also the shadows cast by static objects on the road. Second, a new method is presented to detect the region of interest (ROI), which applies the location of the moving vehicles and the initial road samples and extracts the discriminating features to segment the road region. After detecting the ROI, the moving direction of the traffic is estimated based on the rationale that the crashed vehicles often make rapid change of direction. Lastly, single-vehicle traffic accidents and trajectory conflicts are detected using the first-order logic decision-making system. The experimental results using publicly available videos and a dataset provided by the New Jersey Department of Transportation (NJDOT) demonstrate the feasibility of the proposed methods. Additionally, the main challenges and future directions are discussed regarding (i) improving the performance of the foreground segmentation, (ii) reducing the computational complexity, and (iii) detecting other types of traffic accidents

    Vision-based portuguese sign language recognition system

    Get PDF
    Vision-based hand gesture recognition is an area of active current research in computer vision and machine learning. Being a natural way of human interaction, it is an area where many researchers are working on, with the goal of making human computer interaction (HCI) easier and natural, without the need for any extra devices. So, the primary goal of gesture recognition research is to create systems, which can identify specific human gestures and use them, for example, to convey information. For that, vision-based hand gesture interfaces require fast and extremely robust hand detection, and gesture recognition in real time. Hand gestures are a powerful human communication modality with lots of potential applications and in this context we have sign language recognition, the communication method of deaf people. Sign lan- guages are not standard and universal and the grammars differ from country to coun- try. In this paper, a real-time system able to interpret the Portuguese Sign Language is presented and described. Experiments showed that the system was able to reliably recognize the vowels in real-time, with an accuracy of 99.4% with one dataset of fea- tures and an accuracy of 99.6% with a second dataset of features. Although the im- plemented solution was only trained to recognize the vowels, it is easily extended to recognize the rest of the alphabet, being a solid foundation for the development of any vision-based sign language recognition user interface system
    corecore