111,408 research outputs found

    PRF: A Framework for Building Automatic Program Repair Prototypes for JVM-Based Languages

    Full text link
    PRF is a Java-based framework that allows researchers to build prototypes of test-based generate-and-validate automatic program repair techniques for JVM languages by simply extending it with their patch generation plugins. The framework also provides other useful components for constructing automatic program repair tools, e.g., a fault localization component that provides spectrum-based fault localization information at different levels of granularity, a configurable and safe patch validation component that is 11+X faster than vanilla testing, and a customizable post-processing component to generate fix reports. A demo video of PRF is available at https://bit.ly/3ehduSS.Comment: Proceedings of the 28th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE '20

    Smart video sensors for 3D scene reconstruction of large infrastructures

    Full text link
    The final publication is available at Springer via http://dx.doi.org/10.1007/s11042-012-1184-zThis paper introduces a new 3D-based surveillance solution for large infrastructures. Our proposal is based on an accurate 3D reconstruction using the rich information obtained from a network of intelligent video-processing nodes. In this manner, if the scenario to cover is modeled in 3D with high precision, it will be possible to locate the detected objects in the virtual representation. Moreover, as an improvement over previous 2D solutions, having the possibility of modifying the view point enables the application to choose the perspective that better suits the current state of the scenario. In this sense, the contextualization of the events detected in a 3D environment can offer a much better understanding of what is happening in the real world and where it is exactly happening. Details of the video processing nodes are given, as well as of the 3D reconstruction tasks performed afterwards. The possibilities of such a system are described and the performance obtained is analyzed.This work has been partially supported by the ViCoMo project (ITEA2 project IP08009 funded by the Spanish MICINN with project TSI-020400-2011-57), the Spanish Government (TIN2009-14103-C03-03, DPI2008-06737-C02-01/02 and DPI 2011-28507-C02-02) and European FEDER funds.Ripollés Mateu, ÓE.; Simó Ten, JE.; Benet Gilabert, G.; Vivó Hernando, RA. (2014). Smart video sensors for 3D scene reconstruction of large infrastructures. Multimedia Tools and Applications. 73(2):977-993. https://doi.org/10.1007/s11042-012-1184-zS977993732Atienza-Vanacloig V, Rosell-Ortega J, Andreu-Garcia G, Valiente-Gonzalez J (2008) People and luggage recognition in airport surveillance under real-time constraints. In: 19th international conference on pattern recognition, pp 1–4Cal3D (2011) http://gna.org/projects/cal3d/ . Accessed 19 July 2012Chang F, Chen CJ (2003) A component-labeling algorithm using contour tracing technique. In: 7th int. conference on document analysis and recognition, pp 741–745Cruz-Neira C, Sandin DJ, DeFanti TA, Kenyon RV, Hart JC (1992) The cave: audio visual experience automatic virtual environment. Commun ACM 35:64–72Fleck S, Busch F, Biber P, Strasser W (2006) 3D surveillance a distributed network of smart cameras for real-time tracking and its visualization in 3D. In: Conference on computer vision and pattern recognition workshop (CVPRW06), p 118Hoiem D, Efros AA, Hebert M (2005) Automatic photo pop-up. ACM Trans Graph 24:577–584Javed O, Shah M (2008) Automated multi-camera surveillance: algorithms and practice. Springer, New YorkLipton A, Fujiyoshi H, Patil R (1998) Moving target classification and tracking from real-time video. In: Proceedings of IEEE workshop on applications of computer vision, vol 1, pp 8–14Lloyd DH (1968) A concept of improvement of learning response in the taught lesson. In: Visual education, pp 23–25Osfield R, Burns D (2011) OpenSceneGraph. http://www.openscenegraph.org . Accessed 19 July 2012Rieffel EG, Girgensohn A, Kimber D, Chen T, Liu Q (2007) Geometric tools for multicamera surveillance systems. In: IEEE int. conf. on distributed smart camerasSebe I, Hu J, You S, Neumann U (2003) 3D video surveillance with augmented virtual environments. In: ACM SIGMM workshop on video surveillance, pp 107–112SENSE Consortium (2006) Smart embedded network of sensing entities. Web page: http://www.sense-ist.org (European Commission: IST Project 033279). Accessed 19 July 2012Sánchez J, Benet G, Simó JE (2012) Video sensor architecture for surveillance applications. Sensors 12(2):1509–1528Vouzounaras G, Daras P, Strintzis M (2011) Automatic generation of 3D outdoor and indoor building scenes from a single image. Multimedia Tools Appl. doi: 10.1007/s11042-011-0823-0Yan W, Kieran D, Rafatirad S, Jain R (2011) A comprehensive study of visual event computing. Multimedia Tools Appl 55:443–481Zúñiga M, Brémond F, Thonnat M (2006) Fast and reliable object classification in video based on a 3D generic model. In: Proceedings of the international conference on visual information engineering (VIE2006), pp 26–2

    Movie Description

    Get PDF
    Audio Description (AD) provides linguistic descriptions of movies and allows visually impaired people to follow a movie along with their peers. Such descriptions are by design mainly visual and thus naturally form an interesting data source for computer vision and computational linguistics. In this work we propose a novel dataset which contains transcribed ADs, which are temporally aligned to full length movies. In addition we also collected and aligned movie scripts used in prior work and compare the two sources of descriptions. In total the Large Scale Movie Description Challenge (LSMDC) contains a parallel corpus of 118,114 sentences and video clips from 202 movies. First we characterize the dataset by benchmarking different approaches for generating video descriptions. Comparing ADs to scripts, we find that ADs are indeed more visual and describe precisely what is shown rather than what should happen according to the scripts created prior to movie production. Furthermore, we present and compare the results of several teams who participated in a challenge organized in the context of the workshop "Describing and Understanding Video & The Large Scale Movie Description Challenge (LSMDC)", at ICCV 2015

    The Long-Short Story of Movie Description

    Full text link
    Generating descriptions for videos has many applications including assisting blind people and human-robot interaction. The recent advances in image captioning as well as the release of large-scale movie description datasets such as MPII Movie Description allow to study this task in more depth. Many of the proposed methods for image captioning rely on pre-trained object classifier CNNs and Long-Short Term Memory recurrent networks (LSTMs) for generating descriptions. While image description focuses on objects, we argue that it is important to distinguish verbs, objects, and places in the challenging setting of movie description. In this work we show how to learn robust visual classifiers from the weak annotations of the sentence descriptions. Based on these visual classifiers we learn how to generate a description using an LSTM. We explore different design choices to build and train the LSTM and achieve the best performance to date on the challenging MPII-MD dataset. We compare and analyze our approach and prior work along various dimensions to better understand the key challenges of the movie description task
    corecore