7 research outputs found

    Semantic web technologies for video surveillance metadata

    Get PDF
    Video surveillance systems are growing in size and complexity. Such systems typically consist of integrated modules of different vendors to cope with the increasing demands on network and storage capacity, intelligent video analytics, picture quality, and enhanced visual interfaces. Within a surveillance system, relevant information (like technical details on the video sequences, or analysis results of the monitored environment) is described using metadata standards. However, different modules typically use different standards, resulting in metadata interoperability problems. In this paper, we introduce the application of Semantic Web Technologies to overcome such problems. We present a semantic, layered metadata model and integrate it within a video surveillance system. Besides dealing with the metadata interoperability problem, the advantages of using Semantic Web Technologies and the inherent rule support are shown. A practical use case scenario is presented to illustrate the benefits of our novel approach

    Video Compression using Neural Weight Step and Huffman Coding Techniques

    Get PDF
    مقدمة: تقترح هذه الورقة طريقة مخطط ضغط الفيديو الهرمي (HVCS) مع ثلاث طبقات هرمية من الجودة مع شبكة تحسين الجودة المتكررة (RQEN).  تستخدم تقنيات ضغط الصور لضغط الإطارات في الطبقة الأولى، حيث تتمتع الإطارات بأعلى جودة. باستخدام إطار عالي الجودة كمرجع ، تم اقتراح شبكة الضغط العميق ثنائي الاتجاه (BDC)  لضغط الإطار في الطبقة الثانية بجودة كبيرة. في الطبقة الثالثة، يتم استخدام جودة منخفضة لضغط الإطار باستخدام شبكة ضغط الحركة الواحدة(SMC) المعتمدة، والتي تقترح خريطة الحركة الواحدة لتقدير الحركة داخل إطارات متعددة. نتيجة لذلك ، يوفر SMC معلومات الحركة باستخدام عدد أقل من البتات. في مرحلة فك التشفير ، يتم تطوير شبكة تحسين الجودة المتكررة ((RQEN  المرجحة لأخذ كل من تدفق البتات والإطارات المضغوطة كمدخلات. في خلية RQEN ، يتم ترجيح إشارة التحديث والذاكرة باستخدام ميزات الجودة للتأثير بشكل إيجابي على معلومات الإطارات المتعددة ... طرق العمل: يوضح الجدولان 1 و 2 تمثيل القيم الناتجة لتشويه المعدل في مجموعتي بيانات الفيديو. كما ذكرنا سابقا ، يتم استخدام PSNR و MS-SSIM لتقييم الجودة، حيث يتم حساب معدلات البتات باستخدام بت لكل بكسل(bpp)  . يوضح الجدول 1 أداء PSNR، حيث يظهرون أداء PSNR أفضل لنموذج الضغط المقترح من الطرق الأخرى مثل Chao et al [7] أو الطرق المحسنة [1]. بالإضافة إلى ذلك ، يتفوقون في تطبيق H.265 على مجموعة بيانات JCT-VC القياسية. على الجانب الآخر ، أسفر مخطط الضغط المقترح عن أداء معدل بت أفضل من تطبيق H.265 على UVG. كما هو الحال في الجدول 2 ، قدم تقييم MS-SSIM أداء أفضل للمخطط المقترح من جميع النهج المستفادة الأخرى، حيث وصل إلى أداء أفضل من H.264 و .H.265 نظرا لأداء معدل البت على UVG ، يتمتع Lee et al. [11] بأداء مماثل، وحقق Guo et al [10] أداء أقل من H.265. التقديم على JCT-VC ، DVC [10] يمكن مقارنته فقط ب H.265 . على العكس من ذلك ، فإن أداء تشويه معدل HVCS له أداء أفضل واضح من H.265. علاوة على ذلك، يتم حساب معدل بت دلتا BjꝊntegaard (BDBR) [47] أيضا اعتمادا على H.265. يحسب مقياس BDBR متوسط الفرق في معدل البت مع الأخذ في الاعتبار مرساة H.265 ، حيث يشار إلى أداء أفضل على القيم المنخفضة ل BDBR [48] . يحسب مقياس BDBR متوسط الفرق في معدل البت مع الأخذ في الاعتبار مرساة H.265، حيث يشار إلى أداء أفضل على القيم المنخفضة ل BDBR [48]. في الجدول 3، يتم توضيح أداء BDBR اعتمادا على PSNR و MS-SSIM ، حيث يشار إلى تخفيض معدل البتات بالنظر إلى المرساة بأرقام سالبة معروضة. تتفوق هذه النتائج على أداء H.265، حيث تمثل الأرقام الجريئة أفضل النتائج التي تم تحقيقها من خلال الأساليب المستفادة. قدم الجدول 3 مقارنة عادلة حول تقنيات DVC المحسنة (MS-SSIM & PSNR) [10]  مع الأخذ في الاعتبار المرساة H.265. الاستنتاجات: يقترح هذا العمل مخطط ضغط فيديو مستفاد باستخدام جودة الإطار الهرمي مع التحسين المتكرر. على وجه التحديد، يقترح هذا العمل تقسيم الإطارات إلى مستويات هرمية 1 و 2 و 3 في انخفاض الجودة.  بالنسبة للطبقة الأولى، يتم اقتراح طرق ضغط الصور، مع اقتراح BDC وSMC للطبقات 2 و 3 على التوالي. تم تطوير شبكة RQEN بإطارات مضغوطة بجودة الإطار ومعلومات معدل البت كمدخلات لتحسين الإطارات المتعددة. أثبتت النتائج التجريبية كفاءة مخطط ضغط HVCS المقترح. وبالمثل مع تقنيات الضغط الأخرى ، يتم تعيين هيكل الإطار يدويا في هذا المخطط. يمكن تحقيق توصية واعدة للعمل المستقبلي من خلال تطوير شبكات DNN التي يتم تعلمها تلقائيا للتنبؤ والتسلسل الهرمي.Background: This paper proposes a Hierarchical Video Compression Scheme (HVCS) method with three hierarchical layers of quality with Recurrent Quality Enhancement (RQEN‎‎) network. Image compression techniques are used to compress frames in the first layer, where frames have the highest quality. Using high-quality frame as a reference, the Bi-Directional Deep Compression (BDC) network is proposed for frame compression in the second layer with considerable quality. In the third layer, low quality is used for frame compression using adopted Single Motion Compression (SMC) network, which proposes the single motion map for motion estimation within multiple frames. As a result, SMC provide motion information using fewer bits. In decoding stage, a weighted Recurrent Quality Enhancement (RQEN‎‎) network is developed to take both bit stream and the compressed frames as inputs. In RQEN cell‎‎, the update signal and memory are weighted using quality features to positively influence information of multi-frame for enhancement. In this paper, HVCS adopts hierarchical quality to benefit the efficiency of frame coding, whereas high-quality information improves frame compression and enhances the low-quality frames at encoding and decoding stages, respectively. Experimental results validate that proposed HVCS approach overcomes the state-of-the-art of compression methods. Materials and Methods: Tables 1& 2 illustrate representing yielded values for rate-distortion on both video datasets. As aforementioned, PSNR and MS-SSIM are used for quality evaluation, where bit-rates are calculated using bits per pixel (bpp). Table 1 illustrates PSNR performance, where they show better PSNR performance for the proposed compression model than other methods such as Chao et al [7] or optimized methods [1]. In addition, they outperform applying H.265 on standard JCT-VC dataset. On the other side, proposed compression scheme yielded better bit-rate performance than applying H.265 on UVG. As in Table 2, the MS-SSIM evaluation provided better performance of proposed scheme than all other learned approaches, where it reached better performance than H.264 and H.265. Due to bit-rate performance on UVG, Lee ‎et al. [11] has comparable performance, and Guo ‎et al [10] yielded lower performance than H.265. Applying on JCT-VC, DVC [10] is only comparable with H.265. On the opposite, the preformance of HVCS-rate-distortion have obvious better performance than H.265. Furthermore, BjꝊntegaard Delta Bit-Rate (BDBR) [47] is also computed depending on H.265. A BDBR measure computes the average difference of bit-rate considering H.265 anchor, where better performance is indicated on lower values of BDBR [48]. In Table 3, BDBR performance is illustrated depending on PSNR and MS-SSIM, in which, bit-rate reduction considering the anchor is indicated by showed negative numbers. Such results outperform H.265 performance, where bold numbers represent best yielded results by learned methods. Table 3 provided a fair comparison on (MS-SSIM & PSNR) optimized techniques DVC [10] considering the anchor H.265.   Results: Tables 1& 2 illustrate representing yielded values for rate-distortion on both video datasets. As aforementioned, PSNR and MS-SSIM are used for quality evaluation, where bit-rates are calculated using bits per pixel (bpp). Table 1 illustrates PSNR performance, where they show better PSNR performance for the proposed compression model than other methods such as Chao et al [7] or optimized methods [1]. In addition, they outperform applying H.265 on standard JCT-VC dataset. On the other side, proposed compression scheme yielded better bit-rate performance than applying H.265 on UVG. As in Table 2, the MS-SSIM evaluation provided better performance of proposed scheme than all other learned approaches, where it reached better performance than H.264 and H.265. Due to bit-rate performance on UVG, Lee ‎et al. [11] has comparable performance, and Guo ‎et al [10] yielded lower performance than H.265. Applying on JCT-VC, DVC [10] is only comparable with H.265. On the opposite, the preformance of HVCS-rate-distortion have obvious better performance than H.265. Furthermore, BjꝊntegaard Delta Bit-Rate (BDBR) [47] is also computed depending on H.265. A BDBR measure computes the average difference of bit-rate considering H.265 anchor, where better performance is indicated on lower values of BDBR [48]. In Table 3, BDBR performance is illustrated depending on PSNR and MS-SSIM, in which, bit-rate reduction considering the anchor is indicated by showed negative numbers. Such results outperform H.265 performance, where bold numbers represent best yielded results by learned methods. Table 3 provided a fair comparison on (MS-SSIM & PSNR) optimized techniques DVC [10] considering the anchor H.265. Conclusion: This work proposes a learned video compression scheme utilizing the hierarchical frame quality with recurrent enhancement. Specifically, this work proposes dividing frames into hierarchical levels 1, 2 and 3 in decreasing quality.  For the first layer, image compression methods are proposed, while proposing BDC and SMC for layers 2 and 3 respectively. RQEN‎‎ network is developed with frame quality compressed frames and bit-rate information as inputs for multi-frame enhancement. Experimental results validated the efficiency of proposed HVCS compression scheme. Similarly with other compression techniques, frame structure is manually set the in this scheme. A promising recommendation for future work can be accomplished by developing DNN networks which are automatically learned for the prediction and hierarchy

    Detection and representation of moving objects for video surveillance

    Get PDF
    In this dissertation two new approaches have been introduced for the automatic detection of moving objects (such as people and vehicles) in video surveillance sequences. The first technique analyses the original video and exploits spatial and temporal information to find those pixels in the images that correspond to moving objects. The second technique analyses video sequences that have been encoded according to a recent video coding standard (H.264/AVC). As such, only the compressed features are analyzed to find moving objects. The latter technique results in a very fast and accurate detection (up to 20 times faster than the related work). Lastly, we investigated how different XML-based metadata standards can be used to represent information about these moving objects. We proposed the usage of Semantic Web Technologies to combine information described according to different metadata standards

    Estabilização de vídeos com base em descritores H.264

    Get PDF
    Dissertação para obtenção do Grau de Mestre em Engenharia Eletrotécnica e de ComputadoresNo decorrer dos últimos anos, a qualidade de imagens e vídeos em câmaras tem evoluído, tendo a estabilização digital de imagem vindo a desempenhar um papel importante para a obtenção de vídeos estáveis em variadas situações, nomeadamente, vídeos amadores (shaky movies) ou determinadas áreas de videovigilância. Neste tipo de processamento existem vários métodos de estabilização, desde a vertente 2D até à mais complexa em 3D. O trabalho que aqui se apresenta pretende aplicar modelos de estabilização em áreas de videovigilância, no entanto, mostra-se igualmente bastante eficaz para qualquer tipo de vídeo amador. O trabalho aqui desenvolvido demonstra um método construído em 2D baseado na estimação robusta de homografias abrangendo quatro modelos distintos de transformação de imagem: os modelos translacional, euclidiano, afim e projetivo, apresentados por ordem de complexidade. Estes quatro modelos distinguem-se sobretudo nos níveis de estabilização que se pretende aplicar a um vídeo. Ou seja, parte-se do princípio que quanto maior o número de parâmetros a estabilizar, mais complexo deverá ser o modelo aplicado. O modelo translacional pretende estabilizar apenas os movimentos indesejados nos eixos horizontal e vertical; o modelo euclidiano pretende estabilizar, para além destes, os movimentos rotacionais indesejados; o modelo afim introduz uma complexidade muito maior em termos de parâmetros relativamente aos anteriores, estabilizando, para além dos mencionados anteriormente, também os efeitos de escalamento, compressão e distorção de objetos; por último, o modelo projetivo pretende acrescentar aos anteriores a eliminação de perspetiva horizontal e/ou vertical existente nas imagens. O método desenvolvido extrai os keypoints frame a frame comparando a posição de cada um em frames consecutivas, calculando assim a homografia inversa aplicável às imagens em cada modelo. Tendo isto em conta, a estabilização digital de imagens pode tornar-se, na visão por computador, num dos processamentos mais lentos e exigentes a nível computacional quando se enfrentam qualidades de vídeo bastante elevadas. Desta forma, para além do método de estabilização desenvolvido, este trabalho vem apresentar uma forma eficaz de aceder aos descritores visuais dos vídeos comprimidos em H.264 e extrair a informação neles presentes, acelerando assim todo o processo de estabilização

    Deliverable D1.1 State of the art and requirements analysis for hypervideo

    Get PDF
    This deliverable presents a state-of-art and requirements analysis report for hypervideo authored as part of the WP1 of the LinkedTV project. Initially, we present some use-case (viewers) scenarios in the LinkedTV project and through the analysis of the distinctive needs and demands of each scenario we point out the technical requirements from a user-side perspective. Subsequently we study methods for the automatic and semi-automatic decomposition of the audiovisual content in order to effectively support the annotation process. Considering that the multimedia content comprises of different types of information, i.e., visual, textual and audio, we report various methods for the analysis of these three different streams. Finally we present various annotation tools which could integrate the developed analysis results so as to effectively support users (video producers) in the semi-automatic linking of hypervideo content, and based on them we report on the initial progress in building the LinkedTV annotation tool. For each one of the different classes of techniques being discussed in the deliverable we present the evaluation results from the application of one such method of the literature to a dataset well-suited to the needs of the LinkedTV project, and we indicate the future technical requirements that should be addressed in order to achieve higher levels of performance (e.g., in terms of accuracy and time-efficiency), as necessary
    corecore