65 research outputs found

    Entrega de conteúdos multimédia em over-the-top: caso de estudo das gravações automáticas

    Get PDF
    Doutoramento em Engenharia EletrotécnicaOver-The-Top (OTT) multimedia delivery is a very appealing approach for providing ubiquitous, exible, and globally accessible services capable of low-cost and unrestrained device targeting. In spite of its appeal, the underlying delivery architecture must be carefully planned and optimized to maintain a high Qualityof- Experience (QoE) and rational resource usage, especially when migrating from services running on managed networks with established quality guarantees. To address the lack of holistic research works on OTT multimedia delivery systems, this Thesis focuses on an end-to-end optimization challenge, considering a migration use-case of a popular Catch-up TV service from managed IP Television (IPTV) networks to OTT. A global study is conducted on the importance of Catch-up TV and its impact in today's society, demonstrating the growing popularity of this time-shift service, its relevance in the multimedia landscape, and tness as an OTT migration use-case. Catch-up TV consumption logs are obtained from a Pay-TV operator's live production IPTV service containing over 1 million subscribers to characterize demand and extract insights from service utilization at a scale and scope not yet addressed in the literature. This characterization is used to build demand forecasting models relying on machine learning techniques to enable static and dynamic optimization of OTT multimedia delivery solutions, which are able to produce accurate bandwidth and storage requirements' forecasts, and may be used to achieve considerable power and cost savings whilst maintaining a high QoE. A novel caching algorithm, Most Popularly Used (MPU), is proposed, implemented, and shown to outperform established caching algorithms in both simulation and experimental scenarios. The need for accurate QoE measurements in OTT scenarios supporting HTTP Adaptive Streaming (HAS) motivates the creation of a new QoE model capable of taking into account the impact of key HAS aspects. By addressing the complete content delivery pipeline in the envisioned content-aware OTT Content Delivery Network (CDN), this Thesis demonstrates that signi cant improvements are possible in next-generation multimedia delivery solutions.A entrega de conteúdos multimédia em Over-The-Top (OTT) e uma proposta atractiva para fornecer um serviço flexível e globalmente acessível, capaz de alcançar qualquer dispositivo, com uma promessa de baixos custos. Apesar das suas vantagens, e necessario um planeamento arquitectural detalhado e optimizado para manter níveis elevados de Qualidade de Experiência (QoE), em particular aquando da migração dos serviços suportados em redes geridas com garantias de qualidade pré-estabelecidas. Para colmatar a falta de trabalhos de investigação na área de sistemas de entrega de conteúdos multimédia em OTT, esta Tese foca-se na optimização destas soluções como um todo, partindo do caso de uso de migração de um serviço popular de Gravações Automáticas suportado em redes de Televisão sobre IP (IPTV) geridas, para um cenário de entrega em OTT. Um estudo global para aferir a importância das Gravações Automáticas revela a sua relevância no panorama de serviços multimédia e a sua adequação enquanto caso de uso de migração para cenários OTT. São obtidos registos de consumos de um serviço de produção de Gravações Automáticas, representando mais de 1 milhão de assinantes, para caracterizar e extrair informação de consumos numa escala e âmbito não contemplados ate a data na literatura. Esta caracterização e utilizada para construir modelos de previsão de carga, tirando partido de sistemas de machine learning, que permitem optimizações estáticas e dinâmicas dos sistemas de entrega de conteúdos em OTT através de previsões das necessidades de largura de banda e armazenamento, potenciando ganhos significativos em consumo energético e custos. Um novo mecanismo de caching, Most Popularly Used (MPU), demonstra um desempenho superior as soluções de referencia, quer em cenários de simulação quer experimentais. A necessidade de medição exacta da QoE em streaming adaptativo HTTP motiva a criaçao de um modelo capaz de endereçar aspectos específicos destas tecnologias adaptativas. Ao endereçar a cadeia completa de entrega através de uma arquitectura consciente dos seus conteúdos, esta Tese demonstra que são possíveis melhorias de desempenho muito significativas nas redes de entregas de conteúdos em OTT de próxima geração

    A bag of words description scheme for image quality assessment

    Get PDF
    Every day millions of images are obtained, processed, compressed, saved, transmitted and reproduced. All these operations can cause distortions that affect their quality. The quality of these images should be measured subjectively. However, that brings the disadvantage of achieving a considerable number of tests with individuals requested to provide a statistical analysis of an image’s perceptual quality. Several objective metrics have been developed, that try to model the human perception of quality. However, in most applications the representation of human quality perception given by these metrics is far from the desired representation. Therefore, this work proposes the usage of machine learning models that allow for a better approximation. In this work, definitions for image and quality are given and some of the difficulties of the study of image quality are mentioned. Moreover, three metrics are initially explained. One uses the image’s original quality has a reference (SSIM) while the other two are no reference (BRISQUE and QAC). A comparison is made, showing a large discrepancy of values between the two kinds of metrics. The database that is used for the tests is TID2013. This database was chosen due to its dimension and by the fact of considering a large number of distortions. A study of each type of distortion in this database is made. Furthermore, some concepts of machine learning are introduced along with algorithms relevant in the context of this dissertation, notably, K-means, KNN and SVM. Description aggregator algorithms like “bag of words” and “fisher-vectors” are also mentioned. This dissertation studies a new model that combines machine learning and a quality metric for quality estimation. This model is based on the division of images in cells, where a specific metric is computed. With this division, it is possible to obtain local quality descriptors that will be aggregated using “bag of words”. A SVM with an RBF kernel is trained and tested on the same database and the results of the model are evaluated using cross-validation. The results are analysed using Pearson, Spearman and Kendall correlations and the RMSE to evaluate the representation of the model when compared with the subjective results. The model improves the results of the metric that was used and shows a new path to apply machine learning for quality evaluation.No nosso dia-a-dia as imagens são obtidas, processadas, comprimidas, guardadas, transmitidas e reproduzidas. Em qualquer destas operações podem ocorrer distorções que prejudicam a sua qualidade. A qualidade destas imagens pode ser medida de forma subjectiva, o que tem a desvantagem de serem necessários vários testes, a um número considerável de indivíduos para ser feita uma análise estatística da qualidade perceptual de uma imagem. Foram desenvolvidas várias métricas objectivas, que de alguma forma tentam modelar a percepção humana de qualidade. Todavia, em muitas aplicações a representação de percepção de qualidade humana dada por estas métricas fica aquém do desejável, razão porque se propõe neste trabalho usar modelos de reconhecimento de padrões que permitam uma maior aproximação. Neste trabalho, são dadas definições para imagem e qualidade e algumas das dificuldades do estudo da qualidade de imagem são referidas. É referida a importância da qualidade de imagem como ramo de estudo, e são estudadas diversas métricas de qualidade. São explicadas três métricas, uma delas que usa a qualidade original como referência (SSIM) e duas métricas sem referência (BRISQUE e QAC). Uma comparação é feita entre elas, mostrando- – se uma grande discrepância de valores entre os dois tipos de métricas. Para os testes feitos é usada a base de dados TID2013, que é muitas vezes considerada para estudos de qualidade de métricas devido à sua dimensão e ao facto de considerar um grande número de distorções. Neste trabalho também se fez um estudo dos tipos de distorção incluidos nesta base de dados e como é que eles são simulados. São introduzidos também alguns conceitos teóricos de reconhecimento de padrões e alguns algoritmos relevantes no contexto da dissertação, são descritos como o K-means, KNN e as SVMs. Algoritmos de agregação de descritores como o “bag of words” e o “fisher-vectors” também são referidos. Esta dissertação adiciona métodos de reconhecimento de padrões a métricas objectivas de qua– lidade de imagem. Uma nova técnica é proposta, baseada na divisão de imagens em células, nas quais uma métrica será calculada. Esta divisão permite obter descritores locais de qualidade que serão agregados usando “bag of words”. Uma SVM com kernel RBF é treinada e testada na mesma base de dados e os resultados do modelo são mostrados usando cross-validation. Os resultados são analisados usando as correlações de Pearson, Spearman e Kendall e o RMSE que permitem avaliar a proximidade entre a métrica desenvolvida e os resultados subjectivos. Este modelo melhora os resultados obtidos com a métrica usada e demonstra uma nova forma de aplicar modelos de reconhecimento de padrões ao estudo de avaliação de qualidade

    Real-time neural network based video super-resolution as a service: design and implementation of a real-time video super-resolution service using public cloud services

    Get PDF
    Despite the advancements in video streaming, we still find limitations when there is the necessity to stream real-time video in a higher resolution (e.g., in super- resolution) through mobile devices with limited resources. This thesis work aims to give an option to address this challenge through a cloud service. There were two main code components to create this service. The first component was aiortc (e.g., the WebRTC python version), the streaming protocol. The second component was the Efficient Sub-Pixel Convolutional Neural Network (ESPCN)-model, one of the outstanding methods to upscale video at the present time. These two code components were implemented in a virtual machine in the Microsoft Azure cloud environment with a customized configuration. Qualitative as well as quantitative results of this work were obtained and analyzed. To obtain the qualitative results two versions of the ESPCN-model were developed and for the quantitative outcomes three different configurations of HW/SW codecs and CPU/GPU utilization were produced and analyzed. Besides finding and defining the code components mentioned before as optimal to create an efficient real-time video super-resolution service based on the cloud, an- other conclusion of this project is that sending or receiving information (frames) from the CPU to the GPU and vice-versa has a very big negative impact in the efficiency of the whole service. Hence, to limit this CPU-GPU interaction or to only use GPU (e.g., with the NVIDIA Virtual Processing Framework [VPF]) is critical for an efficient service. This issue can be avoided, as the quantitative results show, if a codec that only makes use of the GPU (e.g., a NVIDIA HW codec) is employed. Furthermore, the Azure cloud environment component, enables an efficient execution of the service in diverse mobile devices. In future, the quality measure of the video super-resolution done by the ESPCN- model is suggested as a next step to do

    Monte Carlo Method with Heuristic Adjustment for Irregularly Shaped Food Product Volume Measurement

    Get PDF
    Volume measurement plays an important role in the production and processing of food products. Various methods have been proposed to measure the volume of food products with irregular shapes based on 3D reconstruction. However, 3D reconstruction comes with a high-priced computational cost. Furthermore, some of the volume measurement methods based on 3D reconstruction have a low accuracy. Another method for measuring volume of objects uses Monte Carlo method. Monte Carlo method performs volume measurements using random points. Monte Carlo method only requires information regarding whether random points fall inside or outside an object and does not require a 3D reconstruction. This paper proposes volume measurement using a computer vision system for irregularly shaped food products without 3D reconstruction based on Monte Carlo method with heuristic adjustment. Five images of food product were captured using five cameras and processed to produce binary images. Monte Carlo integration with heuristic adjustment was performed to measure the volume based on the information extracted from binary images. The experimental results show that the proposed method provided high accuracy and precision compared to the water displacement method. In addition, the proposed method is more accurate and faster than the space carving method

    Data-driven visual quality estimation using machine learning

    Get PDF
    Heutzutage werden viele visuelle Inhalte erstellt und sind zugänglich, was auf Verbesserungen der Technologie wie Smartphones und das Internet zurückzuführen ist. Es ist daher notwendig, die von den Nutzern wahrgenommene Qualität zu bewerten, um das Erlebnis weiter zu verbessern. Allerdings sind nur wenige der aktuellen Qualitätsmodelle speziell für höhere Auflösungen konzipiert, sagen mehr als nur den Mean Opinion Score vorher oder nutzen maschinelles Lernen. Ein Ziel dieser Arbeit ist es, solche maschinellen Modelle für höhere Auflösungen mit verschiedenen Datensätzen zu trainieren und zu evaluieren. Als Erstes wird eine objektive Analyse der Bildqualität bei höheren Auflösungen durchgeführt. Die Bilder wurden mit Video-Encodern komprimiert, hierbei weist AV1 die beste Qualität und Kompression auf. Anschließend werden die Ergebnisse eines Crowd-Sourcing-Tests mit einem Labortest bezüglich Bildqualität verglichen. Weiterhin werden auf Deep Learning basierende Modelle für die Vorhersage von Bild- und Videoqualität beschrieben. Das auf Deep Learning basierende Modell ist aufgrund der benötigten Ressourcen für die Vorhersage der Videoqualität in der Praxis nicht anwendbar. Aus diesem Grund werden pixelbasierte Videoqualitätsmodelle vorgeschlagen und ausgewertet, die aussagekräftige Features verwenden, welche Bild- und Bewegungsaspekte abdecken. Diese Modelle können zur Vorhersage von Mean Opinion Scores für Videos oder sogar für anderer Werte im Zusammenhang mit der Videoqualität verwendet werden, wie z.B. einer Bewertungsverteilung. Die vorgestellte Modellarchitektur kann auf andere Videoprobleme angewandt werden, wie z.B. Videoklassifizierung, Vorhersage der Qualität von Spielevideos, Klassifikation von Spielegenres oder der Klassifikation von Kodierungsparametern. Ein wichtiger Aspekt ist auch die Verarbeitungszeit solcher Modelle. Daher wird ein allgemeiner Ansatz zur Beschleunigung von State-of-the-Art-Videoqualitätsmodellen vorgestellt, der zeigt, dass ein erheblicher Teil der Verarbeitungszeit eingespart werden kann, während eine ähnliche Vorhersagegenauigkeit erhalten bleibt. Die Modelle sind als Open Source veröffentlicht, so dass die entwickelten Frameworks für weitere Forschungsarbeiten genutzt werden können. Außerdem können die vorgestellten Ansätze als Bausteine für neuere Medienformate verwendet werden.Today a lot of visual content is accessible and produced, due to improvements in technology such as smartphones and the internet. This results in a need to assess the quality perceived by users to further improve the experience. However, only a few of the state-of-the-art quality models are specifically designed for higher resolutions, predict more than mean opinion score, or use machine learning. One goal of the thesis is to train and evaluate such machine learning models of higher resolutions with several datasets. At first, an objective evaluation of image quality in case of higher resolutions is performed. The images are compressed using video encoders, and it is shown that AV1 is best considering quality and compression. This evaluation is followed by the analysis of a crowdsourcing test in comparison with a lab test investigating image quality. Afterward, deep learning-based models for image quality prediction and an extension for video quality are proposed. However, the deep learning-based video quality model is not practically usable because of performance constrains. For this reason, pixel-based video quality models using well-motivated features covering image and motion aspects are proposed and evaluated. These models can be used to predict mean opinion scores for videos, or even to predict other video quality-related information, such as a rating distributions. The introduced model architecture can be applied to other video problems, such as video classification, gaming video quality prediction, gaming genre classification or encoding parameter estimation. Furthermore, one important aspect is the processing time of such models. Hence, a generic approach to speed up state-of-the-art video quality models is introduced, which shows that a significant amount of processing time can be saved, while achieving similar prediction accuracy. The models have been made publicly available as open source so that the developed frameworks can be used for further research. Moreover, the presented approaches may be usable as building blocks for newer media formats

    AXMEDIS 2008

    Get PDF
    The AXMEDIS International Conference series aims to explore all subjects and topics related to cross-media and digital-media content production, processing, management, standards, representation, sharing, protection and rights management, to address the latest developments and future trends of the technologies and their applications, impacts and exploitation. The AXMEDIS events offer venues for exchanging concepts, requirements, prototypes, research ideas, and findings which could contribute to academic research and also benefit business and industrial communities. In the Internet as well as in the digital era, cross-media production and distribution represent key developments and innovations that are fostered by emergent technologies to ensure better value for money while optimising productivity and market coverage

    Persönliche Wege der Interaktion mit multimedialen Inhalten

    Get PDF
    Today the world of multimedia is almost completely device- and content-centered. It focuses it’s energy nearly exclusively on technical issues such as computing power, network specifics or content and device characteristics and capabilities. In most multimedia systems, the presentation of multimedia content and the basic controls for playback are main issues. Because of this, a very passive user experience, comparable to that of traditional TV, is most often provided. In the face of recent developments and changes in the realm of multimedia and mass media, this ”traditional” focus seems outdated. The increasing use of multimedia content on mobile devices, along with the continuous growth in the amount and variety of content available, make necessary an urgent re-orientation of this domain. In order to highlight the depth of the increasingly difficult situation faced by users of such systems, it is only logical that these individuals be brought to the center of attention. In this thesis we consider these trends and developments by applying concepts and mechanisms to multimedia systems that were first introduced in the domain of usercentrism. Central to the concept of user-centrism is that devices should provide users with an easy way to access services and applications. Thus, the current challenge is to combine mobility, additional services and easy access in a single and user-centric approach. This thesis presents a framework for introducing and supporting several of the key concepts of user-centrism in multimedia systems. Additionally, a new definition of a user-centric multimedia framework has been developed and implemented. To satisfy the user’s need for mobility and flexibility, our framework makes possible seamless media and service consumption. The main aim of session mobility is to help people cope with the increasing number of different devices in use. Using a mobile agent system, multimedia sessions can be transferred between different devices in a context-sensitive way. The use of the international standard MPEG-21 guarantees extensibility and the integration of content adaptation mechanisms. Furthermore, a concept is presented that will allow for individualized and personalized selection and face the need for finding appropriate content. All of which can be done, using this approach, in an easy and intuitive way. Especially in the realm of television, the demand that such systems cater to the need of the audience is constantly growing. Our approach combines content-filtering methods, state-of-the-art classification techniques and mechanisms well known from the area of information retrieval and text mining. These are all utilized for the generation of recommendations in a promising new way. Additionally, concepts from the area of collaborative tagging systems are also used. An extensive experimental evaluation resulted in several interesting findings and proves the applicability of our approach. In contrast to the ”lean-back” experience of traditional media consumption, interactive media services offer a solution to make possible the active participation of the audience. Thus, we present a concept which enables the use of interactive media services on mobile devices in a personalized way. Finally, a use case for enriching TV with additional content and services demonstrates the feasibility of this concept.Die heutige Welt der Medien und der multimedialen Inhalte ist nahezu ausschließlich inhalts- und geräteorientiert. Im Fokus verschiedener Systeme und Entwicklungen stehen oft primär die Art und Weise der Inhaltspräsentation und technische Spezifika, die meist geräteabhängig sind. Die zunehmende Menge und Vielfalt an multimedialen Inhalten und der verstärkte Einsatz von mobilen Geräten machen ein Umdenken bei der Konzeption von Multimedia Systemen und Frameworks dringend notwendig. Statt an eher starren und passiven Konzepten, wie sie aus dem TV Umfeld bekannt sind, festzuhalten, sollte der Nutzer in den Fokus der multimedialen Konzepte rücken. Um dem Nutzer im Umgang mit dieser immer komplexeren und schwierigen Situation zu helfen, ist ein Umdenken im grundlegenden Paradigma des Medienkonsums notwendig. Durch eine Fokussierung auf den Nutzer kann der beschriebenen Situation entgegengewirkt werden. In der folgenden Arbeit wird auf Konzepte aus dem Bereich Nutzerzentrierung zurückgegriffen, um diese auf den Medienbereich zu übertragen und sie im Sinne einer stärker nutzerspezifischen und nutzerorientierten Ausrichtung einzusetzen. Im Fokus steht hierbei der TV-Bereich, wobei die meisten Konzepte auch auf die allgemeine Mediennutzung übertragbar sind. Im Folgenden wird ein Framework für die Unterstützung der wichtigsten Konzepte der Nutzerzentrierung im Multimedia Bereich vorgestellt. Um dem Trend zur mobilen Mediennutzung Sorge zu tragen, ermöglicht das vorgestellte Framework die Nutzung von multimedialen Diensten und Inhalten auf und über die Grenzen verschiedener Geräte und Netzwerke hinweg (Session mobility). Durch die Nutzung einer mobilen Agentenplattform in Kombination mit dem MPEG-21 Standard konnte ein neuer und flexibel erweiterbarer Ansatz zur Mobilität von Benutzungssitzungen realisiert werden. Im Zusammenhang mit der stetig wachsenden Menge an Inhalten und Diensten stellt diese Arbeit ein Konzept zur einfachen und individualisierten Selektion und dem Auffinden von interessanten Inhalten und Diensten in einer kontextspezifischen Weise vor. Hierbei werden Konzepte und Methoden des inhaltsbasierten Filterns, aktuelle Klassifikationsmechanismen und Methoden aus dem Bereich des ”Textminings” in neuer Art und Weise in einem Multimedia Empfehlungssystem eingesetzt. Zusätzlich sind Methoden des Web 2.0 in eine als Tag-basierte kollaborative Komponente integriert. In einer umfassenden Evaluation wurde sowohl die Umsetzbarkeit als auch der Mehrwert dieser Komponente demonstriert. Eine aktivere Beteiligung im Medienkonsum ermöglicht unsere iTV Komponente. Sie unterstützt das Anbieten und die Nutzung von interaktiven Diensten, begleitend zum Medienkonsum, auf mobilen Geräten. Basierend auf einem Szenario zur Anreicherung von TV Sendungen um interaktive Dienste konnte die Umsetzbarkeit dieses Konzepts demonstriert werden

    A comparison of statistical machine learning methods in heartbeat detection and classification

    Get PDF
    In health care, patients with heart problems require quick responsiveness in a clinical setting or in the operating theatre. Towards that end, automated classification of heartbeats is vital as some heartbeat irregularities are time consuming to detect. Therefore, analysis of electro-cardiogram (ECG) signals is an active area of research. The methods proposed in the literature depend on the structure of a heartbeat cycle. In this paper, we use interval and amplitude based features together with a few samples from the ECG signal as a feature vector. We studied a variety of classification algorithms focused especially on a type of arrhythmia known as the ventricular ectopic fibrillation (VEB). We compare the performance of the classifiers against algorithms proposed in the literature and make recommendations regarding features, sampling rate, and choice of the classifier to apply in a real-time clinical setting. The extensive study is based on the MIT-BIH arrhythmia database. Our main contribution is the evaluation of existing classifiers over a range sampling rates, recommendation of a detection methodology to employ in a practical setting, and extend the notion of a mixture of experts to a larger class of algorithms
    corecore