2,349 research outputs found

    Single-Channel Data Broadcasting under Small Waiting Latency

    Get PDF
    Due to the advancement of network technology, video-on-demand (VoD) services are growing in popularity. However, individual stream allocation for client requests easily causes a VoD system overload; when its network and disk bandwidth cannot match client growth. This study thus presents a fundamentally different approach by focusing solely on a class of applications identified as latency tolerant applications. Because video broadcasting does not provide interactive (i.e., VCR) functions, a client is able to tolerate playback latency from a video server. One efficient broadcasting method is periodic broadcasting, which divides a video into smaller segments and broadcasts these segments periodically on multiple channels. However, numerous practical systems, such as digital video broadcasting-handheld (DVB-H), do not allow clients to download video data from multiple channels because clients usually only have one tuner. To resolve this problem in multiple-channel broadcasting, this study proposes a novel single-channel broadcasting scheme, which leverages segment-broadcasting capability further for more efficient video delivery. The comparison results show that, with the same settings of broadcasting bandwidth, the proposed scheme outperforms the alternative broadcasting scheme, the hopping insertion scheme, SingBroad, PAS, and the reverse-order scheduling scheme for the maximal waiting time

    AI-Generated Content (AIGC): A Survey

    Full text link
    To address the challenges of digital intelligence in the digital economy, artificial intelligence-generated content (AIGC) has emerged. AIGC uses artificial intelligence to assist or replace manual content generation by generating content based on user-inputted keywords or requirements. The development of large model algorithms has significantly strengthened the capabilities of AIGC, which makes AIGC products a promising generative tool and adds convenience to our lives. As an upstream technology, AIGC has unlimited potential to support different downstream applications. It is important to analyze AIGC's current capabilities and shortcomings to understand how it can be best utilized in future applications. Therefore, this paper provides an extensive overview of AIGC, covering its definition, essential conditions, cutting-edge capabilities, and advanced features. Moreover, it discusses the benefits of large-scale pre-trained models and the industrial chain of AIGC. Furthermore, the article explores the distinctions between auxiliary generation and automatic generation within AIGC, providing examples of text generation. The paper also examines the potential integration of AIGC with the Metaverse. Lastly, the article highlights existing issues and suggests some future directions for application.Comment: Preprint. 14 figures, 4 table

    Highly efficient low-level feature extraction for video representation and retrieval.

    Get PDF
    PhDWitnessing the omnipresence of digital video media, the research community has raised the question of its meaningful use and management. Stored in immense multimedia databases, digital videos need to be retrieved and structured in an intelligent way, relying on the content and the rich semantics involved. Current Content Based Video Indexing and Retrieval systems face the problem of the semantic gap between the simplicity of the available visual features and the richness of user semantics. This work focuses on the issues of efficiency and scalability in video indexing and retrieval to facilitate a video representation model capable of semantic annotation. A highly efficient algorithm for temporal analysis and key-frame extraction is developed. It is based on the prediction information extracted directly from the compressed domain features and the robust scalable analysis in the temporal domain. Furthermore, a hierarchical quantisation of the colour features in the descriptor space is presented. Derived from the extracted set of low-level features, a video representation model that enables semantic annotation and contextual genre classification is designed. Results demonstrate the efficiency and robustness of the temporal analysis algorithm that runs in real time maintaining the high precision and recall of the detection task. Adaptive key-frame extraction and summarisation achieve a good overview of the visual content, while the colour quantisation algorithm efficiently creates hierarchical set of descriptors. Finally, the video representation model, supported by the genre classification algorithm, achieves excellent results in an automatic annotation system by linking the video clips with a limited lexicon of related keywords

    Computational inference and control of quality in multimedia services

    Get PDF
    Quality is the degree of excellence we expect of a service or a product. It is also one of the key factors that determine its value. For multimedia services, understanding the experienced quality means understanding how the delivered delity, precision and reliability correspond to the users' expectations. Yet the quality of multimedia services is inextricably linked to the underlying technology. It is developments in video recording, compression and transport as well as display technologies that enables high quality multimedia services to become ubiquitous. The constant evolution of these technologies delivers a steady increase in performance, but also a growing level of complexity. As new technologies stack on top of each other the interactions between them and their components become more intricate and obscure. In this environment optimizing the delivered quality of multimedia services becomes increasingly challenging. The factors that aect the experienced quality, or Quality of Experience (QoE), tend to have complex non-linear relationships. The subjectively perceived QoE is hard to measure directly and continuously evolves with the user's expectations. Faced with the diculty of designing an expert system for QoE management that relies on painstaking measurements and intricate heuristics, we turn to an approach based on learning or inference. The set of solutions presented in this work rely on computational intelligence techniques that do inference over the large set of signals coming from the system to deliver QoE models based on user feedback. We furthermore present solutions for inference of optimized control in systems with no guarantees for resource availability. This approach oers the opportunity to be more accurate in assessing the perceived quality, to incorporate more factors and to adapt as technology and user expectations evolve. In a similar fashion, the inferred control strategies can uncover more intricate patterns coming from the sensors and therefore implement farther-reaching decisions. Similarly to natural systems, this continuous adaptation and learning makes these systems more robust to perturbations in the environment, longer lasting accuracy and higher eciency in dealing with increased complexity. Overcoming this increasing complexity and diversity is crucial for addressing the challenges of future multimedia system. Through experiments and simulations this work demonstrates that adopting an approach of learning can improve the sub jective and objective QoE estimation, enable the implementation of ecient and scalable QoE management as well as ecient control mechanisms

    The harmonious coexistence of sound and image for efficient audiovisual communication: the case of Kairos Communications LTD

    Get PDF
    Relatório de estágio de mestrado em Ciências da Comunicação (área de especialização em Audiovisual e Multimédia)The present report is the result of a three-month traineeship at the Kairos Communications LTD in Maynooth, Ireland, which has long experience in cultural and religious sound and video productions, being an opportunity to practice audio-visual and multimedia knowledge acquired at the University of Minho. Although the traineeship was focused on production and post-production of contents, the Kairos Outside Broadcasting Unit was a big asset to improve technical skills. Based on the empirical experience acquired in the internship, this report is focused on the harmonious coexistence of sound and image for an efficient audio-visual communication. Sound and image corelates with each other as complementary elements in many digital media contents of today’s different digital platforms and applications. The increasing access to smartphones with great capabilities to record, edit ad share sound and image has turn most of the users into content producers, so that today, any cultural, politic, or social event has most probably someone catching sound or image. From the knowledge acquired with the research, an overview on the development of communication in the Democratic Republic of Congo is presented. Oral tradition has been the instrument to pass on knowledge to younger generations or convey information to the public. The development of communication in the Democratic Republic of Congo is linked to the former colonial power (Belgium) and France. France provided formation as well as equipment to update former radio journalists to the television that had been invading most of the world as an instrument of national pride. In fact, the development of media, mostly the radiobroadcast in the beginning, and then the television, was a great instrument of political propaganda for the newly independent African countries. Every country setup a radiobroadcast to free oneself from any other dependency. It was conceived as a great instrument to disseminate ideologies to the population. The forms of communication (verbal, non-verbal, written, etc.), the power and revolution of words and images in the world and in Africa, and the evolution in Congo, from oral to digital communication, are the focus of this report, which tries to understand what had led the Democratic Republic of Congo to the new media environment and where word and image intermingleO presente relatório é o resultado de um estágio de três meses na Kairos Communications LTD em Maynooth, Irlanda, que possui uma longa experiência em produções culturais e religiosas de som e vídeo, sendo uma oportunidade para praticar conhecimentos audiovisuais e multimédia adquiridos na Universidade do Minho. Embora o estágio tenha sido focado na produção e pós produção de conteúdos, a Kairos Outside Broadcasting Unit foi um grande ativo para o aperfeiçoamento das habilidades técnicas. Com base na experiência empírica adquirida no estágio, este relatório centra-se na coexistência harmoniosa de som e imagem para uma comunicação audiovisual eficiente. Som e imagem correlacionam-se entre si como elementos complementares em muitos conteúdos de média digital das diferentes plataformas e aplicativos digitais de hoje. O crescente acesso a smartphones com grandes recursos para gravar, editar e partilhar som e imagem transformou a maioria dos utilizadores em produtores de conteúdo, sendo que hoje, qualquer evento cultural, político ou social tem muito provavelmente alguém a captar o som ou a imagem. A partir dos conhecimentos adquiridos com a pesquisa realizada, é apresentado um panorama sobre o desenvolvimento da comunicação na República Democrática do Congo. A tradição oral tem sido o instrumento para passar conhecimento às gerações mais novas ou levar informações ao público. O desenvolvimento da comunicação na República Democrática do Congo está ligado à ex-potência colonial (Bélgica) e à França. A França forneceu formação e também equipamento para atualizar os ex-jornalistas de rádio sobre a televisão que vinha invadindo a maior parte do mundo como um instrumento de orgulho nacional. Na verdade, o desenvolvimento dos média principalmente a radiodifusão no início, e depois a televisão, foi um grande instrumento de propaganda política para os países africanos recém-independentes. Cada país estabelece uma transmissão de rádio para se libertar de qualquer outra dependência. Foi concebido como um grande instrumento de divulgação de ideologias para a população. As formas de comunicação (verbal, não verbal, escrita, etc.), o poder e a revolução das palavras e imagens no mundo e em África, e a evolução no Congo, da comunicação oral à digital, são o foco deste relatório, que procura compreender o que levou a República Democrática do Congo ao novo ambiente mediático no qual a palavra e a imagem misturam-se.Part of this work was supervised in the scope of the project “Audire - Audio Repository: saving sonic-based memories”, co-funded by the Operational Programme for Competitiveness and Internationalization and by the Portuguese Foundation of Science and Technology (PTDC-COM-CSS/32159/2017). This has instructed the theoretical framework of the present work, specifically on the role of sound and its relationship with image, in the evolution of the communication models and the respective emancipation of communities in the Democratic Republic of Congo

    Scalable on-demand streaming of stored complex multimedia

    Get PDF
    Previous research has developed a number of efficient protocols for streaming popular multimedia files on-demand to potentially large numbers of concurrent clients. These protocols can achieve server bandwidth usage that grows much slower than linearly with the file request rate, and with the inverse of client start-up delay. This hesis makes the following three main contributions to the design and performance evaluation of such protocols. The first contribution is an investigation of the network bandwidth requirements for scalable on-demand streaming. The results suggest that the minimum required network bandwidth for scalable on-demand streaming typically scales as K/ln(K) as the number of client sites K increases for fixed request rate per client site, and as ln(N/(ND+1)) as the total file request rate N increases or client start-up delay D decreases, for a fixed number of sites. Multicast delivery trees configured to minimize network bandwidth usage rather than latency are found to only modestly reduce the minimum required network bandwidth. Furthermore, it is possible to achieve close to the minimum possible network and server bandwidth usage simultaneously with practical scalable delivery protocols. Second, the thesis addresses the problem of scalable on-demand streaming of a more complex type of media than is typically considered, namely variable bit rate (VBR) media. A lower bound on the minimum required server bandwidth for scalable on-demand streaming of VBR media is derived. The lower bound analysis motivates the design of a new immediate service protocol termed VBR bandwidth skimming (VBRBS) that uses constant bit rate streaming, when sufficient client storage space is available, yet fruitfully exploits the knowledge of a VBR profile. Finally, the thesis proposes non-linear media containing parallel sequences of data frames, among which clients can dynamically select at designated branch points, and investigates the design and performance issues in scalable on-demand streaming of such media. Lower bounds on the minimum required server bandwidth for various non-linear media scalable on-demand streaming approaches are derived, practical non-linear media scalable delivery protocols are developed, and, as a proof-of-concept, a simple scalable delivery protocol is implemented in a non-linear media streaming prototype system

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

    Anomaly Detection, Rule Adaptation and Rule Induction Methodologies in the Context of Automated Sports Video Annotation.

    Get PDF
    Automated video annotation is a topic of considerable interest in computer vision due to its applications in video search, object based video encoding and enhanced broadcast content. The domain of sport broadcasting is, in particular, the subject of current research attention due to its fixed, rule governed, content. This research work aims to develop, analyze and demonstrate novel methodologies that can be useful in the context of adaptive and automated video annotation systems. In this thesis, we present methodologies for addressing the problems of anomaly detection, rule adaptation and rule induction for court based sports such as tennis and badminton. We first introduce an HMM induction strategy for a court-model based method that uses the court structure in the form of a lattice for two related modalities of singles and doubles tennis to tackle the problems of anomaly detection and rectification. We also introduce another anomaly detection methodology that is based on the disparity between the low-level vision based classifiers and the high-level contextual classifier. Another approach to address the problem of rule adaptation is also proposed that employs Convex hulling of the anomalous states. We also investigate a number of novel hierarchical HMM generating methods for stochastic induction of game rules. These methodologies include, Cartesian product Label-based Hierarchical Bottom-up Clustering (CLHBC) that employs prior information within the label structures. A new constrained variant of the classical Chinese Restaurant Process (CRP) is also introduced that is relevant to sports games. We also propose two hybrid methodologies in this context and a comparative analysis is made against the flat Markov model. We also show that these methods are also generalizable to other rule based environments

    ATOM : a distributed system for video retrieval via ATM networks

    Get PDF
    The convergence of high speed networks, powerful personal computer processors and improved storage technology has led to the development of video-on-demand services to the desktop that provide interactive controls and deliver Client-selected video information on a Client-specified schedule. This dissertation presents the design of a video-on-demand system for Asynchronous Transfer Mode (ATM) networks, incorporating an optimised topology for the nodes in the system and an architecture for Quality of Service (QoS). The system is called ATOM which stands for Asynchronous Transfer Mode Objects. Real-time video playback over a network consumes large bandwidth and requires strict bounds on delay and error in order to satisfy the visual and auditory needs of the user. Streamed video is a fundamentally different type of traffic to conventional IP (Internet Protocol) data since files are viewed in real-time, not downloaded and then viewed. This streaming data must arrive at the Client decoder when needed or it loses its interactive value. Characteristics of multimedia data are investigated including the use of compression to reduce the excessive bit rates and storage requirements of digital video. The suitability of MPEG-1 for video-on-demand is presented. Having considered the bandwidth, delay and error requirements of real-time video, the next step in designing the system is to evaluate current models of video-on-demand. The distributed nature of four such models is considered, focusing on how Clients discover Servers and locate videos. This evaluation eliminates a centralized approach in which Servers have no logical or physical connection to any other Servers in the network and also introduces the concept of a selection strategy to find alternative Servers when Servers are fully loaded. During this investigation, it becomes clear that another entity (called a Broker) could provide a central repository for Server information. Clients have logical access to all videos on every Server simply by connecting to a Broker. The ATOM Model for distributed video-on-demand is then presented by way of a diagram of the topology showing the interconnection of Servers, Brokers and Clients; a description of each node in the system; a list of the connectivity rules; a description of the protocol; a description of the Server selection strategy and the protocol if a Broker fails. A sample network is provided with an example of video selection and design issues are raised and solved including how nodes discover each other, a justification for using a mesh topology for the Broker connections, how Connection Admission Control (CAC) is achieved, how customer billing is achieved and how information security is maintained. A calculation of the number of Servers and Brokers required to service a particular number of Clients is presented. The advantages of ATOM are described. The underlying distributed connectivity is abstracted away from the Client. Redundant Server/Broker connections are eliminated and the total number of connections in the system are minimized by the rule stating that Clients and Servers may only connect to one Broker at a time. This reduces the total number of Switched Virtual Circuits (SVCs) which are a performance hindrance in ATM. ATOM can be easily scaled by adding more Servers which increases the total system capacity in terms of storage and bandwidth. In order to transport video satisfactorily, a guaranteed end-to-end Quality of Service architecture must be in place. The design methodology for such an architecture is investigated starting with a review of current QoS architectures in the literature which highlights important definitions including a flow, a service contract and flow management. A flow is a single media source which traverses resource modules between Server and Client. The concept of a flow is important because it enables the identification of the areas requiring consideration when designing a QoS architecture. It is shown that ATOM adheres to the principles motivating the design of a QoS architecture, namely the Integration, Separation and Transparency principles. The issue of mapping human requirements to network QoS parameters is investigated and the action of a QoS framework is introduced, including several possible causes of QoS degradation. The design of the ATOM Quality of Service Architecture (AQOSA) is then presented. AQOSA consists of 11 modules which interact to provide end-to-end QoS guarantees for each stream. Several important results arise from the design. It is shown that intelligent choice of stored videos in respect of peak bandwidth can improve overall system capacity. The concept of disk striping over a disk array is introduced and a Data Placement Strategy is designed which eliminates disk hot spots (i.e. Overuse of some disks whilst others lie idle.) A novel parameter (the B-P Ratio) is presented which can be used by the Server to predict future bursts from each video stream. The use of Traffic Shaping to decrease the load on the network from each stream is presented. Having investigated four algorithms for rewind and fast-forward in the literature, a rewind and fast-forward algorithm is presented. The method produces a significant decrease in bandwidth, and the resultant stream is very constant, reducing the chance that the stream will add to network congestion. The C++ classes of the Server, Broker and Client are described emphasizing the interaction between classes. The use of ATOM in the Virtual Private Network and the multimedia teaching laboratory is considered. Conclusions and recommendations for future work are presented. It is concluded that digital video applications require high bandwidth, low error, low delay networks; a video-on-demand system to support large Client volumes must be distributed, not centralized; control and operation (transport) must be separated; the number of ATM Switched Virtual Circuits (SVCs) must be minimized; the increased connections caused by the Broker mesh is justified by the distributed information gain; a Quality of Service solution must address end-to-end issues. It is recommended that a web front-end for Brokers be developed; the system be tested in a wide area A TM network; the Broker protocol be tested by forcing failure of a Broker and that a proprietary file format for disk striping be implemented
    corecore