We present a framework for the analysis of the decoding delay in multiview video coding (MVC). We show that in real-time applications, an accurate estimation of the decoding delay is essential to achieve a minimum communication latency. As opposed to single-view codecs, the complexity of the multiview prediction structure and the parallel decoding of several views requires a systematic analysis of this decoding delay, which we solve using graph theory and a model of the decoder hardware architecture. Our framework assumes a decoder implementation in general purpose multi-core processors with multi-threading capabilities. For this hardware model, we show that frame processing times depend on the computational load of the decoder and we provide an iterative algorithm to compute jointly frame processing times and decoding delay. Finally, we show that decoding delay analysis can be applied to design decoders with the objective of minimizing the communication latency of the MVC system.
Introduction
For several years, video technologies have targeted the development of systems that provide immersive viewing experiences. Nowadays, the advances in three-dimensional (3D) display technologies have made 3D video an emerging and sustainable market in the near future. 3D Video (3DV) and free viewpoint video (FW) are new types of visual media that expand the user's experience beyond what is offered by 2D video [1] , providing a 3D depth impression of the scene, and interactive viewpoint selection. Currently, these types of visual media systems are beginning to enter into consumer markets, such as entertainment and mobile applications [2] . For those systems, a data format that is richer than single 2D video signal is needed. The spectrum of data formats for 3D Video goes from purely image-based data formats like multiview video (multiple views of the same scene) to data formats related to computer graphics like 3D meshes and their corresponding textures [3] . A widely adopted approach is the one that includes multiview video and depth sequences as additional scene geometry information, allowing the possibility of generating additional views on virtual camera positions [4] . Nevertheless, the size of this multiview video grows linearly with the number of views while the available bandwidth is generally limited. Thus, an efficient compression scheme for multiview video is needed.
Multiview video coding (MVC) [5] is an extension of the H.264/ MPEG-4 Advanced Video Coding (AVC) standard [6] that provides efficient coding of such multiview video. Besides, as depth signals can be represented as monochromatic video signals, MVC has been also commonly used to compress them [4] . As an extension of AVC, MVC makes use of the set of AVC coding tools. The key additional feature of the MVC design, that increases the coding efficiency specifically for multiview video, is a new prediction relationship between frames of different views that exploits the interview redundancy. This prediction relationship is known as interview prediction. Fig. 1 shows a sample prediction structure in which temporal and interview predictions are used.
MVC allows a wide range of applications and scenarios [7] . Here, we address real-time applications such as live broadcasting, videoconferencing or interactive streaming [8] where constraints on the end-to-end delay are imposed. The one-way delay between both ends of the conversation is known as communication latency, i.e., the delay between the instant when a frame is captured and the instant when it is displayed at the receiver. In bidirectional applications, the constraint on communication latency is stricter. For those, typical recommendations on maximum communication latency generally state that there is none or little impact below 150 ms, while a serious impact may be observed above 400 ms [9] .
Each element (encoder, transmission channel and decoder) contributes to the delay between the instant when a frame is captured and the instant when it is decoded at the receiver: the system delay. For each frame, the value of the system delay varies due to different factors, such as the required encoding time or the nature of the transmission channel (variable or constant bitrate, packet losses, etc.). Since frames have to be displayed at a constant rate, generally receivers utilize an output buffer for decoded frames, to guarantee a constant communication latency. In practice, this buffer results in an additional variable delay for each decoded frame: the display delay. Therefore, the communication latency is the sum of the system delay and the display delay. In real-time applications, the design of Fig. 1 . Example of a multiview prediction structure for two cameras. Horizontal arrows correspond to temporal prediction and vertical arrows to interview prediction.
this output buffer, and the display delay is a challenging issue. On the one hand, the display delay should be as minimum as possible since it increases the communication latency. On the other hand, it has to be high enough to absorb the variability of the system delay so that frames are displayed at a constant rate. While in non-live services, such as video on demand, this display delay may be over-dimensioned with little impact on the service, this requirement is stricter in the case of real-time bidirectional services. Thus, an accurate computation of the system delay is essential to design a system with a minimum valid display delay.
In [10] , we presented a framework for the analysis of the encoding delay for MVC. Now, in this paper, we focus on the analysis of the contribution of MVC decoders to the system delay: the decoding delay. Our purpose here is to provide tools for an accurate evaluation of the decoding delay in order to complete the analysis of the contribution of the MVC codec processes to the system delay. The decoding delay in MVC decoders depends on two different but related factors:
1. The multiview prediction structure: temporal and interview prediction relationships among frames establish decoding order dependencies for a frame. 2. The hardware architecture and implementation of the decoder: specific architectural features of multiview decoders (e.g. number of processors, use of threads etc.) influence the time needed to decode a given frame, and therefore, they affect the decoding delay performance.
Whereas in single view decoders, the computation of the decoding delay can be easily approximated as the decoding time of one frame, in the case of MVC, the complexity of multiview prediction structures, and the presence of several views that need to be decoded simultaneously, increase the complexity of the decoding delay analysis. Thus, we present here a framework for the systematic analysis of the decoding delay in MVC decoders. This framework evaluates the decoding delay taking into account: (i) the multiview prediction structure and (ii) the hardware implementation of the decoder. Nowadays, actual decoders support several parallel streams and different codecs, and the general tendency is to incorporate general purpose processors, in which the decoders are software-implemented, instead of traditional dedicated hardware processors. This tendency is particularly interesting to handle MVC streams due to its inherent parallelization characteristics [11, 12] . Therefore, our framework assumes a hardware platform for the decoder based on a multi-core processor with multi-threading capabilities. We define a decoding process as the set of operations that are needed to decode a frame. Our model assumes that any decoding process can run on an exclusively dedicated core (processor from now on) or one of the threads that share the processing power of one of the processors. The required time to run that process will be higher if several processes share the same processor. Analogously to the encoding latency analysis [10] , we rely on graph theory to compute the decoding delay for this hardware model. A graph is constructed from the multiview prediction structure in which the frames can be seen as the nodes and the prediction dependencies as the edges. Each edge has an associated cost that represents the contribution of the prediction dependency to the decoding delay. We show that frame processing times depend on the computational load of the decoder and we provide an iterative algorithm to compute jointly frame processing times and the decoding delay by an iterative analysis of the graph.
In our results, we use the decoding delay analysis to characterize the communication latency of a complete MVC system. We show that this analysis can be used to determine hardware requirements of MVC decoders, such as number of processors or processor throughput (number of frames that one processor is able to decode per second), with the objective of achieving a target communication latency. For example, we show that for a given processor throughput, the decoding delay can be reduced by increasing the number of processors in the decoder, until certain limit that we can identify. Increasing the number of processors above that limit does not further decrease the decoding delay. This paper is organized as follows: in Section 2, we discuss the communication latency of an MVC system and the role of the decoding delay on it. In Section 3, we present our framework for the decoding delay analysis in a multi-thread decoder architecture. In Section 4 we present the iterative algorithm for the computation of processing times and decoding delay. In Section 5 we show the experimental results and in Section 6 we present the conclusions.
Discussion on communication latency of MVC systems
As aforementioned, the communication latency indicates the time elapsed between the instant when a frame is captured, and the instant when that frame is displayed. A block diagram of an MVC system and the elements that add to the communication delay between its both ends, are depicted in Fig. 2 . For frame xj (frame j of view ¡), t capt , is the instant when xj is captured, t cod , is the time instant when xj is completely coded, t^, is the instant when the the transmission delay á TX , is:
and the decoding delay S d • is:
We define the system delay á sys ¡ as the time elapsed between the capture time of xj and the insta J nt when it is completely decoded in the receiver. Formally, it can be expressed as:
This system delay is variable for each frame, due to the variability of the encoding and decoding delays introduced by the characteristics of the encoding process, such as different types of frame, the prediction structure, etc., and the variable nature of the delay on transmission channels. In order to maintain a constant display frame rate, the communication latency must have a constant value for all the frames. Therefore, to absorb the variability of the delay added by previous blocks, the receiver uses a buffer for decoded frames, that adds a display delay, S di ¡, which is variable for each frame. Thus, in practice, the communication latency can be expressed as:
Analysis of decoding delay in MVC
In this section, we present the elements and the algorithms for the systematic analysis of the decoding delay. Firstly, we discuss the general time relationships in the decoder that allow us to evaluate the decoding delay. Secondly, we present the main elements to compute the decoding delay for any multiview prediction structure in a multi-processor decoder: (i) the parallel multi-processor decoder model and (ii) the direct acyclic graph model. In the next section we present the iterative algorithm to compute jointly the frame processing times and the decoding delay in that decoder architecture.
In order to develop a systematic analysis of the decoding delay on multiview decoders, we make the following assumptions:
1. All views have to be decoded, as all of them will be displayed or the receiver will be able to choose any view for displaying among those received at any time. 2. A frame is the basic decoding unit and is decoded sequentially, i.e., different decoding operations for a given frame cannot be performed in parallel at the same time in several processors. 3. The decoding of a new frame does not start until its reference frames have been completely decoded.
The decoding delay, á dec ¡, as defined in (3), is the difference between the instant when x\ arrives at the decoder, t ovi , and the instant when xj is already decoded, í dec ¡. From (l)- (3), t^ can be computed as:
Lat-' ¿sys; + ¿dispj
In real-time applications, the optimal communication latency should be the minimum delay that allows the receiver to maintain a constant display rate. This latter condition means that the communication latency value cannot be lower than the system delay of any frame, since this would lead to assign a frame a negative display delay. That is, for a given MVC system, valid communication latency values have to fulfill the following condition:
Therefore, to achieve the minimum valid communication latency value, the following condition must hold:
where the capture time t capti is known, á cod ¡ can be estimated using [10] , and á TX ¡ can be estimated with a transmission channel model [13] .
Regarding the computation of t de ¿, we define t st when the decoding process of xj starts. Then:
At, proc!! as the instant
where At ¡ is the processing time devoted to decoding the coded version of xj. We also need to define another relevant time instant, t read ,;, as the instant when xj is ready for being decoded, i.e., all its reference frames have been completely decoded. t read , is computed as follows:
1 where N is the number of views and M is the number of frames per view. If the condition in (7) holds, the decoder does not add any display delay to the frame with the highest system delay. Formally:
ready] = max t^, : ,max(t d ec/)
where L(i,j) is the set of reference frames for xj. While (3), (10) and (11) only depend on the coding order relationships imposed by the prediction structure, and therefore they are valid for all hardware decoder architectures, the relationship between t start ¡ and t read ¡ depends on the specific hardware decoder architecture being used (e.g., number of processors, sequential or parallel processing, etc.). Nevertheless, for any hardware decoder architecture, if we assume that a given frame cannot be decoded before its reference frames have been decoded, then: Therefore, identifying the frame with highest system delay and its accurate value is an essential factor to obtain the minimum communication latency for well-designed real-time applications. In the next section, we present the framework for the characterization of one of the elements that contribute to that system delay, the "-startf -' Veadyj' (12) The decoding process of xj cannot start until all frames in L(i,j) have been decoded, but the start of the decoding of xj may be delayed if decoding delay á dec " in order to complete the contribution of MVC there are not processing resources available at t ready ¡. Thus, the analcodecs to the communication latency.
ysis has to be individualized for a given decoder hardware platform.
Parallel multi-processor decoder model
Due to the inherent parallel characteristics of MVC, parallelization is an essential factor for an efficient implementation of MVC decoders. This may be done by the utilization of multi-core processors and/or parallelization techniques such as multi-threading.
As the use of software decoders implemented in general purpose multi-core processors has grown in recent years, we propose a decoder model that simulates the characteristics of those decoders. We name it parallel multi-processor decoder (Parallel MPD) model. It considers a set of K processors with multi-task decoding (one processor can decode several frames at a time, by means of parallelization techniques).
The characteristics of the Parallel MPD model are the following:
1. The decoding operations for any frame from any of the N views can be performed in any of the K processors. 2. The processors can decode their assigned frames in a parallel way, i.e., if at a given time all the processors are busy and a new frame is ready to be decoded, its decoding process starts immediately in one of the processors in parallel with the current ongoing processes. 3. The decoder manages the assignment of each of the decoding processes to each of the available processors.
Directed acyclic graph model
In decoder architectures with multitask processors, such as the Parallel MPD model, the decoder assigns the decoding of new received frames to one of the processors without having to wait for the availability of idle processors. Thus, decoding of xj starts at treadyj-Formally:
With this condition and (11):
and using (10):
Under the condition in (13), (15) can be solved using a similar approach to that one in [10] that relies on graph theory. In the following we describe it. For any feasible MVC prediction structure, we can extract a directed acyclic graph (DAG) [14] , in which the frames are the nodes of the DAG and the prediction dependencies are its edges. Due to the directed nature of the dependencies (one frame is predicted from the reference frame but not vice versa), the graph is directed. Each directed edge links a reference node (parent) to the node that is predicted from it (child). A path is a sequence of nodes linked by directed edges. Fig. 3 shows an example prediction structure and its associated DAG. Each edge of the DAG has an associated cost value that indicates the single contribution of its parent node to the decoding delay of its child node. The cost value co'{ of the edge that links xj withxf is:
In (16) we assume thatxj starts its decoding process at t^ (the start of the decoding process of xj is not delayed by the decoding processes of its parent frames) to capture the isolated contribution of xj to the decoding delay of x\. As only positive delay values have a realistic meaning, octf is restricted to positive values. Fig. 4 illustrates the computation of octf with a time chronogram in which the decoding process of the parent frame xj delays the decoding start of child frame xj 1 . Note that x\ is received at t^, but its encoding process cannot start until xj is not completely decoded.
The cost of a path is the sum of the costs of the edges that link the nodes in the path. The cost of the path that ends on a given frame sums up the contributions of all parents frames on that path to the decoding delay of that frame. Among the set of paths ending on the same node xj, we name delay path to xj, to the one with the highest cost value. Its associated cost, p del , is the following: (14) p delf =max{pj(u)},
where U is the set of paths ending in node xj and pUu) is the cost of the u th path. Given that, and the condition in (13), (10) becomes:
and therefore the decoding delay is:
Therefore, provided that the values of At proc , are known, á dec , and the decoding chronogram can be computed systematically foV all the frames using the DAG. 
Frame processing time model
In our previous work [10, 18] , we used a frame processing time model for an MVC encoder in which the time devoted to encoding a given frame depends on the number of reference frames. However, in the case of the decoder we assume that the time devoted of the decoding process of a frame depends on the frame type, i.e., I, P or B.
We define the computational load of the decoding process of a frame as the processing time devoted to decoding that frame in an exclusively dedicated processor (A£¿' ml , where (•) can be I, P or B). We take as a reference the computational load of the decoding process of an I-frame (Atj. iml ) as non extra motion compensation operations are involved in the decoding process. Then, in our model, we consider that the computational load of the decoding process of a P-frame, Atj iml , and a B-frame, Atj iml , are proportional to Atj. iml and computed as follows:
decoding will occur when the number of frames simultaneously decoded at a given time, n sim , is higher than K. At proc , also depends on the how the decoder manages the assignment of" frames to the available processors. For instance, in a decoder with K = 2 and "sim = 3, At proc , would differ if: (a) the three frames are decoded in different threads of only one processor or (b) two frames are decoded in two threads in one of the processors and the other frame is decoded in the other processor.
In our model we assume the following assumption: if n sim frames are decoded simultaneously in one processor, the required processing time to decode those frames is n sim times the computational load of those frames. Formally, their frame processing time At ¿ is:
At", At;
At; = «pAt in Atsiml = «BA4
(21)
where a P and a B are scalar values.
To estimate the parameters in our frame processing time model (At! siml' AC. Atjimi' a P an d ote), we have performed a series of experiments using a JMVC 8.5 decoder [19] running in a general purpose PC: four-core processor working at 2.40 GHz, with 3.25 GB of RAM memory. Using this reference software decoder we have estimated the average decoding time for each type of frame when each frame is decoded in an exclusively dedicated core. From those results we have computed the values of a P and a B . The results for several tested sequences are shown in Table 1 . It can be seen that the value of a P and a B depends on the video sequence, ocp varies from 0.56 to 0.70 and a B varies from 0.74 to 0.89.
The translation of the computational load to time devoted to decoding a frame, At proci , clearly depends on the hardware characteristics of the MVC decoder. For multi-task processors such as the ones in the Parallel MPD model, At proc , depends on the computational load conditions of the set of processors. Thus, if a processor has to decode a single frame, its processing time is: At proc , = At^m l . Otherwise, if the processor has to deal with several frames in parallel, the processing time of each frame will increase, i.e.,
At ni >At«L-For a given decoder with K processors, parallel
To assess this assumption, we have performed the following experiment: we have decoded different number of frames simultaneously in a single processor and evaluated the frame processing times in each case for the different type of frames. We have used JMVC 8.5 decoders [19] running in one of the cores of a general purpose PC. The results are shown in Table 2 shows the estimated frame processing times from n sim = 1 to n sim = 4, i.e., Atj.^ to At^m 4 . The results show that our assumption is sufficiently good as the time devoted to decoding n sim frames simultaneously is close to n sim x At^' ml . In any case, we have assessed that the computational overhead that may occur when processing n sim frames in n sim threads of one processor do not incur in a processing time higher thann sim x At<:> ml .
To sum up, the estimation of At proci , requires: (i) the computation of the number of frames that are simultaneously decoded at any time, (ii) the determination of the time intervals when parallel decoding of several frames occurs, and (iii) the policy on the assignment of simultaneously decoded frames to the available processors. In the next section we depict the iterative algorithm that we have employed to compute jointly At proc , and the decoding chronogram. 
Iterative computation of the decoding delay in the Parallel MPD model
We have shown that in our decoder model, S dec ¡ can be computed with (20) and the DAG model. However, J the values of At proc , are not known a priori, as their values depend on the computational load conditions of the decoder that vary with time. To solve that, we use an iterative computation of the DAG. On each iteration, S de ¿ and thus the decoding chronogram are computed using the DAG with the values of At proc , obtained from the previous iteration. Then, the values of At proc , are updated depending on the computational load conditions observed in the current decoding chronogram.
In this iterative algorithm, we make the following assumption: as the necessary sequential operations to decode a frame can be computed in different processors, we assume that at any time, the remaining operations of a decoding process can be assigned to any of the processors. For example, consider a decoder with two processors (P 0 and Pi) that are decoding three frames (x 0 in P 0 and X] and x 2 in Pi). If at a given time the decoding of x 0 ends, we can assign Xi to P 0 and maintain x 2 in Pi.
Policy on the assignment of frames to processors
With the aim of developing a decoder model that has a fairly balanced processor usage, we implement an assignment policy on the MPD model that assigns frames with higher processing times to less loaded processors. This assignment policy is the following: consider that at given time and algorithm iteration, n sim frames are being decoded simultaneously. If n sim sg K, each frame is assigned to one of the K processors and there is no simultaneous decoding within any processor. If n sim > K, frames are assigned to the available processors by the following rules:
• The n sim frames are distributed in K groups, in such a way that there will be groups with nj = \^\ frames and groups with nt = [ism] frames, i.e., the maximum difference in number of frames among groups is one frame. The number of groups with nt frames, n*, is: n* = nsimmodK, while the number of groups with nj frames, n', is: n' =K -n*.
• Frames are distributed into the groups so that frames with higher At proc , are assigned to groups with nj frames. Thus, frames with higher At proc , will be decoded in processors with lower computational load, limiting the extra decoding delay caused by parallel processing.
• Then, each of the K groups of frames is assigned to each one of the K processors.
An example of this assignment policy for n sim = 6 and K = 4 is shown in Fig. 5 . Fig. 5(a) shows, for a given time instant, a group of frames (frames 0 to 5) ready to be decoded with different values of At proc i. Fig. 5(b) shows those frames sorted by increasing processing time. It can be seen in Fig. 5(c) that the first nt x n* are assigned in an alternate order to the first n* processors and the last nj x n' frames to the last n' processors.
Details of the iterative algorithm
This algorithm computes frame processing times and the decoding chronogram iteratively. On the initialization we assume that each frame is decoded in an exclusively dedicated processor. Then, on each iteration, the decoding chronogram is computed and we identify the time intervals in which the number of frames decoded simultaneously is higher than the number of processors. For those frames, we modify the frame processing times accordingly with the processor occupancy conditions. To update the frame processing times in the cases of simultaneous processing in one processor we follow the frame processing time model in Section 3.3. The flow diagram of this iterative algorithm is shown in Fig. 6 .
• Iteration 0 (Initialization of variables assuming that each frame is decoded in an exclusively dedicated processor.)
1. The initial value of the frame processing time of each frame is set to At" = At; ,Vij. 2. The initial value of the number of frames that are decoded simultaneously over time is n sim (t)|. where At proc is:
For t > ti,n sim (t)| k and nf im (t)| k are reset to the initialization value for next iteration as follows: (variables are reset to 1 for non-evaluated time periods.)
Go back to 1.
In order to illustrate the iterative algorithm, we show the example in Fig. 7 . For simplicity reasons in this example, Fig. 7 (a) shows a very simple GOP structure for one view with no prediction relationships. Fig. 7(b) -(e) show the decoding chronograms on several iterations of the algorithm for a decoder with K = 2 (processors P 0 and Pi). Fig. 7(b) shows the decoding chronogram as obtained in iteration 1 (after iteration 0) with n sim (t)| 0 = 1 (initially, all the frames are assigned to P 0 ). By evaluation of the decoding chronogram, we find the interval At], in which n^Ati)!-, = 2. This means that during At] two frames are decoded simultaneously. As K = 2, the second frame is assigned to Pi and processing times are not modified. After the second step of iteration 2 ( Fig. 7(c) ), the interval At 2 is found, for which n sim (At 2 )| 2 = 3. Thus, for frames x[],x? and x% the frame processing times need to be updated, xjj is assigned to P 0 and x?,*" are assigned to Pi and At proc o and At proc o are updated adding At proc = At 2 /2, as shown in Fig. 7 (d) and fort > ti, n sim (t)| 2 = 1. On iteration 3 ( Fig. 7 (e) ), the interval At 3 is found, for which n sim (At 3 )| 3 = 2. Thus, the remaining decoding operations for x^ are assigned to Pi and frame processing times are not further modified. On iteration 4 ( Fig. 7 (f) ) there are no differences between n sim (t)| 3 and n sim (t)| 4 and the algorithm ends.
Experimental results
We have evaluated the communication latency of an MVC system, such as the one depicted in Fig. 2 , for different multiview prediction structures and MVC decoders with different processing capacities. We focus our analysis in the decoding delay, and the parameters of the MVC decoder within the Parallel MPD model that have an influence on it: the number of processors and processor throughput. Our approach on the tests is the following: given a prediction structure and a target value of the communication latency, we find possible combinations of those parameters that achieve the target latency value. To characterize the processor throughput, we use the processing time of a frame in an exclusively dedicated processor (Atj.¡' ml ) as the parameter under analysis. Note that At^m l and the processor throughput (number of frames decoded per time unit) are inversely proportional. We have performed this evaluation for different multiview GOP structures of the Joint Multiview Video Model (JMVM) with IBP prediction scheme for the interview prediction [20] . We have evaluated prediction structures with three and five views, and GOP sizes of two, four and eight frames. For all experiments, and as we are not focusing on the encoder delay, an MVC encoder with an unlimited processing capacity was assumed [10] . The frame processing time parameters for the encoder have been estimated in a general purpose PC: four-core processor working at 2.40 GHz, with 3.25 GB of RAM memory. Also, for simplicity in our simulations the values of the transmission delay ó^, are not considered. The encoder time parameter values are shown in Table 3. For the frame processing time model in the decoder, we have used the following parameters: a P = 0.6, a B = 0.8. Fig. 8 shows, for a GOP of three views and size of two frames, the evolution of the maximum value of Atj. iml that guarantees a communication latency below a target value, with different numbers of processors. Results are shown for several target communication latency values. For example, considering the graph of Lat = 500 ms and a decoder with two processors, Atj. iml must be under 60 ms to obtain a communication latency below 500 ms. Alternatively, if Atj. iml =100 ms, the decoder needs at least four processors to obtain a communication latency below 500 ms. It can be seen that each of the graphs reaches a saturation value of Atj. iml , Atj. lml max . This result indicates that there exists a limit to the frame processing time to guarantee the target latency value despite the number of processors. This value is obtained when the number of processors is equal to the maximum number of frames that have to be decoded in parallel at any time. Fig. 9 shows the same type of results for different JMVM prediction structures. With these results, we prove that the proposed framework allows us to solve design problems on MVC decoders such as: given a target communication latency, a prediction structure and a certain processor throughput, we can find the minimum number of processors to achieve a communication latency under that target value. Alternatively, given a number of processors we can compute the maximum value of frame processing time that guarantees the target communication latency.
Conclusions
We have presented a framework for the systematic analysis of the decoding delay of multiview decoders. We have shown that in real-time applications, an accurate estimation of the decoding delay is an essential factor to achieve a minimum communication latency.
Thus, the proposed framework completes the analysis of the contribution of the MVC codec to that communication latency. The delay on the decoder depends on: (i) the multiview prediction structure and (ii) the hardware architecture of the decoder. We have considered a multi-processor platform with multi-threading capabilities, whose main characteristics are captured in the Parallel MPD model. We have shown that the contribution of the multiview prediction structure to the decoding delay can be computed using graph theory, given that the frame processing times are known. As in the Parallel MPD model the frame processing times depend on the computational load of the processors in the decoder, we have provided an iterative algorithm to compute jointly frame processing times and the decoding delay in such a decoder platform.
Finally, we have shown that this framework can be applied to design decoders with the aim of minimizing the communication latency. It provides a tool for an efficient design of characteristics of the decoder that have an influence on the decoding delay performance, such as the number of processors or the processor throughput. We have shown that given a prediction structure and a given processor throughput we are able to find the minimum number of decoder processors to achieve a target communication latency value. Alternatively, given the number of processors we find the minimum processor throughput to achieve that target value.
