In general computing systems, a job (process/task) may suspend itself whilst it is waiting for some activity to complete, e.g., an accelerator to return data. In real-time systems, such self-suspension can cause substantial performance/schedulability degradation. This observation, first made in 1988, has led to the investigation of the impact of self-suspension on timing predictability, and many relevant results have been published since. Unfortunately, as it has recently come to light, a number of the existing results are flawed. To provide a correct platform on which future research can be built, this paper reviews the state of the art in the design and analysis of scheduling algorithms and schedulability tests for self-suspending tasks in real-time systems. We provide (1) a systematic description of how self-suspending tasks can be handled in both soft and hard real-time systems; (2) an explanation of the existing misconceptions and their potential remedies; (3) an assessment of the influence of such flawed analyses on partitioned multiprocessor fixed-priority scheduling when tasks synchronize access to shared resources; and (4) a discussion of the computational complexity of analyses for different self-suspension task models.
Real Time Systems Jane Liu Pdf Download
Allowing tasks to self-suspend, meaning that computations can cease to progress despite being incomplete, conversely has the effect that key insights underpinning the analysis of non-self-suspending tasks no longer hold. As an example, consider the execution scenario in Fig. 1. Figure 1a illustrates the worst-case execution scenario for non-self-suspending tasks, i.e., where the longest interval between the arrival time and the finishing time of an instance of a task occurs. This worst case, termed critical instant, occurs when a job release coincides with the release of all higher priority tasks and all followup jobs of the higher-priority tasks are released as early as possible by satisfying the inter-arrival-time constraint. However, if a higher-priority task is allowed to suspend its execution, Fig. 1b shows that it is possible that a lower-priority task misses its deadline even if its deadline can be met under the critical-instant scenario defined above. The classical critical instant theorem (Liu and Layland 1973) thus does not apply to self-suspending task systems.
In contrast, the introduction of suspension behavior has a negative impact on the timing predictability and causes intractability in hard real-time systems (Ridouard et al. 2004). It was shown by Ridouard et al. (2004) that finding an optimal schedule (to meet all deadlines) is \(\mathcal NP\)-hard in the strong sense even when the suspending behavior is known a priori.
One specific problem due to self-suspending behavior is the deferrable execution phenomenon. In the ordinary sporadic and periodic task model, the critical instant theorem by Liu and Layland (1973) provides concrete worst-case scenarios for fixed-priority scheduling. That is, the critical instant of a task defines an instant at which, considering the state of the system, an execution request for the task will generate the worst-case response time (if the job completes before next jobs of the task are released). However, with self-suspensions, no critical instant theorem has yet been established. This makes it difficult to efficiently test the schedulability. Even worse, the effective scheduling strategies for non-self-suspending tasks may not work very well for self-suspending tasks. For example, it is known that EDF (RM, respectively) has a \(100\%\) (\(69.3\%\), respectively) utilization bound for ordinary periodic real-time task systems on uniprocessor systems, as provided by Liu and Layland (1973). However, with self suspensions, it was shown in Ridouard et al. (2004) and Chen and Liu (2014) that most existing scheduling strategies, including EDF and RM, do not provide any bounded performance guarantees.
Much prior work has explored the design of scheduling algorithms and schedulability analyses of task systems when self-suspending tasks are present. Motivated by the proliferation of self-suspending scenarios in modern real-time systems, the topic has received renewed attention in recent years and several results have been re-examined. Unfortunately, we have found that large parts of the literature on real-time scheduling with self-suspensions has been seriously flawed by misconceptions. Several errors were discovered, including:
Incorrect quantification of jitter for dynamic self-suspending task systems (Audsley and Bletsas 2004a, b; Kim et al. 1995; Ming 1994). This misconception was unfortunately carried forward in Zeng and di Natale (2011), Brandenburg (2013), Yang et al. (2013), Kim et al. (2014), Han et al. (2014), Carminati et al. (2014), Yang et al. (2014), and Lakshmanan et al. (2009) in the analysis of worst-case response times under partitioned multiprocessor real-time locking protocols.
During the preparation of this review paper, several reports (Chen et al. 2016b; Chen and Brandenburg 2017; Liu and Anderson 2015; Bletsas et al. 2018) have been filed to discuss the flaws, limits, and proofs of individual papers and results. In the interest of brevity, these reports are summarized here only at a high level, as including them in full detail is beyond the scope of this already long paper. The purpose of this review is thus not to present the individual discussions, evaluations and comparisons of the results in the literature. Rather, our focus is to provide a systematic picture of this research area, common misconceptions, and the state of the art of self-suspending task scheduling. Although it is unfortunate that many of the early results in this area were flawed, we hope that this review will serve as a solid foundation for future research on self-suspensions in real-time systems.
This also applies to systems with scratchpad memories, where the scratchpad memory allocated to a task is dynamically updated during its execution. In such a case, a job of a task executes for a certain amount of time, then initiates a scratchpad memory update to push its content from the scratchpad memory to the main memory and to pull some content from the main memory to the scratchpad memory, often using DMA. During the DMA transfers to update the scratchpad memory, the job suspends itself. Such memory access latency can become much more dynamic and larger when we consider multicore platforms with shared memory, due to bus contention and competition for memory resources.
Example 2: multiprocessor synchronization Under a suspension-based locking protocol, tasks that are denied access to a shared resource (i.e., that block on a lock) are suspended. Interestingly, on uniprocessors, the resulting suspensions can be accounted for more efficiently than general self-suspensions by considering the blocking time due to the lower-priority job(s) that hold(s) the required shared resource(s). More detailed discussions about the reason why uniprocessor synchronization does not have to be considered to be self-suspension can be found in Sect. 6.1. In multiprocessor systems, self-suspensions can arise (for instance) under partitioned scheduling (in which each task is assigned statically on a dedicated processor) when the tasks have to synchronize their access to shared resources (e.g., shared I/O devices, communication buffers, or scheduler locks) with suspension-based locks (e.g., binary semaphores).
Since modern embedded systems are designed to execute complicated applications, the limited resources, such as the battery capacity, the memory size, and the processor speed, may not satisfy the required computation demand. Offloading heavy computation to some powerful computing servers has been shown as an attractive solution, including optimizations for system performance and energy saving. Computation offloading with real-time constraints has been specifically studied in two categories. In the first category, computation offloading always takes place at the end of a job and the post-processing time to process the result from the computing server is negligible. Such offloading scenarios do not incur self-suspending behavior (Nimmagadda et al. 2010; Toma and Chen 2013). In the second category, non-negligible computation time after computation offloading is needed. For example, the computation offloading model studied in Liu et al. (2014b) defines three segments of a task: (1) the first segment is the local computation time to encrypt, extract, or compress the data, (2) the second segment is the worst-case waiting time to receive the result from the computing server, and (3) the third segment is either the local compensation if the result from the computing server is not received in time or the post processing if the result from the computing server is received in time.
Other examples Self-suspension behavior has been observed in other applications. Examples are scheduling of parallel tasks where each subtask is statically assigned on one designated processor (Fonseca et al. 2016), real-time tasks in multicore systems with shared memory (Huang et al. 2016), timing analysis of deferrable servers (Chen et al. 2015), and dynamic reconfigurable FPGAs for real-time applications (Biondi et al. 2016).
The sporadic task model characterizes a task \(\tau _i\) as a three-tuple \((C_i,T_i,D_i)\). Each sporadic task \(\tau _i\) can release an infinite number of jobs (also called task instances) under the given minimum inter-arrival time (also called period) constraint \(T_i\). Each job released by a sporadic task \(\tau _i\) has a relative deadline \(D_i\). That is, if a job of task \(\tau _i\) arrives at time t, it must (in hard real-time systems), or should (in soft real-time systems) be finished before its absolute deadline at time \(t+D_i\), and the next instance of the task must arrive no earlier than time \(t + T_i\). The worst-case execution time of task \(\tau _i\) is \(C_i\). That is, the execution time of a job of task \(\tau _i\) is at most \(C_i\). The utilization of task \(\tau _i\) is defined as \(U_i=C_i/T_i\). 2ff7e9595c
Comments