Automatic parallelization of nested loop programs for non-manifest real-time stream processing applications
Bijlsma, Tjerk (2011) Automatic parallelization of nested loop programs for non-manifest real-time stream processing applications. thesis.
|Abstract:||This thesis is concerned with the automatic parallelization of real-time stream processing applications, such that they can be executed on embedded multiprocessor systems. Stream processing applications can be found in the video and channel decoding domain. These applications often have temporal requirements and can contain non-manifest conditions and expressions. For non-manifest conditions and expressions the results cannot be evaluated at compile time. Current parallelization approaches have difficulties with the extraction of function parallelism from stream processing applications. Some of these approaches require applications with manifest behavior and affine index-expressions. For these applications, they can derive data dependencies and insert inter-task communication via FIFO buffers. But, these approaches cannot support stream processing applications with non-manifest loops. Furthermore, to the best of our knowledge current approaches can only extract a temporal analysis model from applications with manifest behavior and without cyclic data dependencies.
To address the issues mentioned above, we present an automatic parallelization approach to extract function parallelism from sequential descriptions of real-time stream processing applications. We introduce a language to describe stream processing applications. The key property of this language is that all dependencies can be derived at compile time. In our language we support non-manifest loops, if-statements, and index-expressions. We introduce a new buffer type that can always be used to replace the array communication. This buffer supports multiple reading and writing tasks. Because we can always derive the data dependencies and always replace the array communication by communication via a buffer, we can always extract the available function parallelism. Furthermore, our parallelization approach uses an underlying temporal analysis model, in which we capture the inter-task synchronization. With this analysis model, we can compute system settings and perform optimizations. Our parallelization approach is implemented in a multiprocessor compiler. We evaluated our approach, by extracting parallelism from a WLAN channel decoder application and a JPEG decoder application with our multiprocessor compiler.
Electrical Engineering, Mathematics and Computer Science (EEMCS)
|Link to this item:||http://purl.utwente.nl/publications/77591|
|Export this item as:||BibTeX|
Daily downloads in the past month
Monthly downloads in the past 12 months
Repository Staff Only: item control page