Fully Dynamic Partitioning: Handling Data Skew in Parallel Data Cube Computation
Lu, H.J. and Yu, J.X. and Feng, L. and Li, Z.X. (2003) Fully Dynamic Partitioning: Handling Data Skew in Parallel Data Cube Computation. International Journal of Distributed and Parallel Databases, 13 (2). pp. 181-202. ISSN 0926-8782
| PDF Restricted to UT campus only: Request a copy 671Kb |
| Abstract: | Parallel data processing is a promising approach for efficiently computing data cube in relational databases, because most aggregate functions used in OLAP (On-Line Analytical Processing) are distributive functions. This paper studies the issues of handling data skew in parallel data cube computation. We present a fully dynamic partitioning approach that can effectively distribute workload among processing nodes without priori knowledge of data distribution. As supplement, a simple and effective dynamic load balancing mechanism is also incorporated into our algorithm, which further improves the overall performance. Our experimental results indicated that the proposed techniques are effective even when high data skew exists. The results of scale-up and speedup tests are also satisfactory. |
| Item Type: | Article |
| Faculty: | Electrical Engineering, Mathematics and Computer Science (EEMCS) |
| Link to this item: | http://purl.utwente.nl/publications/63245 |
| Official URL: | http://dx.doi.org/10.1023/A:1021567425133 |
| Export this item as: | BibTeX EndNote HTML Citation Reference Manager |
Repository Staff Only: item control page
Show download statistics for this publication
Show download statistics for this publication