Efficient Mid-query Reoptimization of Sub-optimal Query Execution Plans
by Navin Kabra and David J. Dewitt

In this paper, the authors emphasis on the need for reconsidering the design of the ongoing query execution plan in the light of new observations made during run time which include updated values and improved estimates for number of statistics based on which the query was optimized initially. The following signficant contributions were made.

Innovative contributions
  • The paper nicely captures the limitations of the query optimizer thanks to outdated or inacurate estimates present in the system catalogues. It clearly explains how effective it would be if the same plan is designed again with new observed metrics during run time.
  • It seriously tries to present practically easily implementable algorithms obeying their dynamic reoptimization philosophy.
  • The overhead in collecting the metrics during run time is kept below acceptable threshold. The algorithms are clever enough to pick most effective parameters only for run time statistics collection.
  • This approach may make even more significant contribution to object-relation databases where the availability of user-defined functions make optimizers helpless and ineffective in making decisions in designing the query plan.
  • I was thinking of inclusion of confidence levels in the annotations by the optimizer and this was addressed later in the paper! with the concept of inaccuracy level.
Scope for improvement
  • Addition of the collector operators in the data flow paths may disturb the synchronization between producers and consumers in some cases. Does this impact the system badly in any cases?
  • There is a possibility of writing the results of the collector operators back to system catalogues for the benefit of optimizer for future queries. The Dynamic Re optimizer may itself benefit from it while working on some other query that is scheduled concurrently.
  • It is interesting to study the system in the light of multi query optimization or concurrent query execution system. The paper handles the queries in isolation. There may be quite reuse of the computations or data across queries. So the significant advantages due to the overlapping may be lost if query plan is re-optimized. This is another dimension worth looking before re-optimization is looked at.
  • I am little skeptical about performance gains. The proposed solution reports significant performance advantage for only certain class of queries. The authors should have commented on the things that are interesting to the DBA people- regarding in what circumstances they can go for this approach.
  • A decent performance study on object-relational databases will be more appealing.
Conclusion
This paper presents some very interesting ideas regarding dynamic query optimization. Even though, there is scope for improvement in many aspects, the paper in current form itself is very interesting.