Using and suggesting indexes in security-aware query optimisation

Relational databases frequently store sensitive information, requiring the adoption of fine-grained access control mechanisms to regulate data access. Ensuring the efficient and secure evaluation of database queries under fine-grained access control is known as security-aware query optimisation. Security-aware query optimisation frequently uses cost-based query optimisation. Each operator in a query tree has an associated cost of execution, estimated during query optimisation based on factors such as cardinality of results and number of IO operations [1].

Fine-grained database access control allows the specification of policies regulating access to individual cells in the database. Moreover, this type of access control merges both the data to be protected (protection state) and the data used to generate access control decisions (application state). This overlap results in tables being reused multiple times in queries, being used either exclusively as part of the application state or as a part of both application and protection state.

The design and use of indexes in relational database management systems (RDBMS) is a well-established research field [2]. Likewise, systems exist to recommend possible index improvements to users as part of query-optimisation strategies [3]. However, in the access control domain, tables may require different indexes depending on whether they act as members of protection or application state. Further, indexes used in evaluating a query without access control may be lost after the addition of security predicates [4]. In this project, we would like to research how best to consider and suggest indexes when optimising queries under access control.

This project aims to answer (but not limited to) the following questions:

  • What is the relationship between indexing techniques used for access control and those used in base query tables?
  • Which indexing techniques can we apply to speed up security-aware query evaluation?
  • How do we integrate index-aware query rewriting into a cost-based query optimisation strategy?
  • Can we recommend adjustments to tables used in access control to include indexes, and how do we best communicate this to the user?

Useful reading for this project:

[1] Chaudhuri, S. (1998, May). An overview of query optimisation in relational systems. In Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems (pp. 34-43).

[2] Finkelstein, S., Schkolnick, M., & Tiberio, P. (1988). Physical database design for relational databases. ACM Transactions on Database Systems (TODS), 13(1), 91-128.

[3] Valentin, G., Zuliani, M., Zilio, D. C., Lohman, G., & Skelley, A. (2000, February). DB2 advisor: An optimiser smart enough to recommend its own indexes. In Proceedings of 16th International Conference on Data Engineering (Cat. No. 00CB37073) (pp. 101-110). IEEE.

[4] Pappachan, P., Yus, R., Mehrotra, S., & Freytag, J. C. (2020). Sieve: a middleware approach to scalable access control for database management systems. Proceedings of the VLDB Endowment, 13(12), 2424-2437.