Relational databases frequently store sensitive information, requiring the adoption of access control mechanisms to regulate data access. Ensuring the efficient and secure evaluation of database queries under access control is the security-aware query optimiser’s task. It ensures that queries under access control adhere to the desirable properties of soundness, security, and maximality [1]. It also uses cost-based query optimisation strategies to optimise a query based on statistics such as output cardinality and the number of IO operations [2].
The set-difference operation is commonly used in database queries. It is known that this operation is non-monotonic and that this further implies that queries using this operator are not sound under security predicates. This problem stems from the 3-valued logic used in relational databases to handle NULL values [3]. Therefore, proposals for a context-aware set-difference operator have been developed [1,3], which change the semantics of evaluation of query sub-expressions depending on which side of the set-difference they lie.
However, it is not clear how such a context-aware operator would be implemented in a cost-based query optimiser using equivalence-preserving rewrite rules, nor if using such an operator would guarantee the security and maximality of the plan. We are therefore looking for the answer to (but not limited to) the following questions:
- What are the current domains in which context-aware operators are used?
- Which security-guarantees does the use of a context-aware set-difference operator offer, and can we improve upon these?
- How can a context-aware set-difference operator be integrated into an existing cost-based query optimiser? (Using state-of-the-art toolings such as Apache Calcite and PgCuckoo)
Useful reading for this project:
[1] Wang, Q., Yu, T., Li, N., Lobo, J., Bertino, E., Irwin, K., & Byun, J. W. (2007, September). On the correctness criteria of fine-grained access control in relational databases. In Proceedings of the 33rd international conference on Very large data bases (pp. 555-566).
[2] Chaudhuri, S. (1998, May). An overview of query optimisation in relational systems. In Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems (pp. 34-43).
[3] Libkin, L. (2016). SQL’s three-valued logic and certain answers. ACM Transactions on Database Systems (TODS), 41(1), 1-28.