Make Your Database Dream of Electric Sheep: Designing for Autonomous Operation

History

Background – complexity of modern systems, DBAs take up 50% of cost.

1970-2000s – Self-adaptive, self-tuning databases (index selection, partitioning/sharding). Human input, workload trace analysis, human decision.

autonomous-dbms

Automating knob configuration. Database knob – configuration parameter you can set to modify runtime behavior. Growing complexity for human DBAs – hundreds of knobs.

2010s – Cloud databases. Automating deployment of large number of databases in cloud environment.

Why is this all insufficient?

What’s different now? Better hardware, better machine learning tools, better appreciation for data. We seek to complete the circle in autonomous databases. Make decisions and learn from them automatically.

2 research projects

OtterTune

For existing databases – black boxes – we only see what they’re exposing to us.

Database Tuning-as-a-Service

Target database -> Controller (collector) -> Tuning manager (metric analyzer, knob analyzer, configuration recommender) -> Controller (install agent) -> Target database

ottertune

Repeat: collect more runtime metrics, feed back into ML algorithms, see if we get better performance, make recommendation, over and over.

Objective function – latency or throughput, or whatever you want to optimize.

Another key thing that’s different about OtterTune - the only thing algorithms need to see is benign information about runtime metrics. It doesn’t need to see your data or user queries.

Live demo of OtterTune.

Comparison with DBAs. Efficiency. Scalable.

Peloton

(in process or renaming)

What things you can do better than OtterTune if you control the entire software stack.

Self-Driving Database System

ML components can observe all aspects of database runtime behavior, and then make recommendations. Goes beyond just knob configuration like OtterTune – all other things like picking indexes, picking partition schemes, scaling up, scaling out.

Self-driving cars comparison

peloton

Being built from scratch because existing systems are too slow (e.g. adding index is slow). Apply changes and learn as quickly as possible what the right configuration is.

The first problem (predicting workload) is solved. The big problem not yet solved is action catalog. How to determine whether one action is better than another, in terms of what happens before you deploy it or after you deploy it, how you reverse things, etc. How to interact internally and with outside world. This all is ongoing research.

Story - automatic index selection tool, customer kept undoing indexes, so many times that the tool got stuck and didn’t know what to do – turned out customer didn’t like names of indexes.

Design Considerations for Autonomous Operation

Configuration knobs

Internal metrics

Action engineering

Oracle rant

Main takeaways

True autonomous DBMSs are achievable in the next decade.

For anybody who’s working on database systems – you should think about how each new feature can be controlled by a machine.