Scalable Oversight
1. Core Content and Discussions
1.1 An Intro to Scalable Oversight (20 minutes)
1.2 Iterated Amplification (20 minutes)
- Read sections 1, 2, and 5.
1.3 Safety Via Debate (30 minutes)
- Read sections 1, 2.0, 2.1, 2.3, 4 and 5.
1.4 Weak-to-Strong Generalization (40 minutes)
- Read sections 1, 3, 4 and 6.