Scalable Oversight

1. Core Content and Discussions

1.1 An Intro to Scalable Oversight (20 minutes)

1.2 Iterated Amplification (20 minutes)

  • Read sections 1, 2, and 5.

1.3 Safety Via Debate (30 minutes)

  • Read sections 1, 2.0, 2.1, 2.3, 4 and 5.

1.4 Weak-to-Strong Generalization (40 minutes)

  • Read sections 1, 3, 4 and 6.