AI, AGI, AI Safety

1. Welcome and Introduction (5 minutes)

  1. Introductions from Lawrence and Ida
    • Our backgrounds and motivation in AI safety
  2. Discussion Guidelines
    • State your name before speaking (no formal introductions)
    • Active participation encouraged
    • Questions welcome - we’re here to learn together
    • Laptops closed during discussion (exceptions for note-taking/quick searches)

2. Program Structure (5 minutes)

  • 8-week program (excluding finals week)
  • No required preparation outside the 2-hour sessions
  • Weekly Topics:
    • Week 1: What is AI, AI safety, and alignment?
    • Week 2: Alignment
    • Week 3: RLHF and other approaches to alignment
    • Week 4: Scalable oversight
    • Week 5: Robustness, unlearning
    • Week 6: Mechanistic interpretability
    • Week 7: Technical governance
    • Week 8: AI control
  • Food provided at future meetings
  • Guest facilitators include PhD students like Ida and Andy Zou
  • Will be roughly based on AISF

3. Expectations Discussion (5 minutes)

  • Group discussion of participant expectations and goals

4. Initial Survey (5 minutes)

5. Core Content and Discussions

5.1 Introduction to AI

5.2 Intelligence and Goals

5.3 More on Catastrophic AI Risks (If time permits)

  • Reading: 80,000 Hours - AI Problem Profile (20 minutes)
  • Partner Discussion: Timeline Perspectives (20 minutes)
    • What is a timeline?
    • What’s your perspective on AI development timelines?
  • Reading: An Overview of Catastrophic AI Risks (5 minutes)
    • Focus: Section 3 - AI Race
  • Summarize one risk in the readings (5 minutes)
  • Further Discussion
    • Key Questions:
      • Multipolar vs. unipolar development scenarios
      • Private vs. nationalized AI development
      • Open source vs. closed source approaches

Things that came up in discussion: