The Dangers and (potential) Incoherence of AI Alignment

Speaker: Prof Herman Cappelen, The University of Hong Kong

Abstract:

This paper makes three main claims: 1) There is no universal set of core values for AI systems to align with – moral disagreements run too deep. 2) The increased causal potency that creates the problem of AI risk also implies that imperfect safety mechanisms provide effectively no safety at all. Since all safety mechanisms will have some imperfections, the goal of any degree of safety is unattainable. 3) Risk from advanced AI can only be constrained using strategies similar to those for nuclear and biological weapons: by restricting access to the underlying hardware and capabilities.