AI Safety - AI & Humanity Lab

Preventing Unintended Consequences:

Researching methods to anticipate and mitigate unforeseen negative consequences of AI systems.

Robustness and Reliability:

Developing techniques to ensure AI systems are reliable and function as intended, even in unexpected situations.

Value Alignment:

Aligning AI systems with human values to ensure they operate ethically and in accordance with human goals.

Security and Malicious Use:

Addressing vulnerabilities of AI systems to hacking and malicious use, and developing safeguards against misuse.