Let’s make sure AI development happens safely.

We are a group of students, researchers, and professionals interested in, and working on the long-term safety of artificial intelligence. Our goal is to help ensure that future powerful AI systems are safe. For those who are new to AI Safety, we provide an AI Safety Fundamentals course. Furthermore, we run a seminar where we read and discuss recent research papers, we sometimes collaborate on research projects and more!

Opportunity to become a ZAIS organizer!

Zurich AI Safety is looking for volunteer organizers for the next academic year (fall + spring semester). We'll assemble the team over the summer so we can hit the ground running in September. 🚀

Why join? Several past ZAIS organizers have successfully transitioned into full-time jobs working on making AI go well. In our view, organizing is the single best way to stay engaged with AI safety and build a real network in this community.

We're very excited about Zurich, its universities and the talent here, and we expect this community to grow a lot next year — as an organizer, you'll work closely with the incoming ZAIS director, who will focus full-time on building out this community. If you want to be part of that:

📝 Apply here: https://forms.gle/fycUvR5yN4rY2NB99

⏰ Deadline: June 28 — but we review applications on a rolling basis, so earlier is better.

Not sure if it's for you? Totally fine — fill out the form anyway and leave a comment with your questions or uncertainty, and we'll get back to you.

Get involved!

Join our WhatsApp Community to stay informed about future events and programs.

Upcoming Events

Some cool projects from our members

You Didn’t Have to Say It like That: Subliminal Learning from Faithful Paraphrases

Isaia Gisler, Zhonghao He, Tianyi Qiu

GT-HarmBench: Benchmarking AI Safety Risks Through the Lens of Game Theory

Pepijn Cobben, Xuanqiang Angelo Huang, Thao Amelia Pham, Isabel Dahlgren, Terry Jingchen Zhang, Zhijing Jin

Evaluating Superhuman Models with Consistency Checks

Lukas Fluri, Daniel Paleka, Florian Tramèr

Intent-aligned AI systems deplete human agency: the need for agency foundations research in AI safety

Catalin Mitelut, Ben Smith, Peter Vamplew

Red-Teaming the Stable Diffusion Safety Filter

Javier Rando, Daniel Paleka, David Lindner, Lennart Heim, Florian Tramèr

Training Language Models with Natural Language Feedback

Jérémy Scheurer, Jon Ander Campos, Jun Shern Chan, Angelica Chen, Kyunghyun Cho, Ethan Perez

Exploring Adversarial Attacks and Defenses in Vision Transformers trained with DINO

Javier Rando, Nasib Naimi, Thomas Baumann, Max Mathys

Challenges for Using Impact Regularizers to Avoid Negative Side Effects

David Lindner, Kyle Matoba, Alexander Meulemans