KIA ORA / TALOFA

My name is Ma’alona Mafaufau.

I come to AI safety research from a bit of an unusual path.

What drives my research is genuine curiosity about the gap between how AI safety systems are supposed to work and how they actually behave when tested rigorously.

I bring an obsessive attention to detail and a background that lets me approach these problems from multiple angles—technical, creative, cultural, and psychological. My goal is to contribute to building something that actually works, not just something that appears to work.

I believe the most important discoveries often come from asking questions that feel slightly uncomfortable or approaching problems from unexpected angles. My work aims to bridge the gap between theoretical AI safety and empirical reality.

Data-Driven Approach: While my bypass methodology emerged from intuitions about how these systems process trust, authority, and cultural context, I'm committed to understanding the underlying mechanisms. I'm currently developing mechanistic interpretability capabilities to map exactly what happens in these models during successful bypasses, turning empirical observations into data-driven insights about AI behaviour.

THE RESEARCH PHILOSOPHY