KIA ORA / TALOFA

I'm Ma'alona Mafaufau, a Full Stack Developer and AI Safety Researcher based in Auckland, New Zealand.

I came to AI safety research through a bit of an unconventional path - statistics, data analytics, cybersecurity, and now software development. That winding journey gave me some skills to be able to look at AI systems from multiple angles, not just the technical, but the creative, cultural, and psychological dimensions too.

What I've built:

I placed 3rd in the Palisade Research AI Misalignment Bounty, demonstrating reproducible misalignment behaviors in frontier AI models including GPT-5 and o3. That work was published on arxiv and released on huggingface.

To support that research, I built REDD, a full-stack platform for systematic AI safety evaluation. It tests 12+ frontier models using React, FastAPI, and Google Cloud Platform.

I'm genuinely curious about the gap between how AI safety systems are designed in theory versus how they actually behave in practice. The most important discoveries often come from asking questions that feel slightly uncomfortable, or approaching problems from unexpected angles.

Currently, I'm developing mechanistic interpretability capabilities to understand the underlying mechanisms behind the bypass patterns I discovered, turning intuition-based methodology into data-driven insights.

THE RESEARCH PHILOSOPHY