About | Safe AGI Forum

Safe AGI Forum (SAGIF) is a research organization dedicated to the technical problems that must be solved before advanced AI can be safely deployed.

We do not focus on policy, ethics, or high-level governance. Our work is empirical and technical. We study model internals, develop alignment algorithms, and build adversarial testing suites for frontier models.

Our core thesis is that safety cannot be added to a model after training. It must be built into the training objective, the architecture, and the evaluation methodology from the ground up.

Our Approach

Empirical Verification

We verify safety claims through rigorous testing on model internals. If a safety mechanism cannot be traced back to specific circuits or activation patterns, we consider it unverified.

Proactive Alignment

Instead of patching failures after they occur, we work on "scalable oversight" — mechanisms that allow humans to supervise models even when those models are performing tasks beyond human capability.

Adversarial Rigor

A model is not safe because it follows instructions; it is safe because it cannot be forced to deviate from them. We develop advanced red-teaming techniques to find the "hidden" capabilities of frontier models.

The India–UK
Research Corridor

AI safety is a global coordination problem. SAGIF serves as the primary bridge between the technical safety clusters in London and the growing research ecosystem in India.

By fostering joint research fellowships, technical working groups, and shared evaluation infrastructure, we are building a distributed safety corridor that operates at the speed of the frontier.

"We have a short window to build the technical infrastructure that will keep advanced AI aligned with human values. We cannot afford to miss it."