London, UK (strongly preferred) · Full-time · £120,000 – £160,000 / year

AIXI Labs is a small, London-based non-profit doing AI safety research grounded in algorithmic information theory. Most AI research today is either mathematically clean but narrow, or practical but too opaque for rigorous safety analysis. We target the third corner: methods general enough to describe powerful agents and provable enough to support real safety claims — using AIXI, the theoretical model of unbounded artificial superintelligence — then port the strongest of these ideas to modern LLM-based agents.

You’d be joining a three-person core team — Cole Wyeth (Founder & Executive Director), Marcus Hutter (Research Director), and Aram Ebtekar (Founding Research Scientist) — at an early stage. We are not an established engineering org, so you’d help build our empirical research capability essentially from scratch.

Minimum qualifications

Proven ability to independently design, build, and run experiments with modern LLM-based agents — including instrumenting, debugging, and modifying agent behavior, not just calling existing APIs.
Solid grounding in probability, information theory, and formal mathematical reasoning, sufficient to engage seriously with AIXI-style theoretical models (Bayesian reinforcement learning, Kolmogorov complexity, Solomonoff induction). No prior expertise in algorithmic information theory is required — real comfort with dense math and motivation to learn it is.
A track record of self-directed research or engineering work: evidence you can identify a worthwhile question and drive it to a concrete result with minimal oversight.
Willingness to work as one of a very small core team (currently three people), which means owning problems end-to-end, including infrastructure and “glue work” a larger team would delegate.

Preferred qualifications

Experience designing empirical tests for safety-relevant agent behaviors: deception, specification gaming, power-seeking, goal misgeneralization, or similar.
Publications at top ML/AI/theory venues (e.g. NeurIPS, ICML, ICLR, ALT, COLT), though we weight the strength of your best work over volume.
Background in algorithmic information theory, computability theory, or related theoretical computer science.
Experience fine-tuning, evaluating, or red-teaming LLM-based and/or RL agents.

Responsibilities

Design and run experiments on LLM-based agents that test theoretically-predicted risk factors for loss of control (e.g. deceptive or power-seeking behavior) under realistic conditions.
Build and evaluate safety mitigations motivated by our theoretical work, and report honestly on where they hold up and where they don’t.
Directly help shape the research agenda — with a team this size, your judgment about which experiments are worth running will materially affect what we work on.
Contribute to papers, blog posts, or other written output that communicates results to the broader AI safety and ML research communities.

To apply, contact aebtekar@alumni.cmu.edu with your CV and a note on what you'd want to work on here.