A few months back I posted the first part of an introduction to the Free Energy Principle. I did this because it’s fascinating and important, and yet sometimes comically difficult to learn about. A big part of the problem is probably Research Debt - a dearth of digestible translations of cutting-edge research into something accessible to non-specialists. My FEP Explained posts are aimed at this problem, but this post is aimed at another: not knowing where to start or where to go!

Alice Lost In The Forest

For most things humans care to learn, someone has trod the path before and shown the way to others. There is a common starting point (usually the end of high school) an endpoint (doctor, physicist, architect, etc.), and a routine path to take between these two. The FEP has none of these. The researchers tend to come from lots of different backgrounds, so everyone has things they know well, and things they need to learn. There isn’t a clear end-point aside from “understand what’s going on, maybe add to our understanding someday”, and the complete path is not routine enough that there are any university degrees or courses entirely about the FEP.

What we need is a syllabus! A list of material that gives you the necessary background to understand the full Free Energy Principle (and Active Inference), which also functions as a reasonably complete map of the territory so that people can quickly see what the field encompasses and how the parts relate. Our syllabus should cater to the kinds of people who end up interested in the FEP, so should include different starting points and different levels of technical detail. It should gradually improve you, filling in details as you become able to digest them. Lastly, it should challenge you to act and solve new problems - problems you couldn’t have solved before you went through it.

This is a first attempt to do some of those things. Each section has a short motivation, and some of the materials have descriptions where I thought it useful. At the end, there’s a section on Open Problems and Future Research directions which I need your help filling out, so please comment on this post or reply to the Twitter thread!

How to use this syllabus

Generally, dive in. Don’t wait to finish all the prerequisites before you try getting your head around the theory. Jeremy Howard from Fast.ai has this great line about getting a feel for something by playing around with the full construct before you fully understand it, and once you have a feel for it, only then do you go learn the technical details. So even though this list starts with “prerequisites”, treat that as a placeholder, go until you get stuck, and then come back and see if you can find the answer to the stuck-feeling in one of the prerequisites.

I’ve also tried to have a hierarchy of materials within each category, starting with short, intuitive primers and ending with longer, more technical material that only becomes relevant as your interests deepen. There isn’t a strict order to the materials, which is why it’s probably better to go iteratively deeper across all the categories, instead of linearly down the list.

Prerequisites

Neuroscience

The FEP came out of Karl Friston’s neuroimaging work, so it shouldn’t be surprising that neuroscience is a key competency.

Predictive Coding/Predictive Processing

Predictive coding is a theory of how the brain works that inverts the traditional “bottom-up” feature detection view and replaces it with the brain generating top-down predictions instead. When a prediction does not match with what is actually observed, an error signal is passed up through the brain and changes take place to minimise the prediction error in the future.

Predictive processing is closely associated with the FEP, and naturally “emerges” from the equations governing the FEP. In this sense, understanding PP is helpful to have an intuitive view of what the FEP achieves in the brain.

  • It’s Bayes All The Way Up by Scott Alexander [blog]
    • Scott has a great series of posts on Bayesian brain hypotheses. A nice way to get some intuitions!
  • Bit of a Tangent [podcast]
    • This is cheating, but we did a series of three podcasts on PP, so check those out if you prefer audio
  • Surfing Uncertainty by Andy Clark [book]
    • This technical book gives a deep introduction to much of the theory behind PP, but still manages to be accessible to non-neuroscientists (I’m not one). It’s really good for getting an overview of the PP literature and the flavour of this view of the brain!
    • Scott Alexander also has a great review of the book here

Computational Neuro and modelling techniques

Part of the appeal of the FEP is that it might lead to improved models of the brain and biological intelligence. Learning to set up experiments, design simulations, and implement models of neural systems in code is a big part of turning the theoretical results of the FEP into a powerful set of new tools and applications!

  • Neuromatch Academy [summer school/course][virtual]
    • This is a fully-online Summer School with all the materials freely available. It includes all of the content you need to get started in computational neuroscience and includes deep learning and reinforcement learning content too! The content has been created and curated by a great team of researchers and their explicit goal is to train people from diverse backgrounds in computational neuroscience!
  • The Imbizo [summer school/course][in-person]
    • If/when international travel ever becomes possble again, I cannot more strongly recommend attending the annual Computational Neuroscience ‘Imbizo’ in Muizenburg, South Africa. It’s basically a three week crash course in computational neuroscience with a bunch of awesome speakers and students. It’s easily the most fun I had in 2020, and I learnt a tonne!
  • Computational Psychiatry course [course]
    • One of the promises of the FEP is a new understanding of the computational basis of psychiatric disorders such as depression, schizophrenia, ADHD, bipolar-mood disorder and others. This course is an introduction to understanding these conditions computationally and mathematically. The course materials are all online, and Karl Friston himself has given some of the lectures in previous years!

Control Theory/Cybernetics

A huge part of the FEP has to do with acting on the world! We need to keep our physiological parameters within tolerable limits, using feedback loops and the techniques of control theory. Understandnig the history and techniques of these fields helps us contextualise the problems the FEP and active inference set out to solve.

Physics

The Free Energy Principle wants to explain how we, as biological organisms obeying the laws of physics, can self-organise into complicated creatures that maintain our complexity despite (or because of) dissipative forces. The FEP rests solidly on the ideas of modern physics, so the more you know here the better!

Statistical Mechanics

Entropy, Free Energy, non-equilibrium steady states - these ideas are front and centre in the FEP literature and all come from stat-mech. If you were going to spend your time learning about only one area of physics to really have a good basis in the FEP, this is probably the one you’d pick.

  • Statistical Mechanics in The Theoretical Minimum by Leonard Susskind [lecture videos]
    • Leonard Susskind is a hero to me. I’ve heard more than one person say that they did multiple stat-mech courses and the derivation of the partition function was never as magically clear as Susskind’s. If you’ve never done any stat-mech, this is a good place to start!
  • Statistical Mechanics: A Set of Lectures by Richard Feynman [textbook]
    • The Feynman Lectures on Physics are rightly famous for their explanatory clarity, but less well-known are his lectures on statistical mechanics! Worth reading just because it’s Feynman!

Non-equilibrium Statistical Mechanics

Of course, the FEP deals with living systems, and living means being far from thermodynamic equilibrium, so we’ll want to understand how to describe these systems mathematically. The big ideas that come up often are non-equilibrium steady states, as well as the Langevin and Fokker-Planck equations.

Classical Mechanics, Hamiltonian Mechanics, and the Principle of Least Action

Friston sometimes frames the FEP in terms of a principle of least action, where action is an integral over a nice object called a Lagrangian. Lagrangians and Least Action principles give us a really powerful way to describe the equations of motion, symmetries, and conservation laws of our system, so knowing about this approach is really worthwhile. That being said, this approach doesn’t feature prominently in the basic formulation of the FEP, so don’t get stuck here!

  • Classical Mechanics by Leonard Susskind [lecture videos]
    • Like his Stat-Mech course, Susskind does a good job of getting the main ideas across with just enough maths to allow you to understand more formal treatments later.

Gauge theory

Gauge theory is a way of describing how certain symmetries in our system can lead to new properties/forces. It’s not a big part of the core theory, but I include it here because it’s a personal favourite, and Friston and collaborators have dabbled in applying some gauge-theoretic ideas to the FEP.

Maths

I kind of want to say that I’m taking it for granted that you know at least a bit of multivariable calculus, some linear algebra, and some probability theory (the FEP people definitely assume this, and more), but also I didn’t know any of those things when I started (not so long ago!) and felt really helpless when they were just assumed, so the above links go to the MIT OpenCourseware courses I used to get going!

Bayesian inference

The FEP fundamentally deals with agents trying to infer the state of their environment, given their current sensory data. Whenever you have a sentence like that, Bayes’ Theorem can’t be far behind! Understanding, on a gut level, the mechanics of how Bayes works, and being able to fluidly work with the different forms of the formula (knowing some basic identities in terms of joint distributions and marginal distributions) generally makes a lot of things in the FEP clearer!

  • An Intuitive Explanation of Bayes’ Theorem by Eliezer Yudkowsky [blog]
    • I link to this article all over the place because it’s just so good. It won’t teach you to use Bayesian techniques in your machine learning model, but it will teach you why you’d want to!
  • Probability Theory: The Logic of Science by E.T. Jaynes [textbook]
    • Rederiving probability theory from scratch might be overkill, but this book sort of reads like the Feynman Lectures, but for probability theory. Be sure to grab the unofficial errata
  • Machine Learning: A Probabilistic Perspective by Kevin Murphy [textbook]
    • This is pretty lengthy but for the first 3 chapters (reintroducing key ideas in machine learning, probability theory, and generative models) are a quick way to get up to speed with a lot of the jargon, if you’ve already done a bit of probability theory beforehand! Chapter 5 on Bayesian statistics and chapter 21 on variational inference (see below) are especially relevant to the FEP!

Information theory

Entropy, confusingly, moonlights as a term in information theory. Not only that, but since much of the FEP is about a system inferring the state of a hidden (latent) variable through its noisy sensory signals, the techniques of information theory (invented by Claude Shannon for almost exactly that kind of problem) are key. Information theory also teaches us about the Kullback-Liebler divergence, which is worth knowing just on its own!

Information geometry

Depending on your feelings about math, information geometry is either a sort of sublime union of differential geometry and statistics, or a Frankenstein’s Monster of two other monsters. Information geometry lets us do statistics on manifolds, which seems arbitrary (especially if you don’t know what a manifold is), but might be useful, and is at least very cool to say to your friends. Friston occasionally mentions the term, so this is here as a reference you can return to for when he does.

  • Information Geometry by John Baez [notes]
    • These notes are packed with insights about geometry, entropy, dissipative forces, and more! Most importantly, they feature a neat derivation of the Fischer Information Metric, which is a big part of the field. I really enjoy the way that John C. Baez explains maths - he manages to make me feel like I really could have done that when he derives something. A bonus of these notes is Baez riffing on how this information geometry applies to evolutionary systems, and the relative entropy (aka the KL-divergence!), so you can see a lot of the bread and butter of the FEP infused into this field!

Computer science and machine learning

The FEP will remain an intriguing bit of mathematical theory if good programmers don’t start to find ways to apply the ideas to problems in artificial intelligence and machine learning. On the other side, ideas from AI/ML are key to understanding where the FEP and Active Inference can fit in and what the current state-of-the-art is.

Deep Learning

Deep learning has been so damn successful that it’s worth knowing about just for that. Neural networks are great function-approximators, which can be useful in the FEP. Also, the standard deep learning libraries (PyTorch/Tensorflow etc.) are worth learning because they make it really easy to build flexible models using current best-practices, and they make things like autodifferentiation and optimisation easy!

  • fast.ai by Jeremy Howard and Rachel Thomas [course]
    • An excellent, practical introduction to state of the art techniques in modern deep learning. Emphasis on deploying models. Worth it just for Jeremy Howard’s wisdom.
  • Advanced Deep Learning and Reinforcement Learning by DeepMind [lecture videos]
    • A more detailed course which picks up a lot of the detail that Fast.ai leaves out.
  • Deep Learning Book by Ian Goodfellow, Yoshua Bengio, and Aaron Courville [textbook]
    • Not so relevant to the FEP, but so so good and freely available! If you ever want to know more about deep learning, it also has a great set of introductory chapters on calculus, linear algebra, and probability theory, which just give enough to get started in the general area, rather than doing entire university courses!

Reinforcement Learning

RL is currently the most established field for creating autonomous, intelligent agents that can interact with their environments. Knowing the basics of how RL has approached intelligence, what kinds of techniques are available, and their limitations, helps us put the FEP in context. RL also deals heavily with Markov chains and the Markov property, and has established techniques for programming these kinds of agents. All of this can be transferred into the design of FEP-style agents.

A nice side-effect is that RL has a bunch of established benchmarks and environments for testing how intelligent an agent is (like OpenAI Gym) and I think it’s key to the future of the FEP/Active Inference that their techniques can be shown to compete with (and eventually outperform) pure-RL models.

Variational inference

At the start of your journey into the FEP, you’ll keep hearing about ‘surprisal’, ‘the ELBO’ (evidence lower bound), ‘variational Bayes’, and ‘model evidence’. For whatever reason, I took too long to just go and find out that all of these ideas are well established and well explained in the field of variational inference. Variational inference involves trying to infer an (intractable) probability distribution by using the techiniques of mathematical optimisation to make a starting distribution closer to our (intractable) target distribution.

Generative models

Close your eyes and imagine a red bus! If you can do it, maybe that counts as evidence to you that your brain has some sort of generative model (i.e. can imagine/synthesise plausible data points). More generally, generative modelling tries to explain the data we’re observing as being generated by some smaller set of (latent) variables. The FEP deals heavily with the language and ideas of generative models, so reading up on directy is helpful!

  • Modern Latent Variable Models by DeepMind & UCL [video]
  • Probabilistic Graphical Models by Eric Xing (also see course website here [course]
    • I enjoy this course for taking a different perspective on ML/DL. There’s a lot of variety, and the course has videos on variational inference and generative models. There are also slides and course notes here

The Free Energy Principle and Active Inference

The main event!

General introductions

Technical introductions

Key papers

This list is as much a list of some of the most fascinating directions of research in the field as it is a general overview. The hope is that the ideas in some of these papers sound so cool that you can’t help but want to take the ideas further (that was my experience, and that’s why I’m still here!). Rereading this list, I can see it’s skewed by my own research interests, so let me know what other sections/papers should be added!

Open Problems

What are the current open problems in the FEP/Active Inference framework? What research directions are there? Suggestions are encouraged, so comment here or add to this ongoing twitter thread

Conclusion

I hope this encourages more people to get stuck-in to the FEP/Active Inference research-space. It’s so interdisciplinary that it welcomes people from all kinds of backgrounds, and there’s important work that needs doing in math, neurobiology, philosophy, ecology, physics, software engineering, machine learning, and more! There are so many friendly people who are willing to think out loud, explain, answer questions, and offer support - seriously, just try to tweet some of the people mentioned in this post!

Good luck with your studies, and even more so with what you create with that knowledge - perception-and-action are two sides of the same coin, after all!

Contributors

Thanks to these people for suggesting resources to add to the post!

InferenceActive Beren Millidge


Thank you to Gianluca and @InferenceActive for their helpful comments on drafts of this post!