In this lecture we introduce another type of entropy, this time for a measure-preserving dynamical system $f$ on a probability space $(X, \mathscr{A}, \mu)$. This entropy is (unsurprisingly enough) called the measure-theoretic entropy of $f$, and is denoted by $\mathsf{h}_{\mu}(f)$.

The definition of measure-theoretic entropy is very similar the definition of topological entropy via open covers in Lecture 10, where now partitions play the role that open covers used to.

This is a three-stage process:

1. Given a partition $\xi$, let $\xi^k_f$ denote the partition
$$\xi_f^k := \xi \vee f^{-1}\xi \vee \dots \vee f^{-(k-1)} \xi.$$
2. The sequence $k \mapsto \mathsf{H}(\xi^k_f)$ is subadditive, and hence
$$\mathsf{h}_{\mu}(f, \xi) := \lim_{k \to \infty}\frac{1}{k} \mathsf{H}(\xi^k_f)$$
exists.
3. Now define
$$\mathsf{h}_{\mu}(f) := \sup_{ \xi \in \mathscr{P}} \mathsf{h}_{\mu}(f, \xi).$$

Recall from the last lecture that if we think of a partition $\xi$ as recording the possible outcomes of an “experiment” on our probability space, then the entropy $\mathsf{H}( \xi)$ can be thought as measuring the uncertainty of  this experiment. If $f$ is a dynamical system which governs the behaviour of the system under time, then the partition $\xi_f^k$ represents the combined experiment of performing $\xi$ on $k$ consecutive “days” (or whatever time unit an application of $f$ represents).

The measure-theoretic entropy $\mathsf{h}_{\mu}(f, \xi)$ can be thought of as an average uncertainty of performing the experiment $\xi$ on a given day, given that we already know what happened on all the previous days.

Thus $\mathsf{h}_{\mu}(f)$ can be thought of measuring the maximum (over all possible experiments) of the average uncertainty of performing a given experiment every day, forever. In other words, we look for the “least accurate” experiment we can find in our system and then test it every single day and see on average how many mistakes we make in our predictions.