Bayesian Probability

Working through the basics before getting onto networks
Published

March 19, 2023

This is an exploration of Bayesian theorem. I’m going to work through the theorem and the derivation of it in different ways. The wikipedia page has the different derivations that I will be working through.

The Equation

The core equation is:

\[ P(A \mid B) = \frac{P(B \mid A) \cdot P(A)}{P(B)} \]

To write this out in English, it says that the probability of \(A\) if \(B\) is true is derivable using the probability of \(B\) if \(A\) is true with the independent probabilities of \(A\) and \(B\).

The Drug Testing Example

On the wikipedia page they talk about a drug test to show how to work through this. The probabilities that they state are:

  • True positive rate: \(P(\text{Positive} \mid \text{User}) = 0.9\)
  • True negative rate: \(P(\text{Negative} \mid \text{Non-user}) = 0.8\) or \(P(\lnot \text{Positive} \mid \lnot \text{User}) = 0.8\)
  • Prevalence of use: \(P(\text{User}) = 0.05\)

With this we can then calculate the value of \(P(\text{Positive} \mid \text{Non-user})\) (i.e. the chance that a non user will be flagged by the test):

\[ \begin{aligned} P(\text{Positive} \mid \text{Non-user}) &= P(\text{Positive} \mid \lnot \text{User}) \\ &= 1 - P(\lnot \text{Positive} \mid \lnot \text{User}) \\ &= 1 - 0.8 \\ &= 0.2 \end{aligned} \]

That’s easy and doesn’t require Bayes theorem.

If we are interested in finding the chance that a person is a user given a positive test result (\(P(\text{User} \mid \text{Positive})\)) then we can use the rule. To do so we need to first calculate the overall chance of \(P(\text{Positive})\), which we can calculate as follows:

\[ \begin{aligned} P(\text{Positive}) &= ( P(\text{Positive} \mid \text{User}) \cdot P(\text{User}) ) + ( P(\text{Positive} \mid \lnot \text{User}) \cdot P(\lnot \text{User}) ) \\ &= ( 0.9 \cdot 0.05 ) + ( ( 1 - 0.8 ) \cdot ( 1 - 0.05 ) ) \\ &= ( 0.9 \cdot 0.05 ) + ( 0.2 \cdot 0.95 ) \\ &= 0.045 + 0.19 \\ &= 0.235 \end{aligned} \]

Then with this we can calculate \(P(\text{User} \mid \text{Positive})\):

\[ \begin{aligned} P(\text{User} \mid \text{Positive}) &= \frac{P(\text{Positive} \mid \text{User}) \cdot P(\text{User})}{P(\text{Positive})} \\ &= \frac{0.9 \cdot 0.05}{0.235} \\ &= \frac{0.045}{0.235} \\ &= \frac{9}{47} \\ &= 0.19148\dots \end{aligned} \]

The wikipedia page agrees with this result. As a drug test this is not encouraging though! You are far more likely to identify non-users when you get a positive result.

Precision and Recall

I’ve often seen classifier accuracy referred to using the precision and recall terms. Breaking these down we have four different classifications:

  • True Positive (TP) - the classification is positive and correct
  • True Negative (TN) - the classification is negative and correct
  • False Positive (FP) - the classification is positive and incorrect
  • False Negative (FN) - the classification is negative and incorrect

Precision is the ratio of positive drug tests that identify drug users compared to all positive drug tests (\(\frac{TP}{TP + FP}\)). Recall is the ratio of positive drug tests that identify drug users compared to all drug users (\(\frac{TP}{TP + FN}\)).

We can break down the drug test into these four classifications.

User Non-user
Positive 0.9 * 0.05 = 0.045 0.2 * 0.95 = 0.19
Negative 0.1 * 0.05 = 0.005 0.8 * 0.95 = 0.76

We can express these as probabilities:

\[ \begin{aligned} TP &= P(\text{Positive} \cap \text{User}) \\ TN &= P(\lnot \text{Positive} \cap \lnot \text{User}) \\ FP &= P(\text{Positive} \cap \lnot \text{User}) \\ FN &= P(\lnot \text{Positive} \cap \text{User}) \end{aligned} \]

If we have these can we determine the different probabilities that we were using above? It seems trivial to me:

\[ \begin{aligned} P(\text{User}) &= P( \{ \text{Positive}, \lnot \text{Positive} \} \cap \text{User} ) \\ &= P(\text{Positive} \cap \text{User}) + P(\lnot \text{Positive} \cap \text{User}) \\ &= 0.045 + 0.005 \\ &= 0.05 \\ \\ P(\lnot \text{User}) &= P( \{ \text{Positive}, \lnot \text{Positive} \} \cap \lnot \text{User} ) \\ &= P(\text{Positive} \cap \lnot \text{User}) + P(\lnot \text{Positive} \cap \lnot \text{User}) \\ &= 0.19 + 0.76 \\ &= 0.95 \\ \\ P(\text{Positive}) &= P( \text{Positive} \cap \{ \text{User}, \lnot \text{User} \} ) \\ &= P(\text{Positive} \cap \text{User}) + P(\text{Positive} \cap \lnot \text{User}) \\ &= 0.045 + 0.19 \\ &= 0.235 \\ \\ P(\lnot \text{Positive}) &= P( \lnot \text{Positive} \cap \{ \text{User}, \lnot \text{User} \} ) \\ &= P(\lnot \text{Positive} \cap \text{User}) + P(\lnot \text{Positive} \cap \lnot \text{User}) \\ &= 0.005 + 0.76 \\ &= 0.765 \end{aligned} \]

This is basically summing the columns or rows.

If we consider the ratio within a column or row then we are calculating the conditional probability:

\[ \begin{aligned} P(\text{Positive} \mid \text{User}) &= \frac{P(\text{Positive} \cap \text{User})}{P(\text{Positive} \cap \text{User}) + P(\lnot \text{Positive} \cap \text{User})} \\ &= \frac{0.045}{0.045 + 0.005} \\ &= 0.9 \\ \\ P(\text{Positive} \mid \lnot \text{User}) &= \frac{P(\text{Positive} \cap \lnot \text{User})}{P(\text{Positive} \cap \lnot \text{User}) + P(\lnot \text{Positive} \cap \lnot \text{User})} \\ &= \frac{0.19}{0.19 + 0.76} \\ &= 0.2 \\ \\ P(\lnot \text{Positive} \mid \text{User}) &= \frac{P(\lnot \text{Positive} \cap \text{User})}{P(\lnot \text{Positive} \cap \text{User}) + P(\text{Positive} \cap \text{User})} \\ &= \frac{0.005}{0.005 + 0.045} \\ &= 0.1 \\ \\ P(\lnot \text{Positive} \mid \lnot \text{User}) &= \frac{P(\lnot \text{Positive} \cap \lnot \text{User})}{P(\lnot \text{Positive} \cap \lnot \text{User}) + P(\text{Positive} \cap \lnot \text{User})} \\ &= \frac{0.76}{0.76 + 0.19} \\ &= 0.8 \\ \\ \end{aligned} \]

If we have the precision and recall figures can we get to these figures?

\[ \begin{aligned} Precision &= \frac{TP}{TP + FP} \\ &= \frac{P(\text{Positive} \cap \text{User})}{P(\text{Positive} \cap \text{User}) + P(\text{Positive} \cap \lnot \text{User})} \\ \\ Recall &= \frac{TP}{TP + FN} \\ &= \frac{P(\text{Positive} \cap \text{User})}{P(\text{Positive} \cap \text{User}) + P(\lnot \text{Positive} \cap \text{User})} \end{aligned} \]

I do not think it is possible. The key problem here is that we have two known values, \(Precision\) and \(Recall\) and three unknown values, \(TP\), \(FP\) and \(FN\). If one of the unknown values were known then it would be possible to calculate the others.

We can demonstrate this inability by showing two sets of values that produce the same precision and recall values.

For this first table we can see that precision and recall are both 0.5:

Predicted True Predicted False
Actual True 0.25 0.25
Actual False 0.25 0.25

\[ \begin{aligned} Precision &= \frac{TP}{TP + FP} \\ &= \frac{0.25}{0.25 + 0.25} \\ &= 0.5 \\ \\ Recall &= \frac{TP}{TP + FN} \\ &= \frac{0.25}{0.25 + 0.25} \\ &= 0.5 \\ \end{aligned} \]

If we then use the values from this second table we can see that precision and recall remain 0.5:

Predicted True Predicted False
Actual True 0.05 0.05
Actual False 0.05 0.85

\[ \begin{aligned} Precision &= \frac{TP}{TP + FP} \\ &= \frac{0.05}{0.05 + 0.05} \\ &= 0.5 \\ \\ Recall &= \frac{TP}{TP + FN} \\ &= \frac{0.05}{0.05 + 0.05} \\ &= 0.5 \\ \end{aligned} \]

This is a simultaneous equation problem and my rule of thumb is that you need at least as many equations as there are unknown variables.

Interactive Expression Solver

So since I went to the trouble of learning Objective JS last time, is it possible to create an interactive Bayesian solver? Ideally you would be able to express some facts and then ask it to derive another expression.

Let’s imagine the input. The original challenge was to calculate \(P(\text{User} \mid \text{Positive})\) given \(P(\text{Positive} \mid \text{User})\), \(P(\lnot \text{Positive} \mid \lnot \text{User})\) and \(P(\text{User})\). We can imagine providing the values as separate inputs and then asking it to calculate a given probability. Implementing this should be a fun way to explore the underlying problem.

To make the parser simpler I have separated the input into two types of expression - facts and queries. The facts are statements that include the probability, such as \(P(\text{Positive} \mid \text{User}) = 0.9\). The queries are statements that do not include the probability, such as \(P(\text{User} \mid \text{Positive})\).

When working like this the original drug testing example can be stated as:

P(Positive | User) = 0.9 ;
P(¬Positive | ¬User) = 0.8 ;
P(User) = 0.05 ;
P(User | Positive)

There is also a cancer example which can be stated as:

P(Symptoms | Cancer) = 1 ;
P(Cancer) = 0.00001 ;
P(Symptoms | ¬Cancer) = 0.0001000010000100001;
P(Cancer | Symptoms)

If you paste this into the box below then it will solve it for you!

Code
viewof expressions = Inputs.textarea({label: "Expressions", placeholder: "P(A | B) = 0.3 ;\nP(A | ¬B) = 0.3 ;\nP(A)"})

function handle(expressions) {
  if (expressions.length === 0) {
    return "time is a flat circle"
  }
  
  const parsed = parse(expressions)
  if (parsed.error !== undefined) {
    return `PARSER ERROR: ${parsed.error}`
  }

  const solved = solve(parsed)
  if (solved.error !== undefined) {
    return `SOLVER ERROR: ${solved.error}`
  }

  const formatExpression = (exp) => {
    const parts = exp.split("=").map(part => part.trim())
    if (parts.length === 1) {
      return parts
    }
    const [first, second, ...rest] = parts
    const start = `${first} &= ${second} \\\\`
    return [start, ...rest.map(row => `&= ${row} \\\\`)]
  }

  const rows = [
    '$$',
    '\\begin{aligned}',
    ...solved.expressions.flatMap(formatExpression),
    '\\end{aligned}',
    '$$'
  ]
  const content = rows.join("\n")
  setTimeout(MathJax.typeset, 100);
  return html`${content}`
}
handle(expressions)

WARNING: This uses invalid operations, see below

Broadly this works by parsing the expressions that you have entered, and then deriving new facts until all of the queries are answered by a fact. To derive a new fact the following transformations can be applied.

Fact Inversion

We can invert a fact by negating the condition:

\[ P(\lnot \text{Positive} \mid \text{User}) = 1 - P(\text{Positive} \mid \text{User}) \]

If there are multiple conditions then we can negate them one by one:

\[ P(\lnot \text{Cloudy}, \text{Raining}) = 1 - P(\text{Cloudy}, \text{Raining}) \]

Now that I write this out I realise that this doesn’t feel right. The inverse should be all of the things that are not expressed by the original, so the expression should be:

\[ P(\lnot \text{Cloudy}, \text{Raining}) + P(\lnot \text{Cloudy}, \lnot \text{Raining}) = 1 - P(\text{Cloudy}, \text{Raining}) \]

That is why writing this out is good. So this solver is incorrect for negating terms with multiple conditions.

A friend has pointed me towards De Morgan’s laws which express the negation of the original in a more precise form.

Given Removal

We can remove a given by summing over the different given states and weighting by the probability of that outcome:

\[ P(\text{Positive}) = (P(\text{Positive} \mid \text{User}) \cdot P(\text{User})) + (P(\text{Positive} \mid \lnot \text{User}) \cdot P(\lnot \text{User})) \]

Given that we can negate \(P(\text{User})\) that means we need \(P(\text{Positive} \mid \text{User})\), \(P(\text{Positive} \mid \lnot \text{User})\) and \(P(\text{User})\) (or the inverse) to calculate this.

We can remove a single given from a list:

\[ P(\text{Positive} \mid \text{Young}) = (P(\text{Positive} \mid \text{User}, \text{Young}) \cdot P(\text{User})) + (P(\text{Positive} \mid \lnot \text{User}, \text{Young}) \cdot P(\lnot \text{User})) \]

This extension feels legitimate.

Bayes Theorem

This is the point of the overall post so the solver must be able to do this. To remind you, the equation is:

\[ P(\text{User} \mid \text{Positive}) = \frac{P(\text{Positive} \mid \text{User}) \cdot P(\text{User})}{P(\text{Positive})} \]

I have extended this in two ways. The first is that there can be additional unrelated givens that are not altered by the operation:

\[ P(\text{User} \mid \text{Positive}, \text{Young}) = \frac{P(\text{Positive} \mid \text{User}, \text{Young}) \cdot P(\text{User})}{P(\text{Positive})} \]

I thought that this was legit and now I’m torn. The example shows the problem - the distribution of users could differ between the young and the old, and so the multiplication using the overall distribution doesn’t relate to the subset distribution. So, once again, this solver is incorrect when operating over multiple givens.

Then to double down on this I also allowed it to work with unrelated conditions that are not altered by the operation:

\[ P(\text{User}, \text{Young} \mid \text{Positive}) = \frac{P(\text{Positive}, \text{Young} \mid \text{User}) \cdot P(\text{User})}{P(\text{Positive})} \]

I feel that this has the same problem in that the distribution differs so the division using the overall distribution is wrong.

The same friend pointed me to the chain rule for probability which is the missing part here. We can use that to derive Bayes theorem:

\[ \begin{aligned} P(\text{A}, \text{B}) &= P(\text{B}, \text{A}) \\ P(\text{A} \mid \text{B}) \cdot P(\text{B}) &= P(\text{B} \mid \text{A}) \cdot P(\text{A}) \\ P(\text{A} \mid \text{B}) &= \frac{P(\text{B} \mid \text{A}) \cdot P(\text{A})}{P(\text{B})} \end{aligned} \]

easy. So if we use the chain rule in a more general solver then it should be possible to operate accurately over terms with more expressions.

Conclusion

I came up with what I thought were reasonable extensions to the equations to operate over multiple conditions or givens. Now that I think about it I can negate each one.

It’s unfortunate that my initial instincts around this were inaccurate. I’m more pleased about my ability to disprove each extension. I know I need to work on this, and finding the areas where I am weak is the best way to do so. Onward!

Why leave it?

I’ve worked on this post quite a bit. I have learned by doing so. Changing the solver is more of a programming problem than a data science problem.

I might well work on the solver to change it for another post. To fix these operations would be reasonably involved and I want to move to a better way to express the operations that can be performed.

Observable JS Insanity

So how does the solver work? How does it understand what you put in and create a series of steps to solve it? Here I am going to explain how it all works.

This blog post is on a website and websites use javascript. In the evaluation of different interactivity solutions that I did I found that Observable JS was the best way to add interactivity. That means this solver is written in javascript.

The flow of the solver is as follows:

The full code is available below. It is quite long.

Parsed Representation

The parser creates Queries and Facts out of what you write. A statement that has a probability is a Fact, while a statement without a probability is a Query. I named them this way as the solver is finding solutions to the Queries that you make based on the Facts you enter and what can be derived from them. I created a class to represent each of them.

The words used for the condition or given within a probability are Terms. They can be negated so they get a class too.

Here are the classes that the parser issues:

Code
class Term {
  constructor({ term, not }) {
    this.term = term;
    this.not = not;
  }

  equals({ term, not }) {
    return this.term === term && this.not === not;
  }

  invert() {
    return new Term({ term: this.term, not: !this.not });
  }

  toId() {
    return `${this.term}-${this.not}`;
  }

  toString() {
    if (this.not) {
      return ${this.term}`;
    }
    return this.term;
  }
}

function deduplicate(terms) {
  const deduplicated = new Map(
    terms.map((term) => [term.toId(), term]),
  );
  const keys = [...deduplicated.keys()].sort();
  return keys.map((key) => deduplicated.get(key));
}

function termsEqual(left, right) {
  // condition and given always sorted by deduplicate, can use this invariant to avoid lookup
  const zip = (l, r) => l.map((entry, index) => [entry, r[index]]);
  const eq = ([l, r]) => r !== undefined && l.equals(r);
  const all = (accumulated, value) => accumulated && value;
  return left.length === right.length && zip(left, right).map(eq).reduce(all, true);
}

function find({ condition, given = [], facts }) {
  return facts.filter((fact) => fact.equals({ condition, given }))[0];
}

function replace({ original, replacement, terms }) {
  return deduplicate([
    replacement,
    ...terms.filter((term) => !term.equals(original)),
  ]);
}

function round({ value, places = 5 }) {
  const rounded = value.toFixed(places).replace(/0+$/, '');
  const unrounded = `${value}`;
  if (unrounded.length <= rounded.length) {
    return unrounded;
  }
  return rounded;
}

class Base {
  constructor({ condition, given }) {
    this.condition = deduplicate(condition);
    this.given = deduplicate(given);
  }

  equals({ condition, given }) {
    return termsEqual(this.condition, condition) && termsEqual(this.given, given);
  }

  toId() {
    const condition = this.condition.map((term) => term.toId()).join(', ');
    if (this.given.length === 0) {
      return `P(${condition})`;
    }
    const given = this.given.map((term) => term.toId()).join(', ');
    return `P(${condition} \\mid ${given})`;
  }

  toString() {
    const condition = this.condition.map((term) => term.toString()).join(', ');
    if (this.given.length === 0) {
      return `P(${condition})`;
    }
    const given = this.given.map((term) => term.toString()).join(', ');
    return `P(${condition} \\mid ${given})`;
  }
}

class Fact extends Base {
  constructor({ condition, given, probability }) {
    super({ condition, given });
    this.probability = probability;
  }

  solves(query) {
    const conditionMatches = termsEqual(this.condition, query.condition);
    const givenMatches = termsEqual(this.given, query.given);

    return conditionMatches && givenMatches;
  }

  invert({ expressions }) {
    // a fact with multiple conditions is inverted by inverting each condition individually
    // P(A, B) != 1 - P(¬A, ¬B)
    // P(A, B) = 1 - P(¬A, B) = 1 - P(A, ¬B)
    const invertIndex = (conditions, index) => conditions.map((term, termIndex) => {
      if (termIndex === index) {
        return term.invert();
      }
      return term;
    });
    const invert = (ignored, index) => {
      const condition = invertIndex(this.condition, index);
      const fact = new Fact({ condition, given: this.given, probability: 1 - this.probability });
      const expression = (
        `${fact.toString({ probability: false })} = `
        + `1 - ${this.toString({ probability: false })} = `
        + `1 - ${this.probability} = ${round({ value: fact.probability })}`
      );
      return { fact, expressions: expressions.concat([expression]) };
    };
    return this.condition.map(invert);
  }

  removeGiven({ facts, expressions }) {
    // P(A | B) and P(A | ¬B) can be turned into P(A) if P(B) is known:
    // P(A) = ( P(A | B) * P(B) ) + ( P(A | ¬B) * P(¬B) )
    // P(¬B) = 1 - P(B)

    const findAll = (given) => {
      const thisMultiplicand = find({ condition: [given], facts });
      const inverse = find({
        condition: this.condition,
        given: replace({ original: given, replacement: given.invert(), terms: this.given }),
        facts,
      });
      const inverseMultiplicand = find({ condition: [given.invert()], facts });

      return {
        given,
        thisMultiplicand,
        inverse,
        inverseMultiplicand,
      };
    };
    function valid({ thisMultiplicand, inverse, inverseMultiplicand }) {
      return thisMultiplicand !== undefined
        && inverse !== undefined
        && inverseMultiplicand !== undefined;
    }
    const expression = ({
      fact, thisMultiplicand, inverse, inverseMultiplicand,
    }) => (
      `${fact.toString({ probability: false })} = `
        + `(${this.toString({ probability: false })} \\cdot ${thisMultiplicand.toString({ probability: false })})`
        + ` + (${inverse.toString({ probability: false })} \\cdot ${inverseMultiplicand.toString({ probability: false })}) = `
      + `(${this.probability} \\cdot ${round({ value: thisMultiplicand.probability })}) + (${round({ value: inverse.probability })} \\cdot ${round({ value: inverseMultiplicand.probability })}) = `
      + `${round({ value: fact.probability })}`
    );
    const toFact = ({
      given, thisMultiplicand, inverse, inverseMultiplicand,
    }) => {
      const fact = new Fact({
        condition: this.condition,
        given: this.given.filter((term) => !term.equals(given)),
        probability:
            (this.probability * thisMultiplicand.probability)
            + (inverse.probability * inverseMultiplicand.probability),
      });
      return {
        fact,
        expressions: expressions.concat([expression({
          fact, thisMultiplicand, inverse, inverseMultiplicand,
        })]),
      };
    };

    return this.given
      .map(findAll)
      .filter(valid)
      .map(toFact);
  }

  bayes({ ungivenFacts, expressions }) {
    const withProbability = (terms) => terms
      .map((term) => ({ term, id: term.toId() }))
      .filter(({ id }) => ungivenFacts.has(id))
      .map(({ term, id }) => ({ term, probability: ungivenFacts.get(id).probability }));
    const conditions = withProbability(this.condition);
    const givens = withProbability(this.given);
    const toFact = ({ term, probability }) => new Fact({
      condition: [term], given: [], probability,
    });
    const expression = ({ fact, condition, given }) => {
      const conditionFact = toFact(condition);
      const givenFact = toFact(given);
      return (
        `${fact.toString({ probability: false })} = `
        + `\\frac{${this.toString({ probability: false })} \\cdot ${givenFact.toString({ probability: false })}}{${conditionFact.toString({ probability: false })}} = `
        + `\\frac{${round({ value: this.probability })} \\cdot ${round({ value: given.probability })}}{${round({ value: condition.probability })}} = `
        + `${round({ value: fact.probability })}`
      );
    };
    const derive = ({ condition, given }) => {
      const fact = new Fact({
        condition: replace({
          original: condition.term, replacement: given.term, terms: this.condition,
        }),
        given: replace({ original: given.term, replacement: condition.term, terms: this.given }),
        probability: (this.probability * given.probability) / condition.probability,
      });
      return { fact, expressions: expressions.concat([expression({ fact, condition, given })]) };
    };

    return conditions.flatMap((condition) => givens.map((given) => derive({ condition, given })));
  }

  toString({ probability = true } = {}) {
    const start = super.toString(this);
    if (!probability) {
      return start;
    }
    return `${start} = ${round({ value: this.probability })}`;
  }
}

class Query extends Base { }

Parser

It’s a copy of the combinator parser in this post which is really nice to work with. Combinator parsers are cool.

It works around consuming the input a single character at a time. A parser can either accept that input, returning some encoded representation of it, or reject it by returning undefined.

Some parsers are created by combining other parsers. The seq parser takes several child parsers which are run in sequence over the input. If any of them fail then the created seq parser fails.

Combinations like this are how you create the overall parser. You are combining functions until you end up with something that can consume the entire input. After working with this I find it easy to work with, and it leads to code like this:

const FACT = seq(
    P,
    OPEN,
    or(WITH_GIVEN, WITHOUT_GIVEN),
    CLOSE,
    EQUALS,
    NUMBER,
);

Which shows how a fact is matched. You can see the entire code below:

Code
// inspired by https://github.com/dabeaz/blog/blob/main/2023/three-problems.md

parse = {
  /**
   * the signature of the parser is input -> (x, xs) | undefined
   * if the parser accepts the input then the output will be of the form (x, xs)
   * if the parser rejectes the input then the output will be undefined
   */
  
  /**
   * Shift if a parser that just tests if the input is acceptable at all,
   * consuming a single letter from it if it is.
   */
  const shift = ([input, index]) => {
    if (index < input.length) {
      return [input[index], [input, index + 1]];
    }
    return undefined;
  };
  
  /**
   * Nothing just returns undefined as the first element.
   * This is the inverse of shift.
   */
  const nothing = (input) => [undefined, input];
  
  /**
   * Applies the predicate to the first element issued by the parser.
   * If the predicate passes then the parser output is returned.
   */
  const filter = (predicate) => (parser) => (input) => {
    const parsed = parser(input);
    if (parsed === undefined) {
      return undefined;
    }
    if (predicate(parsed[0])) {
      return parsed;
    }
    return undefined;
  };
  
  /**
   * Applies the function to the first element issued by the parser.
   */
  const map = (f) => (parser) => (input) => {
    const parsed = parser(input);
    if (parsed === undefined) {
      return undefined;
    }
    return [f(parsed[0]), parsed[1]];
  };
  
  /**
   * This repeatedly applies a parser until it rejects the input.
   * If the parser is applied at least once then this returns all of the accepted values.
   * Otherwise it returns undefined.
   */
  const oneOrMore = (parser) => (input) => {
    const result = [];
    let current = input;
    let match;
    while ((match = parser(current)) !== undefined) {
      let value;
      [value, current] = match;
      result.push(value);
    }
    if (result.length === 0) {
      return undefined;
    }
    return [result, current];
  };
  
  /**
   * This applies each parser in turn unless one rejects the input.
   */
  const seq = (...parsers) => (input) => {
    const result = [];
    let current = input;
    for (let i = 0, count = parsers.length; i < count; i += 1) {
      const parser = parsers[i];
      const match = parser(current);
      if (match === undefined) {
        return undefined;
      }
      let value;
      [value, current] = match;
      result.push(value);
    }
    return [result, current];
  };
  
  /**
   * This applies each parser in turn, returning the first that accepts the input.
   */
  const or = (...parsers) => (input) => {
    for (let i = 0, count = parsers.length - 1; i < count; i += 1) {
      const parser = parsers[i];
      const match = parser(input);
      if (match !== undefined) {
        return match;
      }
    }
    return parsers[parsers.length - 1](input);
  };
  
  const zeroOrMore = (parser) => or(oneOrMore(parser), seq());
  
  const join = (letters) => letters.join('');
  
  const left = (first, second) => map((pair) => pair[0])(seq(first, second));
  const right = (first, second) => map((pair) => pair[1])(seq(first, second));
  
  const maybe = (parser) => or(parser, nothing);
  
  const isDigit = (char) => char >= '0' && char <= '9';
  const isLetter = (char) => (char >= 'a' && char <= 'z') || (char >= 'A' && char <= 'Z');
  const isLiteral = (value) => (char) => char === value;
  
  const digit = filter(isDigit)(shift);
  const letter = filter(isLetter)(shift);
  const literal = (value) => filter(isLiteral(value))(shift);
  const dot = literal('.');
  
  const digits = oneOrMore(digit);
  const letters = oneOrMore(letter);
  
  const number = map(parseFloat)(
    map(join)(or(
      map((result) => [...result[0], result[1], ...result[2]])(seq(digits, dot, digits)),
      map((result) => [...result[0], result[1]])(seq(digits, dot)),
      map((result) => [result[0], ...result[1]])(seq(dot, digits)),
      digits,
    )),
  );
  const word = map(join)(letters);
  const whitespace = zeroOrMore(or(literal(' '), literal('\t'), literal('\r'), literal('\n')));
  
  const token = (parser) => right(whitespace, parser);
  
  const P = token(literal('P'));
  const OPEN = token(literal('('));
  const CLOSE = token(literal(')'));
  const EQUALS = token(literal('='));
  const NOT = token(literal('¬'));
  const WORD = token(word);
  const GIVEN = token(literal('|'));
  const COMMA = token(literal(','));
  const NUMBER = token(number);
  const SEMICOLON = token(literal(';'));
  const TERM = or(
    map((term) => (new Term({ term, not: true })))(right(NOT, WORD)),
    map((term) => (new Term({ term, not: false })))(WORD),
  );
  
  const combine = (result) => [result[0], ...result[1]];
  const TERM_LIST = map(combine)(seq(
    TERM,
    zeroOrMore(right(COMMA, TERM)),
    maybe(COMMA),
  ));
  
  const extractWithGiven = (result) => ({ condition: result[0], given: result[2] });
  const WITH_GIVEN = map(extractWithGiven)(seq(TERM_LIST, GIVEN, TERM_LIST));
  
  const extractWithoutGiven = (result) => ({ condition: result, given: [] });
  const WITHOUT_GIVEN = map(extractWithoutGiven)(TERM_LIST);
  
  const extractFact = (result) => (new Fact({ ...result[2], probability: result[5] }));
  const FACT = seq(
    P,
    OPEN,
    or(WITH_GIVEN, WITHOUT_GIVEN),
    CLOSE,
    EQUALS,
    NUMBER,
  );
  
  const extractQuery = (result) => (new Query(result[2]));
  const QUERY = seq(
    P,
    OPEN,
    or(WITH_GIVEN, WITHOUT_GIVEN),
    CLOSE,
  );
  
  const FACT_OR_QUERY = or(map(extractFact)(FACT), map(extractQuery)(QUERY));
  
  function parse(text) {
    const grammar = map(combine)(seq(
      FACT_OR_QUERY,
      zeroOrMore(right(SEMICOLON, FACT_OR_QUERY)),
      maybe(SEMICOLON),
    ));
  
    const match = grammar([text, 0]);
    if (match === undefined) {
      return { facts: [], queries: [], error: `unable to parse '${text}'` };
    }
  
    const [expressions, [_text, index]] = match;
    const remaining = _text.substring(index);
    const facts = expressions.filter((expression) => expression.probability !== undefined);
    const queries = expressions.filter((expression) => expression.probability === undefined);
    if (remaining.trim().length > 0) {
      return { facts, queries, error: `unable to parse '${remaining}'` };
    }
    return { facts, queries, error: undefined };
  }

  return parse;
}

Solver

The solver is relatively simple. It works as a breadth first search, where a given set of facts are expanded according to the three operations, above. The generated facts are tested to see if they have been considered before, and those that have not are added to a queue for evaluation. Before expanding the facts the current set of facts is tested to see if it answers the queries posed.

While the process is conceptually simple the implementation is quite messy. One reason that I do not want to address the issues with the expansions is that it is quite involved to work with these representations.

Code
function getMissingTerms({ facts, queries }) {
  const getTerms = ({ condition, given }) => condition.concat(given).map(({ term }) => term);
  const termsInFacts = new Set(facts.flatMap(getTerms));
  const termsInQueries = [...new Set(queries.flatMap(getTerms))];
  const missingTerms = termsInQueries.filter((term) => !termsInFacts.has(term));
  return missingTerms;
}

function getDuplicates(entries) {
  const seen = new Set();
  const duplicates = entries.filter((entry) => {
    const identifier = entry.toId();
    if (seen.has(identifier)) {
      return true;
    }
    seen.add(identifier);
    return false;
  });
  return duplicates;
}

function validate({ facts, queries }) {
  if (queries.length === 0) {
    return { error: 'no query to solve' };
  }

  const missingTerms = getMissingTerms({ facts, queries });
  if (missingTerms.length > 0) {
    return { error: `cannot establish value of query terms ${missingTerms.join(', ')} as they do not appear in the conditions` };
  }

  const duplicatedFacts = getDuplicates(facts);
  if (duplicatedFacts.length > 0) {
    return { error: `duplicate facts stated: ${duplicatedFacts.map((fact) => fact.toString()).join(', ')}` };
  }

  const duplicatedQueries = getDuplicates(queries);
  if (duplicatedQueries.length > 0) {
    return { error: `duplicate queries stated: ${duplicatedQueries.map((query) => query.toString()).join(', ')}` };
  }

  return { error: undefined };
}

/**
 * This creates a comparable key that encodes the current state.
 * The key is used to determine if this state has been visited before.
 */
function toStateIdentifier({ facts }) {
  const identifiers = facts.map((fact) => fact.toId());
  return identifiers.sort().join(', ');
}

function makeNextStates({ facts, expressions }) {
  const hasGiven = (fact) => fact.given.length > 0;
  const singleCondition = (fact) => fact.given.length === 0 && fact.condition.length === 1;
  const ungivenFacts = new Map(
    facts.filter(singleCondition).map((fact) => [fact.condition[0].toId(), fact]),
  );

  const invert = (fact) => fact.invert({ expressions });
  const removeGiven = (fact) => fact.removeGiven({ facts, expressions });
  const bayes = (fact) => fact.bayes({ ungivenFacts, expressions });

  const current = facts.map((fact) => fact.toId());
  const nextFacts = [
    ...facts.flatMap(invert),
    ...facts.filter(hasGiven).flatMap(bayes),
    ...facts.flatMap(removeGiven),
  ];
  return nextFacts
    .filter(({ fact }) => !current.includes(fact.toId()))
    .map((row) => ({ facts: [...facts, row.fact], expressions: row.expressions }));
}

function getSolution({ facts, queries }) {
  return queries
    .map((query) => facts.filter((fact) => fact.solves(query)))
    .filter((solutions) => solutions.length > 0);
}

function solve({ facts, queries }) {
  const { error } = validate({ facts, queries });
  if (error !== undefined) {
    return { error, solutions: undefined, expressions: undefined };
  }

  const expressions = facts.map((fact) => fact.toString());
  let states = [{ facts, queries, expressions }];
  const evaluated = [];
  while (states.length > 0) {
    const state = states.shift();
    const solutions = getSolution(state);
    if (solutions.length === queries.length) {
      return {
        solutions: solutions.map((list) => list[0]),
        error: undefined,
        expressions: state.expressions,
      };
    }

    evaluated.push(toStateIdentifier(state));
    const current = states.map(toStateIdentifier);
    const children = makeNextStates(state)
      .map((child) => ({ ...child, queries: state.queries }))
      .filter((childState) => {
        const id = toStateIdentifier(childState);
        return !(evaluated.includes(id) || current.includes(id));
      });
    states = states.concat(children);
  }
  return { error: 'no solution found', solutions: undefined, expressions };
}

If I were to do this again I would write a language for the substitutions which is then parsed into the expander. It would also be nice to have a directed search, although that might be a bit much.