## Puzzle from Reddit (link)

04/19/2015My friend linked me a programming puzzle on Reddit and I found it interesting. I didn't get anywhere thinking about it, but I brought it up with my friends and we came up with a neat solution. This post outlines our approach to solving the puzzle (solution below, pardon my artwork as always).

**Assumes all information given is consistent!**

### Problem description

As a brief problem description, you are given *N* input statements giving us the probabilities of various combined events. For instance, in the default example above, the input/output should read:

*P(!A & B) = 0.01*

*P(!A & !B) = 0.85*

*P(B) = 0.12*

*P(A & !B)*?

or, the event "Not A and B" happens with probability 1%. Given many probability assignments like this, the challenge is to find the probability of a target event; in the example above, this is

*P(A & !B)*.

### Renaming variables for convience in linear algebra

The input statements are not conducive to linear arithmetic manipulation in their undoctored form. For instance, if we are given

*P(A) = 0.1*

*P(B) = 0.4*

we can't combine those equations to get

*P(A & B)*,

*P(A || B)*,

*P(A & !B)*, or really any meaningful combination of the two events; in order to do so we need some quantitative metric of how related

*A*and

*B*are. Notably, however, if we knew that

*A*and

*B*were mutually exclusive, then we can set up linear systems where the probability of one event or another is simply the sum of the two individual probabilities. And that's the first trick - rephrasing the inputs as combinations of mutually exclusive events.

In a universe of *N* boolean events, there are *2 ^{N}* different, mutually exclusive, outcomes. Specifically, let's label the original events

*X*, and label the entire universe outcomes

_{0}, X_{1}, ..., X_{N-1}*Y*. Let's assume that

_{0}, Y_{1}, ..., Y_{2N-1}*Y*corresponds to the specific outcome of

_{0}*!X*

_{0}& !X_{1}& ... & !X_{N-2}& !X_{N-1}and

*Y*corresponds to

_{1}*!X*

_{0}& !X_{1}& ... & !X_{N-2}& X_{N-1}and

*Y*corresponds to

_{2}*!X*

_{0}& !X_{1}& ... & X_{N-2}& !X_{N-1}etc, "incrementing" from "

*not X*" to "

_{i}*X*" like counting in binary.

_{i}If we rephrase the problem inputs in this grammar, using *Y* instead of *X*, we now get the benefit of knowing that each of our "events" are mutually exclusive; only one element of the *Y _{0} ... Y_{2N-1}* sequence can be true. Because of this, we can add and subtract probabilities and use linear algebraic reasoning. To make this transition, define

*Y'*as

_{i}*P(Y*for all

_{i})*i*in

*[0, 2*.

^{N}-1]### Solving the example

Bringing this back to a concrete example, let's start with the example given in the dashed brown box, except let's rename *A* to be *X _{0}* and

*B*to be

*X*to be consistent with our naming conventions. We then have

_{1}*P(!X*

_{0}& X_{1}) = 0.01*P(!X*

_{0}& !X_{1}) = 0.85*P(X*

_{1}) = 0.12*P(X*?

_{0}& !X_{1})This can be rephrased in the new "Y"-grammar defined above as follows:

*P(Y*

_{1}) = 0.01*P(Y*

_{0}) = 0.85*P(Y*

_{1}|| Y_{3}) = 0.12*P(Y*?

_{2})Note that

*X*actually corresponds the union of the two events,

_{1}*X*with

_{1}& !X_{0}*X*, which is why

_{1}& X_{0}*X*corresponds to

_{1}*Y*. Anyway - we can again rephrase in the new " Y' "-grammar:

_{1}|| Y_{3}*Y'*

_{1}= 0.01*Y'*

_{0}= 0.85*Y'*

_{1}+ Y'_{3}= 0.12*Y'*?

_{2}As you may have realized, there's an additional equality which goes unsaid - the sum all of everything in the

*Y*sequence adds up to 1. Recall that the universe is exactly partitioned into the

_{i}*Y*sequence. So, our final linear system is

_{i}*Y'*

_{1}= 0.01*Y'*

_{0}= 0.85*Y'*

_{1}+ Y'_{3}= 0.12*Y'*

_{0}+ Y'_{1}+ Y'_{2}+ Y'_{3}= 1*Y'*?

_{2}which we can easily solve with Gaussian row reduction.

### Incomplete information

If we take a step back, we solved the example problem by finding the probability of every single true/false assignment to each of the input events. If they've given us enough information, then this is a fine approach. But what if the question was just

*P(!X*

_{0}& X_{1}) = 0.01*P(!X*

_{0}& !X_{1}) = 0.85*P(X*?

_{0})In this case, we don't have enough information to solve for all of

*{Y'*, but we can still evaluate

_{0}, Y'_{1}, Y'_{2}, Y'_{3}}*P(X*, which is

_{0})*Y'*. Rewritten in the " Y' "-grammar,

_{2}+ Y'_{3}*Y'*

_{1}= 0.01*Y'*

_{0}= 0.85*Y'*

_{0}+ Y'_{1}+ Y'_{2}+ Y'_{3}= 1*Y'*?

_{2}+ Y'_{3}Re-written again as a matrix,

*M*,

This is the second trick of this problem. How can we solve one particular linear combination of variables without solving for all of the variables? The naive Gaussian elimination hinges on being able to solve for all variables and is therefore not a valid solution. One solution is to find some combination of the input expressions which gives you the specific linear combination you're looking for. In the example above, note that

*1 * (Y'*

_{0}+ Y'_{1}+ Y'_{2}+ Y'_{3}) - 1 * (Y'_{1}) - 1 * (Y'_{0}) = Y'_{2}+ Y'_{3}So the problem has changed from solving for

*Y*to solving for the coefficients of the input equations which you can combine to get the desired linear combination!

_{i}To set up the new linear system (which solves for the coefficients of the input equations, not for *Y _{i}*) we can take the

__transpose of the unaugmented__, and augment it with the coefficients of the desired linear combination. Neat, right? We are now solving for the coefficients we can use to linearly combine the input equations to get our desired probability. Here is the resultant augmented matrix. The left-side is the transpose of

*M*matrix*M*and the right side is coefficients of

*Y*representing our desired probability, which is

_{i}*Y*.

_{2}+ Y_{3}which, using Gaussian row reduction, we can simplify to

The right-side coefficients here can be used to linearly combine our inputs with our inputs,

*M*to get our desired answer. In matrix form, this is

to give our final answer of

*0.14*for this problem. And that's all there is to it!