Continues from Part 1 - the problem

I hope you had fun trying to come up with your own solution. I hope you have been successful. It is a tricky little problem.

The story continues

The next morning, Alice wakes up and starts going back to the basics. She draws the probabilities she wants on a sheet of paper:

Diagram showing cum sum over probabilites

So, how is she going to choose two values from this? She needs to have two draws, with each draw choosing one friend.

She is puzzled and goes into the kitchen to make herself a cup of tea to help thinking. Opening the drawer, she spots a pair of scissors and has an idea. She picks them up and goes back to her paper, cutting it into two parts.

Diagram showing cumsum cut into two parts

This works all right, she can have two independent draws to select a value from the first and then a value from the second part, using the draw with probabilities (from part 1) that she already has.

But hold on a moment: this does not work! Charlie is on both parts, so he might get selected to do the dishes with himself. No, that is not good enough.

But then she rearranges the bars on the table, and notices something strange.

Diagram re-arranging probabilities

If the bars are placed below each other, the top and bottom area of Charlie do not overlap. Indeed, it is impossible to overlap for any probabily, as the width of the bar is 1.0, and the maximum probability for a single friend is 1.0 as well.

So, all she needs to do is to select a single number that is applied to all bars, and nobody will ever get selected twice!

So, on the next dinner, she implements her algorithm:

from numpy import cumsum
from bisect import bisect_left

def choose_n_with_probabilities(n, values, probabilities):
    sums = cumsum(probabilities)
    i = random()
    return [values[bisect_left(sums, x + i)] for x in range(0, n)]

She runs it:

on_duty = choose_n_with_probabilities(2, ['Alice', 'Bob', 'Charlie', 'Dave'], [0.2, 0.6, 0.8, 0.4])
print("Today, %s and %s do the dishes." % tuple(on_duty))
>>> Today, Bob and Dave do the dishes.

Looks like it is Bob’s and Dave’s turn to clean up afterwards.

Diagram choosing two friends with one draw

She shows her solution to the friends, and everybody is delighted. To check that everything is fair, they run their sample function again:

print(sample(10000, lambda: choose_n_with_probabilities(2, ['Alice', 'Bob', 'Charlie', 'Dave'], [0.2, 0.6, 0.8, 0.4])))
>>> {'Bob': 6026, 'Dave': 4034, 'Charlie': 7951, 'Alice': 1989}

The numbers are roughly double of what they had in their old appartment, which looks right considering that two people have to do the dishes now and is in line with the doubling of probabilities. Problem solved!

(this is indeed a fully universal solution to the problem of part 1)

Improvement: Reducing correlalations

All is good until after a while Bob and Dave start complaining that they never do the dishes together with Alice. Only Charlie ever does the dishes together with her. This is not good.

This has to do with the fact that the probilites are always set up the same way. A way to solve this is to shuffle the order of the friends in the probability bar. Thankfully, numpy already has a helpful function for this.

from numpy.random import shuffle

def choose_n_with_probabilities_shuffled(n, values, probabilities):
    pairs = list(zip(values, probabilities))
    shuffle(pairs)
    (values, probabilities) = zip(*pairs)
    sums = cumsum(probabilities)
    i = random()
    return [values[bisect_left(sums, x + i)] for x in range(0, n)]

Now everybody is happy and they live happily ever after …

(Clarification after feedback: this clearly reduces correlations. I do believe that this creates a solution with minimal correlations, and I have an approach to a proof - but I might also be wrong).

Epilogue

While this problem seems to be rather artifical, I can confirm that it comes from a real-world problem in production code that I was facing a few years ago. It had to do with the fair treament of customers and trade allocations. It is quite funny how this now has become an interesting puzzle to solve. I hope you enjoyed it wrapped in a little story.

The story continues#

Improvement: Reducing correlalations#

Epilogue#

The story continues

Improvement: Reducing correlalations

Epilogue