Alice and Bob exchange data – Probably Overthinking It
Two questions crossed my desktop this week, and I feel I can reply each of them with a single instance.
On Twitter, Kareem Carr requested, “If Alice believes an occasion has a 90% chance of occurring and Bob additionally believes it has a 90% likelihood of occurring, what does it imply to say they’ve the identical diploma of perception? What would we anticipate to watch about each Alice’s and Bob’s habits?”
And on Reddit, a reader of /r/statistics requested, “I’ve three coefficients from three completely different research that measure the identical impact, together with their 95% CIs. Is there a straightforward approach to mix them right into a single estimate of the impact?”
So let me inform you a narrative:
In the future Alice tells her good friend, Bob, “I purchased a random decision-making field. Each time you press this button, it says ‘sure’ or ‘no’. I’ve tried it a number of occasions, and I feel it says ‘sure’ 90% of the time.”
Bob says he has some necessary choices to make and asks if he can borrow the field. The subsequent day, he returns the field to Alice and says, “I used the field a number of occasions, and I additionally assume it says ‘sure’ 90% of the time.”
Alice says, “It appears like we agree, however simply to verify, we must always evaluate our predictions. Suppose I press the button twice; what do you assume is the chance it says ‘sure’ each occasions?”
Bob does some calculations and studies the predictive chance 81.56%.
Alice says, “That’s fascinating. I acquired a barely completely different end result, 81.79%. So possibly we don’t agree in any case.”
Bob says, “Nicely let’s see what occurs if we mix our information. I can inform you what number of occasions I pressed the button and what number of occasions it stated ‘sure’.”
Alice says, “That’s okay, I don’t really need your information; it’s sufficient for those who inform me what prior distribution you used.”
Bob tells her he used a Jeffreys prior.
Alice does some calculations and says, “Okay, I’ve up to date my beliefs to take note of your information in addition to mine. Now I feel the chance of ‘sure’ is 91.67%.”
Bob says, “That’s fascinating. Based mostly in your information, you thought the chance was 90%, and based mostly on my information, I believed it was 90%, however after we mix the info, we get a distinct end result. Inform me what information you noticed, and let me see what I get.”
Alice tells him she pressed the button eight occasions and it at all times stated ‘sure’.
“So,” says Bob, “I assume you used a uniform prior.”
Bob does some calculations and studies, “Making an allowance for all the information, I feel the chance of ‘sure’ is 93.45%.”
Alice says, “So after we began, we had seen completely different information, however we got here to the identical conclusion.”
“Type of,” says Bob, “we had the identical posterior imply, however our posterior distributions had been completely different; that’s why we made completely different predictions for urgent the button twice.”
Alice says, “And now we’re utilizing the identical information, however we’ve completely different posterior means. Which is sensible, as a result of we began with completely different priors.”
“That’s true,” says Bob, “but when we accumulate sufficient information, ultimately our posterior distributions will converge, no less than roughly.”
“Nicely that’s good,” says Alice. “Anyway, how did these choices work out yesterday?”
“Largely dangerous,” says Bob. “It seems that saying ‘sure’ 93% of the time is a horrible approach to make choices.”
If you want to understand how any of these calculations work, you possibly can see the small print in a Jupyter pocket book:
And for those who don’t need the small print, right here is the abstract:
- If two individuals have completely different priors OR they see completely different information, they’ll usually have completely different posterior distributions.
- If two posterior distributions have the identical imply, a few of their predictions would be the identical, however many others won’t.
- If you’re given abstract statistics from a posterior distribution, you may be capable to work out the remainder of the distribution, relying on what different info you’ve. For instance, if you already know the posterior is a two-parameter beta distribution (or is well-modeled by one) you possibly can get better it from the imply and second second, or the imply and a reputable interval, or nearly every other pair of statistics.
- If somebody has achieved a Bayesian replace utilizing information you don’t have entry to, you may be capable to “again out” their chance perform by dividing their posterior distribution by the prior.
- If you’re given a posterior distribution and the info used to compute it, you possibly can again out the prior by dividing the posterior by the chance of the info (except the prior comprises values with zero chance).
- If you’re given abstract statistics from two posterior distributions, you may be capable to mix them. Usually, you want sufficient info to get better each posterior distributions and no less than one prior.