Some probability puzzles

1. A certain family contains two children. One of them is a girl. What is the probability that the other is a girl?
The controversy over the first problem seems to stem from ambiguity. Reading back I see that even you reversed your answer on it at least twice, Calboner. For the sake of a definitive answer, could you restate the problem without ambiguities?

From the set of all families with two children, a child is selected at random and is found to be a girl. What is the probability that the other child of the family is a girl?

or

From the set of all families with two children, a family is selected at random and is found to have a girl. What is the probability that the other child of the family is a girl?
I can see why you think there is an ambiguity, but I don't think there really is one. The original question has to be interpreted the second way you stated it. The statement "One of them is a girl" has to mean "at least one of them is a girl." The only other way you could interpret it is "exactly one is a girl" in which case there is zero chance the other child is a girl.

If you told me you had 2 children and I asked, "Is one of them a girl?" you wouldn't select one of them at random and then answer based on that. You would answer yes if at least one was a girl.
 
I appreciate your taking the trouble to read back over previous posts, Irish. My first answer to problem no. 1 (post #11 above), which I offered with express uncertainty ("I may be making a mistake here, but this is how I reasoned"), was that the probability of the other child being a girl was 1/3. After I posted that, several others argued for the same answer that you offer, 1/2. I was initially persuaded by those arguments, and so accepted that answer (post #24), but eventually I was persuaded by Jovial's response (post #42) that my initial solution was in fact the correct one (as I stated in post #43). Since that time, I have only become more confident in that solution. So that makes two reversals, but no more than two.

You say that the initial statement of the problem was ambiguous, in that it could be interpreted in either of the two ways that you state. The initial statement of the problem was:
A certain family contains two children. One of them is a girl. What is the probability that the other is a girl?​
And your two interpretations are:
1. From the set of all families with two children, a child is selected at random and is found to be a girl. What is the probability that the other child of the family is a girl?

2. From the set of all families with two children, a family is selected at random and is found to have a girl. What is the probability that the other child of the family is a girl?
I have difficulty grasping the distinction that you are trying to make here. For one thing, your version no. 2 is itself ambiguous. The phrase "a family is . . . found to have a girl" could mean either (a) that the family has exactly one girl or (b) that it has at least one girl. Now interpretation (a) makes nonsense of the problem, since if it were stipulated at the outset that the family has exactly one girl, then the answer to the question "What is the probability that the other child of the family is a girl?" is simply "zero." But if we rewrite (2) to adopt interpretation (b), we end up with a question that doesn't make sense:
From the set of all families with two children, a family is selected at random and is found to have at least one girl. What is the probability that the other child of the family is a girl?
The question makes no sense, because the phrase "the other child" has no reference when no child has been specified in the first place. ("Other" child than what child? Than the one or two children who are girls?) The question cannot be "What is the probability that the other child of the family is a girl?", but must be "What is the probability that both children of the family are girls?" So the proper statement of (2) should be as follows:
2*. From the set of all families with two children, a family is selected at random and is found to have at least one girl. What is the probability that both children of the family are girls?
But now, I don't see how this is substantively different from version 1. Won't both have exactly the same solution?

Edited to add: Jovial's post appeared while I was writing this one.
 
Last edited:
I can see why you think there is an ambiguity, but I don't think there really is one. The original question has to be interpreted the second way you stated it. The statement "One of them is a girl" has to mean "at least one of them is a girl." The only other way you could interpret it is "exactly one is a girl" in which case there is zero chance the other child is a girl.
I'm moderately decent at spelling and I know a few rules about grammar, but interpreting the English language has never been a strong suit of mine.

I feel the two ways the problem was restated are distinctly different by using different sampling methods and clarifying them, I'm not asking anyone to agree with that part. But knowing the scope of the problem now, I agree with the answer of 1/3.

If you told me you had 2 children and I asked, "Is one of them a girl?" you wouldn't select one of them at random and then answer based on that. You would answer yes if at least one was a girl.
You're making two assumptions about me, and while you happen to be correct in this instance, I could not make the same assumptions in reference to this problem.

1. That I won't lie to you.
2. That I am not a math problem written in a language you've never liked dealing with - especially for math.

Won't both have exactly the same solution?
The logic I used to present my answer the first time (which results in 50%) applies to picking a child, the logic you used (which results in 33%) applies to picking a family. As you said earlier, it's the difference between sets and permutations.

It really should be obvious given you mention the family, but I did not interpret it that way. Again, probably more a lack of reading comprehension on my part.
 
Last edited:
The logic I used to present my answer the first time (which results in 50%) applies to picking a child, the logic you used (which results in 33%) applies to picking a family. As you said earlier, it's the difference between sets and permutations.

It really should be obvious given you mention the family, but I did not interpret it that way. Again, probably more a lack of reading comprehension on my part.

I think I'm beginning to get the idea, and also to see what is wrong with this way of interpreting the problem. Here again is the disputed interpretation:
1. From the set of all families with two children, a child is selected at random and is found to be a girl. What is the probability that the other child of the family is a girl?
Only now do I see what is wrong with this formulation, namely that the first sentence is logically incoherent. It begins, "From the set of all families with two children, a child is selected at random"; but this makes no sense. By definition, the members of a set of families are simply families, not children. From a set of families, you can only select a family; you cannot, as a matter of logic, select a child. A child is a member of a family, of course, but it is not a member of a set of families.

This may sound like nitpicking, but it really isn't; for look what happens when we try to rewrite (1) to eliminate its logical incoherence. I can see only two possible ways of doing so:
1a. From the set of all families with two children, a family is selected at random and is found to contain at least one girl. What is the probability that both children are girls?

1b. From the set of all children having exactly one sibling (or, equivalently: the set of all children belonging to families of two children), a child is selected at random and is found to be a girl. What is the probability that the other child of the family is a girl?
Now it is obvious that (1a) is equivalent to (2*), the corrected version of (2) that I posted earlier; in fact, it is almost identical verbatim:
2*. From the set of all families with two children, a family is selected at random and is found to have at least one girl. What is the probability that both children of the family are girls?
What, then, of (1b)? OOPS! I posted something here, but saw an error in it immediately after posting it! To be continuted.
 
Last edited:
Okay, I think I've got (1b) figured out. Here it is again, for easy reference:
1b. From the set of all children having exactly one sibling (or, equivalently: the set of all children belonging to families of two children), a child is selected at random and is found to be a girl. What is the probability that the other child of the family is a girl?
Consider the following sets (the notation may not be mathematically proper, but I hope that it is clear enough):
F2: the set of families of two children
C2: the set of children belonging to families of two children
Clearly, C2 will have exactly twice the membership (I mean the number of members; I don't remember what the proper set-theoretic term is for this) that F2 has. Now consider these sets:
F2G: the set of families of two children, both of which are girls
C2G: the set of children belonging to families of two children, both of which are girls
Clearly, F2G is a subset of F2 and C2G is a subset of C2. Moreover, their memberships will stand in the same numerical ratio as the first two: C2G will have exactly twice the number of members that F2G has. Now the proportion of the membership of F2G to that of F2 will be identical with the probability that a randomly chosen family of two children is a family of two children, both of which are girls (this is what (1a) asks), and the proportion of the membership of C2G to that of C2 will be identical with the probability that a randomly chosen child belonging to a two-child family will belong to a two-child family, both of the children in which are girls (i.e., will have a sister; this is what (1b) asks). From what has gone before, it follows that these probabilities will be identical.

There is probably a much simpler way of demonstrating this result, but the implication should be clear: the solution to (1b) will be identical to the solution to (1a), which has already been shown to be identical to the solution to (2*), which has been shown to be 1/3.

Hmm. This was not supposed to be so complicated. But I was bothered by your contention that my original statement of the problem was ambiguous. I think it is clear that the problem can only be interpreted as (1a) or (2*), but this further analysis shows that even if we interpret it, rather implausibly, as (1b), the solution is the same.
 
There don't seem to be any further takers for my problem no. 4 (which actually should have been numbered 5, because, as I only remembered too late, Jovial offered a fourth problem a while back), so I will comment on it myself. The problem was:
4. There is a lottery. Exactly 1,000 tickets are sold. Two tickets will be drawn at random: the holder of the first ticket drawn will win first prize; the holder of the second ticket drawn will win second prize. If you hold one ticket, what is your chance of winning second prize?
What I found interesting about this problem is that there is a temptation to think that the probability of winning second prize is 1/999 rather than 1/1,000. In fact, I initially thought that that was the correct answer; for after all, the second-prize-winning ticket is drawn from 999 tickets, not from 1,000. I had hoped that someone would offer that answer to the problem, so that I could then offer the following reply:

The reasoning is that, if we start out with N tickets and draw one to get first prize, then the second ticket is drawn from a pool of N-1 tickets; therefore, the chance of winning second prize is 1/(N-1). But suppose that there were only two tickets (i.e., N = 2). By this reasoning, the chance of winning second prize is then 1/(2-1) = 1/1 = 1; in other words, one cannot but win second prize. But that conclusion is obviously false: one's chance of winning second prize in such a lottery is 1/2, not 1. So obviously, that line of reasoning is faulty.

The interesting question is then: where does this line of reasoning go wrong? I say that the error in it is that it conflates two different questions. The question posed in the problem is "What is the chance of winning second prize?" The reasoning that leads to the answer "1/999" answers a different question, namely: "What is the chance of winning second prize, given that your ticket was not drawn for first prize?" In the terms used in the theory of probability, the error in the latter line of reasoning is that it conflates a conditional probability with a categorical one. Understood in these terms, the error is the very same one that leads to the wrong answer to problem no. 1 (about the two children).

I may be just talking to myself at this point. I don't know if anyone finds this interesting any more. But I do.

In case there is anyone still interested in this, here is a variation (call this problem no. 4B): Suppose that the way the lottery works is that after a ticket is drawn for first prize, it is returned to the pool before a second drawing is made. Thus, in this lottery, it is possible for the same ticket to be drawn twice. Everything else is the same as in problem no. 4: exactly 1,000 tickets are sold altogether, and you hold just one of them. What, then, is your chance of winning second prize in this lottery?
 
Well, I'm still reading this thread, Calboner, so your typing isn't wasted. I'm impressed that you can explain the answers with such rigor.

I knew the answer to #4 was 1/1000, but I wanted to see if anyone else responded.

For 4b, I would say the chance of winning second is 1/1000. And you have a 1 in 1,000,000 chance of winning both first and second.
 
That's the answer that I get for no. 4B as well, 1/1000; but that result surprised me, because it seems as though throwing the first ticket back into the lot should make a difference to the probability of getting your ticket drawn second! In #4 (where the first-drawn ticket is kept out), the second ticket is drawn from 999, while in #4B (where the first-drawn ticket is thrown back in), it's drawn from 1,000. But once again, I think that the cause of perplexity is the tendency to confuse a categorical probability (the probability of having one's ticket drawn second) with a conditional probability (the probability of having one's ticket drawn second given that it was not drawn first). For the conditional probabilities do differ: in #4, given that the first ticket has been drawn and was not one's own, the probability of one's ticket getting drawn second is 1/999; while in #4B, given that the first ticket has been drawn and thrown back (regardless of whether it was one's own or not), the probability of one's ticket getting drawn second is 1/1,000.

The differences in the probabilities of winning in the two games can be stated thus:

No. 4 (first-drawn ticket is not replaced):
Probability of winning first prize: 1/1,000
Probability of winning second prize: 1/1,000
Probability of winning both prizes: 0
Probability of winning second prize, given that one's ticket was not drawn first: 1/999

No. 4B (first-drawn ticket is replaced):
Probability of winning first prize: 1/1,000
Probability of winning second prize: 1/1,000
Probability of winning both prizes: 1/1,000,000
Probability of winning second prize, given that one's ticket was not drawn first: 1/1,000 (having one's ticket drawn first or not drawn first has no effect on the probability of its being drawn second)
 
I would say the Law of Total Probability applies for problem 4.
Using the notation:
P(A) = probability of A
P(~A) = probability that A did not occur
P(A|B) = probability of A given B occurred
P(A|~B) = probability of A given that B did not occur

Then P(A) = P(A|B) * P(B) + P(A|~B) * P(~B)

Let
A = the probability that your ticket is drawn second
B = the probability that your ticket is drawn first

Then
P(A|B) = 0
P(B) = 1/1000

P(A|~B) = 1/999
P(~B) = 999/1000

So by the Law of Total Probability you get

P(A) = 0 * (1/1000) + (1/999) * (999/1000) = 1/1000

Which is basically how someone calculated it previously.

Edit: Just to clarify in words, the Law of Total Probably states that P(A) is the weighted average of the probability of A given that B has occurred and given that it has not occurred.
 
Last edited:
Edit: Just to clarify in words, the Law of Total Probably states that P(A) is the weighted average of the probability of A given that B has occurred and given that it has not occurred.

The calculus of conditional probability is not difficult to learn, just as a way of twiddling with formulas, but maintaining a grasp of what it means and applying it correctly can be very difficult.

I think we've probably driven everybody away by now with our nerdiness. :redface: