MishaEE
Junior Boarder
Blog Posts: 0
Forum Posts: 28
Rating: 0  
|
I've never been able to figure out just how the solution to the Sultan's Dowry problem (short version: a Sultan has 100 daughters, and you have to select one to be your wife, but you have to select the one with the largest dowry; as each one is presented to you, one at a time, the Sultan tells you her dowry, and you have to decide then and there whether she is the one you want before you can see any others).
Is there a website, or 'text explanation', anywhere that explains how the solution (as well as the solution to the 'follow-up' problem, 'what if you know that the 'optimal solution' is incorrect?'  was
|
|
The topic has been locked.
|
JohnC
Junior Boarder
Blog Posts: 0
Forum Posts: 25
Rating: 0  
|
|
Following are my thoughts on the problem, as well as a different 'follow-up' which to my knowledge has not been solved (although see the conjecture below):
Eric Farmer
Problem: You are responsible for interviewing n applicants for a job position. You will interview one applicant at a time. After each interview, you must decide whether to accept or reject the applicant; you cannot return to a previously rejected applicant. Assuming you know only the relative order of preference of the applicants interviewed at any given time, what is the strategy for selecting the best applicant with the greatest probability?
Solution: Let n be the total number of applicants, m be the number of applicants interviewed so far, and k be the rank of the current (m-th) applicant (1 <= k <= m <= n). Define accept(n,m,k) to be the probability that the current applicant is the best, and reject(n,m,k) (for m < n) to be the probability that, after rejecting the current applicant, the best applicant will be accepted using the optimal strategy. (Note that reject(n,m,k) is actually independent of k; this 'extra' notation will be useful in the follow-up problem below.) The strategy then consists of accepting or rejecting the m-th applicant, whose current rank is k, accordin g to whether accept(n,m,k) or reject(n,m,k), respectively, is greater.
Note that accept(n,m,k) = 0 for all k > 1, and accept(n,m,1) = m/n, which increases with m. Also, reject(n,m,1) does not increase with m, since nothing is known about the relative order of preference of applicants waiting to be interviewed. Thus, the optimal strategy consists of rejecting the first j applicants (for some j) outright, then accepting the next applicant better than all those interviewed so far.
Given n > 1, reject(n,n-1,1) = 1/n. For m < n-1 and accept(n,m+1,1) >= reject(n,m+1,1):
reject(n,m,1) = (m*reject(n,m+1,1) + accept(n,m+1,1)) / (m+1) = (m*reject(n,m+1,1) + (m+1)/n) / (m+1) = m/(m+1)*reject(n,m+1,1) + 1/n
= m/n * (1/m + ... + 1/(n-1))
by induction. The optimal strategy can then be described as follows: find the least m such that
m/n > m/n * (1/m + ... + 1/(n-1))
Reject the first m-1 applicants, then accept the next applicant better than all those interviewed so far.
Note that the right side of the above inequality is approximately m/n * ln(n/m), so that the critical number of applicants is approximately n/e, and the resulting probability of accepting the best applicant is approximately 1/e.
The objective in this problem is rather unrealistic. For example, suppose the second-best (overall) applicant is the first to be interviewed, and you just interviewed the ninth (next to last) applicant, which is the third-best overall, or the second-best so far. There are only two chances in ten that the last applicant will be better, but optimal strategy in this case suggests rejecting the current applicant. Thus, the following variant is proposed:
*******************************************************
********************* ************************
Problem: You are responsible for interviewing n applicants for a job position. You will interview one applicant at a time. After each interview, you must decide whether to accept or reject the applicant; you cannot return to a previously rejected applicant. Assuming you know only the relative order of preference of the applicants interviewed at any given time, what is the strategy for selecting the applicant with the best average rank?
Solution: Let n, m, and k be as above, and define accept(n,m,k) and reject(n,m,k) to be the expected rank of the applicant selected by accepting or rejecting the current applicant, respectively. The optimal strategy consists of accepting or rejecting the m-th applicant, whose current rank is k, according to whether accept(n,m,k) or reject(n,m,k), respectively, is lesser (i.e. results in better average overall rank).
Note that accept(n,m,k) = k + (n-m)*k/(m+1) = k*(n+1)/(m+1), since the overall rank of the current applicant is k plus n-m {0,1}-indicator variables, indicating whether each of the n-m subsequent applicants has better overall rank. Each subsequent applicant has better rank with probability k/(m+1); by linearity of expectation, accept(n,m,k) = k + (n-m)*k/(m+1) = k*(n+1)/(m+1).
(Equivalently, accept(n,m,k) = Sum[j * C(j-1,k-1) * C(n-j,m-k), {j,1,n}] / C(n,m), since among the C(n,m) equally likely sets of overall ranks of the applicants interviewed so far, the overall rank of the current applicant being j implies that the k-1 currently higher-ranked applicants are among the top j-1 applicants overall, and the m-k currently lower-ranked applicants are among the lowest n-j applicants overall.)
Also, reject(n,m,k) = Sum[min{accept(n,m+1,j), reject(n,m+1,j)}, {j,1,m+1}] / (m+1). Recall that reject(n,m,k) is independent of k, and so is constant for given n and m. Since accept(n,m,k) is increasing with k, the optimal strategy consists of accepting the m-th of n total applicants iff the applicant's rank is at most some critical k(n,m).
Empirical results based on these formulae suggest the following:
Conjecture: The optimal strategy consists of accepting the m-th of n total applicants, whose current rank is k, iff k is at most (approximately) n/(n-m+1).
For example, the following table describes the optimal strategy for n = 10; an x in position (m,k) indicates that if the m-th applicant has current rank k, then he/she should be accepted.
k
|
|
The topic has been locked.
|
richmondphil
Fresh Boarder
Blog Posts: 0
Forum Posts: 18
Rating: 0  
|
|
If you are conservative, you will want to reject the first 30 daughters (unless one is exceptionally well-endowed), while compiling statistics, like mean and standard deviation for a dowry.
After that, accept the first one whose dowry has at least a 50% chance of not being surpassed by a subsequent dowry. For example, if the 50th daughter's dowry is 3 standard deviations above average, each of the subsequent 50 daughters has about a 1/1000 chance of exceeding that dowry. So the odds that you will *not* find a better catch is (999/1000)^50, or about 95.12%, and you should accept daughter number 50.
Dave
|
|
The topic has been locked.
|
Transhumanist
Junior Boarder
Blog Posts: 0
Forum Posts: 23
Rating: 0  
|
|
I think you are solving a different problem than the one asked. The object is to get the largest dowry, not to get the as large a dowry as possible, or to get the average value of your dowry as large as possible. The way I often see this formulated is that you only get to marry her if she has the largest dowry, making you indifferent between the 2nd and 80th largest dowries.
With this phrasing, your strategy obviously fails in one circumstance. Suppose that you have gotten almost to the end, and are at the 99th daughter. The 99th daughter is well above average, but her dowry is not as large as the 65th daughter's. By your strategy, since #99 is above average, #100 has a less than 50% chance of surpassing her, so you should take #99. But there is absolutely no chance that #99 has the largest dowry, so in reality, you should go on to #100 and keep your finger's crossed.
|
|
The topic has been locked.
|
Jim
Junior Boarder
Blog Posts: 0
Forum Posts: 30
Rating: 0  
|
|
Yes, I agree. The wording indicates I must select the one with the largest dowry or I do not get to marry any of the daughters.
(Then naturally it follows that I would not accept a dowry that I already knew was not the largest.)
Here is my revised solution (which in more logical than conservative):
Always reject the first daughter.
For the second to hundredth daughter, always reject her if you have already rejected one with a dowry greater than or equal to hers (assuming that only one daughter has the 'largest dowry', as quoted in the premise).
Otherwise (for the second to hundredth daughter), select her if her dowry has at least a 50% chance of not being surpassed by a subsequent dowry.
On encountering each daughter, you must calculate the following statistics for all known daughters:
MeanOfKnownDowries StandardDeviationOfKnownDowries NumberOfKnownDowries
Then, using the above statistics, calculate:
ProbabilityCurrentDowryWillNotBeSurpassedByAnotherUnkno
wnDowry (statistics algorithm available upon request)
Finally, calculate:
ProbabilityCurrentDowryWillNotBeSurpassedByAnySubsequen
tDowry = ProbabilityCurrentDowryWillNotBeSurpassedByAnotherUnkno
wnDowry ^ SubsequentDowries
For example, if you just found out the 10th Dowry, and calculated ProbabilityCurrentDowryWillNotBeSurpassedByAnotherUnkno
wnDowry = 0.999, then ProbabilityCurrentDowryWillNotBeSurpassedByAnySubsequen
tDowry = (0.999)^90 = 0.913890031855952, which is > 50%, so you should select Daughter number 10.
|
|
The topic has been locked.
|
quest_marsman
Junior Boarder
Blog Posts: 0
Forum Posts: 25
Rating: 0  
|
|
Aren't you making an implicit assumption that the dowries are normally distributed when you calculate the probability of a dowry being surpassed based on the mean and standard deviation? Would a non-parametric method work better?
|
|
The topic has been locked.
|
paydayuscf
Junior Boarder
Blog Posts: 0
Forum Posts: 29
Rating: 0  
|
|
Yes, I am assuming normal distribution. You are right, it would be better to observe the distribution of known samples, and adjust my calculations accordingly. So the optimal solution would be extremely complex.
|
|
The topic has been locked.
|
Mathew
Junior Boarder
Blog Posts: 0
Forum Posts: 25
Rating: 0  
|
|
***** [SPOILER on “follow-up” problem is at end of this posting.]
The well-known solution is to pass on the first 100/e (37) daughters and then select the next dowry (if any) that is the highest so far. The probability that you will be successful in selecting the very highest is 1/e.
Eric Farmer posted a method of calculating this result. Here is another method that I posted recently in reply to a similar problem, using a very large deck of cards whose values can range from 0 to oo.
Ln[1/ (1-q] I will digress for a minute to discuss ln(a), which is the integral of 1/x between x=1 and x=a. For 0<a<1, let 1-q=a and flip the graph between 0 and 1. Then we can get the ln[1/(1-q) as follows:
Expanding: 1/(1-q) = 1 + q + q^2 + q^3 +.....
Integrating (from 0): Ln[1/ (1-q)] = q + q^2/2 + q^3/3 + ... for 0<q<1 Thus ln(3) = 2/3 + (2/3)^2/2 + (2/3)^3/3 +.....
This series converges even without the increasing denominators for 0<q<1. *****
Since the values can range from 0 to oo, no number is “small” or “large” except by comparison with a group of other cards. Accordingly, you should first look at (“pass”) a number of items keeping in mind only the largest of the pass group. Then you should select the first card (if any) that is higher than the highest in the pass group.
The strategy question is how many to pass initially to optimize your chance of selecting the highest value in the deck. Call group P the p% of the items you pass. Group Q is the remaining (1-p)% = q% of the deck. Hope for one of the following for large n: (Probabilities approach those shown as n increases.)
(1) The 2nd highest is in group P and the highest is in group Q. Probability of winning is p*q.
Or: (2) The 3rd highest is in P, and the two highest are in Q and he highest is the first one of the two high cards in Q. Probability is p*q^2/2.
Or: (3) The 4th highest is in P and the three highest are in Q and the first higher one in Q is highest in the deck. Probability is p*q^3/3.
Or: Etc. for 5th, 6th .. to be in P and the higher cards all in Q.
The total probability of winning is p(q + q^2/2 + q^3/3 +... ) From the formula given above the value in ( ) is ln[1/(1-q)]. Since p=(1-q) this equals ln(1/p) = -ln(p). Thus, we want to optimize -p*ln(p).
The derivative (fg)’ = f’g + fg’. The derivative of ln(p) is 1/p. This yields the derivative of -p*ln(p) = 1 + ln(p). Setting this equal to 0 yields ln(p) = -1 or p=e^-1 = 1/e (36.79%) as the optimum portion to pass initially.
Substituting p=1/e in the formula -p*ln(p) for the probability of winning yields 1/e as the probability of winning with this strategy.
For “small” n, the above calculations are inexact. But since the pass group will be an integer, I believe that rounding n/e is likely optimum anyway. I have made no detailed analysis of this.
***** 'FOLLOW-UP' PROBLEM
Someone (#1) using this strategy fails. You are next. You know that: - (1) the largest dowry is in the P group (0-37), or - (2) the two (or more) largest dowries are in the Q group (38-100) and the first one in this group is not the largest of all.
The sultan presents the daughters to you in the same order as he did for #1.
You should pass on the first 19.55% [e^(1/e -2)] of the daughters, noting only the largest in that group. If you find a largest-so-far (LSF) in the 20-37 group, you should select her.
If you fail to find one, you should then pass on the first LSF you may find after #37 (you know it fails) and wait (hope) for an even larger dowry and select that one.
As you might gather from the formula for 19.55%, the math is rather involved. Let’s look separately at the two possible reasons for #1 failing:
(1) The largest dowry is in the P group.
As noted earlier, if you pass on p% of a large group of any size, and then select the next LSF (if any), your probability of selecting the largest in the group is -p*ln(p). Suppose you pass on a range of r% of the P group and select the next LSF (if any) in that group.
Since the P group contains 1/e of all dowries, your overall chance of success is -r*ln(r)/e.
(2) The two (or more) largest dowries are in the Q group (38-100) and the first one in this group is not the largest of all.
We know that #1’s strategy succeeds 1/e (36.79%) of the time. Most of the successes occur when the largest is in the Q group (1-1/e) and the 2nd largest is in the P group (1/e). This probability is (1-1/e)*(1/e) = 1/e-1/e^2 (23.25%).
Subtracting this from the total probability of 1/e of #1’s strategy succeeding gives a probability of 1/e^2 (13.53%) that #1’s strategy wins when there are two or more of the largest in the Q group and the first of these happens to be the largest overall. But there is an equal probability that the largest overall is the 2nd LSF in Q.
You will get a chance to select from the Q group only if the largest in the P group is in the first r% of that group. (Otherwise you will select from the (1-r%) of the later dowries in that group). This makes your chance of success of picking the largest overall from the Q group equal to r/e^2 (.1353r). ***** Combining (1) and (2), your total chance of success is: -r*ln(r)/e + r/e^2.
Differentiating: f’(r) = -1/e -ln(r)/e +1/e^2 At f’(r)=0, ln(r) = 1/e- 1; r= e^(1/e -1)
This expresses r as a percent of P, which is only 1/e of the total dowries. Your pass group is r/e = e^(1/e -2) percent of all daughters.
r = 53.15% if the P group, or 19.55% of all daughters.
Given that #1 fails, your chance of success is 30.93%. (Actually, your best strategy is to try to talk the sultan into changing the order in which he presents his daughters. If he agrees, then use #1’s strategy, where you have a 36.79% chance.)
Bill Ryan
|
|
The topic has been locked.
|
ScottNash
Junior Boarder
Blog Posts: 0
Forum Posts: 24
Rating: 0  
|
|
s p o i l e r
Tricky? I found it pretty easy.
Pick a value before it begins. If the first daughter has a dowry less than that, pass her. If greater, take her.
If both daughters are less than that, it takes the second and wins 50% If both daughters are greater, it takes the first and wins 50%. If one is less and one greater, it takes the greater and wins 100%.
So if you choose reasonably, so case 3 happens with >0 probability, it wins > 50%.
|
|
The topic has been locked.
|
garyncurtis
Junior Boarder
Blog Posts: 0
Forum Posts: 32
Rating: 0  
|
|
We can do better than this:
Pick a strictly increasing function f from the possible dowry sizes to ]0;1[. You can substitute ]-inf;inf[ for the possible dowry sizes if you want to be on the safe side (like me, who doesn't know what a dowry is).
Choose a random number x in ]0;1[, and choose the first daughter if her dowry size d satisfies f(d)>1
This strategy wins with probability 50% + ABS( f(d1) - f(d2) ) / 2
which is strictly larger than 50% for any dowry sizes d-inf;inf[ for the possible dowry sizes if you want to be on the safe side (like me, who doesn't know what a dowry is).
Choose a random number x in ]0;1[, and choose the first daughter if her dowry size d satisfies f(d)>1
This strategy wins with probability 50% + ABS( f(d1) - f(d2) ) / 2
which is strictly larger than 50% for any dowry sizes d1 <> d2.
|
|
The topic has been locked.
|
jugherffere
Junior Boarder
Blog Posts: 0
Forum Posts: 34
Rating: 0  
|
|
Hmmm....
This actually doesn't satisfy the conditions I stated, which was that you should have a greater than 50% chance of winning for ANY size dowries. As you state above, for some sizes of dowries, you have an exactly 50% chance of winning.
Yes, this works, using the range ]-inf;infmmm....
This actually doesn't satisfy the conditions I stated, which was that you should have a greater than 50% chance of winning for ANY size dowries. As you state above, for some sizes of dowries, you have an exactly 50% chance of winning.
Yes, this works, using the range ]-inf;inf[
|
|
The topic has been locked.
|
|