Issue
I don't know how to describe this concisely. Say you have X groups of items, each item in the same group is identical. All the groups have infinite size.
You want to choose Y items from the X groups completely at random, you want exactly Y items, and you can choose all from 1 group, 2 groups, 3 groups... all groups.
It is hard for me to describe so I will give a simple example. Say you want to choose 9 items from 3 groups, call these groups A, B, C.
You can choose (9A, 0B, 0C) or (0A, 9B, 0C) or (0A, 0B, 9C) or (1A, 2B, 6C)...
You can choose either from one group, from two groups, or from all groups.
The task is to find the number of ways distinct ways to partition the Y items, or the sets of counts of items from the same groups, and the composition doesn't matter.
To make it more clear, (9A, 0B, 0C) and (0A, 9B, 0C) and (0A, 0B, 9C) are treated the same, they are all counted (0, 0, 9) and there are 3 ways to obtain (0, 0, 9), 54 ways to obtain (0, 1, 8).
I tried to do it using a smart way but I got the WRONG results:
from collections import Counter
from itertools import product
groups = Counter()
for i in range(10):
for k in range(j := 9 - i):
groups[tuple(sorted((i, k, j - k)))] += 1
Counter({(1, 2, 6): 6, (1, 3, 5): 6, (2, 3, 4): 6, (0, 1, 8): 4, (0, 2, 7): 4, (0, 3, 6): 4, (0, 4, 5): 4, (1, 1, 7): 3, (1, 4, 4): 3, (2, 2, 5): 3, (0, 0, 9): 1, (3, 3, 3): 1})
So I did it the bruteforce way and iterated through all the possibilities to get the CORRECT result:
new_groups = Counter()
for item in product('abc', repeat=9):
a = b = c = 0
for e in item:
if e == 'a':
a += 1
elif e == 'b':
b += 1
else:
c += 1
new_groups[tuple(sorted((a, b, c)))] += 1
Counter({(2, 3, 4): 7560,
(1, 3, 5): 3024,
(2, 2, 5): 2268,
(1, 4, 4): 1890,
(3, 3, 3): 1680,
(1, 2, 6): 1512,
(0, 4, 5): 756,
(0, 3, 6): 504,
(0, 2, 7): 216,
(1, 1, 7): 216,
(0, 1, 8): 54,
(0, 0, 9): 3})
The result is correct, but this method is extremely inefficient, it is in exponential time, for 9 items and 3 groups it takes 19683 iterations, and 9 items 4 groups 262144 iterations!
What is a better method?
Perhaps I didn't make it clear but I thought the code made it pretty clear what I was after.
Now I will state it explicitly, say you have Y balls and you want to throw all of them randomly into X containers. All balls are thrown into containers, no misses (this is mathematics).
The task is to find all sets of the counts of number of balls in the bins, which bins have the count doesn't matter, the sets are unordered (like Python set
s), and the corresponding count of each set.
The output should be the same as my correct but inefficient method's output. I am NOT trying to calculate just ONE number.
The correct output is the output from the bruteforce approach. The second one.
I just wrote another piece of code using the bruteforce method to illustrate what the numbers mean.
groups2 = {}
for item in product('abc', repeat=9):
a = b = c = 0
for e in item:
if e == 'a':
a += 1
elif e == 'b':
b += 1
else:
c += 1
key = tuple(sorted((a, b, c)))
if key in groups2:
groups2[key].append(''.join(item))
else:
groups2[key] = [''.join(item)]
To answer why (0, 1, 8) is 54:
In [43]: groups2[(0, 1, 8)]
Out[43]:
['aaaaaaaab',
'aaaaaaaac',
'aaaaaaaba',
'aaaaaaaca',
'aaaaaabaa',
'aaaaaacaa',
'aaaaabaaa',
'aaaaacaaa',
'aaaabaaaa',
'aaaacaaaa',
'aaabaaaaa',
'aaacaaaaa',
'aabaaaaaa',
'aacaaaaaa',
'abaaaaaaa',
'abbbbbbbb',
'acaaaaaaa',
'acccccccc',
'baaaaaaaa',
'babbbbbbb',
'bbabbbbbb',
'bbbabbbbb',
'bbbbabbbb',
'bbbbbabbb',
'bbbbbbabb',
'bbbbbbbab',
'bbbbbbbba',
'bbbbbbbbc',
'bbbbbbbcb',
'bbbbbbcbb',
'bbbbbcbbb',
'bbbbcbbbb',
'bbbcbbbbb',
'bbcbbbbbb',
'bcbbbbbbb',
'bcccccccc',
'caaaaaaaa',
'caccccccc',
'cbbbbbbbb',
'cbccccccc',
'ccacccccc',
'ccbcccccc',
'cccaccccc',
'cccbccccc',
'ccccacccc',
'ccccbcccc',
'cccccaccc',
'cccccbccc',
'ccccccacc',
'ccccccbcc',
'cccccccac',
'cccccccbc',
'cccccccca',
'ccccccccb']
Solution
FWIW, if there are no repeated inputs (e.g. 1,2,3 OK, 1,1,3 not OK), then the number of permutations is
P = n! * (a1 + ... + an)! / ((a1!) * ... * (an!))
where n is the number of groups, and the ai are the inputs. I don't yet know how to avoid double counting when there are repeated input values.
The equation comes from counting the number of permutations of a set with duplicates, (see here https://math.stackexchange.com/questions/4038010/counting-permutation-of-duplicate-items) and multiplying by the number of ways of permuting n groups which is n!. This is valid when there are no repeated inputs because there's no way that permuting the groups can produce the same list of items.
Empirical testing appears to show that this works when only one input value is repeated, but I can't explain why.
P = n! * (a1 + ... + an)! / ((a1!) * ... * (an!) * D!)
where D is the number of times the repeated value is repeated. E.g. for 1, 2, 3, 3, D = 2, for 2, 3, 3, 3, D = 3 etc.
Further empirical testing appears to show that this works but I can't explain why.
P = n! * (a1 + ... + an)! / ((a1!) * ... * (an!) * D1! * ... *Dm!)
where there are m distinct input values which occur Di times. E.g. for 1, 2, 3, 3, [D1, D2, D3] = [1, 1, 2], for 2, 3, 3, 3, [D1, D2] = [1, 3].
While this isn't proof, it does appear to work. 7^11 = 1977326743.
7 : 0 0 0 0 0 0 11
462 : 0 0 0 0 0 1 10
2310 : 0 0 0 0 0 2 9
6930 : 0 0 0 0 0 3 8
13860 : 0 0 0 0 0 4 7
19404 : 0 0 0 0 0 5 6
11550 : 0 0 0 0 1 1 9
103950 : 0 0 0 0 1 2 8
277200 : 0 0 0 0 1 3 7
485100 : 0 0 0 0 1 4 6
291060 : 0 0 0 0 1 5 5
207900 : 0 0 0 0 2 2 7
970200 : 0 0 0 0 2 3 6
1455300 : 0 0 0 0 2 4 5
970200 : 0 0 0 0 3 3 5
1212750 : 0 0 0 0 3 4 4
138600 : 0 0 0 1 1 1 8
1663200 : 0 0 0 1 1 2 7
3880800 : 0 0 0 1 1 3 6
5821200 : 0 0 0 1 1 4 5
5821200 : 0 0 0 1 2 2 6
23284800 : 0 0 0 1 2 3 5
14553000 : 0 0 0 1 2 4 4
19404000 : 0 0 0 1 3 3 4
5821200 : 0 0 0 2 2 2 5
29106000 : 0 0 0 2 2 3 4
12936000 : 0 0 0 2 3 3 3
831600 : 0 0 1 1 1 1 7
11642400 : 0 0 1 1 1 2 6
23284800 : 0 0 1 1 1 3 5
14553000 : 0 0 1 1 1 4 4
52390800 : 0 0 1 1 2 2 5
174636000 : 0 0 1 1 2 3 4
38808000 : 0 0 1 1 3 3 3
87318000 : 0 0 1 2 2 2 4
174636000 : 0 0 1 2 2 3 3
43659000 : 0 0 2 2 2 2 3
2328480 : 0 1 1 1 1 1 6
34927200 : 0 1 1 1 1 2 5
58212000 : 0 1 1 1 1 3 4
174636000 : 0 1 1 1 2 2 4
232848000 : 0 1 1 1 2 3 3
349272000 : 0 1 1 2 2 2 3
52390800 : 0 1 2 2 2 2 2
2328480 : 1 1 1 1 1 1 5
34927200 : 1 1 1 1 1 2 4
23284800 : 1 1 1 1 1 3 3
174636000 : 1 1 1 1 2 2 3
87318000 : 1 1 1 2 2 2 2
Total 1977326743 :
Answered By - Simon Goater
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.