Instancing random objects preserving frequency

Hi all. After the post talking about the math involving frequency for procedural animation, it is now time to talk about the randomness.

During the production in studio we had to deal with the following question: instantiate some base objects along particles randomly to form a crowd, using the Maya Instancer, but having the control over the count of the instances for each object (to have some characters less frequent than others).

The Gaussian approach

The most natural attempt would be using a Gaussian distribution (or Normal distribution), which results in a non-uniform distribution of the samples, being more dense around the mean value, having the variance as measure of the width of  distribution.

Following the Gaussian approach, we can use the result of a random.gauss() function as the index of the element extracted from the list of our objects to instance:


import random

numObjs =10
numSamples=600
dev=2

random.seed(2233)
l= range(numObjs)
count = [0 for x in l]

for i in range(numSamples-1):
    while True:        
        v = int (abs(random.gauss(0,dev)))
        if v < len(l):
            count[v] +=1
            break
print count
        

The list “l” contains the objects we want to instance (for the sake of simplicity, I filled it out with numbers as [0, 1, .... numObjs] ), wile numSamples reprents the numbers of object we want to instance (i.e. the number of extractions).
To get stable results, I initialized the seed of the random function (!Important).
What we are doing in the two nested loops is extracting #numSamples times a random value from the Gaussian distribution, having as mean value 0 (the bell centered at origin) and dev as the width of the bell curve. Since we are using the result of the extraction as index for the list containing our elements, the result must be an integer value and equal or greater than zero. Because our list has fixed size (equal to len(l) ) and the bell goes from -infinite to infinite, we repeat the estraction in case the resulting value is out of the list indices range.
The code above produce the following count:

[209, 175, 112, 61, 37, 3, 2, 0, 0, 0]

As we can see, the values on the left are extracted more frequently than the one on the right.
Because we are instancing objects, it would be nice to get some instances for each object.
To achieve this, we can enlarge the width of the bell curve by increasing dev:

dev=3: [141, 126, 117, 82, 53, 38, 27, 12, 1, 2]

dev=5: [92, 94, 78, 86, 68, 64, 36, 35, 25, 21]

dev=10: [67, 69, 63, 65, 67, 52, 61, 59, 43, 53]

As the numbers above suggests, the higher the deviation, the more the values tend to be uniform. By playing with the dev value, we have a certain amount of freedom on driving the extraction frequency of each sample, thus the frequency of instancing of each object.
Even if this approach is mathematically correct, it lacks on control over the real count (you can try to change the seed and you will easily get different values) plus it gives unpredictable results when the population (the number of base object) is numerically comparable to the number of extractions (the object to instance).
For example, let’s see this case:

numObjs= 20, numSamples=30, dev=2:

[13, 7, 4, 4, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

We see that the bell is not large enough to give at least one instance to the last elements. Let’s try to increase dev:

dev=10: [3, 6, 1, 2, 1, 2, 1, 0, 2, 2, 2, 1, 1, 0, 0, 0, 0, 1, 1, 3]

Now we get a more even distribution, but we see some 0s around: that means some objects will be never instanced and we can even see that the higher number is the second (while we expect to be the first one): that means the element we want to be more frequent is less frequent than the second one, by changing the seed, we could be more lucky, but for sure we don’t have full control of what’s going on.

The Modulo approach

Let’s follow another approach, using the Modulo operation.
If we want evenly distribute some objects extracted from a list, we can exctract the element at index:

index = #numObjs mod n

Where n represents the number of the current extraction:


import random

l = ["Obj1", "Obj2", "Obj3", "Obj4", "Obj5"]
count = [0,0,0,0,0]

numInstances = 10

for i in range(numInstances):
    index = i %len(l)
   
    #instance object l[index]
    #...
   
    print l[index]
    count[index] +=1

print count

This code produces the following extractions: Obj1, Obj2, .., Obj5, Obj1, Obj2, .. and obviously the count would be equal for all objects (where the number of extractions is a multiple of the number of elements).
To get a random and numerically stable extraction, we can shuffle our list:


import random

random.seed(10)
l = ["Obj1", "Obj2", "Obj3", "Obj4", "Obj5"]
random.shuffle(l)
count = [0,0,0,0,0]

numInstances = 10

for i in range(numInstances):
    index = i %len(l)
   
    #instance object l[index]
    #...
   
    print l[index]
    count[index] +=1

The last step is adding the information about the frequency.
Let’s set the frequency for each element and rebuild the list l by adding each element # times as much as its frequency. For instance:

f['Obj1'] = 3
f['Obj2'] = 1
f['Obj3'] = 2

Our list would be: l=['Obj1', 'Obj1', 'Obj1', 'Obj2', 'Obj3', 'Obj3']
Now by shuffling this list, we can use the modulo function for the estraction, preserving the frequencies set above:


import random

random.seed(10)

f=dict()

f["Obj1"] = 3
f["Obj2"] = 1
f["Obj3"] = 2

l = list()

for k, v in f.items():
    for i in range(v):
        l.append(k)

print "Base list: ", l

random.shuffle(l)

print "Shuffled list: ", l

count = dict()

count["Obj1"] = 0
count["Obj2"] = 0
count["Obj3"] = 0


numInstances = 6

for i in range(numInstances):
    index = i %len(l)
    
    #instance object l[index]
    #...
    
    count[l[index]] +=1

print "Objects count: ", count

The above code produces the following output:

Base list: ['Obj1', 'Obj1', 'Obj1', 'Obj3', 'Obj3', 'Obj2']
Shuffled list: ['Obj2', 'Obj1', 'Obj1', 'Obj3', 'Obj1', 'Obj3']
Objects count: {‘Obj1′: 3, ‘Obj3′: 2, ‘Obj2′: 1}

Let’s raise the number of instances (numInstances=600):

Objects count: {‘Obj1′: 300, ‘Obj3′: 200, ‘Obj2′: 100}

We can see that the frequency is preserved. Everything works so far, except one last thing: we are getting a pseudo-random extractions that repeats the sequence everytime we visit the whole list l.
To get a more random behaviour, we can shuffle the list everytime we restart by extracting the first element, i.e.: index == 0:


import random

random.seed(10)

f=dict()

f["Obj1"] = 3
f["Obj2"] = 1
f["Obj3"] = 2

l = list()

for k, v in f.items():
    for i in range(v):
        l.append(k)

print "Base list: ", l

random.shuffle(l)

print "Shuffled list: ", l

count = dict()

count["Obj1"] = 0
count["Obj2"] = 0
count["Obj3"] = 0


numInstances = 12

for i in range(numInstances):
    index = i %len(l)
    if index == 0:
        random.seed(i)
        random.shuffle(l)
        
    #instance object l[index]
    #...
    print l[index]
    count[l[index]] +=1

print "Objects count: ", count

And don’t forget to change your seed before the shuffle() operation, or you’ll get the same list over and over.
The code above produces the following sequence:

Obj1, Obj1, Obj2, Obj1, Obj3, Obj3, Obj1, Obj2, Obj1, Obj1, Obj3, Obj3

With Objects count: {‘Obj1′: 6, ‘Obj3′: 4, ‘Obj2′: 2}.

Finally we got a way to instantiate objects randomly but preserving control over their frequency.
I suggest to use the Gaussian approach only in case our base population is not numerically comparable to the number of instances.