LA ILAHA ILLA HU
Allah, Your Lord There Is No Deity Except Him.
Python Data Science NumPy Random Zipf Data Distribution
Zipf Distribution: Zipf distritution is used to sample data based on zipf's law.
Zipf's Law: states that in a collection the nth common term is 1/n times of the most common term.
E.g. 5th common word in english occurs nearly 1/5th times as the most used word.
It has the following two parameters.
1. a - distribution parameter.
2. size - The shape of the returned array.
Example 1: Draw out a sample for zipf distribution with distribution parameter of 2 with size 2x4.
Code
from numpy import random
x = random.zipf(a=2, size=(2, 4))
print(x)
the output will be
[[26 1 25 1] [ 1 1 2 33]]
Note: Every time the code is run the output may vary because of random generation.
Visualization of Zipf Distribution
Sample 100000 points but plotting only ones with value < 10 for more meaningful chart.
Example 2
Code
from numpy import random
import matplotlib.pyplot as plt
import seaborn as sns
x = random.zipf(a=2, size=100000)
sns.distplot(x[x<10], kde=False)
plt.show()
the output will be