Introduction to complex systems

Probability distribution function (PDF)

The tail of a PDF is linked to its kurtosis. This kurtosis gives the concentration of values around the central value of the law and thus the concentration for extreme values, which means, far from the average (mean). In this exercise we will compare the tail of empirical probability distribution and a normal distribution. Using

data.odt

Question

Using the data set, calculate:

  • The mean value

  • The variance

  • The standard deviation

Question

Using the given data set , compare the empirical PDF and Normal distribution

Hint

The empirical pdf is calculate using Weibull distribution

Solution

Python script

1
@author: yacine.mezemate
2
"""
3
import numpy as np
4
import matplotlib.pyplot as plt
5
import matplotlib.mlab as mlab
6
#Import data
7
data = np.genfromtxt("data.txt", delimiter= '')
8
data = np.diff(data) # use the difference
9
# Statistics
10
mu = np.mean(data) # mean
11
Var = np.var(data) # variance
12
std = np.sqrt(Var) # standard deviation
13
# Calculate and plot the Empirical PDF using Weibull distribution
14
neg = np.sort(data[data<0]) # sort negative values
15
pos = np.sort(data[data>0]) # sort positive values
16
P_neg = np.arange(len(neg), dtype = np.double)/len(neg) # probability of negative values
17
P_pos = np.arange(len(pos), dtype = np.double)/len(pos) # probability of positive values
18
plt.plot(neg, np.log(P_neg), marker='+', label ="Empirical Estimation") # plot PDF
19
# Calculate and plot Normal PDF
20
mn = np.min(data) # min value
21
mx = np.max(data) # max value
22
x = np.linspace(mn, 0, 100) # bins number
23
Gauss=np.log(mlab.normpdf(x,mu,std)) # Normal distribution
24
plt.plot(x,Gauss, label="Normal distribution") # plot PDF
25
# Plot informations
26
plt.legend(loc='down left')
27
plt.title("Probability distribution", fontsize = 18)
28
plt.ylabel("$\log(Pr(\Delta v))$", fontsize=14)
29
plt.xlabel("$\Delta v$", fontsize=14)
30
plt.show()
31
Comparison between empirical probability distribution and a normal distribution for geophysical field

The plot is logarithmic so as to emphasis the heavy tail of the distribution. The plot shows that the normal distribution does not fit the empirical one. In complex system such as in geophysics, extreme values can not be detected using a Gaussian distribution.

PreviousPreviousNextNext
HomepageHomepagePrintPrintCreated with Scenari (new window)