API
Sobol
- pyscenarios.sobol(size, d0=0, *, chunks=None)
Sobol points generator based on Gray code order
This is a Python reimplementation of a C++ algorithm by Stephen Joe and Frances Y. Kuo, using directions from the file
new-joe-kuo-6.21201linked above.- Parameters:
size (int | tuple[int, int]) – number of samples (cannot be greater than \(2^{32}\)) to extract from a single dimension, or tuple (samples, dimensions). To guarantee uniform distribution, the number of samples should always be \(2^{n} - 1\).
d0 (int) – first dimension. This can be used as a functional equivalent of a a random seed. dimensions + d0 can’t be greater than
max_sobol_dimensions()- 1.chunks –
If omitted or None, return a NumPy array.
If not None, return a Dask array with the given chunk size. It can be anything accepted by Dask (a positive integer, a tuple of two ints, or a tuple of two tuples of ints) for the output shape (see result below). e.g. either
(16384, 50)or((16384, 16383), (50, 50, 50))could be used together withsize=(32767, 150).
- Returns:
If size is an int, a 1-dimensional array of samples. If size is a tuple, a 2-dimensional array POINTS, where
POINTS[i, j]is the ith sample of the jth dimension. Each dimension is a uniform (0, 1) distribution.- Return type:
If chunks is not None,
dask.array.Array; elsenumpy.ndarray
Note
This function will try accelerating the calculation with numba if installed, and fall back to a slower pure NumPy implementation otherwise.
Copulas
- pyscenarios.gaussian_copula(cov, samples, *, seed=0, chunks=None, rng='Mersenne Twister')
Gaussian Copula scenario generator.
Simplified algorithm:
>>> l = numpy.linalg.cholesky(cov) >>> y = numpy.random.standard_normal(size=(samples, cov.shape[0])) >>> p = (l @ y.T).T
- Parameters:
cov – covariance matrix, a.k.a. correlation matrix. It must be a Hermitian, positive-definite matrix in any square array-like format. The width of cov determines the number of dimensions of the output.
samples (int) –
Number of random samples to generate
Note
When using Sobol, to obtain a uniform distribution one must use \(2^{n} - 1\) samples (for any n > 0).
chunks –
Chunk size for the return array, which has shape (samples, dimensions). It can be anything accepted by Dask (a positive integer, a tuple of two ints, or a tuple of two tuples of ints) for the output shape.
Omit or set to None to return a NumPy array.
Warning
When using the Mersenne Twister random generator, the chunk size changes the random sequence. To guarantee repeatability, it must be fixed together with the seed. chunks=None also produces different results from using Dask.
seed (int) –
Random seed.
With
rng='Sobol', this is the initial dimension; when generating multiple copulas with different seeds, one should never use seeds that are less thancov.shape[0]apart from each other.The maximum seed when using Sobol is:
pyscenarios.max_sobol_dimensions() - cov.shape[0] - 1
rng (str) – Either
Mersenne TwisterorSobol
- Returns:
array of shape (samples, dimensions), with all series being normal (0, 1) distributions.
- Return type:
If chunks is not None,
dask.array.Array; elsenumpy.ndarray
- pyscenarios.t_copula(cov, df, samples, seed=0, *, chunks=None, rng='Mersenne Twister')
Student T Copula / IT Copula scenario generator.
Simplified algorithm:
>>> l = numpy.linalg.cholesky(cov) >>> y = numpy.random.standard_normal(size=(samples, cov.shape[0])) >>> p = (l @ y.T).T # Gaussian Copula >>> r = numpy.random.uniform(size=(samples, 1)) >>> s = scipy.stats.chi2.ppf(r, df=df) >>> z = numpy.sqrt(df / s) * p >>> u = scipy.stats.t.cdf(z, df=df) >>> t = scipy.stats.norm.ppf(u)
- Parameters:
cov – covariance matrix, a.k.a. correlation matrix. It must be a Hermitian, positive-definite matrix in any square array-like format. The width of cov determines the number of dimensions of the output.
df – Number of degrees of freedom. Can be either a scalar int for Student T Copula, or a one-dimensional NumPy array or array-like with one point per dimension for IT Copula.
samples (int) –
Number of random samples to generate
Note
When using Sobol, to obtain a uniform distribution one must use \(2^{n} - 1\) samples (for any n > 0).
chunks –
Chunk size for the return array, which has shape (samples, dimensions). It can be anything accepted by Dask (a positive integer, a tuple of two ints, or a tuple of two tuples of ints) for the output shape.
Omit or set to None to return a NumPy array.
Warning
When using the Mersenne Twister random generator, the chunk size changes the random sequence. To guarantee repeatability, it must be fixed together with the seed. chunks=None also produces different results from using Dask.
seed (int) –
Random seed.
With
rng='Sobol', this is the initial dimension; when generating multiple copulas with different seeds, one should never use seeds that are less thancov.shape[0] + 1apart from each other.The maximum seed when using Sobol is:
pyscenarios.max_sobol_dimensions() - cov.shape[0] - 2
rng (str) – Either
Mersenne TwisterorSobol
- Returns:
array of shape (samples, dimensions), with all series being normal (0, 1) distributions.
- Return type:
If chunks is not None,
dask.array.Array; elsenumpy.ndarray
Statistical functions
- pyscenarios.tail_dependence(x, y, q)
Calculate tail dependence between vectors x and y.
- Parameters:
x – 1D array-like or Dask array containing samples from a uniform (0, 1) distribution.
y – other 1D array-like or Dask array to compare against
q – quantile(s) (0 < q < 1). Either a scalar or a ND array-like or Dask array.
- Returns:
Array of the same shape and type as q, containing:
\[\cases{ P(y < q | x < q) | q < 0.5 \cr P(y \geq q | x \geq q) | q \geq 0.5 }\]