API

Sobol

Sobol points generator based on Gray code order

Parameters:

size – number of samples (cannot be greater than \(2^{32}\)) to extract from a single dimension, or tuple (samples, dimensions). To guarantee uniform distribution, the number of samples should always be \(2^{n} - 1\).
d0 (int) – first dimension. This can be used as a functional equivalent of a a random seed. dimensions + d0 can’t be greater than max_sobol_dimensions() - 1.
chunks –
If None, return a numpy array.

If set, return a dask array with the given chunk size. It can be anything accepted by dask (a positive integer, a tuple of two ints, or a tuple of two tuples of ints) for the output shape (see result below). e.g. either (16384, 50) or ((16384, 16383), (50, 50, 50)) could be used together with size=(32767, 150).

Note

The algorithm is not efficient if there are multiple chunks on axis 0. However, if you do need them, it is typically better to require them here than re-chunking afterwards, particularly if (most of) the subsequent algorithm is embarassingly parallel.

Returns:

If size is an int, a 1-dimensional array of samples. If size is a tuple, a 2-dimensional array POINTS, where POINTS[i, j] is the ith sample of the jth dimension. Each dimension is a uniform (0, 1) distribution.

Return type:

If chunks is not None, dask.array.Array; else numpy.ndarray

pyscenarios.max_sobol_dimensions() → int: Return number of dimensions available. When invoking sobol(), size[1] + d0 must be smaller than this.

Copulas

Gaussian Copula scenario generator.

Simplified algorithm:

>>> l = numpy.linalg.cholesky(cov)
>>> y = numpy.random.standard_normal(size=(samples, cov.shape[0]))
>>> p = (l @ y.T).T

Parameters:

cov (numpy.ndarray) – covariance matrix, a.k.a. correlation matrix. It must be a Hermitian, positive-definite matrix in any square array-like format. The width of cov determines the number of dimensions of the output.
samples (int) –
Number of random samples to generate

Note

When using Sobol, to obtain a uniform distribution one must use \(2^{n} - 1\) samples (for any n > 0).
chunks –
Chunk size for the return array, which has shape (samples, dimensions). It can be anything accepted by dask (a positive integer, a tuple of two ints, or a tuple of two tuples of ints) for the output shape.

Set to None to return a numpy array.

Warning

When using the Mersenne Twister random generator, the chunk size changes the random sequence. To guarantee repeatability, it must be fixed together with the seed. chunks=None also produces different results from using dask.
seed (int) –
Random seed.

With rng='Sobol', this is the initial dimension; when generating multiple copulas with different seeds, one should never use seeds that are less than cov.shape[0] apart from each other.

The maximum seed when using sobol is:
```
pyscenarios.sobol.max_sobol_dimensions() - cov.shape[0] - 1
```
rng (str) – Either Mersenne Twister or Sobol

Returns:

array of shape (samples, dimensions), with all series being normal (0, 1) distributions.

Return type:

If chunks is not None, dask.array.Array; else numpy.ndarray

Student T Copula / IT Copula scenario generator.

Simplified algorithm:

>>> l = numpy.linalg.cholesky(cov)
>>> y = numpy.random.standard_normal(size=(samples, cov.shape[0]))
>>> p = (l @ y.T).T  # Gaussian Copula
>>> r = numpy.random.uniform(size=(samples, 1))
>>> s = scipy.stats.chi2.ppf(r, df=df)
>>> z = numpy.sqrt(df / s) * p
>>> u = scipy.stats.t.cdf(z, df=df)
>>> t = scipy.stats.norm.ppf(u)

Parameters:

cov (numpy.ndarray) – covariance matrix, a.k.a. correlation matrix. It must be a Hermitian, positive-definite matrix in any square array-like format. The width of cov determines the number of dimensions of the output.
df – Number of degrees of freedom. Can be either a scalar int for Student T Copula, or a one-dimensional array-like with one point per dimension for IT Copula.
samples (int) –
Number of random samples to generate

Note

When using Sobol, to obtain a uniform distribution one must use \(2^{n} - 1\) samples (for any n > 0).
chunks –
Chunk size for the return array, which has shape (samples, dimensions). It can be anything accepted by dask (a positive integer, a tuple of two ints, or a tuple of two tuples of ints) for the output shape.

Set to None to return a numpy array.

Warning

When using the Mersenne Twister random generator, the chunk size changes the random sequence. To guarantee repeatability, it must be fixed together with the seed. chunks=None also produces different results from using dask.
seed (int) –
Random seed.

With rng='Sobol', this is the initial dimension; when generating multiple copulas with different seeds, one should never use seeds that are less than cov.shape[0] + 1 apart from each other.

The maximum seed when using sobol is:
```
pyscenarios.sobol.max_sobol_dimensions() - cov.shape[0] - 2
```
rng (str) – Either Mersenne Twister or Sobol

Returns:

array of shape (samples, dimensions), with all series being normal (0, 1) distributions.

Return type:

If chunks is not None, dask.array.Array; else numpy.ndarray

Statistical functions

pyscenarios.tail_dependence(x: Any, y: Any, q: Any) → ndarray | Array

Calculate tail dependence between vectors x and y.

Parameters:

x – 1D array-like or dask array containing samples from a uniform (0, 1) distribution.
y – other array to compare against
q – quantile(s) (0 < q < 1). Either a scalar or a ND array-like or dask array.

Returns:

array of the same shape and type as q, containing:

\[\cases{ P(y < q | x < q) | q < 0.5 \cr P(y \geq q | x \geq q) | q \geq 0.5 }\]