Simulation#

This module is used to define an abstract Simulation class. We may simulate different data from a linear combination of basis functions or multiple realizations of diverse Brownian motion.

Simulation class#

class FDApy.simulation.simulation.Simulation(basis_name: str, random_state: int | None = None)#

Bases: ABC

Class that defines functional data simulation.

Parameters:
basis_name: str

Name of the simulation

random_state: int, default=None

A seed to initialize the random number generator.

Attributes:
data: DenseFunctionalData or MultivariateFunctionalData

An object that represents the simulated data.

noisy_data: DenseFunctionalData or MultivariateFunctionalData

An object that represents a noisy version of the simulated data.

sparse_data: IrregularFunctionalData or MultivariateFunctionalData

An object that represents a sparse version of the simulated data.

Methods

add_noise([noise_variance])

Add noise to functional data objects.

add_noise_and_sparsify([noise_variance, ...])

Generate a noisy and sparse version of functional data objects.

new(n_obs[, n_clusters, argvals])

Simulate a new set of curves.

sparsify([percentage, epsilon])

Generate a sparse version of functional data objects.

property basis_name: str#

Getter for basis_name.

abstract new(n_obs: int, n_clusters: int = 1, argvals: ndarray[Any, dtype[float64]] | None = None, **kwargs) None#

Simulate a new set of curves.

add_noise(noise_variance: float = 1.0) None#

Add noise to functional data objects.

This function generates an artificial noisy version of a functional data object of class DenseFunctionalData by adding realizations of Gaussian random variables \(\epsilon \sim \mathcal{N}(0, \sigma^2)\) to the observations. The variance \(\sigma^2\) can be supplied by the user. The generated data are given by

\[Y(t) = X(t) + \epsilon.\]
Parameters:
noise_variance: float, default=1.0

The variance \(\sigma^2\) of the Gaussian noise that is added to the data.

sparsify(percentage: float = 0.9, epsilon: float = 0.05) None#

Generate a sparse version of functional data objects.

This function generates an artificially sparsified version of a functional data object of class DenseFunctionalData. The percentage (and the uncertainty around it) of the number of observation points retained can be supplied by the user. Let \(p\) be the defined percentage and \(\epsilon\) be the uncertainty value. The retained number of observations will be different for each curve and be between \(p - \epsilon\) and \(p + \epsilon\).

Parameters:
percentage: float, default=0.9

The percentage of observations to be retained.

epsilon: float, default=0.05

The uncertainty around the percentage of observations to be retained.

add_noise_and_sparsify(noise_variance: float = 1.0, percentage: float = 0.9, epsilon: float = 0.05) None#

Generate a noisy and sparse version of functional data objects.

This function generates an artificially noisy and sparse version of a functional datasets. From a functional dataset, it first generates the noisy version and then the sparse version based on the noisy one.

Parameters:
noise_variance: float, default=1.0

The variance \(\sigma^2\) of the Gaussian noise that is added to the data.

percentage: float, default=0.9

The percentage of observations to be retained.

epsilon: float, default=0.05

The uncertainty around the percentage of observations to be retained.

Brownian motions#

class FDApy.simulation.brownian.Brownian(name: str, random_state: int | None = None)#

Bases: Simulation

Class that defines Brownian motions simulation.

Parameters:
name: str, {‘standard’, ‘geometric’, ‘fractional’}

Name of the Brownian motion type to simulate.

random_state: int, default=None

A seed to initialize the random number generator.

Attributes:
data: DenseFunctionalData

An object that represents the simulated data.

noisy_data: DenseFunctionalData

An object that represents a noisy version of the simulated data.

sparse_data: IrregularFunctionalData

An object that represents a sparse version of the simulated data.

Methods

add_noise([noise_variance])

Add noise to functional data objects.

add_noise_and_sparsify([noise_variance, ...])

Generate a noisy and sparse version of functional data objects.

new(n_obs[, n_clusters, argvals])

Simulate realizations of a Brownian motion.

sparsify([percentage, epsilon])

Generate a sparse version of functional data objects.

Notes

The sampling points have to be regularly spaced. Otherwise, the covariance of the generated data will not be the good one.

The implementation is adapted from [1].

References

new(n_obs: int, n_clusters: int = 1, argvals: ndarray[Any, dtype[float64]] | None = None, **kwargs) None#

Simulate realizations of a Brownian motion.

This function generates n_obs realizations of a Brownian motion on a common grid argvals.

Parameters:
n_obs: int

Number of observations to simulate.

n_clusters: int

Not used.

argvals: Optional[npt.NDArray[np.float64]], shape=(n,)

Values at which Brownian motions are evaluated. If None, the functions are evaluated on the interval \([0, 1]\) with \(21\) regularly spaced sampled points.

**kwargs:
init_point: float

Start value of the Brownian motion. For geometric Brownian motion, init_point should be stricly positive. Default value is 0 for standard Brownian motion and 1 for geometric Brownian motion.

mu: float, default=0

Interest rate (or percentage drift).

sigma: float, default=1

Diffusion coefficient (or percentage volatility).

hurst: float, default=0.5

Hurst parameter. If hurst = 0.5. the fractional Brownian motion is equivalent to the standard Brownian motion.

Karhunen-Loève decomposition#

class FDApy.simulation.karhunen.KarhunenLoeve(n_functions: List[Tuple[int] | int] = 5, basis_name: List[Tuple[str] | str] | None = 'fourier', argvals: List[DenseArgvals] | None = None, basis: Basis | Sequence[Basis] | None = None, random_state: int | None = None, **kwargs_basis: Any)#

Bases: Simulation

Class that defines simulation based on Karhunen-Loève decomposition.

This class is used to simulate functional data \(X_1, \dots, X_N\) based on a truncated Karhunen-Loève decomposition:

\[X_i(t) = \sum_{K = 1}^K c_{i, k}\phi_{k}(t), i = 1, \dots, N,\]

on one- or higher-dimensional domains. The eigenfunctions \(\phi_{k}(t)\) could be generated using different basis functions or be user-defined. The scores \(c_{i, k}\) are simulated independently from a normal distribution with zero mean and decreasing variance. For higher-dimensional domains, the eigenfunctions are constructed as tensors of marginal orthonormal function systems.

Parameters:
n_functions: List[Union[Tuple[int], int]], default=5

Number of functions to use to generate the basis. See Basis and MultivariateBasis for more information.

basis_name: Optional[List[Union[Tuple[str], str]]]

Name of the basis to use. See Basis and MultivariateBasis for more information.

argvals: Optional[List[DenseArgvals]], default=None

The sampling points of the functional data.

basis: Optional[Union[Basis, MultivariateBasis]], default=None

Basis of functions as a Basis object. Used to have a user-defined basis of function.

random_state: int, default=None

A seed to initialize the random number generator.

Attributes:
data: Union[DenseFunctionalData, MultivariateFunctionalData]

An object that represents the simulated data.

noisy_data: Union[DenseFunctionalData, MultivariateFunctionalData]

An object that represents a noisy version of the simulated data.

sparse_data: Union[IrregularFunctionalData, MultivariateFunctionalData]

An object that represents a sparse version of the simulated data.

labels: npt.NDArray[np.float64], shape=(n_obs,)

The integer labels for cluster membership of each sample.

basis: Union[Basis, MultivariateBasis]

The eigenfunctions used to simulate the data.

eigenvalues: npt.NDArray[np.float64], shape=(n_functions,)

The eigenvalues used to simulate the data.

Methods

add_noise([noise_variance])

Add noise to functional data objects.

add_noise_and_sparsify([noise_variance, ...])

Generate a noisy and sparse version of functional data objects.

new(n_obs[, n_clusters, argvals])

Simulate realizations from Karhunen-Loève decomposition.

sparsify([percentage, epsilon])

Generate a sparse version of functional data objects.

Notes

In the case of multivariate functional data, \(X_i\) and \(\phi_{k}\) are vectors and according to the multivariate Karhunen-Loève theorem (see, e.g, [1]), the coefficients do not depend on the component \(p\).

If the basis is user-defined, the object has to be an element of the class Basis and not just DenseFunctionalData or MultivariateFunctionalData.

References

[1]

Happ C. & Greven S. (2018), Multivariate Functional Principal Component Analysis for Data Observed on Different (Dimensional) Domains. Journal of the American Statistical Association, 113, pp. 649–659.

new(n_obs: int, n_clusters: int = 1, argvals: ndarray[Any, dtype[float64]] | None = None, **kwargs) None#

Simulate realizations from Karhunen-Loève decomposition.

This function generates n_obs realizations of a Gaussian process using the Karhunen-Loève decomposition on a common grid argvals.

Parameters:
n_obs: int

Number of observations to simulate.

n_clusters: int, default=1

Number of clusters to generate.

argvals: None

Not used in this context. We will use the argvals from the Basis object as argvals of the simulation. Here to be compliant with the class Simulation.

**kwargs:
centers: npt.NDArray[np.float64], shape=(n_features, n_clusters)

The centers of the clusters to generate. The n_features correspond to the number of functions within the basis.

cluster_std: npt.NDArray[np.float64],shape=(n_features, n_clusters)

The standard deviation of the clusters to generate. The n_features correspond to the number of functions within the basis.