Simulation#
This module is used to define an abstract Simulation class. We may simulate different data from a linear combination of basis functions or multiple realizations of diverse Brownian motion.
Simulation class#
- class FDApy.simulation.simulation.Simulation(basis_name: str, random_state: int | None = None)#
Bases:
ABCClass that defines functional data simulation.
- Parameters:
- basis_name: str
Name of the simulation
- random_state: int, default=None
A seed to initialize the random number generator.
- Attributes:
- data: DenseFunctionalData or MultivariateFunctionalData
An object that represents the simulated data.
- noisy_data: DenseFunctionalData or MultivariateFunctionalData
An object that represents a noisy version of the simulated data.
- sparse_data: IrregularFunctionalData or MultivariateFunctionalData
An object that represents a sparse version of the simulated data.
Methods
add_noise([noise_variance])Add noise to functional data objects.
add_noise_and_sparsify([noise_variance, ...])Generate a noisy and sparse version of functional data objects.
new(n_obs[, n_clusters, argvals])Simulate a new set of curves.
sparsify([percentage, epsilon])Generate a sparse version of functional data objects.
- property basis_name: str#
Getter for basis_name.
- abstract new(n_obs: int, n_clusters: int = 1, argvals: ndarray[Any, dtype[float64]] | None = None, **kwargs) None#
Simulate a new set of curves.
- add_noise(noise_variance: float = 1.0) None#
Add noise to functional data objects.
This function generates an artificial noisy version of a functional data object of class
DenseFunctionalDataby adding realizations of Gaussian random variables \(\epsilon \sim \mathcal{N}(0, \sigma^2)\) to the observations. The variance \(\sigma^2\) can be supplied by the user. The generated data are given by\[Y(t) = X(t) + \epsilon.\]- Parameters:
- noise_variance: float, default=1.0
The variance \(\sigma^2\) of the Gaussian noise that is added to the data.
- sparsify(percentage: float = 0.9, epsilon: float = 0.05) None#
Generate a sparse version of functional data objects.
This function generates an artificially sparsified version of a functional data object of class
DenseFunctionalData. The percentage (and the uncertainty around it) of the number of observation points retained can be supplied by the user. Let \(p\) be the defined percentage and \(\epsilon\) be the uncertainty value. The retained number of observations will be different for each curve and be between \(p - \epsilon\) and \(p + \epsilon\).- Parameters:
- percentage: float, default=0.9
The percentage of observations to be retained.
- epsilon: float, default=0.05
The uncertainty around the percentage of observations to be retained.
- add_noise_and_sparsify(noise_variance: float = 1.0, percentage: float = 0.9, epsilon: float = 0.05) None#
Generate a noisy and sparse version of functional data objects.
This function generates an artificially noisy and sparse version of a functional datasets. From a functional dataset, it first generates the noisy version and then the sparse version based on the noisy one.
- Parameters:
- noise_variance: float, default=1.0
The variance \(\sigma^2\) of the Gaussian noise that is added to the data.
- percentage: float, default=0.9
The percentage of observations to be retained.
- epsilon: float, default=0.05
The uncertainty around the percentage of observations to be retained.
Brownian motions#
- class FDApy.simulation.brownian.Brownian(name: str, random_state: int | None = None)#
Bases:
SimulationClass that defines Brownian motions simulation.
- Parameters:
- name: str, {‘standard’, ‘geometric’, ‘fractional’}
Name of the Brownian motion type to simulate.
- random_state: int, default=None
A seed to initialize the random number generator.
- Attributes:
- data: DenseFunctionalData
An object that represents the simulated data.
- noisy_data: DenseFunctionalData
An object that represents a noisy version of the simulated data.
- sparse_data: IrregularFunctionalData
An object that represents a sparse version of the simulated data.
Methods
add_noise([noise_variance])Add noise to functional data objects.
add_noise_and_sparsify([noise_variance, ...])Generate a noisy and sparse version of functional data objects.
new(n_obs[, n_clusters, argvals])Simulate realizations of a Brownian motion.
sparsify([percentage, epsilon])Generate a sparse version of functional data objects.
Notes
The sampling points have to be regularly spaced. Otherwise, the covariance of the generated data will not be the good one.
The implementation is adapted from [1].
References
[1]- new(n_obs: int, n_clusters: int = 1, argvals: ndarray[Any, dtype[float64]] | None = None, **kwargs) None#
Simulate realizations of a Brownian motion.
This function generates
n_obsrealizations of a Brownian motion on a common gridargvals.- Parameters:
- n_obs: int
Number of observations to simulate.
- n_clusters: int
Not used.
- argvals: Optional[npt.NDArray[np.float64]], shape=(n,)
Values at which Brownian motions are evaluated. If
None, the functions are evaluated on the interval \([0, 1]\) with \(21\) regularly spaced sampled points.- **kwargs:
- init_point: float
Start value of the Brownian motion. For geometric Brownian motion,
init_pointshould be stricly positive. Default value is 0 for standard Brownian motion and 1 for geometric Brownian motion.- mu: float, default=0
Interest rate (or percentage drift).
- sigma: float, default=1
Diffusion coefficient (or percentage volatility).
- hurst: float, default=0.5
Hurst parameter. If
hurst = 0.5. the fractional Brownian motion is equivalent to the standard Brownian motion.
Karhunen-Loève decomposition#
- class FDApy.simulation.karhunen.KarhunenLoeve(n_functions: List[Tuple[int] | int] = 5, basis_name: List[Tuple[str] | str] | None = 'fourier', argvals: List[DenseArgvals] | None = None, basis: Basis | Sequence[Basis] | None = None, random_state: int | None = None, **kwargs_basis: Any)#
Bases:
SimulationClass that defines simulation based on Karhunen-Loève decomposition.
This class is used to simulate functional data \(X_1, \dots, X_N\) based on a truncated Karhunen-Loève decomposition:
\[X_i(t) = \sum_{K = 1}^K c_{i, k}\phi_{k}(t), i = 1, \dots, N,\]on one- or higher-dimensional domains. The eigenfunctions \(\phi_{k}(t)\) could be generated using different basis functions or be user-defined. The scores \(c_{i, k}\) are simulated independently from a normal distribution with zero mean and decreasing variance. For higher-dimensional domains, the eigenfunctions are constructed as tensors of marginal orthonormal function systems.
- Parameters:
- n_functions: List[Union[Tuple[int], int]], default=5
Number of functions to use to generate the basis. See Basis and MultivariateBasis for more information.
- basis_name: Optional[List[Union[Tuple[str], str]]]
Name of the basis to use. See Basis and MultivariateBasis for more information.
- argvals: Optional[List[DenseArgvals]], default=None
The sampling points of the functional data.
- basis: Optional[Union[Basis, MultivariateBasis]], default=None
Basis of functions as a Basis object. Used to have a user-defined basis of function.
- random_state: int, default=None
A seed to initialize the random number generator.
- Attributes:
- data: Union[DenseFunctionalData, MultivariateFunctionalData]
An object that represents the simulated data.
- noisy_data: Union[DenseFunctionalData, MultivariateFunctionalData]
An object that represents a noisy version of the simulated data.
- sparse_data: Union[IrregularFunctionalData, MultivariateFunctionalData]
An object that represents a sparse version of the simulated data.
- labels: npt.NDArray[np.float64], shape=(n_obs,)
The integer labels for cluster membership of each sample.
- basis: Union[Basis, MultivariateBasis]
The eigenfunctions used to simulate the data.
- eigenvalues: npt.NDArray[np.float64], shape=(n_functions,)
The eigenvalues used to simulate the data.
Methods
add_noise([noise_variance])Add noise to functional data objects.
add_noise_and_sparsify([noise_variance, ...])Generate a noisy and sparse version of functional data objects.
new(n_obs[, n_clusters, argvals])Simulate realizations from Karhunen-Loève decomposition.
sparsify([percentage, epsilon])Generate a sparse version of functional data objects.
Notes
In the case of multivariate functional data, \(X_i\) and \(\phi_{k}\) are vectors and according to the multivariate Karhunen-Loève theorem (see, e.g, [1]), the coefficients do not depend on the component \(p\).
If the basis is user-defined, the object has to be an element of the class Basis and not just DenseFunctionalData or MultivariateFunctionalData.
References
[1]Happ C. & Greven S. (2018), Multivariate Functional Principal Component Analysis for Data Observed on Different (Dimensional) Domains. Journal of the American Statistical Association, 113, pp. 649–659.
- new(n_obs: int, n_clusters: int = 1, argvals: ndarray[Any, dtype[float64]] | None = None, **kwargs) None#
Simulate realizations from Karhunen-Loève decomposition.
This function generates
n_obsrealizations of a Gaussian process using the Karhunen-Loève decomposition on a common gridargvals.- Parameters:
- n_obs: int
Number of observations to simulate.
- n_clusters: int, default=1
Number of clusters to generate.
- argvals: None
Not used in this context. We will use the
argvalsfrom theBasisobject asargvalsof the simulation. Here to be compliant with the classSimulation.- **kwargs:
- centers: npt.NDArray[np.float64], shape=(n_features, n_clusters)
The centers of the clusters to generate. The
n_featurescorrespond to the number of functions within the basis.- cluster_std: npt.NDArray[np.float64],shape=(n_features, n_clusters)
The standard deviation of the clusters to generate. The
n_featurescorrespond to the number of functions within the basis.