IrregularFunctionalData#
- class FDApy.representation.IrregularFunctionalData(argvals, values)[source]#
Represent irregularly sampled functional data.
- Parameters:
argvals (IrregularArgvals) – The sampling points of the functional data. Each entry of the dictionary represents an input dimension. Then, each dimension is a dictionary where entries are the different observations. So, the observation \(i\) for the dimension \(j\) is a np.ndarray with shape \((m^i_j,)\) for \(0 \leq i \leq n\) and \(0 \leq j \leq p\).
values (IrregularValues) – The values of the functional data. Each entry of the dictionary is an observation of the process. And, an observation is represented by a np.ndarray of shape \((n, m_1, \dots, m_p)\). It should not contain any missing values.
- Attributes:
argvals_stand (IrregularArgvals) – Standardized sampling points of the functional data.
n_obs (int) – Number of observations of the functional data.
n_dimension (int) – Number of input dimension of the functional data.
n_points (Dict[int, Tuple[int, …]]) – Number of sampling points.
Examples
For 1-dimensional irregular data:
>>> argvals = IrregularArgvals({ ... 0: DenseArgvals({'input_dim_0': np.array([0, 1, 2, 3, 4])}), ... 1: DenseArgvals({'input_dim_0': np.array([0, 2, 4])}), ... 2: DenseArgvals({'input_dim_0': np.array([2, 4])}) ... }) >>> values = IrregularValues({ ... 0: np.array([1, 2, 3, 4, 5]), ... 1: np.array([2, 5, 6]), ... 2: np.array([4, 7]) ... }) >>> IrregularFunctionalData(argvals, values)
For 2-dimensional irregular data:
>>> argvals = IrregularArgvals({ ... 0: DenseArgvals({ ... 'input_dim_0': np.array([1, 2, 3, 4]), ... 'input_dim_1': np.array([5, 6, 7]) ... }), ... 1: DenseArgvals({ ... 'input_dim_0': np.array([2, 4]), ... 'input_dim_1': np.array([1, 2, 3]) ... }), ... 2: DenseArgvals({ ... 'input_dim_0': np.array([4, 5, 6]), ... 'input_dim_1': np.array([8, 9]) ... }) ... }) >>> values = IrregularValues({ ... 0: np.array([[1, 2, 3], [4, 1, 2], [3, 4, 1], [2, 3, 4]]), ... 1: np.array([[1, 2, 3], [1, 2, 3]]), ... 2: np.array([[8, 9], [8, 9], [8, 9]]) ... }) >>> IrregularFunctionalData(argvals, values)
References
Methods
center([mean, method_smoothing])Center the data.
concatenate(*fdata)Concatenate IrregularFunctionalData objects.
covariance([points, method_smoothing, ...])Compute an estimate of the covariance function.
inner_product([method_integration, ...])Compute the inner product matrix of the data.
mean([points, method_smoothing, approx])Compute an estimate of the mean.
noise_variance([order])Estimate the variance of the noise.
norm([squared, method_integration, ...])Norm of each observation of the data.
normalize(**kwargs)Normalize the data.
rescale([weights, method_integration, ...])Rescale the data.
smooth([points, method, bandwidth, penalty])Smooth the data.
standardize([center])Standardize the data.
to_basis([points, method, penalty])Convert the data to basis format.
to_long([reindex])Convert the data to long format.
- center(mean=None, method_smoothing='LP', **kwargs)[source]#
Center the data.
- Parameters:
mean (DenseFunctionalData | None) – A precomputed mean as a DenseFunctionalData object.
method_smoothing (str) – The method to used for the smoothing of the mean. If ‘PS’, the method is P-splines [4]. If ‘LP’, the method is local polynomials [2].
kwargs – Other keyword arguments are passed to one of the following functions:
IrregularFunctionalData.mean()(mean=None) andIrregularFunctionalData.smooth().
- Returns:
The centered version of the data.
- Return type:
Examples
>>> kl = KarhunenLoeve( ... basis_name='bsplines', ... n_functions=5, ... random_state=42 ... ) >>> kl.new(n_obs=10) >>> kl.add_noise_and_sparsify(0.01, 0.95) >>> kl.sparse_data.center(smooth=True) Functional data object with 10 observations on a 1-dimensional support.
- static concatenate(*fdata)[source]#
Concatenate IrregularFunctionalData objects.
- Parameters:
fdata (IrregularFunctionalData) – Functional data to concatenate.
- Returns:
The concatenated objects.
- Return type:
- covariance(points=None, method_smoothing='LP', center=True, smooth=True, kwargs_center={}, **kwargs)[source]#
Compute an estimate of the covariance function.
This function computes an estimate of the covariance surface of a IrregularFunctionalData object. As the curves are not sampled on a common grid, we consider the method in [8].
- Parameters:
points (DenseArgvals | None) – The sampling points at which the covariance is estimated. If None, the concatenation of the IrregularArgvals of the IrregularFunctionalData is used.
method_smoothing (str) – The method to used for the smoothing of the mean. If ‘PS’, the method is P-splines [4]. If ‘LP’, the method is local polynomials [2].
center (bool) – Should the data be centered before computing the covariance.
smooth (bool) – Should the covariance be smoothed.
kwargs_center (Dict[str, object]) – Keyword arguments to be passed to the function
FunctionalData.center().kwargs – Other keyword arguments are passed to the following function:
FunctionalData._smooth_covariance().
- Returns:
An estimate of the covariance as a two-dimensional DenseFunctionalData object.
- Return type:
- Raises:
NotImplementedError – Not implement for higher-dimensional data.
Examples
>>> kl = KarhunenLoeve( ... basis_name='bsplines', ... n_functions=5, ... random_state=42 ... ) >>> kl.new(n_obs=100) >>> kl.sparsify(percentage=0.5, epsilon=0.05) >>> kl.sparse_data.covariance() Functional data object with 1 observations on a 2-dimensional support.
- inner_product(method_integration='trapz', method_smoothing='LP', noise_variance=None, **kwargs)[source]#
Compute the inner product matrix of the data.
The inner product matrix is a
n_obsbyn_obsmatrix where each entry is defined as\[\langle x, y \rangle = \int_{\mathcal{T}} x(t)y(t)dt, t \in \mathcal{T},\]where \(\mathcal{T}\) is a one- or multi-dimensional domain [1].
- Parameters:
method_integration (str) – The method used to integrated.
method_smoothing (str) – Should the mean be smoothed?
noise_variance (float | None) – An estimation of the variance of the noise. If None, an estimation is computed using the methodology in [5].
kwargs – Other keyword arguments are passed to the following function:
IrregularFunctionalData.center().
- Returns:
Inner product matrix of the data.
- Return type:
npt.NDArray[np.float64], shape=(n_obs, n_obs)
- Raises:
NotImplementedError – Not implement for higher-dimensional data.
Examples
For one-dimensional functional data:
>>> kl = KarhunenLoeve( ... basis_name='bsplines', n_functions=5, random_state=5 ... ) >>> kl.new(n_obs=3) >>> kl.sparsify(percentage=0.8, epsilon=0.05) >>> kl.sparse_data.inner_product(noise_variance=0) array([ [ 0.15749721, 0.01983093, -0.09607059], [ 0.01983093, 0.17937531, -0.24773228], [-0.09607059, -0.24773228, 0.41648575] ])
- mean(points=None, method_smoothing='LP', approx=True, **kwargs)[source]#
Compute an estimate of the mean.
This function computes an estimate of the mean curve of a IrregularFunctionalData object. The curves are not sampled on a common grid. We implement the methodology from [2].
- Parameters:
points (DenseArgvals | None) – The sampling points at which the mean is estimated. If None, the concatenation of the argvals of the IrregularFunctionalData is used.
method_smoothing (str) – The method to used for the smoothing. If ‘PS’, the method is P-splines [4]. If ‘LP’, the method is local polynomials [2].
approx (bool) – Approximation of the estimation.
kwargs – Other keyword arguments are passed to the following function:
IrregularFunctionalData.smooth().
- Returns:
An estimate of the mean as a DenseFunctionalData object.
- Return type:
Examples
For one-dimensional functional data:
>>> argvals = IrregularArgvals({ ... 0: DenseArgvals({'input_dim_0': np.array([0, 1, 2, 3, 4])}), ... 1: DenseArgvals({'input_dim_0': np.array([0, 2, 4])}), ... 2: DenseArgvals({'input_dim_0': np.array([2, 4])}) ... }) >>> values = IrregularValues({ ... 0: np.array([1, 2, 3, 4, 5]), ... 1: np.array([2, 5, 6]), ... 2: np.array([4, 7]) ... }) >>> fdata = IrregularFunctionalData(argvals, values) >>> fdata.mean() Functional data object with 1 observations on a 1-dimensional support.
- noise_variance(order=2)[source]#
Estimate the variance of the noise.
This function estimates the variance of the noise. The noise is estimated for each individual curve using the methodology in [3]. As the curves are assumed to be generated by the same process, the estimation of the variance of the noise is the mean over the set of curves.
- Parameters:
order (int) – Order of the difference sequence. The order has to be between 1 and 10. See [3] for more information.
- Returns:
The estimation of the variance of the noise.
- Return type:
Examples
>>> kl = KarhunenLoeve( ... basis_name='bsplines', ... n_functions=5, ... random_state=42 ... ) >>> kl.new(n_obs=100) >>> kl.sparsify(0.5) >>> kl.sparse_data.noise_variance(order=2) 0.006671248206782777
- norm(squared=False, method_integration='trapz', use_argvals_stand=False)[source]#
Norm of each observation of the data.
For each observation in the data, it computes its norm defined in [6] as
\[\| X \| = \left\{\int_{\mathcal{T}} X(t)^2dt\right\}^{\frac12}.\]- Parameters:
- Returns:
The norm of each observations.
- Return type:
npt.NDArray[np.float64], shape=(n_obs,)
Examples
>>> kl = KarhunenLoeve( ... basis_name='bsplines', ... n_functions=5, ... random_state=42 ... ) >>> kl.new(n_obs=10) >>> kl.sparsify(percentage=0.5, epsilon=0.05) >>> kl.sparse_data.norm() array([ 0.53419879, 0.40750272, 0.67092435, 0.26762124, 0.27425138, 0.37419987, 0.65775515, 0.54579643, 0.25830787, 0.49324345 ])
- normalize(**kwargs)[source]#
Normalize the data.
The normalization is performed by divising each functional datum \(X\) by its norm \(\| X \|\). It results in
\[\widetilde{X} = \frac{X}{\| X \|}.\]- Parameters:
kwargs – Other keyword arguments are passed to the following function:
IrregularFunctionalData.norm().- Returns:
The normalized data.
- Return type:
Examples
>>> kl = KarhunenLoeve( ... basis_name='bsplines', ... n_functions=5, ... random_state=42 ... ) >>> kl.new(n_obs=10) >>> kl.sparsify(percentage=0.5, epsilon=0.05) >>> kl.sparse_data.normalize() Functional data object with 10 observations on a 1-dimensional support.
- rescale(weights=0.0, method_integration='trapz', method_smoothing='LP', use_argvals_stand=False, **kwargs)[source]#
Rescale the data.
The rescaling is performed by first centering the data and then multiplying with a common weight:
\[\widetilde{X}(t) = w\{X(t) - \mu(t)\}.\]The weights are defined in [6].
- Parameters:
weights (float) – The weights used to normalize the data. If weights = 0.0, the weights are estimated by integrating the variance function [3].
method_integration (str) – The method used to integrated.
use_argvals_stand (bool) – Use standardized argvals to compute the normalization of the data.
kwargs – Other keyword arguments are passed to the following function:
IrregularFunctionalData.smooth().method_smoothing (str)
- Returns:
The rescaled data and the weight.
- Return type:
Tuple[IrregularFunctionalData, float]
Examples
>>> kl = KarhunenLoeve( ... basis_name='bsplines', ... n_functions=5, ... random_state=42 ... ) >>> kl.new(n_obs=10) >>> kl.sparsify(percentage=0.5, epsilon=0.05) >>> kl.sparse_data.normalize() (Functional data object with 10 observations on a 1-dimensional support., DenseValues(0.16802008))
- smooth(points=None, method='PS', bandwidth=None, penalty=None, **kwargs)[source]#
Smooth the data.
This function smooths each curves individually. Based on [2], it fits a local polynomial smoother to the data. Based on [4], it fits P-splines to the data.
- Parameters:
points (DenseArgvals | None) – Points at which the curves are estimated. The default is None, meaning we use the argvals as estimation points.
method (str) – The method to used for the smoothing. If ‘PS’, the method is P-splines [4]. If ‘LP’, the method is local polynomials [2]. Otherwise, it raises an error.
bandwidth (float | None) – Strictly positive. Control the size of the associated neighborhood. If
bandwidth=None, it is assumed that the curves are twice differentiable and the bandwidth is set to \(n^{-1/5}\) [7] where \(n\) is the number of sampling points per curve. Be careful with the results if the curves are not sampled on \([0, 1]\).penalty (float | None) – Strictly positive. Penalty used in the P-splined fitting of the data.
kwargs – Other keyword arguments are passed to one of the following functions:
preprocessing.smoothing.PSplines()(method='PS') andpreprocessing.smoothing.LocalPolynomial()(method='LP').
- Returns:
Smoothed data.
- Return type:
Examples
For one-dimensional functional data:
>>> argvals = IrregularArgvals({ ... 0: DenseArgvals({'input_dim_0': np.array([0, 1, 2, 3, 4])}), ... 1: DenseArgvals({'input_dim_0': np.array([0, 2, 4])}), ... 2: DenseArgvals({'input_dim_0': np.array([2, 4])}) ... }) >>> values = IrregularValues({ ... 0: np.array([1, 2, 3, 4, 5]), ... 1: np.array([2, 5, 6]), ... 2: np.array([4, 7]) ... }) >>> fdata = IrregularFunctionalData(argvals, values) >>> fdata.smooth() Functional data object with 3 observations on a 1-dimensional support.
- standardize(center=True, **kwargs)[source]#
Standardize the data.
The standardization is performed by first centering the data and then dividing by the standard deviation curve [3]. It results in
\[\widetilde{X}(t) = C(t, t)^{-\frac12}\{X(t) - \mu(t)\}, \quad t \in \mathcal{T}.\]- Parameters:
center (bool, default=True) – Should the data be centered?
**kwargs –
Other keyword arguments are passed to the following functions:
- Returns:
The standardized data.
- Return type:
Examples
>>> kl = KarhunenLoeve( ... basis_name='bsplines', ... n_functions=5, ... random_state=42 ... ) >>> kl.new(n_obs=10) >>> kl.sparsify(percentage=0.5, epsilon=0.05) >>> kl.sparse_data.standardize() Functional data object with 10 observations on a 1-dimensional support.
- to_basis(points=None, method='PS', penalty=None, **kwargs)[source]#
Convert the data to basis format.
This function transforms a IrregularFunctionalData object into a BasisFunctionalData object using method.
- Parameters:
points (DenseArgvals | None) – The argvals of the basis.
method (str) – The method to get the coefficients.
penalty (float | None) – Strictly positive. Penalty used in the P-splined fitting of the data.
kwargs – Other keyword arguments are passed to the function:
preprocessing.smoothing.PSplines()
- Returns:
The expanded data.
- Return type:
- to_long(reindex=False)[source]#
Convert the data to long format.
This function transform a IrregularFunctionalData object into pandas DataFrame. It uses the long format to represent the IrregularFunctionalData object as a dataframe. This is a helper function as it might be easier for some computation, e.g., smoothing of the mean and covariance functions to have a long format.
- Parameters:
reindex (bool) – Should the observations be reindexed?
- Returns:
The data in a long format.
- Return type:
pd.DataFrame
Examples
For one-dimensional functional data:
>>> argvals = IrregularArgvals({ ... 0: DenseArgvals({'input_dim_0': np.array([0, 1, 2, 3, 4])}), ... 1: DenseArgvals({'input_dim_0': np.array([0, 2, 4])}), ... 2: DenseArgvals({'input_dim_0': np.array([2, 4])}) ... }) >>> values = IrregularValues({ ... 0: np.array([1, 2, 3, 4, 5]), ... 1: np.array([2, 5, 6]), ... 2: np.array([4, 7]) ... }) >>> fdata = IrregularFunctionalData(argvals, values) >>> fdata.to_long() input_dim_0 id values 0 0 0 1 1 1 0 2 2 2 0 3 3 3 0 4 4 4 0 5 5 0 1 2 6 2 1 5 7 4 1 6 8 2 2 4 9 4 2 7