Basis#
- class FDApy.representation.Basis(name='bsplines', n_functions=5, argvals=None, values=None, is_normalized=False, add_intercept=True, **kwargs)[source]#
Define univariate orthonormal basis.
- Parameters:
name (Tuple[str] | str) – Denotes the basis of functions to use. The default is bsplines. If name=given, it uses a user defined basis (defined with the argvals and values parameters). For higher dimensional data, name is a tuple for the marginal basis.
n_functions (Tuple[int] | int) – Number of functions in the basis.
argvals (DenseArgvals | None) – The sampling points of the functional data.
values (DenseValues | None) – The values of the functional data. Only used if name=’given’.
is_normalized (bool) – Should we normalize the basis function?
add_intercept (bool) – Should the constant functions be into the basis?
kwargs – Other keyword arguments are passed to the function
representation.basis._simulate_basis().
- Attributes:
argvals_stand (DenseArgvals) – Standardized sampling points of the functional data.
n_obs (int) – Number of observations of the functional data.
n_dimension (int) – Number of input dimension of the functional data.
n_points (Tuple[int, …]) – Number of sampling points.
References
Methods
center([mean, method_smoothing])Center the data.
concatenate(*fdata)Concatenate DenseFunctional objects.
covariance([points, method_smoothing, ...])Compute an estimate of the covariance function.
inner_product([method_integration, ...])Compute the inner product matrix of the data.
mean([points, method_smoothing])Compute an estimate of the mean.
noise_variance([order])Estimate the variance of the noise.
norm([squared, method_integration, ...])Norm of each observation of the data.
normalize(**kwargs)Normalize the data.
rescale([weights, method_integration, ...])Rescale the data.
smooth([points, method, bandwidth, penalty])Smooth the data.
standardize([center])Standardize the data.
to_basis([points, method, penalty])Convert the data to basis format.
to_long([reindex])Convert the data to long format.
- center(mean=None, method_smoothing=None, **kwargs)[source]#
Center the data.
The centering is done by estimating the mean from the data and then substracting it to the data. It results in
\[\widetilde{X}(t) = X(t) - \mu(t).\]- Parameters:
mean (DenseFunctionalData | None) – A precomputed mean as a DenseFunctionalData object.
method_smoothing (str | None) – The method to used for the smoothing of the mean. If ‘None’, no smoothing is performed. If ‘PS’, the method is P-splines [4]. If ‘LP’, the method is local polynomials [2].
kwargs – Other keyword arguments are passed to one of the following functions:
DenseFunctionalData.mean()(mean=None) andDenseFunctionalData.smooth().
- Returns:
The centered version of the data.
- Return type:
Examples
>>> kl = KarhunenLoeve( ... basis_name='bsplines', ... n_functions=5, ... random_state=42 ... ) >>> kl.new(n_obs=10) >>> kl.data.center(smooth=True) Functional data object with 10 observations on a 1-dimensional support.
- static concatenate(*fdata)[source]#
Concatenate DenseFunctional objects.
- Parameters:
fdata (DenseFunctionalData) – Functional data to concatenate.
- Returns:
The concatenated object.
- Return type:
- covariance(points=None, method_smoothing=None, center=True, kwargs_center={}, **kwargs)[source]#
Compute an estimate of the covariance function.
This function computes an estimate of the covariance surface of a DenseFunctionalData object. As the curves are sampled on a common grid, we consider the sample covariance [7].
- Parameters:
points (DenseArgvals | None) – The sampling points at which the covariance is estimated. If None, the DenseArgvals of the DenseFunctionalData is used. If smooth is False, the DenseArgvals of the DenseFunctionalData is used.
method_smoothing (str | None) – The method to used for the smoothing of the mean. If ‘None’, no smoothing is performed. If ‘PS’, the method is P-splines [4]. If ‘LP’, the method is local polynomials [2].
center (bool) – Should the data be centered before computing the covariance.
kwargs_center (Dict[str, object]) – Keyword arguments to be passed to the function
FunctionalData.center().kwargs – Other keyword arguments are passed to the following function:
functional_data._smooth_covariance().
- Returns:
An estimate of the covariance as a two-dimensional DenseFunctionalData object.
- Return type:
Examples
>>> kl = KarhunenLoeve( ... basis_name='bsplines', ... n_functions=5, ... random_state=42 ... ) >>> kl.new(n_obs=100) >>> kl.add_noise(0.01) >>> kl.noisy_data.covariance(smooth=True) Functional data object with 1 observations on a 2-dimensional support.
- inner_product(method_integration='trapz', method_smoothing=None, noise_variance=None, **kwargs)[source]#
Compute the inner product matrix of the data.
The inner product matrix is a
n_obsbyn_obsmatrix where each entry is defined as\[\langle x, y \rangle = \int_{\mathcal{T}} x(t)y(t)dt, t \in \mathcal{T},\]where \(\mathcal{T}\) is a one- or multi-dimensional domain [1].
- Parameters:
method_integration (str) – The method used to integrated.
method_smoothing (str | None) – The method to used for the smoothing of the mean. If ‘None’, no smoothing is performed. If ‘PS’, the method is P-splines [4]. If ‘LP’, the method is local polynomials [2].
noise_variance (float | None) – An estimation of the variance of the noise. If None, an estimation is computed using the methodology in [5].
kwargs – Other keyword arguments are passed to the following function:
DenseFunctionalData.center().
- Returns:
Inner product matrix of the data.
- Return type:
npt.NDArray[np.float64], shape=(n_obs, n_obs)
Examples
For one-dimensional functional data:
>>> kl = KarhunenLoeve( ... basis_name='bsplines', n_functions=5, random_state=42 ... ) >>> kl.new(n_obs=3) >>> kl.data.inner_product(noise_variance=0) array([ [ 0.16288536, 0.01958865, -0.10017322], [ 0.01958865, 0.17701988, -0.2459348 ], [-0.10017322, -0.2459348 , 0.42008035] ])
For two-dimensional functional data:
>>> kl = KarhunenLoeve( ... basis_name='bsplines', dimension='2D', n_functions=5, ... random_state=42, argvals=np.linspace(0, 1, 11) ... ) >>> kl.new(n_obs=3) >>> kl.data.inner_product(noise_variance=0) array([ [ 0.01669878, 0.00349892, -0.00817676], [ 0.00349892, 0.03208174, -0.03777796], [-0.00817676, -0.03777796, 0.05083159] ])
- mean(points=None, method_smoothing=None, **kwargs)[source]#
Compute an estimate of the mean.
This function computes an estimate of the mean curve of a DenseFunctionalData object. As the curves are sampled on a common grid, we consider the sample mean, as defined in [7]. The sampled mean is rate optimal [2]. We included some smoothing using Local Polynonial Estimators [8] or P-Splines [4].
- Parameters:
points (DenseArgvals | None) – The sampling points at which the mean is estimated. If None, the DenseArgvals of the DenseFunctionalData is used.
method_smoothing (str | None) – The method to used for the smoothing. If ‘None’, no smoothing is performed. If ‘PS’, the method is P-splines [4]. If ‘LP’, the method is local polynomials [8].
kwargs – Other keyword arguments are passed to the following function
DenseFunctionalData.smooth().
- Returns:
An estimate of the mean as a DenseFunctionalData object.
- Return type:
Examples
>>> kl = KarhunenLoeve( ... basis_name='bsplines', ... n_functions=5, ... random_state=42 ... ) >>> kl.new(n_obs=100) >>> kl.add_noise(0.01) >>> kl.noisy_data.mean(smooth=True) Functional data object with 1 observations on a 1-dimensional support.
- noise_variance(order=2)[source]#
Estimate the variance of the noise.
This function estimates the variance of the noise. The noise is estimated for each individual curve using the methodology in [5]. As the curves are assumed to be generated by the same process, the estimation of the variance of the noise is the mean over the set of curves.
- Parameters:
order (int) – Order of the difference sequence. The order has to be between 1 and 10. See [5] for more information.
- Returns:
The estimation of the variance of the noise.
- Return type:
Examples
>>> kl = KarhunenLoeve( ... basis_name='bsplines', ... n_functions=5, ... random_state=42 ... ) >>> kl.new(n_obs=100) >>> kl.add_noise(0.05) >>> kl.noisy_data.noise_variance(order=2) 0.051922438333740877
- norm(squared=False, method_integration='trapz', use_argvals_stand=False)[source]#
Norm of each observation of the data.
For each observation in the data, it computes its norm defined in [6] as
\[\| X \| = \left\{\int_{\mathcal{T}} X(t)^2dt\right\}^{\frac12}.\]- Parameters:
- Returns:
The norm of each observations.
- Return type:
npt.NDArray[np.float64], shape=(n_obs,)
Examples
>>> kl = KarhunenLoeve( ... basis_name='bsplines', ... n_functions=5, ... random_state=42 ... ) >>> kl.new(n_obs=10) >>> kl.data.norm() array([ 0.53253351, 0.42212112, 0.6709846 , 0.26672898, 0.27440755, 0.37906252, 0.65277413, 0.53998411, 0.2872874 , 0.4934973 ])
- normalize(**kwargs)[source]#
Normalize the data.
The normalization is performed by divising each functional datum \(X\) by its norm \(\| X \|\). It results in
\[\widetilde{X} = \frac{X}{\| X \|}.\]- Parameters:
kwargs – Other keyword arguments are passed to the following function:
DenseFunctionalData.norm().- Returns:
The normalized data.
- Return type:
Examples
>>> kl = KarhunenLoeve( ... basis_name='bsplines', ... n_functions=5, ... random_state=42 ... ) >>> kl.new(n_obs=10) >>> kl.data.normalize() Functional data object with 10 observations on a 1-dimensional support.
- rescale(weights=0.0, method_integration='trapz', use_argvals_stand=False, **kwargs)[source]#
Rescale the data.
The rescaling is performed by first centering the data and then multiplying with a common weight:
\[\widetilde{X}(t) = w\{X(t) - \mu(t)\}.\]The weights are defined in [6].
- Parameters:
weights (float) – The weights used to normalize the data. If weights = 0.0, the weights are estimated by integrating the variance function [3].
method_integration (str) – The method used to estimate the integral.
use_argvals_stand (bool) – Use standardized argvals to compute the normalization of the data.
- Returns:
The rescaled data and the weight.
- Return type:
Tuple[DenseFunctionalData, float]
Examples
>>> kl = KarhunenLoeve( ... basis_name='bsplines', ... n_functions=5, ... random_state=42 ... ) >>> kl.new(n_obs=10) >>> kl.data.rescale() Functional data object with 10 observations on a 1-dimensional support.
- smooth(points=None, method='PS', bandwidth=None, penalty=None, **kwargs)[source]#
Smooth the data.
This function smooths each curves individually. Based on [2], it fits a local polynomial smoother to the data. Based on [4], it fits P-splines to the data.
- Parameters:
points (DenseArgvals | None) – Points at which the curves are estimated. The default is None, meaning we use the argvals as estimation points.
method (str) – The method to used for the smoothing. If ‘PS’, the method is P-splines [4]. If ‘LP’, the method is local polynomials [2]. Otherwise, it raises an error.
bandwidth (float | None) – Strictly positive. Control the size of the associated neighborhood. If
bandwidth=None, it is assumed that the curves are twice differentiable and the bandwidth is set to \(n^{-1/5}\) [8] where \(n\) is the number of sampling points per curve. Be careful with the results if the curves are not sampled on \([0, 1]\).penalty (float | None) – Strictly positive. Penalty used in the P-splined fitting of the data.
kwargs – Other keyword arguments are passed to one of the following functions
preprocessing.smoothing.PSplines()(method='PS') andpreprocessing.smoothing.LocalPolynomial()(method='LP').
- Returns:
Smoothed data.
- Return type:
Examples
>>> kl = KarhunenLoeve( ... basis_name='bsplines', ... n_functions=5, ... random_state=42 ... ) >>> kl.new(n_obs=1) >>> kl.add_noise(0.05) >>> kl.noisy_data.smooth() Functional data object with 1 observations on a 1-dimensional support.
- standardize(center=True, **kwargs)[source]#
Standardize the data.
The standardization is performed by first centering the data and then dividing by the standard deviation curve [3]. It results in
\[\widetilde{X}(t) = C(t, t)^{-\frac12}\{X(t) - \mu(t)\}, \quad t \in \mathcal{T}.\]- Parameters:
center (bool) – Should the data be centered?
kwargs – Other keyword arguments are passed to the following function:
DenseFunctionalData.center().
- Returns:
The standardized data.
- Return type:
Examples
>>> kl = KarhunenLoeve( ... basis_name='bsplines', ... n_functions=5, ... random_state=42 ... ) >>> kl.new(n_obs=10) >>> kl.data.standardize() Functional data object with 10 observations on a 1-dimensional support.
- to_basis(points=None, method='PS', penalty=None, **kwargs)[source]#
Convert the data to basis format.
This function transform a DenseFunctionalData object into a BasisFunctionalData object using method.
- Parameters:
points (DenseArgvals | None) – The argvals of the basis.
method (str) – The method to get the coefficients.
penalty (float | None) – Strictly positive. Penalty used in the P-splined fitting of the data.
kwargs – Other keyword arguments are passed to the function:
preprocessing.smoothing.PSplines()
- Returns:
The expanded data.
- Return type:
- to_long(reindex=False)[source]#
Convert the data to long format.
This function transform a DenseFunctionalData object into pandas DataFrame. It uses the long format to represent the DenseFunctionalData object as a dataframe. This is a helper function as it might be easier for some computation, e.g., smoothing of the mean and covariance functions to have a long format.
- Parameters:
reindex (bool) – Not used here.
- Returns:
The data in a long format.
- Return type:
pd.DataFrame
Examples
>>> argvals = DenseArgvals({'input_dim_0': np.array([1, 2, 3, 4, 5])}) >>> values = DenseValues(np.array([ ... [1, 2, 3, 4, 5], ... [6, 7, 8, 9, 10], ... [11, 12, 13, 14, 15] ... ])) >>> fdata = DenseFunctionalData(argvals, values)
>>> fdata.to_long() input_dim_0 id values 0 1 0 1 1 2 0 2 2 3 0 3 3 4 0 4 4 5 0 5 5 1 1 6 6 2 1 7 7 3 1 8 8 4 1 9 9 5 1 10 10 1 2 11 11 2 2 12 12 3 2 13 13 4 2 14 14 5 2 15