MultivariateFunctionalData#

class FDApy.representation.MultivariateFunctionalData(initlist)[source]#

Represent multivariate functional data.

An instance of MultivariateFunctionalData is a list containing objects of the class DenseFunctionalData or IrregularFunctionalData.

Parameters:

initlist (List[Type[FunctionalData]]) – The list containing the elements of the MultivariateFunctionalData.

Attributes:

n_obs (int) – Number of observations of the functional data.
n_functional (int) – Number of components of the multivariate functional data.
n_dimension (List[int]) – Number of input dimension of the functional data.
n_points (List[Dict[str, int]]) – Number of sampling points.

Examples

>>> argvals = DenseArgvals({'input_dim_0': np.array([1, 2, 3, 4, 5])})
>>> values = DenseValues(np.array([
...     [1, 2, 3, 4, 5],
...     [6, 7, 8, 9, 10],
...     [11, 12, 13, 14, 15]
... ]))
>>> fdata_dense = DenseFunctionalData(argvals, values)

>>> argvals = IrregularArgvals({
...     0: DenseArgvals({'input_dim_0': np.array([0, 1, 2, 3, 4])}),
...     1: DenseArgvals({'input_dim_0': np.array([0, 2, 4])}),
...     2: DenseArgvals({'input_dim_0': np.array([2, 4])})
... })
>>> values = IrregularValues({
...     0: np.array([1, 2, 3, 4, 5]),
...     1: np.array([2, 5, 6]),
...     2: np.array([4, 7])
... })
>>> fdata_irregular = IrregularFunctionalData(argvals, values)

>>> MultivariateFunctionalData([fdata_dense, fdata_irregular])

Notes

Be careful that we will not check if all the elements have the same type. It is possible to create MultivariateFunctionalData containing both Dense, Iregular and Basis functional data. The number of observations has to be the same for each element of the list.

References

Methods

`append`(item)	Add an item to self.
`center`([mean, method_smoothing])	Center the data.
`clear`()	Remove all items from the list.
`concatenate`(*fdata)	Concatenate MultivariateFunctionalData objects.
`copy`()
`count`(value)
`covariance`([points, method_smoothing])	Compute an estimate of the covariance.
`extend`(other)	Extend the list of FunctionalData by appending from iterable.
`index`(value, [start, [stop]])	Raises ValueError if the value is not present.
`inner_product`([method_integration, ...])	Compute the inner product matrix of the data.
`insert`(i, item)	Insert an item item at a given position i.
`mean`([points, method_smoothing])	Compute an estimate of the mean.
`noise_variance`([order])	Estimate the variance of the noise.
`norm`([squared, method_integration, ...])	Norm of each observation of the data.
`normalize`(**kwargs)	Normalize the data.
`pop`([i])	Remove the item at the given position in the list, and return it.
`remove`(item)	Remove the first item from self where value is item.
`rescale`([weights, method_integration, ...])	Rescale the data.
`reverse`()	Reserve the elements of the list in place.
`smooth`([points, method, bandwidth, penalty])	Smooth the data.
`sort`(args, *kwds)
`standardize`([center])	Standardize the data.
`to_basis`(**kwargs)	Convert the data to basis format.
`to_grid`()	Convert the data to grid.
`to_long`([reindex])	Convert the data to long format.

append(item)[source]#

Add an item to self.

Parameters:: item (Type[FunctionalData]) – Item to add.
Return type:: None

center(mean=None, method_smoothing=None, **kwargs)[source]#

Center the data.

Parameters:

mean (MultivariateFunctionalData | None) – A precomputed mean as a MultivariateFunctionalData object.
method_smoothing (str | None) – The method to used for the smoothing of the mean. If ‘None’, no smoothing is performed. If ‘PS’, the method is P-splines [3]. If ‘LP’, the method is local polynomials [7].
kwargs – Other keyword arguments are passed to one of the following functions DenseFunctionalData.mean() (mean=None) and DenseFunctionalData.smooth().

Returns:

The centered version of the data.

Return type:

MultivariateFunctionalData

Examples

>>> kl = KarhunenLoeve(
...     basis_name=name, n_functions=n_functions, random_state=42
... )
>>> kl.new(n_obs=10)
>>> kl.add_noise_and_sparsify(0.05, 0.5)

>>> fdata_1 = kl.data
>>> fdata_2 = kl.sparse_data
>>> fdata = MultivariateFunctionalData([fdata_1, fdata_2])
>>> fdata.center(smooth=True)
Functional data object with 10 observations on a 1-dimensional support.

clear()[source]#

Remove all items from the list.

Return type:: None

static concatenate(*fdata)[source]#

Concatenate MultivariateFunctionalData objects.

Parameters:

data – The data to concatenate with self.
fdata (MultivariateFunctionalData)

Returns:

The concatenation of self and data.

Return type:

MultivariateFunctionalData

Raises:

ValueError – When all fdata do not have the same number of elements.

copy()#

count(value) → integer -- return number of occurrences of value#

covariance(points=None, method_smoothing=None, **kwargs)[source]#

Compute an estimate of the covariance.

This function computes an estimate of the covariance surface of a MultivariateFunctionalData object.

Parameters:

points (List[DenseArgvals] | None) – Points at which the mean is estimated. The default is None, meaning we use the argvals as estimation points.
method_smoothing (str | None) – Should the mean be smoothed?
kwargs – Other keyword arguments are passed to the following function DenseFunctionalData.covariance() and IrregularFunctionalData.covariance().

Returns:

An estimate of the covariance as a two-dimensional MultivariateFunctionalData object with same argvals as self.

Return type:

MultivariateFunctionalData

Examples

>>> kl = KarhunenLoeve(
...     basis_name='bsplines', n_functions=5, random_state=42
... )
>>> kl.new(n_obs=50)
>>> kl.add_noise_and_sparsify(0.05, 0.5)

>>> fdata_1 = kl.data
>>> fdata_2 = kl.noisy_data
>>> fdata = MultivariateFunctionalData([fdata_1, fdata_2])

>>> points = DenseArgvals({'input_dim_0': np.linspace(0, 1, 11)})
>>> fdata.covariance(points=[points, points])
Multivariate functional data object with 2 functions of 1 observations.

extend(other)[source]#

Extend the list of FunctionalData by appending from iterable.

Parameters:: other (Iterable[Type[FunctionalData]])
Return type:: None

index(value[, start[, stop]]) → integer -- return first index of value.#

Raises ValueError if the value is not present.

Supporting start and stop arguments is optional, but recommended.

inner_product(method_integration='trapz', method_smoothing=None, noise_variance=None, **kwargs)[source]#

Compute the inner product matrix of the data.

The inner product matrix is a n_obs by n_obs matrix where each entry is defined as

\[\langle\langle x, y \rangle\rangle = \sum_{p = 1}^P \int_{\mathcal{T}_k} x^{(p)}(t)y^{(p)}(t)dt, t \in \mathcal{T},\]

where \(\mathcal{T}\) is a one- or multi-dimensional domain [1].

Parameters:

method_integration (str) – The method used to integrated.
method_smoothing (str | None) – Should the mean be smoothed?
noise_variance (ndarray[Any, dtype[float64]] | None) – An estimation of the variance of the noise. If None, an estimation is computed using the methodology in [4].
kwargs – Other keyword arguments are passed to the following function DenseFunctionalData.inner_product() and IrregularFunctionalData.inner_product().

Returns:

Inner product matrix of the data.

Return type:

npt.NDArray[np.float64], shape=(n_obs, n_obs)

Examples

>>> kl = KarhunenLoeve(
...     basis_name=name, n_functions=n_functions, random_state=42
... )
>>> kl.new(n_obs=4)
>>> kl.add_noise_and_sparsify(0.05, 0.5)

>>> fdata_1 = kl.data
>>> fdata_2 = kl.sparse_data
>>> fdata = MultivariateFunctionalData([fdata_1, fdata_2])
>>> fdata.inner_product(noise_variance=0)
array([
    [ 0.39261306,  0.06899153, -0.14614219, -0.0836462 ],
    [ 0.06899153,  0.32580074, -0.4890299 ,  0.07577286],
    [-0.14614219, -0.4890299 ,  0.94953678, -0.09322892],
    [-0.0836462 ,  0.07577286, -0.09322892,  0.17157688]
])

insert(i, item)[source]#

Insert an item item at a given position i.

Parameters:

i (int)
item (Type[FunctionalData])

Return type:

None

mean(points=None, method_smoothing=None, **kwargs)[source]#

Compute an estimate of the mean.

This function computes an estimate of the mean curve of a MultivariateFunctionalData object.

Parameters:

points (List[DenseArgvals] | None) – Points at which the mean is estimated. The default is None, meaning we use the argvals as estimation points.
method_smoothing (str | None) – The method to used for the smoothing. If ‘None’, no smoothing is performed. If ‘PS’, the method is P-splines [3]. If ‘LP’, the method is local polynomials [7].
kwargs – Other keyword arguments are passed to the following function: MultivariateFunctionalData.smooth().

Returns:

An estimate of the mean as a MultivariateFunctionalData object.

Return type:

MultivariateFunctionalData

Examples

>>> kl = KarhunenLoeve(
...     basis_name='bsplines', n_functions=5, random_state=42
... )
>>> kl.new(n_obs=50)
>>> kl.add_noise_and_sparsify(0.05, 0.5)

>>> fdata_1 = kl.data
>>> fdata_2 = kl.noisy_data
>>> fdata = MultivariateFunctionalData([fdata_1, fdata_2])

>>> points = DenseArgvals({'input_dim_0': np.linspace(0, 1, 11)})
>>> fdata.mean(points=points)
Multivariate functional data object with 2 functions of 1 observations.

noise_variance(order=2)[source]#

Estimate the variance of the noise.

This function estimates the variance of the noise. The noise is estimated for each individual curve using the methodology in [4]. As the curves are assumed to be generated by the same process, the estimation of the variance of the noise is the mean over the set of curves.

Parameters:: order (int) – Order of the difference sequence. The order has to be between 1 and 10. See [4] for more information.
Returns:: The estimation of the variance of the noise.
Return type:: float

Examples

>>> kl = KarhunenLoeve(
...     basis_name='bsplines',
...     n_functions=5,
...     random_state=42
... )
>>> kl.new(n_obs=100)
>>> kl.add_noise(0.05)
>>> kl.sparsify(0.5)
>>> fdata = MultivariateFunctionalData([kl.noisy_data, kl.sparse_data])
>>> fdata.noise_variance
[0.051922438333740877, 0.006671248206782777]

norm(squared=False, method_integration='trapz', use_argvals_stand=False)[source]#

Norm of each observation of the data.

For each observation in the data, it computes its norm defined in [2] as

\[\| X \| = \left\{\int_{\mathcal{T}} X(t)^2dt\right\}^{\frac12}.\]

Parameters:

squared (bool) – If True, the function calculates the squared norm, otherwise it returns the norm.
method_integration (str) – The method used to integrated.
use_argvals_stand (bool) – Use standardized argvals to compute the normalization of the data.

Returns:

The norm of each observations.

Return type:

npt.NDArray[np.float64], shape=(n_obs,)

Examples

>>> kl = KarhunenLoeve(
...     basis_name=name, n_functions=n_functions, random_state=42
... )
>>> kl.new(n_obs=4)
>>> kl.add_noise_and_sparsify(0.05, 0.5)

>>> fdata_1 = kl.data
>>> fdata_2 = kl.sparse_data
>>> fdata = MultivariateFunctionalData([fdata_1, fdata_2])
>>> fdata.norm()
array([1.05384959, 0.84700578, 1.37439764, 0.59235447])

normalize(**kwargs)[source]#

Normalize the data.

The normalization is performed by divising each functional datum \(X\) by its norm \(\| X \|\). It results in

\[\widetilde{X} = \frac{X}{\| X \|}.\]

Parameters:: kwargs – Other keyword arguments are passed to the following function MultivariateFunctionalData.norm().
Returns:: The normalized data.
Return type:: MultivariateFunctionalData

Examples

>>> kl = KarhunenLoeve(
...     basis_name=name, n_functions=n_functions, random_state=42
... )
>>> kl.new(n_obs=4)
>>> kl.add_noise_and_sparsify(0.05, 0.5)

>>> fdata_1 = kl.data
>>> fdata_2 = kl.sparse_data
>>> fdata = MultivariateFunctionalData([fdata_1, fdata_2])
>>> fdata.normalize()
Functional data object with 10 observations on a 1-dimensional support.

pop(i=-1)[source]#

Remove the item at the given position in the list, and return it.

Parameters:: i (int)
Return type:: Type[FunctionalData]

remove(item)[source]#

Remove the first item from self where value is item.

Parameters:: item (Type[FunctionalData])
Return type:: None

rescale(weights=None, method_integration='trapz', method_smoothing='LP', use_argvals_stand=False, **kwargs)[source]#

Rescale the data.

The normalization is performed by divising each functional datum by \(w_j = \int_{T} Var(X(t))dt\).

Parameters:

weights (ndarray[Any, dtype[float64]] | None) – The weights used to normalize the data. If weights = None, the weights are estimated by integrating the variance function [5].
method_integration (str) – The method used to integrated.
use_argvals_stand (bool) – Use standardized argvals to compute the normalization of the data.
kwargs – Keyword parameters for the smoothing of the observations.
method_smoothing (str)

Returns:

The normalized data.

Return type:

Tuple[MultivariateFunctionalData, npt.NDArray[np.float64]]

Examples

>>> kl = KarhunenLoeve(
...     basis_name=name, n_functions=n_functions, random_state=42
... )
>>> kl.new(n_obs=4)
>>> kl.add_noise_and_sparsify(0.05, 0.5)

>>> fdata_1 = kl.data
>>> fdata_2 = kl.sparse_data
>>> fdata = MultivariateFunctionalData([fdata_1, fdata_2])
>>> fdata.normalize()
(Multivariate functional data object with 2 functions of 4
observations., array([0.20365764, 0.19388443]))

reverse()[source]#

Reserve the elements of the list in place.

Return type:: None

smooth(points=None, method='PS', bandwidth=None, penalty=None, **kwargs)[source]#

Smooth the data.

This function smooths each curves individually. It fits a local smoother to the data (the argument degree controls the degree of the local fits). All the paraneters have to be passed as a list of the same length of the MultivariateFunctionalData.

Parameters:

points (DenseArgvals | None) – Points at which the curves are estimated. The default is None, meaning we use the argvals as estimation points.
method (str) – The method to used for the smoothing. If ‘PS’, the method is P-splines [3]. If ‘LP’, the method is local polynomials [7]. Otherwise, it raises an error.
bandwidth (float | None) – Strictly positive. Control the size of the associated neighborhood. If bandwidth == None, it is assumed that the curves are twice differentiable and the bandwidth is set to \(n^{-1/5}\) [6] where \(n\) is the number of sampling points per curve. Be careful with the results if the curves are not sampled on \([0, 1]\).
penalty (float | None) – Strictly positive. Penalty used in the P-splined fitting of the data.
kwargs – Other keyword arguments are passed to one of the following functions DenseFunctionalData.smooth() an IrregularFunctionalData.smooth().

Returns:

Smoothed data.

Return type:

MultivariateFunctionalData

Examples

>>> kl = KarhunenLoeve(
...     basis_name='bsplines', n_functions=5, random_state=42
... )
>>> kl.new(n_obs=50)
>>> kl.add_noise_and_sparsify(0.05, 0.5)

>>> fdata_1 = kl.data
>>> fdata_2 = kl.noisy_data
>>> fdata = MultivariateFunctionalData([fdata_1, fdata_2])

>>> points = DenseArgvals({'input_dim_0': np.linspace(0, 1, 11)})
>>> fdata_smooth = fdata.smooth(
...     points=[points, points],
...     kernel_name=['epanechnikov', 'epanechnikov'],
...     bandwidth=[0.05, 0.1],
...     degree=[1, 2]
... )
Multivariate functional data object with 2 functions of 50 observations

sort(*args, **kwds)#

standardize(center=True, **kwargs)[source]#

Standardize the data.

The standardization is performed by first centering the data and then dividing by the standard deviation curve [2]. It results in

\[\widetilde{X}(t) = C(t, t)^{-\frac12}\{X(t) - \mu(t)\}, \quad t \in \mathcal{T}.\]

Parameters:

center (bool) – Should the data be centered?
kwargs – Other keyword arguments are passed to the following function MultivariateFunctionalData.center(), DenseFunctionalData.standardize() and IrregularFunctionalData.stansardize().

Returns:

The standardized data.

Return type:

MultivariateFunctionalData

Examples

>>> kl = KarhunenLoeve(
...     basis_name=name, n_functions=n_functions, random_state=42
... )
>>> kl.new(n_obs=4)
>>> kl.add_noise_and_sparsify(0.05, 0.5)

>>> fdata_1 = kl.data
>>> fdata_2 = kl.sparse_data
>>> fdata = MultivariateFunctionalData([fdata_1, fdata_2])
>>> fdata.standardize()
Functional data object with 10 observations on a 1-dimensional support.

to_basis(**kwargs)[source]#

Convert the data to basis format.

This function transforms a MultivariateFunctionalData object into a MultivariateFunctionalData that contains BasisFunctionalData.

Parameters:: kwargs – Other keyword arguments are passed to the functions representation.functional_data.DenseFunctionalData() and representation.functional_data.IrregularFunctionalData().
Returns:: The expanded data.
Return type:: MultivariateFunctionalData

to_grid()[source]#

Convert the data to grid.

Returns:: The data in grid format.
Return type:: MultivariateFunctionalData

to_long(reindex=True)[source]#

Convert the data to long format.

This function transform a MultivariateFunctionalData object into a list of pandas DataFrame. It uses the long format to represent each element of the MultivariateFunctionalData object as a dataframe. This is a helper function as it might be easier for some computation.

Parameters:: reindex (bool) – Should the observations be reindexed.
Returns:: The data in a long format.
Return type:: List[pd.DataFrame]

Examples

>>> argvals = DenseArgvals({'input_dim_0': np.array([1, 2, 3, 4, 5])})
>>> values = DenseValues(np.array([
...     [1, 2, 3, 4, 5],
...     [6, 7, 8, 9, 10],
...     [11, 12, 13, 14, 15]
... ]))
>>> fdata_dense = DenseFunctionalData(argvals, values)

>>> argvals = IrregularArgvals({
...     0: DenseArgvals({'input_dim_0': np.array([0, 1, 2, 3, 4])}),
...     1: DenseArgvals({'input_dim_0': np.array([0, 2, 4])}),
...     2: DenseArgvals({'input_dim_0': np.array([2, 4])})
... })
>>> values = IrregularValues({
...     0: np.array([1, 2, 3, 4, 5]),
...     1: np.array([2, 5, 6]),
...     2: np.array([4, 7])
... })
>>> fdata_irregular = IrregularFunctionalData(argvals, values)
>>> fdata = MultivariateFunctionalData([fdata_dense, fdata_irregular])

>>> fdata.to_long()
[    input_dim_0  id  values
           1   0       1
           2   0       2
           3   0       3
           4   0       4
           5   0       5
           1   1       6
           2   1       7
           3   1       8
           4   1       9
           5   1      10
          1   2      11
          2   2      12
          3   2      13
          4   2      14
          5   2      15,
   input_dim_0  id  values
          0   0       5
          1   0       4
          2   0       3
          3   0       2
          4   0       1
          0   1       5
          2   1       3
          4   1       1
          2   2       5
          4   2       3]