Code Documentation

class GeneralRegression.GenericRegressor(funcs, regressor=None, ci=0.95, **kwargs)[source]

Uses a linear regression algorithm and a transformer to perform nonlinear regression. Using a set of functions \((f_0,\dots,f_n)\), and a point \(x\), lifts the point \(x\) to \((x, f_0(x),\dots,f_n(x))\) and applies a linear regression. The result will be a nonlinear regression based on \(f_0,\dots,f_n.\)

Parameters:
  • funcs – a function that transforms the data points
  • regressor – the linear regression method which should be scikit-learn compatible; default: BayesianRidge
  • ci – confidence interval; float between 0 and 1.
  • kwargs – argument to be passed to funcs
fit(X, y)[source]

Calculates an orthonormal basis according to the given function basis and the linear regressor..

Parameters:
  • X – Training data
  • y – Target values
Returns:

self

predict(X)[source]

Predict using the Hilbert regression method

Parameters:X – data points for prediction
Returns:returns predicted values

Hilbert Space based regression

exception NpyProximation.Error(*args)[source]

Generic errors that may occur in the course of a run.

class NpyProximation.FunctionSpace(dim=1, measure=None, basis=None)[source]

A class tha facilitates a few types of computations over function spaces of type \(L_2(X, \mu)\)

Parameters:
  • dim – the dimension of ‘X’ (default: 1)
  • measure – an object of type Measure representing \(\mu\)
  • basis – a finite basis of functions to construct a subset of \(L_2(X, \mu)\)
form_basis()[source]

Call this method to generate the orthogonal basis corresponding to the given basis. The result will be stored in a property called orth_base which is a list of function that are orthogonal to each other with respect to the measure measure over the given range domain.

inner(f, g)[source]

Computes the inner product of the two parameters with respect to the measure measure, i.e., \(\int_Xf\cdot g d\mu\).

Parameters:
  • f – callable
  • g – callable
Returns:

the quantity of \(\int_Xf\cdot g d\mu\)

project(f, g)[source]

Finds the projection of f on g with respect to the inner product induced by the measure measure.

Parameters:
  • f – callable
  • g – callable
Returns:

the quantity of \(\frac{\langle f, g\rangle}{\|g\|_2}g\)

series(f)[source]

Given a function f, this method finds and returns the coefficients of the series that approximates f as a linear combination of the elements of the orthogonal basis \(B\). In symbols \(\sum_{b\in B}\langle f, b\rangle b\).

Returns:the list of coefficients \(\langle f, b\rangle\) for \(b\in B\)
class NpyProximation.HilbertRegressor(deg=3, base=None, meas=None, fspace=None, c_limit=0.95)[source]

Regression using Hilbert Space techniques Scikit-Learn style.

Parameters:
  • deg – int, default=3 The degree of polynomial regression. Only used if base is None
  • base – list, default = None a list of function to form an orthogonal function basis
  • measNpyProximation.Measure, default = None the measure to form the \(L_2(\mu)\) space. If None a discrete measure will be constructed based on fit inputs
  • fspaceNpyProximation.FunctionBasis, default = None the function subspace of \(L_2(\mu)\), if None it will be initiated according to self.meas
fit(X, y)[source]

Calculates an orthonormal basis according to the given function space basis and the discrete measure from the training points.

Parameters:
  • X – Training data
  • y – Target values
Returns:

self

predict(X)[source]

Predict using the Hilbert regression method

Parameters:X – data points for prediction
Returns:returns predicted values
score(X, y, sample_weight=None)[source]

The default scoring method is the weighted mean square error

Parameters:
  • X
  • y
  • sample_weight
Returns:

class NpyProximation.Measure(density=None, domain=None)[source]

Constructs a measure \(\mu\) based on density and domain.

Parameters:
  • density

    the density over the domain: + if none is given, it assumes uniform distribution

    • if a callable h is given, then \(d\mu=h(x)dx\)
    • if a dictionary is given, then \(\mu=\sum w_x\delta_x\) a discrete measure. The points \(x\) are the keys of the dictionary (tuples) and the weights \(w_x\) are the values.
  • domain – if density is a dictionary, it will be set by its keys. If callable, then domain must be a list of tuples defining the domain’s box. If None is given, it will be set to \([-1, 1]^n\)
integral(f)[source]

Calculates \(\int_{domain} fd\mu\).

Parameters:f – the integrand
Returns:the value of the integral
norm(p, f)[source]

Computes the norm-p of the f with respect to the current measure, i.e., \((\int_{domain}|f|^p d\mu)^{1/p}\).

Parameters:
  • p – a positive real number
  • f – the function whose norm is desired.
Returns:

\(\|f\|_{p, \mu}\)

class NpyProximation.Regression(points, dim=None)[source]

Given a set of points, i.e., a list of tuples of the equal lengths P, this class computes the best approximation of a function that fits the data, in the following sense:

  • if no extra parameters is provided, meaning that an object is initiated like R = Regression(P) then calling R.fit() returns the linear regression that fits the data.
  • if at initiation the parameter deg=n is set, then R.fit() returns the polynomial regression of degree n.
  • if a basis of functions provided by means of an OrthSystem object (R.SetOrthSys(orth)) then calling R.fit() returns the best approximation that can be found using the basic functions of the orth object.
Parameters:
  • points – a list of points to be fitted or a callable to be approximated
  • dim – dimension of the domain
fit()[source]

Fits the best curve based on the optional provided orthogonal basis. If no basis is provided, it fits a polynomial of a given degree (at initiation) :return: The fit.

set_func_spc(sys)[source]

Sets the bases of the orthogonal basis

Parameters:sysorthsys.OrthSystem object.
Returns:None

Note

For technical reasons, the measure needs to be given via set_measure method. Otherwise, the Lebesque measure on \([-1, 1]^n\) is assumed.

set_measure(meas)[source]

Sets the default measure for approximation.

Parameters:meas – a measure.Measure object
Returns:None

Time Series Tools

class ModelSelection.TimeSeriesCV(test_ratio=0.2, train_ratio=None, index=0)[source]

This is a very naive cross validator for time series. It simply sorts the given index (default 0) and splits the sorted index into a train and a test index set according to the given ratios.

Parameters:
  • test_ratio – (default .2) float betweem 0. and 1., the portion of test data
  • train_ratio – (default None-> .8) float betweem 0. and 1., the portion of train data
  • index – (default 0) the index of the column that corresponds to a time parameter in the data
get_n_splits(X=None, y=None, groups=None)[source]

Returns the number of splitting iterations in the cross-validator

Parameters:
  • X – Always ignored, exists for compatibility.
  • y – Always ignored, exists for compatibility.
  • groups – Always ignored, exists for compatibility.
Returns:

Returns the number of splitting iterations in the cross-validator which is 1 for time series.

split(X, y=None, groups=None)[source]

Generate indices to split data into training and test set.

Parameters:
  • X – array-like of shape (n_samples, n_features) Training data, where n_samples is the number of samples and n_features is the number of features.
  • y – array-like of shape (n_samples,), default=None The target variable for supervised learning problems.
  • groups – array-like of shape (n_samples,), default=None Group labels for the samples used while splitting the dataset into train/test set.
Returns:

train The training set indices for that split. test The testing set indices for that split.