Extensions

Winsorizer

class BPt.extensions.Scalers.Winsorizer(quantile_range=(5, 95), copy=True)

This Scaler performs winzorization, or clipping by feature.

Parameters
  • quantile_range (tuple (q_min, q_max), 0.0 < q_min < q_max < 100.0) – Default: (5.0, 95.0), the lower and upper range in which to clip values to.

  • copy (boolean, optional, default is True) – Make a copy of the data.

FeatureSelector

class BPt.extensions.Feat_Selectors.FeatureSelector(mask='sets as random features')

Custom BPt feature selector for integrating in feature selection with a hyper-parameter search.

Parameters

mask ({'sets as random features', 'sets as hyperparameters'}) –

  • ‘sets as random features’: Use random features.

  • ’sets as hyperparameters’: Each feature is set as a hyperparameter, such that the parameter search can tune if each feature is included or not.

SurfLabels

class BPt.extensions.Loaders.SurfLabels(labels, background_label=0, mask=None, strategy='mean', vectorize=True)

This class functions simmilar to NiftiLabelsMasker from nilearn, but instead is for surfaces (though it could work on a cifti image too).

Parameters
  • labels (str or array-like) – This should represent an array, of the same size as the data dimension, as a mask with unique integer values for each ROI. You can also pass a str location in which to load in this array (though the saved file must be loadable by either numpy.load, or if not a numpy array, will try and load with nilearn.surface.load_surf_data(), which you will need nilearn installed to use.)

  • background_labels (int, array-like of int, optional) –

    This parameter determines which label, if any, in the corresponding passed labels, should be treated as ‘background’ and therefore no ROI calculated for that value or values. You may pass either a single interger value, an array-like of integer values.

    If not background label is desired, just pass a label which doesn’t exist in any of the data, e.g., -100.

    default = 0
    

  • mask (None, str or array-like, optional) –

    This parameter allows you to optional pass a mask of values in which to not calculate ROI values for. This can be passed as a str or array-like of values (just like labels), and should be comprised of a boolean array (or 1’s and 0’s), where a value of 1 means that value will be ignored (set to background label) should be kept, and a value of 0, for that value should be masked away. This array should have the same length as the passed labels.

    default = None
    

  • strategy (specific str, custom_func, optional) –

    This parameter dictates the function to be applied to each data’s ROI’s individually, e.g., mean to calculate the mean by ROI.

    If a str is passed, it must correspond to one of the below preset options:

    • ’mean’

      Calculate the mean with np.mean

    • ’sum’

      Calculate the sum with np.sum

    • ’min’ or ‘minimum

      Calculate the min value with np.min

    • ’max’ or ‘maximum

      Calculate the max value with np.max

    • ’std’ or ‘standard_deviation’

      Calculate the standard deviation with np.std

    • ’var’ or ‘variance’

      Calculate the variance with np.var

    If a custom function is passed, it must accept two arguments, custom_func(X_i, axis=data_dim), X_i, where X_i is a subjects data array where that subjects data corresponds to labels == some class i, and can potentially be either a 1D array or 2D array, and an axis argument to specify which axis is the data dimension (e.g., if calculating for a time-series [n_timepoints, data_dim], then data_dim = 1, if calculating for say stacked contrasts where [data_dim, n_contrasts], data_dim = 0, and lastly for a 1D array, data_dim is also 0.

    default = 'mean'
    

  • vectorize (bool, optional) –

    If the returned array should be flattened to 1D. E.g., if the last step in a set of loader steps this should be True, if before a different step it may make sense to set to False.

    default = True
    

SurfMaps

class BPt.extensions.Loaders.SurfMaps(maps, strategy='auto', mask=None, vectorize=True)

Simmilar to NiftiMapsMasker from nilearn but for surfaces, and designed to work with BPt Loader.

This object calculates the signal for each of the passed maps as extracted from the input during fit, and returns for each map a value.

mapsstr or array-like, optional

This parameter represents the maps in which to apply to each surface, where the shape of the passed maps should be (# of vertex, # of maps) or in other words, the size of the data array in the first dimension and the number of maps (i.e., the number of outputted ROIs from fit) as the second dimension.

You may pass maps as either an array-like, or the str file location of a numpy or other valid surface file format array in which to load.

strategy{‘auto’, ‘ls’, ‘average’}, optional

The stratgey in which the maps are used to extract signal. If ‘ls’ is selected, which stands for least squares, the least-squares solution will be used for each region.

Alternatively, if ‘average’ is passed, then the weighted average value for each map will be computed.

By default ‘auto’ will be selected, which will use ‘average’ if the passed maps contain only positive weights, and ‘ls’ in the case that there are any negative values in the passed maps.

Otherwise, you can set a specific strategy. In deciding which method to use, consider an example. Let’s say the fit data X, and maps are

data = np.array([1, 1, 5, 5])
maps = np.array([[0, 0],
                 [0, 0],
                 [1, -1],
                 [1, -1]])

In this case, the ‘ls’ method would yield region signals [2.5, -2.5], whereas the weighted ‘average’ method, would yield [5, 5], notably ignoring the negative weights. This highlights an important limitation to the weighted averaged method, as it does not handle negative values well.

On the other hand, consider changing the maps weights to

data = np.array([1, 1, 5, 5])
maps = np.array([[0, 1],
                 [0, 2],
                 [1, 0],
                 [1, 0]])

ls_sol = [5. , 0.6]
average_sol = [5, 1]

In this case, we can see that the weighted average gives a maybe more intuative summary of the regions. In general, it depends on what signal you are trying to summarize, and how you are trying to summarize it.

maskNone, str or array-like, optional

This parameter allows you to optional pass a mask of values in which to not calculate ROI values for. This can be passed as a str or array-like of values (just like maps), and should be comprised of a boolean array (or 1’s and 0’s), where a value of 1 means that value will be ignored (set to 0) should be kept, and a value of 0, for that value should be masked away. This array should have the same length as the passed maps. Specifically, where the shape of maps is (size, n_maps), the shape of mask should be (size).

default = None
vectorizebool, optional

If the returned array should be flattened to 1D. E.g., if this is the last step in a set of loader steps this should be True. Also note, if the surface data it is being applied to is 1D, then the output will be 1D regardless of this parameter.

default = True