Welcome to niseq’s documentation!

Below is the API documentation for the niseq Python package. Check out our GitHub repository and example notebooks for more information.

Indices and tables

Cluster-level Tests

niseq.cluster_test.sequential_cluster_test_1samp(X, look_times, n_max, alpha=0.05, tail=0, spending_func=None, verbose=True, **kwargs)

A sequential one-sample cluster test.

A sequential generalization of a one-sample cluster-based permutation test (as described by [4]) or of TFCE (as described by [6]).

Distributes Type I error over multiple, sequential analyses of the data (at interim sample sizes specified in look_times never to exceed max_n) using a permutation-based adaptation of the alpha-spending procedure introduced by Lan and DeMets [1]. This allows data collection to be terminated before max_n is reached if there is enough evidence to reject the null hypothesis at an interim analysis, without inflating the false positive rate. This provides a principled way to determine sample size and can result in substantial efficiency gains over a fixed-sample design (i.e. can acheive the same statistical power with a smaller expected sample size) [2, 3].

Parameters:
  • X (array, shape (n_observations, p[, q][, r])) – The data to be analyzed. The first dimension of the array is the number of observations; remaining dimensions comprise the size of a single observation. Observations must appear in the order in which they were collected. Note: that the last dimension of X should correspond to the dimension represented in the adjacency parameter (e.g., spectral data should be provided as (observations, frequencies, channels/vertices)).

  • look_times (list of int) – Sample sizes at which statistical test is applied to the data, in order. Not to exceed max_n.

  • n_max (int) – Sample size at which data collection is completed, regardless of whether the null hypothesis has been rejected.

  • alpha (float, default: 0.05) – Desired false positive rate after all looks at the data (i.e. at n_max)

  • tail (-1 or 0 or 1, default: 0) – If tail is 1, the alternative hypothesis is that the mean of the data is greater than 0 (upper tailed test). If tail is 0, the alternative hypothesis is that the mean of the data is different than 0 (two tailed test). If tail is -1, the alternative hypothesis is that the mean of the data is less than 0 (lower tailed test).

  • spending_func (instance of SpendingFunction, default: LinearSpendingFunction) – An initialized instance of one of SpendingFunction’s subclasses. This defines a monotonically increasing function such that f(0) = 0 and f(n_max) = alpha, determining how Type I error is distributed over sequential analyses. See [2, 3] for details and provided spending functions in niseq.spending_functions module.

  • verbose (bool | str | int | None) – Control verbosity of the logging output. If None, use the default verbosity level. See the logging documentation and mne.verbose() for details. Should only be passed as a keyword argument.

  • threshold (float | dict | None, default: None) – The so-called “cluster forming threshold” in the form of a test statistic (note: this is not an alpha level / “p-value”). If numeric, vertices with data values more extreme than threshold will be used to form clusters. If None, threshold will be chosen automatically to correspond to a p-value of 0.05 for the given number of observations (only valid when using default statistic). If threshold is a dict (with keys 'start' and 'step') then threshold-free cluster enhancement (TFCE) will be used (see TFCE example and [6]).

  • n_permutations (int, default: 1024) – Number of permutations.

  • stat_fun (callable | None) – Function called to calculate the test statistic. Must accept 1D-array as input and return a 1D array. If None (the default), uses mne.stats.ttest_1samp_no_p.

  • adjacency (scipy.sparse.spmatrix | None | False) – Defines adjacency between locations in the data, where “locations” can be spatial vertices, frequency bins, time points, etc. For spatial vertices, see: mne.channels.find_ch_adjacency(). If False, assumes no adjacency (each location is treated as independent and unconnected). If None, a regular lattice adjacency is assumed, connecting each location to its neighbor(s) along the last dimension of X (or the last two dimensions if X is 2D). If adjacency is a matrix, it is assumed to be symmetric (only the upper triangular half is used) and must be square with dimension equal to X.shape[-1] (for 2D data) or X.shape[-1] * X.shape[-2] (for 3D data) or (optionally) X.shape[-1] * X.shape[-2] * X.shape[-3] (for 4D data). The function mne.stats.combine_adjacency may be useful for 4D data.

  • n_jobs (int | None) – The number of jobs to run in parallel. If -1, it is set to the number of CPU cores. Requires the joblib package. None (default) is a marker for ‘unset’ that will be interpreted as n_jobs=1 (sequential execution) unless the call is performed under a joblib:joblib.parallel_backend() context manager that sets another value for n_jobs.

  • seed (None | int | instance of RandomState) – A seed for the NumPy random number generator (RNG). If None (default), the seed will be obtained from the operating system (see RandomState for details), meaning it will most likely produce different output every time this function or method is run. To achieve reproducible results, pass a value here to explicitly initialize the RNG with a defined state.

  • max_step (int) – Maximum distance between samples along the second axis of X to be considered adjacent (typically the second axis is the “time” dimension). Only used when adjacency has shape (n_vertices, n_vertices), that is, when adjacency is only specified for sensors (e.g., via mne.channels.find_ch_adjacency()), and not via sensors and further dimensions such as time points (e.g., via an additional call of mne.stats.combine_adjacency()).

  • exclude (bool array or None) – Mask to apply to the data to exclude certain points from clustering (e.g., medial wall vertices). Should be the same shape as X. If None, no points are excluded.

  • t_power (float) – Power to raise the statistical values (usually t-values) by before summing (sign will be retained). Note that t_power=0 will give a count of locations in each cluster, t_power=1 will weight each location by its statistical score.

  • out_type ('mask' | 'indices') – Output format of clusters within a list. If 'mask', returns a list of boolean arrays, each with the same shape as the input data (or slices if the shape is 1D and adjacency is None), with True values indicating locations that are part of a cluster. If 'indices', returns a list of tuple of ndarray, where each ndarray contains the indices of locations that together form the given cluster along the given dimension. Note that for large datasets, 'indices' may use far less memory than 'mask'. Default is 'indices'.

  • check_disjoint (bool) – Whether to check if the connectivity matrix can be separated into disjoint sets before clustering. This may lead to faster clustering, especially if the second dimension of X (usually the “time” dimension) is large.

Returns:

  • looks (dict) – Dictionary containing results of each look at the data, indexed by the values provided in look_times. Each entry of the dictionary is a tuple that contains:

    obsarray, shape (p[, q][, r])

    Statistic observed for all variables.

    clusterslist

    List type defined by out_type above.

    cluster_pvarray

    P-value for each cluster.

    H0array, shape (n_permutations,)

    Max cluster level stats observed under permutation.

  • ps (array, shape (n_looks,)) – The lowest p-value obtained at each look specied in look_times. These can be compared to adj_alphas to determine on which looks, if any, one can reject the null hypothesis.

  • adj_alphas (array, shape (n_looks,)) – The adjusted significance thresholds for each look, chosen to control the false positive rate across multiple, sequential analyses. All p-values should be compared to the adjusted alpha for the look at which they were computed.

  • spending (array, shape (n_looks,)) – The value of the alpha spending function at each look.

Notes

The number of and timing of looks at the data need not be planned in advance (other than n_max), but it is important to include all looks that have already occured in look_times each time you analyze the data to ensure that valid adjusted significance thresholds are computed. In your final analysis, look_times should contain the ordered sample sizes at all looks at the data that occured during the study.

When reporting results, you should minimally include the sample sizes at each look, the minimum p-values at each look, the adjusted significance thresholds for each look (to which the p-values are compared), and the value of the alpha-spending function at each look. See [3] for further recommendations.

References

niseq.cluster_test.sequential_cluster_test_corr(X, y, look_times, n_max, alpha=0.05, tail=0, spending_func=None, verbose=True, **kwargs)

A sequential cluster test for correlations.

A sequential generalization of a cluster-based permutation test (as described by [4]) or of TFCE (as described by [6]) for testing a relationship between X and a continuous variable y. Uses Pearson correlation by default (or its z-transform if using TFCE), but test statistic can be modified.

Distributes Type I error over multiple, sequential analyses of the data (at interim sample sizes specified in look_times never to exceed max_n) using a permutation-based adaptation of the alpha-spending procedure introduced by Lan and DeMets [1]. This allows data collection to be terminated before max_n is reached if there is enough evidence to reject the null hypothesis at an interim analysis, without inflating the false positive rate. This provides a principled way to determine sample size and can result in substantial efficiency gains over a fixed-sample design (i.e. can acheive the same statistical power with a smaller expected sample size) [2, 3].

Parameters:
  • X (array, shape (n_observations, p[, q][, r])) – The data to be analyzed. The first dimension of the array is the number of observations; remaining dimensions comprise the size of a single observation. Observations must appear in the order in which they were collected. Note: that the last dimension of X should correspond to the dimension represented in the adjacency parameter (e.g., spectral data should be provided as (observations, frequencies, channels/vertices)).

  • y (array, shape (n_observations,)) – Value of dependent variable associated with each observation in X.

  • look_times (list of int) – Sample sizes at which statistical test is applied to the data, in order. Not to exceed max_n.

  • n_max (int) – Sample size at which data collection is completed, regardless of whether the null hypothesis has been rejected.

  • alpha (float, default: 0.05) – Desired false positive rate after all looks at the data (i.e. at n_max)

  • tail (-1 or 0 or 1, default: 0) – If tail is 1, the alternative hypothesis is that the mean of the data is greater than 0 (upper tailed test). If tail is 0, the alternative hypothesis is that the mean of the data is different than 0 (two tailed test). If tail is -1, the alternative hypothesis is that the mean of the data is less than 0 (lower tailed test).

  • spending_func (instance of SpendingFunction, default: LinearSpendingFunction) – An initialized instance of one of SpendingFunction’s subclasses. This defines a monotonically increasing function such that f(0) = 0 and f(n_max) = alpha, determining how Type I error is distributed over sequential analyses. See [2, 3] for details and provided spending functions in niseq.spending_functions module.

  • verbose (bool | str | int | None) – Control verbosity of the logging output. If None, use the default verbosity level. See the logging documentation and mne.verbose() for details. Should only be passed as a keyword argument.

  • threshold (float | dict | None, default: None) – The so-called “cluster forming threshold” in the form of a test statistic (note: this is not an alpha level / “p-value”). If numeric, vertices with data values more extreme than threshold will be used to form clusters. If None, threshold will be chosen automatically to correspond to a p-value of 0.05 for the given number of observations (only valid when using default statistic). If threshold is a dict (with keys 'start' and 'step') then threshold-free cluster enhancement (TFCE) will be used (see TFCE example and [6]).

  • n_permutations (int, default: 1024) – Number of permutations.

  • tail – If tail is 1, the alternative hypothesis is that the correlation is greater than 0 (upper tailed test). If tail is 0, the alternative hypothesis is that the correlation is different than 0 (two tailed test). If tail is -1, the alternative hypothesis is that the correlation is less than 0 (lower tailed test).

  • stat_fun (callable() | None, default: None) – Function called to calculate the test statistic. Must accept 1D-array as input and return a 1D array. If None (the default), computes Pearson correlation.

  • adjacency (scipy.sparse.spmatrix | None | False) – Defines adjacency between locations in the data, where “locations” can be spatial vertices, frequency bins, time points, etc. For spatial vertices, see: mne.channels.find_ch_adjacency(). If False, assumes no adjacency (each location is treated as independent and unconnected). If None, a regular lattice adjacency is assumed, connecting each location to its neighbor(s) along the last dimension of X (or the last two dimensions if X is 2D). If adjacency is a matrix, it is assumed to be symmetric (only the upper triangular half is used) and must be square with dimension equal to X.shape[-1] (for 2D data) or X.shape[-1] * X.shape[-2] (for 3D data) or (optionally) X.shape[-1] * X.shape[-2] * X.shape[-3] (for 4D data). The function mne.stats.combine_adjacency may be useful for 4D data.

  • n_jobs (int | None) – The number of jobs to run in parallel. If -1, it is set to the number of CPU cores. Requires the joblib package. None (default) is a marker for ‘unset’ that will be interpreted as n_jobs=1 (sequential execution) unless the call is performed under a joblib:joblib.parallel_backend() context manager that sets another value for n_jobs.

  • seed (None | int | instance of RandomState) – A seed for the NumPy random number generator (RNG). If None (default), the seed will be obtained from the operating system (see RandomState for details), meaning it will most likely produce different output every time this function or method is run. To achieve reproducible results, pass a value here to explicitly initialize the RNG with a defined state.

  • max_step (int) – Maximum distance between samples along the second axis of X to be considered adjacent (typically the second axis is the “time” dimension). Only used when adjacency has shape (n_vertices, n_vertices), that is, when adjacency is only specified for sensors (e.g., via mne.channels.find_ch_adjacency()), and not via sensors and further dimensions such as time points (e.g., via an additional call of mne.stats.combine_adjacency()).

  • exclude (bool array or None) – Mask to apply to the data to exclude certain points from clustering (e.g., medial wall vertices). Should be the same shape as X. If None, no points are excluded.

  • t_power (float) – Power to raise the statistical values (usually F-values) by before summing (sign will be retained). Note that t_power=0 will give a count of locations in each cluster, t_power=1 will weight each location by its statistical score.

  • out_type ('mask' | 'indices') – Output format of clusters within a list. If 'mask', returns a list of boolean arrays, each with the same shape as the input data (or slices if the shape is 1D and adjacency is None), with True values indicating locations that are part of a cluster. If 'indices', returns a list of tuple of ndarray, where each ndarray contains the indices of locations that together form the given cluster along the given dimension. Note that for large datasets, 'indices' may use far less memory than 'mask'. Default is 'indices'.

  • check_disjoint (bool) – Whether to check if the connectivity matrix can be separated into disjoint sets before clustering. This may lead to faster clustering, especially if the second dimension of X (usually the “time” dimension) is large.

Returns:

  • looks (dict) – Dictionary containing results of each look at the data, indexed by the values provided in look_times. Each entry of the dictionary is a tuple that contains:

    obsarray, shape (p[, q][, r])

    Statistic observed for all variables.

    clusterslist

    List type defined by out_type above.

    cluster_pvarray

    P-value for each cluster.

    H0array, shape (n_permutations,)

    Max cluster level stats observed under permutation.

  • ps (array, shape (n_looks,)) – The lowest p-value obtained at each look specied in look_times. These can be compared to adj_alphas to determine on which looks, if any, one can reject the null hypothesis.

  • adj_alphas (array, shape (n_looks,)) – The adjusted significance thresholds for each look, chosen to control the false positive rate across multiple, sequential analyses. All p-values should be compared to the adjusted alpha for the look at which they were computed.

  • spending (array, shape (n_looks,)) – The value of the alpha spending function at each look.

Notes

The number of and timing of looks at the data need not be planned in advance (other than n_max), but it is important to include all looks that have already occured in look_times each time you analyze the data to ensure that valid adjusted significance thresholds are computed. In your final analysis, look_times should contain the ordered sample sizes at all looks at the data that occured during the study.

When reporting results, you should minimally include the sample sizes at each look, the minimum p-values at each look, the adjusted significance thresholds for each look (to which the p-values are compared), and the value of the alpha-spending function at each look. See [3] for further recommendations.

References

niseq.cluster_test.sequential_cluster_test_indep(X, labels, look_times, n_max, alpha=0.05, tail=0, spending_func=None, verbose=True, **kwargs)

A sequential independent-sample cluster test.

A sequential generalization of an independet-sample cluster-based permutation test (as described by [4]) or of TFCE (as described by [6]).

Distributes Type I error over multiple, sequential analyses of the data (at interim sample sizes specified in look_times never to exceed max_n) using a permutation-based adaptation of the alpha-spending procedure introduced by Lan and DeMets [1]. This allows data collection to be terminated before max_n is reached if there is enough evidence to reject the null hypothesis at an interim analysis, without inflating the false positive rate. This provides a principled way to determine sample size and can result in substantial efficiency gains over a fixed-sample design (i.e. can acheive the same statistical power with a smaller expected sample size) [2, 3].

Parameters:
  • X (array, shape (n_observations, p[, q][, r])) – The data to be analyzed. The first dimension of the array is the number of observations; remaining dimensions comprise the size of a single observation. Observations must appear in the order in which they were collected. Note: that the last dimension of X should correspond to the dimension represented in the adjacency parameter (e.g., spectral data should be provided as (observations, frequencies, channels/vertices)).

  • labels (array, shape (n_observations,)) – Condition label associated with each observation in X.

  • look_times (list of int) – Sample sizes at which statistical test is applied to the data, in order. Not to exceed max_n.

  • n_max (int) – Sample size at which data collection is completed, regardless of whether the null hypothesis has been rejected.

  • alpha (float, default: 0.05) – Desired false positive rate after all looks at the data (i.e. at n_max)

  • tail (-1 or 0 or 1, default: 0) – If tail is 1, the statistic is thresholded above threshold. If tail is -1, the statistic is thresholded below threshold. If tail is 0, the statistic is thresholded on both sides of the distribution.

  • spending_func (instance of SpendingFunction, default: LinearSpendingFunction) – An initialized instance of one of SpendingFunction’s subclasses. This defines a monotonically increasing function such that f(0) = 0 and f(n_max) = alpha, determining how Type I error is distributed over sequential analyses. See [2, 3] for details and provided spending functions in niseq.spending_functions module.

  • verbose (bool | str | int | None) – Control verbosity of the logging output. If None, use the default verbosity level. See the logging documentation and mne.verbose() for details. Should only be passed as a keyword argument.

  • threshold (float | dict | None, default: None) – The so-called “cluster forming threshold” in the form of a test statistic (note: this is not an alpha level / “p-value”). If numeric, vertices with data values more extreme than threshold will be used to form clusters. If None, threshold will be chosen automatically to correspond to a p-value of 0.05 for the given number of observations (only valid when using default statistic). If threshold is a dict (with keys 'start' and 'step') then threshold-free cluster enhancement (TFCE) will be used (see TFCE example and [6]).

  • n_permutations (int, default: 1024) – Number of permutations.

  • tail – If tail is 1, the alternative hypothesis is that the mean of the data is greater than 0 (upper tailed test). If tail is 0, the alternative hypothesis is that the mean of the data is different than 0 (two tailed test). If tail is -1, the alternative hypothesis is that the mean of the data is less than 0 (lower tailed test).

  • stat_fun (callable | None) – Function called to calculate the test statistic. Must accept 1D-array as input and return a 1D array. If None (the default), uses mne.stats.f_oneway.

  • adjacency (scipy.sparse.spmatrix | None | False) – Defines adjacency between locations in the data, where “locations” can be spatial vertices, frequency bins, time points, etc. For spatial vertices, see: mne.channels.find_ch_adjacency(). If False, assumes no adjacency (each location is treated as independent and unconnected). If None, a regular lattice adjacency is assumed, connecting each location to its neighbor(s) along the last dimension of X (or the last two dimensions if X is 2D). If adjacency is a matrix, it is assumed to be symmetric (only the upper triangular half is used) and must be square with dimension equal to X.shape[-1] (for 2D data) or X.shape[-1] * X.shape[-2] (for 3D data) or (optionally) X.shape[-1] * X.shape[-2] * X.shape[-3] (for 4D data). The function mne.stats.combine_adjacency may be useful for 4D data.

  • n_jobs (int | None) – The number of jobs to run in parallel. If -1, it is set to the number of CPU cores. Requires the joblib package. None (default) is a marker for ‘unset’ that will be interpreted as n_jobs=1 (sequential execution) unless the call is performed under a joblib:joblib.parallel_backend() context manager that sets another value for n_jobs.

  • seed (None | int | instance of RandomState) – A seed for the NumPy random number generator (RNG). If None (default), the seed will be obtained from the operating system (see RandomState for details), meaning it will most likely produce different output every time this function or method is run. To achieve reproducible results, pass a value here to explicitly initialize the RNG with a defined state.

  • max_step (int) – Maximum distance between samples along the second axis of X to be considered adjacent (typically the second axis is the “time” dimension). Only used when adjacency has shape (n_vertices, n_vertices), that is, when adjacency is only specified for sensors (e.g., via mne.channels.find_ch_adjacency()), and not via sensors and further dimensions such as time points (e.g., via an additional call of mne.stats.combine_adjacency()).

  • exclude (bool array or None) – Mask to apply to the data to exclude certain points from clustering (e.g., medial wall vertices). Should be the same shape as X. If None, no points are excluded.

  • t_power (float) – Power to raise the statistical values (usually F-values) by before summing (sign will be retained). Note that t_power=0 will give a count of locations in each cluster, t_power=1 will weight each location by its statistical score.

  • out_type ('mask' | 'indices') – Output format of clusters within a list. If 'mask', returns a list of boolean arrays, each with the same shape as the input data (or slices if the shape is 1D and adjacency is None), with True values indicating locations that are part of a cluster. If 'indices', returns a list of tuple of ndarray, where each ndarray contains the indices of locations that together form the given cluster along the given dimension. Note that for large datasets, 'indices' may use far less memory than 'mask'. Default is 'indices'.

  • check_disjoint (bool) – Whether to check if the connectivity matrix can be separated into disjoint sets before clustering. This may lead to faster clustering, especially if the second dimension of X (usually the “time” dimension) is large.

Returns:

  • looks (dict) – Dictionary containing results of each look at the data, indexed by the values provided in look_times. Each entry of the dictionary is a tuple that contains:

    obsarray, shape (p[, q][, r])

    Statistic observed for all variables.

    clusterslist

    List type defined by out_type above.

    cluster_pvarray

    P-value for each cluster.

    H0array, shape (n_permutations,)

    Max cluster level stats observed under permutation.

  • ps (array, shape (n_looks,)) – The lowest p-value obtained at each look specied in look_times. These can be compared to adj_alphas to determine on which looks, if any, one can reject the null hypothesis.

  • adj_alphas (array, shape (n_looks,)) – The adjusted significance thresholds for each look, chosen to control the false positive rate across multiple, sequential analyses. All p-values should be compared to the adjusted alpha for the look at which they were computed.

  • spending (array, shape (n_looks,)) – The value of the alpha spending function at each look.

Notes

The number of and timing of looks at the data need not be planned in advance (other than n_max), but it is important to include all looks that have already occured in look_times each time you analyze the data to ensure that valid adjusted significance thresholds are computed. In your final analysis, look_times should contain the ordered sample sizes at all looks at the data that occured during the study.

When reporting results, you should minimally include the sample sizes at each look, the minimum p-values at each look, the adjusted significance thresholds for each look (to which the p-values are compared), and the value of the alpha-spending function at each look. See [3] for further recommendations.

References

Max-type Tests

niseq.max_test.sequential_permutation_t_test_1samp(X, look_times, n_max, alpha=0.05, tail=0, spending_func=None, verbose=True, **kwargs)

One-sample sequential permutation test with max-type correction.

This is a sequential generalization of mne.stats.permutations.permutation_t_test.

Uses max-type correction for multiple comparisons [4].

Distributes Type I error over multiple, sequential analyses of the data (at interim sample sizes specified in look_times never to exceed max_n) using a permutation-based adaptation of the alpha-spending procedure introduced by Lan and DeMets [1]. This allows data collection to be terminated before max_n is reached if there is enough evidence to reject the null hypothesis at an interim analysis, without inflating the false positive rate. This provides a principled way to determine sample size and can result in substantial efficiency gains over a fixed-sample design (i.e. can acheive the same statistical power with a smaller expected sample size) [2, 3].

Parameters:
  • X (array, shape (n_observations[, p][, q][, r])) – The data to be analyzed. The first dimension of the array is the number of observations; remaining dimensions comprise the size of a single observation. Observations must appear in the order in which they were collected.

  • look_times (list of int) – Sample sizes at which statistical test is applied to the data, in order. Not to exceed max_n.

  • n_max (int) – Sample size at which data collection is completed, regardless of whether the null hypothesis has been rejected.

  • alpha (float, default: 0.05) – Desired false positive rate after all looks at the data (i.e. at n_max)

  • tail (-1 or 0 or 1, default: 0) – If tail is 1, the alternative hypothesis is that the mean of the data is greater than 0 (upper tailed test). If tail is 0, the alternative hypothesis is that the mean of the data is different than 0 (two tailed test). If tail is -1, the alternative hypothesis is that the mean of the data is less than 0 (lower tailed test).

  • spending_func (instance of SpendingFunction, default: LinearSpendingFunction) – An initialized instance of one of SpendingFunction’s subclasses. This defines a monotonically increasing function such that f(0) = 0 and f(n_max) = alpha, determining how Type I error is distributed over sequential analyses. See [2, 3] for details and provided spending functions in niseq.spending_functions module.

  • verbose (bool | str | int | None) – Control verbosity of the logging output. If None, use the default verbosity level. See the logging documentation and mne.verbose() for details. Should only be passed as a keyword argument.

  • n_permutations (int, default: 1024) – Number of permutations.

  • n_jobs (int | None) – The number of jobs to run in parallel. If -1, it is set to the number of CPU cores. Requires the joblib package. None (default) is a marker for ‘unset’ that will be interpreted as n_jobs=1 (sequential execution) unless the call is performed under a joblib:joblib.parallel_backend() context manager that sets another value for n_jobs.

  • seed (None | int | instance of RandomState) – A seed for the NumPy random number generator (RNG). If None (default), the seed will be obtained from the operating system (see RandomState for details), meaning it will most likely produce different output every time this function or method is run. To achieve reproducible results, pass a value here to explicitly initialize the RNG with a defined state.

Returns:

  • looks (dict) – Dictionary containing results of each look at the data, indexed by the values provided in look_times. Each entry of the dictionary is a tuple that contains:

    obsarray of shape (p[, q][, r])

    Test statistic observed for all variables.

    p_valuesarray of shape (p[, q][, r])

    P-values for all the tests (a.k.a. variables).

    H0array of shape [n_permutations]

    Max test statistics obtained by permutations.

  • ps (array, shape (n_looks,)) – The lowest p-value obtained at each look specied in look_times. These can be compared to adj_alphas to determine on which looks, if any, one can reject the null hypothesis.

  • adj_alphas (array, shape (n_looks,)) – The adjusted significance thresholds for each look, chosen to control the false positive rate across multiple, sequential analyses. All p-values should be compared to the adjusted alpha for the look at which they were computed.

  • spending (array, shape (n_looks,)) – The value of the alpha spending function at each look.

Notes

The number of and timing of looks at the data need not be planned in advance (other than n_max), but it is important to include all looks that have already occured in look_times each time you analyze the data to ensure that valid adjusted significance thresholds are computed. In your final analysis, look_times should contain the ordered sample sizes at all looks at the data that occured during the study.

When reporting results, you should minimally include the sample sizes at each look, the minimum p-values at each look, the adjusted significance thresholds for each look (to which the p-values are compared), and the value of the alpha-spending function at each look. See [3] for further recommendations.

References

niseq.max_test.sequential_permutation_test_corr(X, y, look_times, n_max, alpha=0.05, tail=0, spending_func=None, verbose=True, **kwargs)

A sequential permutation test for correlations with a max-type correction.

Tests for a relationship between X and a continuous independent variable y. By default, uses Pearson correlation by default, but the test statistic can be modified.

Uses max-type correction for multiple comparisons [4].

Distributes Type I error over multiple, sequential analyses of the data (at interim sample sizes specified in look_times never to exceed max_n) using a permutation-based adaptation of the alpha-spending procedure introduced by Lan and DeMets [1]. This allows data collection to be terminated before max_n is reached if there is enough evidence to reject the null hypothesis at an interim analysis, without inflating the false positive rate. This provides a principled way to determine sample size and can result in substantial efficiency gains over a fixed-sample design (i.e. can acheive the same statistical power with a smaller expected sample size) [2, 3].

Parameters:
  • X (array, shape (n_observations[, p][, q][, r])) – The data to be analyzed. The first dimension of the array is the number of observations; remaining dimensions comprise the size of a single observation. Observations must appear in the order in which they were collected.

  • y (array, shape (n_observations,)) – Value of dependent variable associated with each observation in X.

  • look_times (list of int) – Sample sizes at which statistical test is applied to the data, in order. Not to exceed max_n.

  • n_max (int) – Sample size at which data collection is completed, regardless of whether the null hypothesis has been rejected.

  • alpha (float, default: 0.05) – Desired false positive rate after all looks at the data (i.e. at n_max)

  • tail (-1 or 0 or 1, default: 0) – If tail is 1, the alternative hypothesis is that the correlation is greater than 0 (upper tailed test). If tail is 0, the alternative hypothesis is that the correlation is different than 0 (two tailed test). If tail is -1, the alternative hypothesis is that the correlation is less than 0 (lower tailed test).

  • spending_func (instance of SpendingFunction, default: LinearSpendingFunction) – An initialized instance of one of SpendingFunction’s subclasses. This defines a monotonically increasing function such that f(0) = 0 and f(n_max) = alpha, determining how Type I error is distributed over sequential analyses. See [2, 3] for details and provided spending functions in niseq.spending_functions module.

  • verbose (bool | str | int | None) – Control verbosity of the logging output. If None, use the default verbosity level. See the logging documentation and mne.verbose() for details. Should only be passed as a keyword argument.

  • n_permutations (int, default: 1024) – Number of permutations.

  • n_jobs (int | None) – The number of jobs to run in parallel. If -1, it is set to the number of CPU cores. Requires the joblib package. None (default) is a marker for ‘unset’ that will be interpreted as n_jobs=1 (sequential execution) unless the call is performed under a joblib:joblib.parallel_backend() context manager that sets another value for n_jobs.

  • seed (None | int | instance of RandomState) – A seed for the NumPy random number generator (RNG). If None (default), the seed will be obtained from the operating system (see RandomState for details), meaning it will most likely produce different output every time this function or method is run. To achieve reproducible results, pass a value here to explicitly initialize the RNG with a defined state.

Returns:

  • looks (dict) – Dictionary containing results of each look at the data, indexed by the values provided in look_times. Each entry of the dictionary is a tuple that contains:

    obsarray of shape (p[, q][, r])

    Test statistic observed for all variables.

    p_valuesarray of shape (p[, q][, r])

    P-values for all the tests (a.k.a. variables).

    H0array of shape [n_permutations]

    Max test statistics obtained by permutations.

  • ps (array, shape (n_looks,)) – The lowest p-value obtained at each look specied in look_times. These can be compared to adj_alphas to determine on which looks, if any, one can reject the null hypothesis.

  • adj_alphas (array, shape (n_looks,)) – The adjusted significance thresholds for each look, chosen to control the false positive rate across multiple, sequential analyses. All p-values should be compared to the adjusted alpha for the look at which they were computed.

  • spending (array, shape (n_looks,)) – The value of the alpha spending function at each look.

Notes

The number of and timing of looks at the data need not be planned in advance (other than n_max), but it is important to include all looks that have already occured in look_times each time you analyze the data to ensure that valid adjusted significance thresholds are computed. In your final analysis, look_times should contain the ordered sample sizes at all looks at the data that occured during the study.

When reporting results, you should minimally include the sample sizes at each look, the minimum p-values at each look, the adjusted significance thresholds for each look (to which the p-values are compared), and the value of the alpha-spending function at each look. See [3] for further recommendations.

References

niseq.max_test.sequential_permutation_test_indep(X, labels, look_times, n_max, alpha=0.05, tail=0, spending_func=None, verbose=True, **kwargs)

Independent-sample sequential permutation test with max-type correction.

By default, this is a sequential generalization of an independent sample max-t procedure if two groups and max-F procedure if more groups.

Uses max-type correction for multiple comparisons [4].

Distributes Type I error over multiple, sequential analyses of the data (at interim sample sizes specified in look_times never to exceed max_n) using a permutation-based adaptation of the alpha-spending procedure introduced by Lan and DeMets [1]. This allows data collection to be terminated before max_n is reached if there is enough evidence to reject the null hypothesis at an interim analysis, without inflating the false positive rate. This provides a principled way to determine sample size and can result in substantial efficiency gains over a fixed-sample design (i.e. can acheive the same statistical power with a smaller expected sample size) [2, 3].

Parameters:
  • X (array, shape (n_observations[, p][, q][, r])) – The data to be analyzed. The first dimension of the array is the number of observations; remaining dimensions comprise the size of a single observation. Observations must appear in the order in which they were collected.

  • labels (array, shape (n_observations,)) – Condition label associated with each observation in X.

  • look_times (list of int) – Sample sizes at which statistical test is applied to the data, in order. Not to exceed max_n.

  • n_max (int) – Sample size at which data collection is completed, regardless of whether the null hypothesis has been rejected.

  • alpha (float, default: 0.05) – Desired false positive rate after all looks at the data (i.e. at n_max)

  • tail (-1 or 0 or 1, default: 0) – If tail is 1, the alternative hypothesis is that the mean of the data is greater than 0 (upper tailed test). If tail is 0, the alternative hypothesis is that the mean of the data is different than 0 (two tailed test). If tail is -1, the alternative hypothesis is that the mean of the data is less than 0 (lower tailed test).

  • spending_func (instance of SpendingFunction, default: LinearSpendingFunction) – An initialized instance of one of SpendingFunction’s subclasses. This defines a monotonically increasing function such that f(0) = 0 and f(n_max) = alpha, determining how Type I error is distributed over sequential analyses. See [2, 3] for details and provided spending functions in niseq.spending_functions module.

  • verbose (bool | str | int | None) – Control verbosity of the logging output. If None, use the default verbosity level. See the logging documentation and mne.verbose() for details. Should only be passed as a keyword argument.

  • n_permutations (int, default: 1024) – Number of permutations.

  • n_jobs (int | None) – The number of jobs to run in parallel. If -1, it is set to the number of CPU cores. Requires the joblib package. None (default) is a marker for ‘unset’ that will be interpreted as n_jobs=1 (sequential execution) unless the call is performed under a joblib:joblib.parallel_backend() context manager that sets another value for n_jobs.

  • seed (None | int | instance of RandomState) – A seed for the NumPy random number generator (RNG). If None (default), the seed will be obtained from the operating system (see RandomState for details), meaning it will most likely produce different output every time this function or method is run. To achieve reproducible results, pass a value here to explicitly initialize the RNG with a defined state.

Returns:

  • looks (dict) – Dictionary containing results of each look at the data, indexed by the values provided in look_times. Each entry of the dictionary is a tuple that contains:

    obsarray of shape (p[, q][, r])

    Test statistic observed for all variables.

    p_valuesarray of shape (p[, q][, r])

    P-values for all the tests (a.k.a. variables).

    H0array of shape [n_permutations]

    Max test statistics obtained by permutations.

  • ps (array, shape (n_looks,)) – The lowest p-value obtained at each look specied in look_times. These can be compared to adj_alphas to determine on which looks, if any, one can reject the null hypothesis.

  • adj_alphas (array, shape (n_looks,)) – The adjusted significance thresholds for each look, chosen to control the false positive rate across multiple, sequential analyses. All p-values should be compared to the adjusted alpha for the look at which they were computed.

  • spending (array, shape (n_looks,)) – The value of the alpha spending function at each look.

Notes

The number of and timing of looks at the data need not be planned in advance (other than n_max), but it is important to include all looks that have already occured in look_times each time you analyze the data to ensure that valid adjusted significance thresholds are computed. In your final analysis, look_times should contain the ordered sample sizes at all looks at the data that occured during the study.

When reporting results, you should minimally include the sample sizes at each look, the minimum p-values at each look, the adjusted significance thresholds for each look (to which the p-values are compared), and the value of the alpha-spending function at each look. See [3] for further recommendations.

References

Alpha Spending Functions

class niseq.spending_functions.LinearSpendingFunction(alpha, max_n)

The linear spending function. This is the simplest possible spending function, which distributes Type I error rate allowance evenly over time.

__call__(n)
Parameters:

n (int) – An interim sample size

Returns:

spending – The value of the spending function at n

Return type:

float

__init__(alpha, max_n)
Parameters:
  • alpha (float) – Desired false-positive rate after all sequential tests.

  • max_n (int) – The sample size at which data collection will be terminated, regardless of whether the null hypothesis has been rejected.

class niseq.spending_functions.OBrienFlemingSpendingFunction(alpha, max_n)

The O’Brien Fleming spending function. A common choice for clinical trials or other confirmatory research, this spending function is conservative for early analyses, saving more power for later in the study.

__call__(n)
Parameters:

n (int) – An interim sample size

Returns:

spending – The value of the spending function at n

Return type:

float

__init__(alpha, max_n)
Parameters:
  • alpha (float) – Desired false-positive rate after all sequential tests.

  • max_n (int) – The sample size at which data collection will be terminated, regardless of whether the null hypothesis has been rejected.

class niseq.spending_functions.PiecewiseSpendingFunction(old_spending_func, break_n, new_max_n)

A piecewise spending function for adjusting the maximum sample size. A piecewise spending function to be used when adjusting your maximum sample size. i.e., the old spending function old_spending_func is used up until break_n, the intermediate sample size at which you decided to change the max sample size. After that, a linear function is used that goes from old_spending_func(break_n) to (new_max_n, alpha).

This is useful if, for instance, (1) you accidentally collect more data than your original max_n, requiring you to adjust your spending function, or (2) if a conditional power analysis encourages you to change your sample size to acheive a desired Type II error rate. Also (3) if you can no longer collect your original max_n for practical reasons.

If max_n is adjusted multiple times, you can create piecewise spending functions recursively.

__call__(n)
Parameters:

n (int) – An interim sample size

Returns:

spending – The value of the spending function at n

Return type:

float

__init__(old_spending_func, break_n, new_max_n)
Parameters:
  • old_spending_func (instance of SpendingFunction) – an initialized instance of a SpendingFunction subclass with old_spending_func.max_n greater than break_n.

  • break_n (int) – The interim sample size at which researcher decides to adjust their maximum sample size.

  • new_max_n (int) – The new maximum sample size.

class niseq.spending_functions.PocockSpendingFunction(alpha, max_n)

The Pocock spending function. A very common spending function that spends your alpha budget somewhat liberally (compared to e.g. the O’Brien-Fleming function) at the beginning of study; consequently, you have more power early on in exchange for a sharper penalty as you approach the maximum sample size.

__call__(n)
Parameters:

n (int) – An interim sample size

Returns:

spending – The value of the spending function at n

Return type:

float

__init__(alpha, max_n)
Parameters:
  • alpha (float) – Desired false-positive rate after all sequential tests.

  • max_n (int) – The sample size at which data collection will be terminated, regardless of whether the null hypothesis has been rejected.

class niseq.spending_functions.SpendingFunction(alpha, max_n)

An abstract base class for spending functions

abstract __call__(n)
Parameters:

n (int) – An interim sample size

Returns:

spending – The value of the spending function at n

Return type:

float

__init__(alpha, max_n)
Parameters:
  • alpha (float) – Desired false-positive rate after all sequential tests.

  • max_n (int) – The sample size at which data collection will be terminated, regardless of whether the null hypothesis has been rejected.

Power Analysis

niseq.power.bootstrap.bootstrap_predictive_power_1samp(X, test_func, look_times, n_max, alpha=0.05, conditional=False, n_simulations=1024, seed=None, n_jobs=None, **test_func_kwargs)

Predictive power analysis via Bayesian bootstrap

Computes the predictive power non-parametrically using the Bayesian bootstrap. Optionally, you can condition on the current data to get conditional power, which is useful for adaptive designs. Only valid for one-sample (or paired-sample) tests.

Statistics computed from resamples using the Bayesian bootstrap, as opposed to the frequentist boostrap, can be interpreted as draws from the posterior distribution with an uninformative prior [1]. Thus, results here can be conveniently interpreted as the Bayesian predictive power. As recommended by [2] (not in English) and helpfully restated by [3] (in English), resampling weights are drawn from Dirichlet(alpha = 4) for a better approximation.

This functionality is experimental. It is the best catch-all way to do a power analysis for permutation tests I can think of, and similar resampling approaches to estimating power have been used in the literature (e.g. by [4]); however, it should be noted that the neuroimaging literature has not converged upon a standardized approach to performing power analyses. The Bayesian bootstrap approach used here incorporates uncertainty about the effect size into the power estimate, which is handy since uncertainty about the true effect size is considerable following a small pilot study, or even a typical psychology/neuroimaging sample size, as pointed out by [5].

Parameters:
  • X (array, shape (n_observations, p[, q][, r])) – The data from which to resample. X should contain the observations for one group or paired differences. The first dimension of the array is the number of observations; remaining dimensions comprise the size of a single observation. See documentation for user-input test_func for more details.

  • test_func (function) – The one-sample sequential test you want to run a power analysis for. Must accept look_times, n_max, alpha, and verbose arguments and return results, the middle two of which are the p-values for each look and the adjusted alphas, respectively. This could be any user-facing function from niseq that ends in _1samp.

  • look_times (list of int) – Sample sizes at which statistical test is applied to the data, in order. Not to exceed max_n.

  • n_max (int) – Sample size at which data collection is completed, regardless of whether the null hypothesis has been rejected.

  • alpha (float, default: 0.05) – Desired false positive rate after all looks at the data (i.e. at n_max)

  • conditional (bool, default: False) – If True, performs a conditional power analysis; that is, computes the probability of a design rejecting the null hypothesis given that the data in X has already been collected and is included in the analysis, as in an adaptive design. If False (default), performs a prospective power analysis (e.g. if you’re using pilot data or data from another study to inform sample size planning for a study that hasn’t begun data collection).

  • n_simulations (int, default: 1024) – Number of bootstrap resamples/simulations to perform.

  • seed (None | int | instance of RandomState) – A seed for the NumPy random number generator (RNG). If None (default), the seed will be obtained from the operating system (see RandomState for details), meaning it will most likely produce different output every time this function or method is run. To achieve reproducible results, pass a value here to explicitly initialize the RNG with a defined state.

  • n_jobs (int | None) – The number of jobs to run in parallel. If -1, it is set to the number of CPU cores. Requires the joblib package. None (default) is a marker for ‘unset’ that will be interpreted as n_jobs=1 (sequential execution) unless the call is performed under a joblib:joblib.parallel_backend() context manager that sets another value for n_jobs.

  • **test_func_kwargs – You may input any arguments you’d like to be passed to test_func.

Returns:

res – A results dictionary with keys:

'uncorr_instantaneous_power'list of float

The power of a fixed-sample statistical test performed at each look.

'rejection_probability'list of float

The probability that a sequential test rejects the null hypothesis (for the first time) at each look time.

'cumulative_power'list of float

The power of a sequential test to reject the null hypothesis by each look time. res['cumulative_power'][-1] is the power of the full sequential procedure.

'uncorr_cumulative_power'list of float

Cumulative power if the rejection threshold at each look was not corrected using alpha-spending (as it should be).

'n_expected'float

The expected sample size for the sequential procedure.

'n_simulations'int

The number of bootstrap resamples used.

'n_orig_data'int

The sample size of the original data X, i.e. X.shape[0].

'conditional'bool

Whether the power analysis that was run was conditional (True) or prospective (False).

'test_func'str

Name of the sequential test function used.

'test_func_kwargs'dict

A record of the arguments passed to the test function, including look_times and n_max.

Return type:

dict

References

Backend Functions

niseq._permutation.find_thresholds(H0, look_times, max_n, alpha=0.05, tail=0, spending_func=None)

Given a permutation null distribution for a corresponding sequence of look times and an alpha spending function, computes the adjusted significance thresholds requires to control the false positive rate across all looks.

This isn’t meant to be accessed directly by users, but it can be used together with generate_permutation_dist to create new sequential tests if you’re confident you know what you’re doing.

Parameters:
  • H0 (array of shape (n_permutations, n_looks)) – The joint permutation null distribution of the test statistic across look times.

  • look_times (list of int) – Sample sizes at which statistical test is applied to the data, in order. Not to exceed max_n.

  • n_max (int) – Sample size at which data collection is completed, regardless of whether the null hypothesis has been rejected.

  • alpha (float, default: 0.05) – Desired false positive rate after all looks at the data (i.e. at n_max)

  • tail (-1 or 0 or 1, default: 0) – If tail is 1, the alternative hypothesis is that the mean of the data is greater than 0 (upper tailed test). If tail is 0, the alternative hypothesis is that the mean of the data is different than 0 (two tailed test). If tail is -1, the alternative hypothesis is that the mean of the data is less than 0 (lower tailed test).

  • spending_func (instance of SpendingFunction, default: LinearSpendingFunction) – An initialized instance of one of SpendingFunction’s subclasses. This defines a monotonically increasing function such that f(0) = 0 and f(n_max) = alpha, determining how Type I error is distributed over sequential analyses. See [2, 3] for details and provided spending functions in niseq.spending_functions module.

Returns:

  • spending (array of shape (n_looks,)) – The value of the alpha spending function at each sample size in look_times.

  • adj_alphas (array of shape (n_looks,)) – The adjusted significance threshold against which to compare p-values at each sample size in look_times.

niseq._permutation.generate_permutation_dist(X, labels, look_times, n_permutations=1024, seed=None, n_jobs=None, statistic=<function _get_cluster_stats_samples>, verbose=True, **statistic_kwargs)

This function computes the test statistic and its permutation distribution at each look time. It isn’t meant for users to access directly for ordinary use, though it can be used in combination with find_thresholds to construct new sequential tests if you’re confident you know what you’re doing. You’ll want to read the source code carefully to make sure your statistic function is compatible.

Parameters:
  • X (array, shape (n_observations[, p][, q][, r])) – The data to be analyzed. The first dimension of the array is the number of observations; remaining dimensions comprise the size of a single observation. Observations must appear in the order in which they were collected.

  • labels (array of shape (n_observations,) | None) – Either condition labels for each observation in X, a continuous dependent variable to correlate with X, or None. In the latter case, a one-sample (sign flip) permutation scheme will be used, otherwise an independent sample (label shuffle) permutation scheme is used.

  • n_permutations (int, default: 1024) – Number of permutations.

  • seed (None | int | instance of RandomState) – A seed for the NumPy random number generator (RNG). If None (default), the seed will be obtained from the operating system (see RandomState for details), meaning it will most likely produce different output every time this function or method is run. To achieve reproducible results, pass a value here to explicitly initialize the RNG with a defined state.

  • n_jobs (int | None) – The number of jobs to run in parallel. If -1, it is set to the number of CPU cores. Requires the joblib package. None (default) is a marker for ‘unset’ that will be interpreted as n_jobs=1 (sequential execution) unless the call is performed under a joblib:joblib.parallel_backend() context manager that sets another value for n_jobs.

  • verbose (bool | str | int | None) – Control verbosity of the logging output. If None, use the default verbosity level. See the logging documentation and mne.verbose() for details. Should only be passed as a keyword argument.

  • statistic (callable(), default: _get_cluster_stats_samples) – The test statistic to compute on the data, e.g. a cluster statistic or a max-t statistic. The last value statistic returns must be the omnibus test statistic (e.g. the max-t or the cluster size), though you can return whatever other stuff you want which will be passed through the obs dictionary.

  • **statistic_kwargs – You may pass arbitrary arguments to the statistic function.

Returns:

  • obs (dict) – The output of statistic indexed by look time in look_times.

  • H0 (array of shape (n_permutations, n_looks)) – The joint permutation null distribution of the test statistic across look times.