BayesT2Samples

class robusta.groupwise.BayesT2Samples(null_interval=None, prior_scale: str = 'medium', sample_from_posterior: bool = False, iterations: int = 10000, mu: float = 0, **kwargs)

Bases: robusta.groupwise.models.T2Samples

Run a Bayesian two-samples t-test, either dependent or independent.

Parameters
  • x (keys in data or NumPy array of values, optional) – x and y can be used to specify. If str, both have to be keys to columns in the dataframe (data argument). If array-like, have to contain only objects that can be coerced into numeric. If not specified they are inferred based on the following arguments formula, and between or within (in this order).

  • y (keys in data or NumPy array of values, optional) – x and y can be used to specify. If str, both have to be keys to columns in the dataframe (data argument). If array-like, have to contain only objects that can be coerced into numeric. If not specified they are inferred based on the following arguments formula, and between or within (in this order).

  • paired (bool) –

    Whether the test is dependent/paired-samples (True) or independent-samples (False). If not specified, robusta will try to infer based on other input arguments - formula, indpependent,

    between and within (in this order). Default is True.

  • tail (str, optional) – Direction of the tested alternative hypothesis. Optional values are ‘x!=y’ (Two sided test; aliased by ‘two.sided’), ‘x<y’ (lower tail; aliased by ‘less’) ‘x>y’ (upper tail; aliased by ‘greater’). Whitespace characters in the input are ignored. Default value is ‘x != y’.

  • ci (int) – Width of confidence interval around the sample mean difference. Float between 0 and 100. Default value is 95.

  • independent (str, optional) – The name of the column identifying the independent variable in the data. The column could be either numeric or object, but can contain up to two unique values. Alias for within for paired, between for unpaired.

  • data (pd.DataFrame) – Containing the subject, dependent and independent variables as columns.

  • formula (str, optional) – An R-style formula describing the statistical model. In the form of (dependent ~ between + within | subject). If used, the parsed formula will overrides the following arguments dependent, between, within and subject.

  • dependent (key in data, optional) – The name of the column identifying the dependent variable (i.e., response variable) in the data. The column data type should be numeric or a string that can be coerced to numeric. Overriden by formula if specified. Required if formula is not specified.

  • between (key(s) in data (str or array-like), optional) – The name of the column identifying the independent variable (i.e., predictor variable) in the data. Identifies variables that are manipulated between different subject units (i.e., exogenous variable). Overriden by formula if specified. Not required if formula is not specified, given within is is specified.

  • within (key(s) in data (str or array-like), optional) – The name of the column identifying the independent variable in the data (i.e., predictor variable). The Identifies variables that are manipulated within different subject units (i.e., endogenous variable). Overriden by formula if specified. Not required if formula is not specified, given between is is specified.

  • subject (str or key in data, optional) – The name of the column identifying the sampling unit in the data (i.e., subject). Overriden by formula if specified. Required if formula is not specified.

  • agg_func (str (name of pandas aggregation function) or callable, optional) – Specified how to aggregate observations within sampling.

  • fit (bool, optional) – Whether to run the statistical test upon object creation. Default is True.

  • scale_prior (float, optional) – Controls the scale (width) of the prior distribution. Default value is 1.0 which yields a standard Cauchy prior. It is also possible to pass ‘medium’, ‘wide’ or ‘ultrawide’ as input arguments instead of a float (matching the values of \(\frac{\sqrt{2}}{2}, 1, \sqrt{2}\), respectively). TODO - limit str input values.

  • sample_from_posterior (bool, optional) – If True return samples from the posterior, if False returns Bayes factor. Default is False.

  • iterations (int, optional) – Number of samples used to estimate Bayes factor or posterior. Default is 1000.

  • mu (float, optional) – The hypothesized mean of the differences between the samples, default is 0.

  • kwargs (mapiing, optional) – Keyword arguments passed down to robusta.groupwise.models.T2Samples.

Notes

R function - ttestBF: https://www.rdocumentation.org/packages/BayesFactor/versions/0.9.12-4.2/topics/ttestBF from the BayesFactor[1]_ package

References

1

Morey, R. D., Rouder, J. N., Jamil, T., & Morey, M. R. D. (2015). Package ‘bayesfactor’.