Math with uncertain and censored measurements (pwkit.msmt)

pwkit.msmt - Working with uncertain measurements.

Classes:

Uval - An empirical uncertain value represented by numerical samples. LimitError - Raised on illegal operations on upper/lower limits. Lval - Container for either precise values or upper/lower limits. Textual - A measurement recorded in textual form.

Generic unary functions on measurements:

absolute - abs(x) arccos - As named. arcsin - As named. arctan - As named. cos - As named. errinfo - Get (limtype, repval, plus_1_sigma, minus_1_sigma) expm1 - exp(x) - 1 exp - As named. fmtinfo - Get (typetag, text, is_imprecise) for textual round-tripping. isfinite - True if the value is well-defined and finite. liminfo - Get (limtype, repval) limtype - -1 if the datum is an upper limit; 1 if lower; 0 otherwise. log10 - As named. log1p - log(1+x) log2 - As named. log - As named. negative - -x reciprocal - 1/x repval - Get a “representative” value if x (in case it is uncertain). sin - As named. sqrt - As named. square - x**2 tan - As named. unwrap - Get a version of x on which algebra can be performed.

Generic binary mathematical-ish functions:

add - x + y divide - x / y, never with floor-integer division floor_divide- x // y multiply - x * y power - x ** y subtract - x - y true_divide - x / y, never with floor-integer division typealign - Return (x*, y*) cast to same algebra-friendly type: float, Uval, or Lval.

Miscellaneous functions:

is_measurement - Check whether an object is numerical find_gamma_params - Compute reasonable Γ distribution parameters given mode/stddev. pk_scoreatpercentile - Simplified version of scipy.stats.scoreatpercentile. sample_double_norm - Sample from a quasi-normal distribution with asymmetric variances. sample_gamma - Sample from a Γ distribution with α/β parametrization.

Variables:

lval_unary_math - Dict of unary math functions operating on Lvals. parsers - Dict of type tag to parsing functions. scalar_unary_math - Dict of unary math functions operating on scalars. textual_unary_math - Dict of unary math functions operating on Textuals. UQUANT_UNCERT - Scale of uncertainty assumed for in cases where it’s unquantified. uval_default_repval_method - Default method for computing Uval representative values. uval_dtype - The Numpy dtype of Uval data (often ignored!) uval_nsamples - Number of samples used when constructing Uvals uval_unary_math - Dict of unary math functions operating on Uvals.

exception pwkit.msmt.LimitError[source]
class pwkit.msmt.Lval(kind, value)[source]

A container for either precise values or upper/lower limits. Constructed as Lval(kind, value), where kind is "exact", "uncertain", "toinf", "tozero", "pastzero", or "undef". Most easily constructed via Textual.parse(). Can also be constructed with Lval.from_other().

Supported operations are unicode() str() repr() -(neg) abs() + - * / ** += -= *= /= **=.

class pwkit.msmt.Textual(tkind, dkind, data)[source]

A measurement recorded in textual form.

Textual.from_exact(text, tkind=’none’) - text is passed to float() Textual.parse(text, tkind=’none’) - text as described below.

Transformation kinds are ‘none’, ‘log10’, or ‘positive’. Expressions for values take the form ‘1.234’, ‘<2’, ‘>3’, ‘~7’, ‘6to8’, ‘7pm0.1’, or ‘12p1m0.3’.

Methods:

unparse() - Return parsed text (but not tkind!) unwrap() - Express as float/Uval/Lval as appropriate. repval(limitsok=False) - Get single scalar “representative” value. limtype() - -1 if upper limit; +1 if lower; 0 otherwise.

Supported operations: unicode() str() repr() [latexification] -(neg) abs() + - * / **

limtype()[source]

Return -1 if this value is an upper limit, 1 if it is a lower limit, 0 otherwise.

repval(limitsok=False)[source]

Get a best-effort representative value as a float. This can be DANGEROUS because it discards limit information, which is rarely wise.

pwkit.msmt.UQUANT_UNCERT = 0.2

Some values are known to be uncertain, but their uncertainties have not been quantified. This is lame but it happens. In this case, assume a 20% uncertainty.

We could infer uncertainties from the number of written digits: i.e., assuming “1.2” is uncertain by 0.05 or so, while “1.2000” is uncertain by 0.00005 or so. But there are many cases in astronomy where people just list values that are 20% uncertain and give them to multiple decimal places. I’d rather be conservative with these values than overly optimistic.

Code to do the appropriate parsing is in the Python uncertainties package, in its __init__.py:parse_error_in_parentheses().

class pwkit.msmt.Uval(data)[source]

An empirical uncertain value, represented by samples.

Constructors are:

  • Uval.from_other()

  • Uval.from_fixed()

  • Uval.from_norm()

  • Uval.from_unif()

  • Uval.from_double_norm()

  • Uval.from_gamma()

  • Uval.from_pcount()

Key methods are:

Supported operations are: unicode() str() repr() [latexification]  + -(sub) * // / % ** += -= *= //= %= /= **= -(neg) ~ abs()

static from_pcount(nevents)[source]

We assume a Poisson process. nevents is the number of events in some interval. The distribution of values is the distribution of the Poisson rate parameter given this observed number of events, where the “rate” is in units of events per interval of the same duration. The max-likelihood value is nevents, but the mean value is nevents + 1. The gamma distribution is obtained by assuming an improper, uniform prior for the rate between 0 and infinity.

repvals(method)[source]

Compute representative statistical values for this Uval. method may be either ‘pct’ or ‘gauss’.

Returns (best, plus_one_sigma, minus_one_sigma), where best is the “best” value in some sense, and the others correspond to values at the ~84 and 16 percentile limits, respectively. Because of the sampled nature of the Uval system, there is no single method to compute these numbers.

The “pct” method returns the 50th, 15.866th, and 84.134th percentile values.

The “gauss” method computes the mean μ and standard deviation σ of the samples and returns [μ, μ+σ, μ-σ].

text_pieces(method, uplaces=2, use_exponent=True)[source]

Return (main, dhigh, dlow, sharedexponent), all as strings. The delta terms do not have sign indicators. Any item except the first may be None.

method is passed to Uval.repvals() to compute representative statistical limits.

pwkit.msmt.errinfo(msmt)[source]

Return (limtype, repval, errval1, errval2). Like m_liminfo, but also provides error bar information for values that have it.

pwkit.msmt.find_gamma_params(mode, std)[source]

Given a modal value and a standard deviation, compute corresponding parameters for the gamma distribution.

Intended to be used to replace normal distributions when the value must be positive and the uncertainty is comparable to the best value. Conversion equations determined from the relations given in the sample_gamma() docs.

pwkit.msmt.fmtinfo(value)[source]

Returns (typetag, text, is_imprecise). Unlike other functions that operate on measurements, this also operates on bools, ints, and strings.

pwkit.msmt.liminfo(msmt)[source]

Return (limtype, repval). limtype is -1 for upper limits, 1 for lower limits, and 0 otherwise; repval is a best-effort representative scalar value for this measurement.

pwkit.msmt.limtype(msmt)[source]

Return -1 if this value is some kind of upper limit, 1 if this value is some kind of lower limit, 0 otherwise.

pwkit.msmt.repval(msmt, limitsok=False)[source]

Get a best-effort representative value as a float. This is DANGEROUS because it discards limit information, which is rarely wise. m_liminfo() or m_unwrap() are recommended instead.

pwkit.msmt.sample_double_norm(mean, std_upper, std_lower, size)[source]

Note that this function requires Scipy.

pwkit.msmt.sample_gamma(alpha, beta, size)[source]

This is mostly about recording the conversion between Numpy/Scipy conventions and Wikipedia conventions. Some equations:

mean = alpha / beta variance = alpha / beta**2 mode = (alpha - 1) / beta [if alpha > 1; otherwise undefined] skewness = 2 / sqrt(alpha)

pwkit.msmt.unwrap(msmt)[source]

Convert the value into the most basic representation that we can do math on: float if possible, then Uval, then Lval.

pwkit.msmt.uval_dtype

alias of float64