histogrammar.primitives.sparselybin.SparselyBin

class histogrammar.primitives.sparselybin.SparselyBin(binWidth, quantity, value=<Count 0.0>, nanflow=<Count 0.0>, origin=0.0)[source]

Bases: histogrammar.defs.Factory, histogrammar.defs.Container

Split a quantity into equally spaced bins, creating them whenever their entries would be non-zero. Exactly one sub-aggregator is filled per datum.

Use this when you have a distribution of known scale (bin width) but unknown domain (lowest and highest bin index).

Unlike fixed-domain binning, this aggregator has the potential to use unlimited memory. A large number of distinct outliers can generate many unwanted bins.

Like fixed-domain binning, the bins are indexed by integers, though they are 64-bit and may be negative. Bin indexes below -(2**63 - 1) are put in the -(2**63 - 1) are bin and indexes above (2**63 - 1) are put in the (2**63 - 1) bin.

__add__(other)

Add two containers of the same type. The originals are unaffected.

__init__(binWidth, quantity, value=<Count 0.0>, nanflow=<Count 0.0>, origin=0.0)

Create a SparselyBin that is capable of being filled and added.

Parameters:
  • binWidth (float) – the width of a bin; must be strictly greater than zero.
  • quantity (function returning float) – computes the quantity of interest from the data.
  • value (Container) – generates sub-aggregators to put in each bin.
  • nanflow (Container) – a sub-aggregator to use for data whose quantity is NaN.
  • origin (float) – the left edge of the bin whose index is 0.
Other Parameters:
 
  • entries (float) – the number of entries, initially 0.0.
  • bins (dict from int to Container) – the map, probably a hashmap, to fill with values when their entries become non-zero.
at(index)

Extract the container at a given index, if it exists.

bin(x)

Find the bin index associated with numerical value x.

Returns -2**63 if x is NaN, the bin index if it is between -(2**63 - 1) and (2**63 - 1), otherwise saturate at the endpoints.

children

List of sub-aggregators, to make it possible to walk the tree.

copy()

Copy this container, making a clone with no reference to the original.

static ed(binWidth, entries, contentType, bins, nanflow, origin)

Create a SparselyBin that is only capable of being added.

Parameters:
  • binWidth (float) – the width of a bin.
  • entries (float) – the number of entries.
  • contentType (str) – the value’s sub-aggregator type (must be provided to determine type for the case when bins is empty).
  • bins (dict from int to Container) – the non-empty bin indexes and their values.
  • nanflow (Container) – the filled nanflow bin.
  • origin (float) – the left edge of the bin whose index is zero.
factory

Reference to the container’s factory for runtime reflection.

fill(datum, weight=1.0)

Increment the aggregator by providing one datum to the fill rule with a given weight.

Usually all containers in a collection of histograms take the same input data by passing it recursively through the tree. Quantities to plot are specified by the individual container’s lambda functions.

The container is changed in-place.

fromJson(json)

User’s entry point for reconstructing a container from JSON text.

static fromJsonFragment(json, nameFromParent)

staticmethod(function) -> method

Convert a function to be a static method.

A static method does not receive an implicit first argument. To declare a static method, use this idiom:

class C: def f(arg1, arg2, ...): ... f = staticmethod(f)

It can be called either on the class (e.g. C.f()) or on an instance (e.g. C().f()). The instance is ignored except for its class.

Static methods in Python are similar to those found in Java or C++. For a more advanced concept, see the classmethod builtin.

high

The high edge of the last non-empty bin or None if no values have been accumulated.

histogram()

Return a plain histogram by converting all sub-aggregator values into Counts.

indexes

Get a sequence of filled indexes.

static ing(binWidth, quantity, value=<Count 0.0>, nanflow=<Count 0.0>, origin=0.0)

Synonym for __init__.

low

The low edge of the first non-empty bin or None if no values have been accumulated.

maxBin

The last non-empty bin or None if no values have been accumulated.

minBin

The first non-empty bin or None if no values have been accumulated.

name

Name of the concrete Factory as a string; used to label the container type in JSON.

nan(x)

Return true iff x is in the nanflow region (equal to NaN).

num

The number of bins between the first non-empty one (inclusive) and the last non-empty one (exclusive).

numFilled

The number of non-empty bins.

plot(httpServer=None, **parameters)

Generate a VEGA visualization and serve it via HTTP.

range(index)

Get the low and high edge of a bin (given by index number).

register(factory)

Add a new Factory to the registry, introducing a new container type on the fly. General users usually wouldn’t do this, but they could. This method is used internally to define the standard container types.

specialize()

Explicitly invoke histogrammar.specialized.addImplicitMethods on this object, usually right after construction (in each of the methods that construct: __init__, ed, ing, fromJsonFragment, etc).

Objects used as default parameter arguments are created too early for this to be possible, since they are created before the histogrammar.specialized module can be defined. These objects wouldn’t satisfy any of addImplicitMethod‘s checks anyway.

toImmutable()

Return a copy of this container as though it was created by the ed function or from JSON (the “immutable form” in languages that support it, not Python).

toJson()

Convert this container to dicts and lists representing JSON (dropping its fill method).

Note that the dicts and lists can be turned into a string with json.dumps.

toJsonFragment(suppressName)

Used internally to convert the container to JSON without its "type" header.

zero()

Create an empty container with the same parameters as this one. The original is unaffected.