This page explains how to install Histogrammar in different ways. Use only the instructions relevant to your situation.

Get a specific release or the latest from GitHub

Starting in version 0.8, each language implementation of Histogrammar has a separate GitHub repository. Installation instructions (and project status) are given in each repository’s README page. Reference documentation is generated from the source code for each release (and therefore isn’t available for the latest).

Read this for an explanation of version numbers and cross-compatibility.

Specification Scala Python C++ Julia R Javascript
1.1-prerelease repo, clone, zip (README) repo, clone, zip (README) repo, clone, zip (README) repo, clone, zip (README)    
1.0 1.0.4 (README, reference) 1.0.9 (README, reference) 1.0.0 (README) 1.0.0 (README)    
0.8 0.8.0 (README, reference) 0.8.0 (README, reference) 0.8.0 (README) 0.8.0 (README)    
0.7 0.7.1 (reference) 0.7.1 (reference) 0.7.1      
0.6 0.6 (reference) 0.6 0.6      

Install from a public repository

Java/Scala or Apache Spark

Histogrammar is available on Maven Central, a publicly accessible Java/Scala repository with dependency management.

Apache Spark

To use Histogrammar in the Spark shell, you don’t have to download anything. Just start Spark with

spark-shell --packages "org.diana-hep:histogrammar_2.11:1.0.4"

and call

import org.dianahep.histogrammar._

on the Spark prompt. For plotting with Bokeh, include org.diana-hep:histogrammar-bokeh_2.11:1.0.4 and for interaction with Spark-SQL, include org.diana-hep:histogrammar-sparksql_2.11:1.0.4.

Use _2.11 for compatibility with Spark 2.x (Scala 2.11) and _2.10 for compatibility with Spark 1.x (Scala 2.10).

Note: due to a dependency bug, Bokeh is incompatible with Spark 2.x (Scala 2.11).

Java/Scala with Maven

To compile Histogrammar into a project with the Maven build tool, add

<dependency>
  <groupId>org.diana-hep</groupId>
  <artifactId>histogrammar_2.11</artifactId>
  <version>1.0.4</version>
</dependency>

to your <dependencies> section. Use _2.11 for compatibility with Scala 2.11 and _2.10 for compatibility with Scala 2.10.

Scala with sbt

To use Histogrammar in sbt console or to compile it into a project with the sbt build tool, add

libraryDependencies += "org.diana-hep" %% "histogrammar" % "1.0.4"

to your build.sbt file. The double-percent gets the appropriate version of Histogrammar for your version of Scala.

Quick start

In fact, the easiest way to start an interactive Scala session with histogrammar is simply to make the following build.sbt:

page.scalaversion := "2.11.8"
libraryDependencies += "org.diana-hep" %% "histogrammar" % "1.0.4"

and run sbt console. You don’t need to install Scala or anything other than sbt.

Python

Histogrammar is available on PyPI, a publicly accessible Python repository with dependency management.

If you have superuser (root) access

To install the latest version of Histogrammar, use

sudo easy_install histogrammar

or

sudo pip install histogrammar

depending on whether you have pip installed (recommended). Some systems with both Python 2 and 3 use easy_install3 and pip3 to distinguish the Python 3 version.

On freshly minted Ubuntu machines, you can install pip with

sudo apt-get install python-setuptools
easy_install pip

If you do not have superuser access

pip install --user histogrammar

which installs it in ~/.local (Python knows where to find it).

For use in PySpark

PySpark uses both Histogrammar-Python (as an interface) and Histogrammar-Scala (for faster calculations). To use it, you need to download Histogrammar-Python as described immediately above, and launch PySpark with a request for Histogrammar-Scala:

pyspark --packages "org.diana-hep:histogrammar_2.11:1.0.4"

Use _2.11 for compatibility with Spark 2.x (Scala 2.11) and _2.10 for compatibility with Spark 1.x (Scala 2.10).

In PySpark, you should be able to call

from histogrammar import *
import histogrammar.sparksql
histogrammar.sparksql.addMethods(df)

where df is a DataFrame that you would like to enable with Histogrammar. You can now call

h = df.Bin(100, -5.0, 5.0, df["plotme"] + df["andme"])

to get a histogram h of Column expression df["plotme"] + df["andme"]. All of the processing is performed in Java with Spark’s DataFrame optimizations.

C++

When available, Spack instructions will be found here.

Julia

When available, Julia package instructions will be found here.

R

When available, CRAN instructions will be found here.

Javascript

When available, npm instructions will be found here.