Basic Usage: datastore application

The datastore is a tool for using the eemeter which automates and helps to scale some of the most frequent tasks accomplished by the eemeter. These tasks include data loading and storage, meter running, result storage and warehousing. It puts a REST API in front of the eemeter and uses a postgres backend.

This tutorial is also available as a jupyter notebook.

Note:

This tutorial assumes you have a working datastore instance. If you do not, please follow the datastore development setup instructions or contact Open EE to setting up a dedicated production deployment.

Note:

For small and large datasets, the ETL toolkit exists to ease and speed up the process of loading your data.

This tutorial does not cover ETL toolkit usage. For more information on the ETL toolkit, see its API documentation.

Setup

In [1]:
# library imports
import pandas as pd
import requests
import pytz

If you followed the datastore development setup instructions, you will already have run the command to create a superuser and access credentials.

python manage.py dev_seed

If you haven’t already done so, do so now. The dev_seed command creates a demo admin user and a sample project.

  • username: demo,
  • password: demo-password,
  • API access token: tokstr.

Ensure that your development server is running locally on port 8000 before continuing.

python manage.py runserver

Each request will include an Authorization header

Authorization: Bearer tokstr
In [2]:
base_url = "http://0.0.0.0:8000"
token = "tokstr"
headers = {"Authorization": "Bearer {}".format(token)}

Using the API to get loaded data

We can use the API to inspect the data that is loaded into the datastore. (The API can also be used for loading data, but that is not covered here. See the ETL tutorial for more information on loading data.)

Note:

We will use the requests python package for making requests, but you could just as easily use a tool like cURL or Postman.

If you have the eemeter package installed, you will also have the requests package installed, but if not, you can install it with:

$ pip install requests

A request using the requests library looks like this:

import requests
url = "https://example.com"
data = {
    "first_name": "John",
    "last_name": "Doe"
}
requests.post(url + "/api/users/", json=data)

which is equivalent to:

POST /api/users/ HTTP/1.1
Host: example.com
{
    "first_name": "John",
    "last_name": "Doe"
}

Since the dev_seed command creates a sample project, this will return a response showing that project. Projects all have a unique “project_id”, which can be set to whatever is most appropriate (note: it is not used as primary key; that’s the ‘id’ field).

In [3]:
url = base_url + "/api/v1/projects/"
projects = requests.get(url, headers=headers).json()
In [4]:
projects
Out[4]:
[{'baseline_period_end': '2012-01-01T00:00:00Z',
  'baseline_period_start': None,
  'id': 1,
  'project_id': 'DEV_SEED_PROJECT',
  'project_owner_id': 1,
  'reporting_period_end': None,
  'reporting_period_start': '2012-02-01T00:00:00Z',
  'zipcode': '91104'}]

Energy trace data will be associated with this project by foreign key through a many-to-many table. This means that projects can have 0 to n associated traces, and that traces can have 0 to n associated projects.

Like projects and the project_id field, traces are identified by a unique ‘trace_id’ field, which can also be set to whatever is most appropriate.

There are API endpoints used to fetch trace data:

  1. /api/v1/traces/: This stores trace ids, unit, and interpretation.
  2. /api/v1/trace_records/: This stores time-series records associated with each trace.

These records are stored by record start timestamp, with the implicit assumption that the start timestamp of the next temporal record is the end of the current record. The value of the last record is ignored, and serves as the final end timestamp (and is usually set to null).

In [5]:
url = base_url + "/api/v1/traces/?projects={}".format(projects[0]['id'])
traces = requests.get(url, headers=headers).json()
In [6]:
traces
Out[6]:
[{'id': 1,
  'interpretation': 'NATURAL_GAS_CONSUMPTION_SUPPLIED',
  'trace_id': 'DEV_SEED_TRACE_NATURAL_GAS_MONTHLY',
  'unit': 'THERM'},
 {'id': 2,
  'interpretation': 'NATURAL_GAS_CONSUMPTION_SUPPLIED',
  'trace_id': 'DEV_SEED_TRACE_NATURAL_GAS_DAILY',
  'unit': 'THERM'},
 {'id': 3,
  'interpretation': 'ELECTRICITY_CONSUMPTION_SUPPLIED',
  'trace_id': 'DEV_SEED_TRACE_ELECTRICITY_15MIN',
  'unit': 'KWH'},
 {'id': 4,
  'interpretation': 'ELECTRICITY_CONSUMPTION_SUPPLIED',
  'trace_id': 'DEV_SEED_TRACE_ELECTRICITY_HOURLY',
  'unit': 'KWH'},
 {'id': 5,
  'interpretation': 'ELECTRICITY_ON_SITE_GENERATION_UNCONSUMED',
  'trace_id': 'DEV_SEED_TRACE_SOLAR_HOURLY',
  'unit': 'KWH'},
 {'id': 6,
  'interpretation': 'ELECTRICITY_ON_SITE_GENERATION_UNCONSUMED',
  'trace_id': 'DEV_SEED_TRACE_SOLAR_30MIN',
  'unit': 'KWH'}]

We can also query for trace records by trace primary key.

In [7]:
url = base_url + "/api/v1/trace_records/?trace={}".format(traces[0]['id'])
trace_records = requests.get(url, headers=headers).json()
In [8]:
trace_records[:3]  # first 3 records
Out[8]:
[{'estimated': False,
  'id': 1,
  'start': '2010-01-01T00:00:00Z',
  'trace_id': 1,
  'value': None},
 {'estimated': False,
  'id': 2,
  'start': '2010-02-01T00:00:00Z',
  'trace_id': 1,
  'value': 1.0},
 {'estimated': False,
  'id': 3,
  'start': '2010-03-01T00:00:00Z',
  'trace_id': 1,
  'value': 1.0}]

Running meters

Running a meter means pulling trace data, matching it with relevant project data, and evaluating its energy effiency performance. This is the central task performed by the datastore, so if the specifics are unfamiliar, there is a bit more background information worthy of review in the Methods Overview section of the guides.

To run a meter, make a request to create a “meter run”. This request will start a job that runs a meter and saves its results. The result of a meter run is called a “meter result”.

In [9]:
from collections import OrderedDict
import json

Scheduling a single meter run

The primary component of this request is a trace primary key.

The project data associated with the trace will be automatically pulled in to be associated with the trace.

In [10]:
created_meter_run = requests.post(
    base_url + "/api/v1/meter_runs/",
    json={
        "trace": traces[0]['id']  # single trace primary key
    },
    headers=headers
).json(object_pairs_hook=OrderedDict)  # retains order of keys
In [11]:
print(json.dumps(created_meter_run, indent=2))
{
  "id": 1,
  "trace": 1,
  "project": 1,
  "meter_result": 1,
  "meter_input": null,
  "status": "PENDING",
  "failure_message": null,
  "traceback": null,
  "model_class": null,
  "model_kwargs": null,
  "formatter_class": null,
  "formatter_kwargs": null,
  "added": "2016-11-18T02:16:36.078334Z",
  "updated": "2016-11-18T02:16:36.078375Z"
}

This is a summary of the task to run the meter on the indicated project.

The response shows us the complete specification of the meter run behavior, which is as follows:

  1. project: the project primary key (determined implicitly from the trace).
  2. trace: the trace primary key (given in API request).
  3. status: the task status code (in this case "PENDING"), other options are:
  • "PENDING": which means the tasks is scheduled but not yet running or completed.
  • "RUNNING": task is currently running.
  • "SUCCESS": successful completion.
  • "FAILED": failed due to some sort of error.
  1. meter_result: the primary key of the meter result.
  2. meter_input: has not yet been created (this is the complete serialized input to the meter, as required by the eemeter.)
  3. model_class and model_kwargs: The model class and arguments used in meter fitting.
  • If these are left blank, default values will be used.
  1. formatter_class and formatter_kwargs: The formatter class and arguments used in meter fitting.
  • If these are left blank, default values will be used.

If you wish, you can also specify many of these properties explicitly and we will do so in a following section.

Let’s make another call to inspect the state of this meter run

In [12]:
meter_run = requests.get(
    base_url + "/api/v1/meter_runs/{}/".format(created_meter_run['id']),
    headers=headers
).json(object_pairs_hook=OrderedDict)
In [13]:
print(json.dumps(meter_run, indent=2))
{
  "id": 1,
  "trace": 1,
  "project": 1,
  "meter_result": 1,
  "meter_input": "https://storage.googleapis.com/my-storage-bucket/datastore/meter_run_inputs/5fa24b58-444b-4c72-a8c9-bb0327b23118.json",
  "status": "SUCCESS",
  "failure_message": null,
  "traceback": null,
  "model_class": null,
  "model_kwargs": null,
  "formatter_class": null,
  "formatter_kwargs": null,
  "added": "2016-11-18T02:16:36.078334Z",
  "updated": "2016-11-18T02:17:44.356211Z"
}

The associated meter result is also available now and carries a set of outputs that include the meter run value and additionally:

  1. meter_output: serialized output of the meter run.
  2. eemeter_version and datastore_version: software version of eemeter library and datastore application
In [14]:
meter_result = requests.get(
    base_url + "/api/v1/meter_results/{}/".format(created_meter_run['meter_result']),
    headers=headers
).json(object_pairs_hook=OrderedDict)
In [15]:
print(json.dumps(meter_result, indent=2))
{
  "id": 1,
  "trace": 1,
  "project": 1,
  "meter_run": 1,
  "meter_output": "https://storage.googleapis.com/my-storage-bucket/datastore/meter_run_outputs/e1896b44-0b89-49ac-93e1-8eb6e44987bd.json",
  "status": "SUCCESS",
  "eemeter_version": "0.4.12",
  "datastore_version": "0.2.3",
  "model_class": "BillingElasticNetCVModel",
  "model_kwargs": {
    "heating_base_temp": 65,
    "cooling_base_temp": 65
  },
  "formatter_class": "ModelDataBillingFormatter",
  "formatter_kwargs": {},
  "added": "2016-11-18T02:17:44.203325Z",
  "updated": "2016-11-18T02:17:44.223200Z"
}

Customizing meter runs

Meter runs can also be customized by specifying various attributes explicitly, such as custom arguments for the model class.

In [16]:
custom_meter_run = requests.post(
    base_url + "/api/v1/meter_runs/",
    json={
        "trace": 2,
        "project": 1,
        "model_kwargs": {
            "heating_base_temp": 64,  # different temperature
            "cooling_base_temp": 64,
        },
    },
    headers=headers
).json(object_pairs_hook=OrderedDict)
In [17]:
print(json.dumps(custom_meter_run, indent=2))
{
  "id": 2,
  "trace": 2,
  "project": 1,
  "meter_result": 2,
  "meter_input": null,
  "status": "PENDING",
  "failure_message": null,
  "traceback": null,
  "model_class": null,
  "model_kwargs": {
    "heating_base_temp": 64,
    "cooling_base_temp": 64
  },
  "formatter_class": null,
  "formatter_kwargs": null,
  "added": "2016-11-18T02:17:44.681341Z",
  "updated": "2016-11-18T02:17:44.681374Z"
}

Or, if you leave out the project and trace attributes, you can specify the exact serialized input. This means that if serialized meter inputs are available, you need not explicitly load traces and projects through ETL.

Please download a preformatted input file for this step.

In [18]:
with open('meter_input_example.json', 'r') as f:
    meter_input = f.read()  # loaded as a serialized string
    meter_input_meter_run = requests.post(
        base_url + "/api/v1/meter_runs/",
        json={
            "meter_input": meter_input,
        },
        headers=headers
    ).json(object_pairs_hook=OrderedDict)
In [19]:
print(json.dumps(meter_input_meter_run, indent=2))
{
  "id": 3,
  "trace": null,
  "project": null,
  "meter_result": 3,
  "meter_input": "https://storage.googleapis.com/my-storage-bucket/datastore/meter_run_inputs/bf0629db-0c81-4ded-8dcc-adbd0ddbf3f3.json",
  "status": "PENDING",
  "failure_message": null,
  "traceback": null,
  "model_class": null,
  "model_kwargs": null,
  "formatter_class": null,
  "formatter_kwargs": null,
  "added": "2016-11-18T02:19:23.155268Z",
  "updated": "2016-11-18T02:19:23.155857Z"
}
In [20]:
meter_run = requests.get(
    base_url + "/api/v1/meter_runs/{}/".format(meter_input_meter_run['id']),
    headers=headers
).json(object_pairs_hook=OrderedDict)
print(json.dumps(meter_run, indent=2))
{
  "id": 3,
  "trace": null,
  "project": null,
  "meter_result": 3,
  "meter_input": "https://storage.googleapis.com/my-storage-bucket/datastore/meter_run_inputs/bf0629db-0c81-4ded-8dcc-adbd0ddbf3f3.json",
  "status": "SUCCESS",
  "failure_message": null,
  "traceback": null,
  "model_class": null,
  "model_kwargs": null,
  "formatter_class": null,
  "formatter_kwargs": null,
  "added": "2016-11-18T02:19:23.155268Z",
  "updated": "2016-11-18T02:22:13.679220Z"
}
In [21]:
meter_result = requests.get(
    base_url + "/api/v1/meter_results/{}/".format(meter_input_meter_run['meter_result']),
    headers=headers
).json(object_pairs_hook=OrderedDict)
print(json.dumps(meter_result, indent=2))
{
  "id": 3,
  "trace": null,
  "project": null,
  "meter_run": 3,
  "meter_output": "https://storage.googleapis.com/my-storage-bucket/datastore/meter_run_outputs/7ab2bd31-a723-4c75-afa6-424d560ab284.json",
  "status": "SUCCESS",
  "eemeter_version": "0.4.12",
  "datastore_version": "0.2.3",
  "model_class": "SeasonalElasticNetCVModel",
  "model_kwargs": {
    "heating_base_temp": 65,
    "cooling_base_temp": 65
  },
  "formatter_class": "ModelDataFormatter",
  "formatter_kwargs": {
    "freq_str": "D"
  },
  "added": "2016-11-18T02:22:13.566866Z",
  "updated": "2016-11-18T02:22:13.583138Z"
}

Meters can also be triggered in bulk; the next section covers this.

Bulk-triggering meter runs

Often it is more convenient to trigger many meter runs at once than to do it trace-by-trace. This can be done either through the API or through a datastore management command.

Through the API

The following sends a list of “targets” to the datastore for triggering. Here, we’re triggering a set of meter runs for one project, which will trigger meter runs for all associated traces.

Warning:

The following may take a few minutes to complete. If you have enabled celery workers, it will execute more quickly and computation will continue in the background. If this is the case for you, you should wait until that computation has completed before continuing.

For more information on background worker setup, see datastore setup instructions.

To follow progress, watch the datastore logs or use the meter_progress command. In a development environment, these are printed in the python manage.py runserver output.

In [22]:
bulk_created_meter_runs = requests.post(
    base_url + "/api/v1/meter_runs/bulk/",  # note: different url!
    json={
        "targets": [  # a list of targets can be provided
            {
                "project": projects[0]['id']
            },
        ]
    },
    headers=headers
).json(object_pairs_hook=OrderedDict)
In [23]:
print(json.dumps(bulk_created_meter_runs, indent=2))
[
  [
    {
      "id": 4,
      "trace": 3,
      "project": 1,
      "meter_result": 4,
      "meter_input": null,
      "status": "PENDING",
      "failure_message": null,
      "traceback": null,
      "model_class": null,
      "model_kwargs": null,
      "formatter_class": null,
      "formatter_kwargs": null,
      "added": "2016-11-18T02:22:14.088620Z",
      "updated": "2016-11-18T02:22:14.088658Z"
    },
    {
      "id": 5,
      "trace": 4,
      "project": 1,
      "meter_result": 5,
      "meter_input": null,
      "status": "PENDING",
      "failure_message": null,
      "traceback": null,
      "model_class": null,
      "model_kwargs": null,
      "formatter_class": null,
      "formatter_kwargs": null,
      "added": "2016-11-18T02:26:15.683349Z",
      "updated": "2016-11-18T02:26:15.683442Z"
    },
    {
      "id": 6,
      "trace": 5,
      "project": 1,
      "meter_result": 6,
      "meter_input": null,
      "status": "PENDING",
      "failure_message": null,
      "traceback": null,
      "model_class": null,
      "model_kwargs": null,
      "formatter_class": null,
      "formatter_kwargs": null,
      "added": "2016-11-18T02:28:00.757629Z",
      "updated": "2016-11-18T02:28:00.757666Z"
    },
    {
      "id": 7,
      "trace": 6,
      "project": 1,
      "meter_result": 7,
      "meter_input": null,
      "status": "PENDING",
      "failure_message": null,
      "traceback": null,
      "model_class": null,
      "model_kwargs": null,
      "formatter_class": null,
      "formatter_kwargs": null,
      "added": "2016-11-18T02:29:09.066736Z",
      "updated": "2016-11-18T02:29:09.066777Z"
    },
    {
      "id": 8,
      "trace": 1,
      "project": 1,
      "meter_result": 8,
      "meter_input": null,
      "status": "PENDING",
      "failure_message": null,
      "traceback": null,
      "model_class": null,
      "model_kwargs": null,
      "formatter_class": null,
      "formatter_kwargs": null,
      "added": "2016-11-18T02:32:05.062196Z",
      "updated": "2016-11-18T02:32:05.062238Z"
    },
    {
      "id": 9,
      "trace": 2,
      "project": 1,
      "meter_result": 9,
      "meter_input": null,
      "status": "PENDING",
      "failure_message": null,
      "traceback": null,
      "model_class": null,
      "model_kwargs": null,
      "formatter_class": null,
      "formatter_kwargs": null,
      "added": "2016-11-18T02:32:30.471808Z",
      "updated": "2016-11-18T02:32:30.471864Z"
    }
  ]
]

Note that results are returned grouped by target (as a list).

If model or formatter class or kwarg arguments are supplied, they will be applied to all meter_runs.

Through a management command

The other way to bulk-trigger meter runs is through a management command.

python manage.py run_meters --all-traces

You can monitor the progress of these commands with:

python manage.py meter_progress --all-meters --poll-until-complete

Meter result warehouse tables

For easy access to summarized meter result data, it may be helpful to use the meter result “mart”, which is part of the data warehouse that can be created in the postgres database.

Data warehouse tables make it easier to query into results by summarizing the most relevant information.

To create warehouse tables, use the following management command:

$ python manage.py meterresultmart recreate

This is equivalent to running

$ python manage.py meterresultmart destroy
$ python manage.py meterresultmart create

Running the create command without first destroying will give duplicate rows.

Using the warehouse_meterresultmart table

The easiest way to access the results of the warehouse is to connect an analytics service which can read from the database directly.

If that is not available to you, you can also query directly with postgres. Assuming you have a database set up called “datastore” (yours may be named differently, depending on how you set it up), you can connect as follows:

$ psql datastore
psql (9.4.1)
Type "help" for help.

datastore=# SELECT
  trace_id
  , differential_lower_bound as savings_lower_bound
  , differential_value as savings
  , differential_upper_bound as savings_upper_bound
FROM
  warehouse_meterresultmart
WHERE
  project_id='DEV_SEED_PROJECT'
AND
  derivative_interpretation='gross_predicted'
ORDER BY
  project_id
  , trace_id
  , derivative_interpretation;

              trace_id              | savings_lower_bound |      savings      | savings_upper_bound
------------------------------------+---------------------+-------------------+---------------------
 DEV_SEED_TRACE_ELECTRICITY_15MIN   |    2.21781163934072 |  4.93938974304001 |     7.6609678467393
 DEV_SEED_TRACE_ELECTRICITY_HOURLY  |      10.47300113325 |  12.9734969290002 |    15.4739927247505
 DEV_SEED_TRACE_NATURAL_GAS_DAILY   |   -12.4594891213768 | -6.03261538803008 |   0.394258345316612
 DEV_SEED_TRACE_NATURAL_GAS_MONTHLY |   -774.348987437802 | -580.576019960851 |     -386.8030524839
 DEV_SEED_TRACE_SOLAR_30MIN         |   0.848394785466816 |  3.81981394853938 |    6.79123311161194
 DEV_SEED_TRACE_SOLAR_HOURLY        |                     |                   |
(6 rows)
datastore=#

Aggregations and groups

Traces can be aggregated by putting them into groups and triggering aggregation runs.

Groups must be named, and are defined by combinations of filters over project_id, trace_id, or arbitary project metadata.

Filters are created with the following attributes as either a “filter” or a “filter_boolean”, which is combination of two filters.

Filter types:

"filter":

  • "target", can be:
    • “project_id”
    • “trace_id”
    • “project_metadata|NAME_OF_ATTRIBUTE”
  • "comparison", can be:
    • “>”, “>=”, “<”, “<=”, “==”, ”!=”
    • “in”, “not in”
  • "value", can be:
    • int, float, str (for comparisons “>”, “>=”, “<”, “<=”, “==”, ”!=”)
    • list of values (for comparisons “in”, “not in”)

"filter_boolean":

  • "boolean", can be:
    • “and”, “or”
  • "filter_a", can be:
    • filter, filter_boolean
  • "filter_b", can be:
    • filter, filter_boolean

Example filter specification creation:

In [24]:
filter_specification = {
    "filter": {
        "target": "project_id",
        "comparison": "==",
        "value": projects[0]["project_id"],
    }
}
In [25]:
trace_group = requests.post(
    base_url + "/api/v1/trace_groups/",  # note: different url!
    json={
        "name": "project_group",
        "filter_specification": filter_specification,
    },
    headers=headers
).json(object_pairs_hook=OrderedDict)
In [26]:
print(json.dumps(trace_group, indent=2))
{
  "id": 3,
  "name": "project_group",
  "filter_specification": {
    "filter": {
      "comparison": "==",
      "target": "project_id",
      "value": "DEV_SEED_PROJECT"
    }
  }
}
In [27]:
aggregation_run = requests.post(
    base_url + "/api/v1/aggregation_runs/",
    json={
        "group": trace_group['id'],
        "trace_interpretation": "ELECTRICITY_CONSUMPTION_SUPPLIED",
        "derivative_interpretation": "annualized_weather_normal",
    },
    headers=headers
).json(object_pairs_hook=OrderedDict)
In [28]:
print(json.dumps(aggregation_run, indent=2))
{
  "id": 7,
  "group": 3,
  "aggregation_result": 1,
  "aggregation_input": null,
  "status": "PENDING",
  "traceback": null,
  "failure_message": null,
  "trace_interpretation": "ELECTRICITY_CONSUMPTION_SUPPLIED",
  "derivative_interpretation": "annualized_weather_normal",
  "aggregation_interpretation": "SUM",
  "added": "2016-11-18T03:04:00.425945Z",
  "updated": "2016-11-18T03:04:00.426601Z"
}
In [29]:
aggregation_run = requests.get(
    base_url + "/api/v1/aggregation_runs/{}/".format(aggregation_run["id"]),
    headers=headers
).json(object_pairs_hook=OrderedDict)
print(json.dumps(aggregation_run, indent=2))
{
  "id": 7,
  "group": 3,
  "aggregation_result": 1,
  "aggregation_input": "https://storage.googleapis.com/my-storage-bucket/datastore/aggregation_inputs/2b986708-1076-4a16-a2dd-31a78f93d817.json",
  "status": "SUCCESS",
  "traceback": null,
  "failure_message": null,
  "trace_interpretation": "ELECTRICITY_CONSUMPTION_SUPPLIED",
  "derivative_interpretation": "annualized_weather_normal",
  "aggregation_interpretation": "SUM",
  "added": "2016-11-18T03:04:00.425945Z",
  "updated": "2016-11-18T03:04:03.806500Z"
}
In [30]:
aggregation_result = requests.get(
    base_url + "/api/v1/aggregation_results/{}/".format(aggregation_run["aggregation_result"]),
    headers=headers
).json(object_pairs_hook=OrderedDict)
print(json.dumps(aggregation_result, indent=2))
{
  "id": 1,
  "aggregation_run": 7,
  "trace_interpretation": "ELECTRICITY_CONSUMPTION_SUPPLIED",
  "derivative_interpretation": "annualized_weather_normal",
  "aggregation_interpretation": "SUM",
  "aggregation_output": "https://storage.googleapis.com/my-storage-bucket/datastore/aggregation_outputs/4295fd00-4ecd-494e-9eeb-884ef620ce14.json",
  "derivatives": [
    7,
    9
  ],
  "unit": "KWH",
  "baseline_value": 4863.15574521486,
  "baseline_lower": 1.5110421742195,
  "baseline_upper": 1.5110421742195,
  "baseline_n": 730.0,
  "reporting_value": 4860.88336194519,
  "reporting_lower": 0.545198012058528,
  "reporting_upper": 0.545198012058528,
  "reporting_n": 730.0,
  "differential_direction": "BASELINE_MINUS_REPORTING",
  "differential_value": 2.27238326967017,
  "differential_lower": 1.60639015330105,
  "differential_upper": 1.60639015330105,
  "differential_n": 1460.0,
  "eemeter_version": "0.4.12",
  "datastore_version": "0.2.3",
  "added": "2016-11-18T03:04:03.701863Z",
  "updated": "2016-11-18T03:04:03.701911Z"
}

Additional filter examples:

All traces:

None  # leave blank

Traces with project cost less than or equal to 10000:

{
    "filter": {
        "target": "project_metadata|project_cost",
        "comparison": "<=",
        "value": 10000,
    }
}

Traces with project_id in particular set:

{
    "filter": {
        "target": "project_id",
        "comparison": "in",
        "value": [
            "PROJECT_101",
            "PROJECT_102"
        ]
    }
}

Traces with project_id in particular set or with project cost greater than or equal to 5000:

{
    "filter_boolean": {
        "boolean": "or",
        "filter_a": {
            "filter": {
                "target": "project_id",
                "comparison": "in",
                "value": [
                    "PROJECT_101",
                    "PROJECT_102"
                ]
            }
        },
        "filter_b": {
            "filter": {
                "target": "project_metadata|project_cost",
                "comparison": ">=",
                "value": 5000,
            }
        }
    }
}

Deeply nested filter:

{
    "filter_boolean": {
        "boolean": "and",
        "filter_a": {
            "filter_boolean": {
                "boolean": "and",
                "filter_a": {
                    "filter": {
                        "target": "project_metadata|contractor",
                        "comparison": "==",
                        "value": "AAA CONTRACTING",
                    }
                },
                "filter_b": {
                    "filter": {
                        "target": "project_metadata|project_type",
                        "comparison": "!=",
                        "value": "SOLAR"
                    }
                },
            }
        },
        "filter_b": {
            "filter": {
                "target": "project_metadata|project_cost",
                "comparison": ">=",
                "value": 5000,
            }
        }
    }
}

Group statistics warehouse tables

For easy access to summarized aggregated data, it may be helpful to use the group statistics mart, which is part of the data warehouse that can be created in the postgres database.

This supplements the meter result mart by providing summarized group statistics.

Just as with the meter result mart, the group statistics mart can also be created with a management command:

$ python manage.py groupstatisticsmart recreate