Skip to content

Billing Model

opendsm.eemeter.models.billing

A module housing billing model classes and functions.

Copyright 2014-2025 OpenDSM contributors

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

BillingModel(settings=None, verbose=False)

A class to fit a model to the input meter data.

BillingModel is a wrapper for the DailyModel class using billing presets.

Attributes:

Name Type Description
settings dict

A dictionary of settings.

seasonal_options list

A list of seasonal options (su: Summer, sh: Shoulder, wi: Winter). Elements in the list are seasons separated by '_' that represent a model split. For example, a list of ['su_sh', 'wi'] represents two splits: summer/shoulder and winter.

day_options list

A list of day options.

combo_dictionary dict

A dictionary of combinations.

df_meter DataFrame

A dataframe of meter data.

error dict

A dictionary of error metrics.

combinations list

A list of combinations.

components list

A list of components.

fit_components list

A list of fit components.

wRMSE_base float

The mean bias error for no splits.

best_combination list

The best combination of splits.

model Pipeline

The final fitted model.

id str

The index of the meter data.

Source code in opendsm/eemeter/models/billing/model.py
def __init__(self, settings=None, verbose: bool = False,):
    super().__init__(model="legacy", settings=settings, verbose=verbose)

seasonal_options = [['su_sh_wi'], ['su', 'sh_wi'], ['su_sh', 'wi'], ['su_wi', 'sh'], ['su', 'sh', 'wi']] instance-attribute

day_options = [['wd', 'we']] instance-attribute

combo_dictionary = {'su': 'summer', 'sh': 'shoulder', 'wi': 'winter', 'fw': [n + 1 for n in n_week], 'wd': [n + 1 for n in n_week if day_dict[n + 1] == 'weekday'], 'we': [n + 1 for n in n_week if day_dict[n + 1] == 'weekend']} instance-attribute

verbose = verbose instance-attribute

error = {'wRMSE': np.nan, 'RMSE': np.nan, 'MAE': np.nan, 'CVRMSE': np.nan, 'PNRMSE': np.nan} instance-attribute

to_json()

Returns a JSON string of model parameters.

Returns:

Type Description
str

Model parameters.

Source code in opendsm/eemeter/models/daily/model.py
def to_json(self) -> str:
    """Returns a JSON string of model parameters.

    Returns:
        Model parameters.
    """
    return json.dumps(self.to_dict())

from_dict(data) classmethod

Create a instance of the class from a dictionary (such as one produced from the to_dict method).

Parameters:

Name Type Description Default
data dict

The dictionary containing the model data.

required

Returns:

Type Description
DailyModel

An instance of the class.

Source code in opendsm/eemeter/models/daily/model.py
@classmethod
def from_dict(cls, data) -> DailyModel:
    """Create a instance of the class from a dictionary (such as one produced from the to_dict method).

    Args:
        data (dict): The dictionary containing the model data.

    Returns:
        An instance of the class.

    """
    settings = data.get("settings")
    daily_model = cls(settings=settings)
    info = data.get("info")
    daily_model.params = DailyModelParameters(
        submodels=data.get("submodels"),
        info=info,
        settings=settings,
    )

    def deserialize_warnings(warnings):
        if not warnings:
            return []
        warn_list = []
        for warning in warnings:
            warn_list.append(
                EEMeterWarning(
                    qualified_name=warning.get("qualified_name"),
                    description=warning.get("description"),
                    data=warning.get("data"),
                )
            )
        return warn_list

    daily_model.disqualification = deserialize_warnings(
        info.get("disqualification")
    )
    daily_model.warnings = deserialize_warnings(info.get("warnings"))
    daily_model.baseline_timezone = info.get("baseline_timezone")
    daily_model.is_fitted = True
    return daily_model

from_json(str_data) classmethod

Create an instance of the class from a JSON string.

Parameters:

Name Type Description Default
str_data str

The JSON string representing the object.

required

Returns:

Type Description
DailyModel

An instance of the class.

Source code in opendsm/eemeter/models/daily/model.py
@classmethod
def from_json(cls, str_data: str) -> DailyModel:
    """Create an instance of the class from a JSON string.

    Args:
        str_data: The JSON string representing the object.

    Returns:
        An instance of the class.

    """
    return cls.from_dict(json.loads(str_data))

from_2_0_dict(data) classmethod

Create an instance of the class from a legacy (2.0) model dictionary.

Parameters:

Name Type Description Default
data dict

A dictionary containing the necessary data (legacy 2.0) to create a DailyModel instance.

required

Returns:

Type Description
DailyModel

An instance of the class.

Source code in opendsm/eemeter/models/daily/model.py
@classmethod
def from_2_0_dict(cls, data) -> DailyModel:
    """Create an instance of the class from a legacy (2.0) model dictionary.

    Args:
        data (dict): A dictionary containing the necessary data (legacy 2.0) to create a DailyModel instance.

    Returns:
        An instance of the class.

    """
    daily_model = cls(model="legacy")
    daily_model.params = DailyModelParameters.from_2_0_params(data)
    daily_model.warnings = []
    daily_model.disqualification = []
    daily_model.baseline_timezone = "UTC"
    daily_model.is_fitted = True
    return daily_model

from_2_0_json(str_data) classmethod

Create an instance of the class from a legacy (2.0) JSON string.

Parameters:

Name Type Description Default
str_data str

The JSON string.

required

Returns:

Type Description
DailyModel

An instance of the class.

Source code in opendsm/eemeter/models/daily/model.py
@classmethod
def from_2_0_json(cls, str_data: str) -> DailyModel:
    """Create an instance of the class from a legacy (2.0) JSON string.

    Args:
        str_data: The JSON string.

    Returns:
        An instance of the class.

    """
    return cls.from_2_0_dict(json.loads(str_data))

fit(baseline_data, ignore_disqualification=False)

Source code in opendsm/eemeter/models/billing/model.py
def fit(
    self, 
    baseline_data: BillingBaselineData, 
    ignore_disqualification: bool = False
) -> BillingModel:
    return super().fit(baseline_data, ignore_disqualification=ignore_disqualification)

predict(reporting_data, aggregation=None, ignore_disqualification=False)

Predicts the energy consumption using the fitted model.

Parameters:

Name Type Description Default
reporting_data BillingBaselineData | BillingReportingData

The data used for prediction.

required
aggregation str | None

The aggregation level for the prediction. One of [None, 'none', 'monthly', 'bimonthly'].

None
ignore_disqualification bool

Whether to ignore model disqualification. Defaults to False.

False

Returns:

Type Description
DataFrame

Dataframe with input data along with predicted energy consumption.

Raises:

Type Description
RuntimeError

If the model is not fitted.

DisqualifiedModelError

If the model is disqualified and ignore_disqualification is False.

TypeError

If the reporting data is not of type BillingBaselineData or BillingReportingData.

ValueError

If the aggregation is not one of [None, 'none', 'monthly', 'bimonthly'].

Source code in opendsm/eemeter/models/billing/model.py
def predict(
    self,
    reporting_data: BillingBaselineData | BillingReportingData,
    aggregation: str | None = None,
    ignore_disqualification: bool = False,
) -> pd.DataFrame:
    """Predicts the energy consumption using the fitted model.

    Args:
        reporting_data: The data used for prediction.
        aggregation: The aggregation level for the prediction. One of [None, 'none', 'monthly', 'bimonthly'].
        ignore_disqualification: Whether to ignore model disqualification. Defaults to False.

    Returns:
        Dataframe with input data along with predicted energy consumption.

    Raises:
        RuntimeError: If the model is not fitted.
        DisqualifiedModelError: If the model is disqualified and ignore_disqualification is False.
        TypeError: If the reporting data is not of type BillingBaselineData or BillingReportingData.
        ValueError: If the aggregation is not one of [None, 'none', 'monthly', 'bimonthly'].
    """
    if not self.is_fitted:
        raise RuntimeError("Model must be fit before predictions can be made.")

    if self.disqualification and not ignore_disqualification:
        raise DisqualifiedModelError(
            "Attempting to predict using disqualified model without setting ignore_disqualification=True"
        )

    if not isinstance(reporting_data, (BillingBaselineData, BillingReportingData)):
        raise TypeError(
            "reporting_data must be a BillingBaselineData or BillingReportingData object"
        )

    df = getattr(reporting_data, self._data_df_name)
    df_res = self._predict(df)

    if aggregation is None:
        agg = None
    elif aggregation.lower() == "none":
        agg = None
    elif aggregation == "monthly":
        agg = "MS"
    elif aggregation == "bimonthly":
        agg = "2MS"
    else:
        raise ValueError(
            "aggregation must be one of [None, 'monthly', 'bimonthly']"
        )

    if agg is not None:
        sum_quad = lambda x: np.sqrt(np.sum(np.square(x)))

        season = df_res["season"].resample(agg).first()
        temperature = df_res["temperature"].resample(agg).mean()
        observed = df_res["observed"].resample(agg).sum()
        predicted = df_res["predicted"].resample(agg).sum()
        predicted_unc = df_res["predicted_unc"].resample(agg).apply(sum_quad)
        heating_load = df_res["heating_load"].resample(agg).sum()
        cooling_load = df_res["cooling_load"].resample(agg).sum()
        model_split = df_res["model_split"].resample(agg).first()
        model_type = df_res["model_type"].resample(agg).first()

        df_res = pd.concat(
            [
                season,
                temperature,
                observed,
                predicted,
                predicted_unc,
                heating_load,
                cooling_load,
                model_split,
                model_type,
            ],
            axis=1,
        )

    return df_res

plot(data, aggregation=None)

Plot a model fit with baseline or reporting data. Requires matplotlib to use.

Parameters:

Name Type Description Default
df_eval

The baseline or reporting data object to plot.

required
aggregation str | None

The aggregation level for the prediction. One of [None, 'none', 'monthly', 'bimonthly'].

None
Source code in opendsm/eemeter/models/billing/model.py
def plot(
    self,
    data,
    aggregation: str | None = None,
):
    """Plot a model fit with baseline or reporting data. Requires matplotlib to use.

    Args:
        df_eval: The baseline or reporting data object to plot.
        aggregation: The aggregation level for the prediction. One of [None, 'none', 'monthly', 'bimonthly'].
    """
    try:
        from opendsm.eemeter.models.billing.plot import plot
    except ImportError:  # pragma: no cover
        raise ImportError("matplotlib is required for plotting.")

    # TODO: pass more kwargs to plotting function

    plot(self, self.predict(data, aggregation=aggregation))

to_dict()

Returns a dictionary of model parameters.

Returns:

Type Description
dict

Model parameters.

Source code in opendsm/eemeter/models/billing/model.py
def to_dict(self) -> dict:
    """Returns a dictionary of model parameters.

    Returns:
        Model parameters.
    """
    model_dict = super().to_dict()
    model_dict["settings"]["developer_mode"] = True

    return model_dict

BillingBaselineData(df, is_electricity_data)

Data class to represent Billing Baseline Data.

Only baseline data should go into the dataframe input, no blackout data should be input. Checks sufficiency for the data provided as input depending on OpenEEMeter specifications and populates disqualifications and warnings based on it.

Billing data should have an extra month's data appended at the to denote end of period. (Do not append NaN, any other value would work.)

Parameters:

Name Type Description Default
df DataFrame

A dataframe having a datetime index or a datetime column with the timezone also being set. It also requires 2 more columns - 'observed' for meter data, and 'temperature' for temperature data. The temperature column should have values in Fahrenheit. Please convert your temperatures accordingly.

required
is_electricity_data bool

Flag to ascertain if this is electricity data or not. Electricity data values of 0 are set to NaN.

required

Attributes:

Name Type Description
df DataFrame

Immutable dataframe that contains the meter and temperature values for the baseline data period.

disqualification list[EEMeterWarning]

A list of serious issues with the data that can degrade the quality of the model. If you want to go ahead with building the model while ignoring them, set the ignore_disqualification = True flag in the model. By default disqualifications are not ignored.

warnings list[EEMeterWarning]

A list of ssues with the data, but none that will severely reduce the quality of the model built.

Source code in opendsm/eemeter/models/daily/data.py
def __init__(self, df: pd.DataFrame, is_electricity_data: bool):
    self._df = None
    self.warnings = []
    self.disqualification = []
    self.is_electricity_data = is_electricity_data
    self.tz = None

    # TODO re-examine dq/warning pattern. keep consistent between
    # either implicitly setting as side effects, or returning and assigning outside
    self._df, temp_coverage = self._set_data(df)

    sufficiency_df = self._df.merge(
        temp_coverage, left_index=True, right_index=True, how="left"
    )
    disqualification, warnings = self._check_data_sufficiency(sufficiency_df)

    self.disqualification += disqualification
    self.warnings += warnings
    self.log_warnings()

warnings = [] instance-attribute

disqualification = [] instance-attribute

is_electricity_data = is_electricity_data instance-attribute

tz = None instance-attribute

df: pd.DataFrame | None property

Get the corrected input data stored in the class. The actual dataframe is immutable, this returns a copy.

billing_df: pd.DataFrame | None property

Get the corrected input data stored in the class. The actual dataframe is immutable, this returns a copy.

from_series(meter_data, temperature_data, is_electricity_data) classmethod

Create an instance of the Data class from meter data and temperature data.

Public method that can can handle two separate series (meter and temperature) and join them to create a single dataframe. The temperature column should have values in Fahrenheit.

Parameters:

Name Type Description Default
meter_data Series | DataFrame

The meter data.

required
temperature_data Series | DataFrame

The temperature data.

required
is_electricity_data bool

A flag indicating whether the data represents electricity data. This is required as electricity data with 0 values are converted to NaNs.

required

Returns:

Type Description

An instance of the Data class with the dataframe populated with the corrected data, along with warnings and disqualifications based on the input.

Source code in opendsm/eemeter/models/daily/data.py
@classmethod
def from_series(
    cls,
    meter_data: pd.Series | pd.DataFrame,
    temperature_data: pd.Series | pd.DataFrame,
    is_electricity_data: bool,
):
    """Create an instance of the Data class from meter data and temperature data.

    Public method that can can handle two separate series (meter and temperature) and join them to create a single dataframe. The temperature column should have values in Fahrenheit.

    Args:
        meter_data: The meter data.
        temperature_data: The temperature data.
        is_electricity_data: A flag indicating whether the data represents electricity data. This is required as electricity data with 0 values are converted to NaNs.

    Returns:
        An instance of the Data class with the dataframe populated with the corrected data, along with warnings and disqualifications based on the input.
    """
    if isinstance(meter_data, pd.Series):
        meter_data = meter_data.to_frame()
    if isinstance(temperature_data, pd.Series):
        temperature_data = temperature_data.to_frame()
    meter_data = meter_data.rename(columns={meter_data.columns[0]: "observed"})
    temperature_data = temperature_data.rename(
        columns={temperature_data.columns[0]: "temperature"}
    )
    temperature_data.index = temperature_data.index.tz_convert(
        meter_data.index.tzinfo
    )

    if temperature_data.empty:
        raise ValueError("Temperature data cannot be empty.")
    if meter_data.empty:
        # reporting from_series always passes a full index of nan
        raise ValueError("Meter data cannot by empty.")

    is_billing_data = False
    if not meter_data.empty:
        is_billing_data = compute_minimum_granularity(
            meter_data.index, "billing"
        ).startswith("billing")

    # first, trim the data to exclude NaNs on the outer edges of the data
    last_meter_index = meter_data.last_valid_index()
    if is_billing_data:
        # preserve final NaN for billing data only
        last = meter_data.last_valid_index()
        if last and last != meter_data.index[-1]:
            # TODO include warning here for non-NaN final billing row since it will be discarded
            last_meter_index = meter_data.index[meter_data.index.get_loc(last) + 1]
    meter_data = meter_data.loc[meter_data.first_valid_index() : last_meter_index]
    temperature_data = temperature_data.loc[
        temperature_data.first_valid_index() : temperature_data.last_valid_index()
    ]

    # TODO consider a refactor of the period offset calculation/slicing.
    # it seems like a fairly dense block of code for something conceptually simple.
    # at the very least, try to clarify variable names a bit

    period_diff_first = pd.Timedelta(0)
    period_diff_last = pd.Timedelta(0)
    # calculate difference in period length for first and last rows in meter/temp
    # first/last will generally be the same offset for daily/hourly, but billing can be quite variable
    # could consider using to_offset(index.inferred_freq) if available,
    # but the intent here is just to provide a lenient first trim.
    # checking for consistent frequency is done later during __init__
    if len(meter_data.index) > 1 and len(temperature_data.index) > 1:
        period_meter_first = meter_data.index[1] - meter_data.index[0]
        period_temp_first = temperature_data.index[1] - temperature_data.index[0]
        period_diff_first = period_meter_first - period_temp_first

        period_meter_last = meter_data.index[-1] - meter_data.index[-2]
        period_temp_last = temperature_data.index[-1] - temperature_data.index[-2]
        period_diff_last = period_meter_last - period_temp_last

    # if diff is positive, meter period is longer (lower frequency)
    zero_offset = pd.Timedelta(0)
    meter_period_first_longer = period_diff_first > zero_offset
    meter_period_last_longer = period_diff_last > zero_offset

    # large period needs a buffer for the min index, and no buffer for the max index
    # short period needs a buffer for the max index, and no buffer for the min index
    meter_offset_first = (
        period_diff_first if meter_period_first_longer else zero_offset
    )
    meter_offset_last = (
        -period_diff_last if not meter_period_last_longer else zero_offset
    )
    temp_offset_first = (
        -period_diff_first if not meter_period_first_longer else zero_offset
    )
    temp_offset_last = period_diff_last if meter_period_last_longer else zero_offset

    # if the shorter period ends on an exact index of the longer, we accept it.
    # the data should be DQ'd later due to insufficiency for the period

    # constrain meter index to temperature index
    temp_index_min = temperature_data.index.min() - meter_offset_first
    temp_index_max = temperature_data.index.max() + meter_offset_last
    meter_data = meter_data[temp_index_min:temp_index_max]
    if meter_data.empty:
        raise ValueError("Meter and temperature data are fully misaligned.")

    # if billing detected, subtract one day from final index since dataframe input assumes final row is part of period
    if is_billing_data:
        new_index = meter_data.index[:-1].union(
            [(meter_data.index[-1] - pd.Timedelta(days=1))]
        )
        if len(new_index) == len(meter_data.index):
            meter_data.index = new_index
        else:
            # handles the case of a 1 day off-cycle read at end of series
            meter_data = meter_data[:-1]

    # constrain temperature index to meter index
    meter_index_min = meter_data.index.min() - temp_offset_first
    meter_index_max = meter_data.index.max() + temp_offset_last
    if is_billing_data and len(meter_data) > 1:
        # last billing period is offset by one index
        meter_index_max = meter_data.index[-2] + temp_offset_last
    temperature_data = temperature_data[meter_index_min:meter_index_max]

    if is_billing_data:
        # TODO consider adding misaligned data warning here if final row was not already NaN
        meter_data.iloc[-1] = np.nan

    df = pd.concat([meter_data, temperature_data], axis=1)
    return cls(df, is_electricity_data)

log_warnings()

Logs the warnings and disqualifications associated with the data.

View the disqualifications and warnings associated with the current data input provided.

Returns:

Type Description
None

None

Source code in opendsm/eemeter/models/daily/data.py
def log_warnings(self) -> None:
    """Logs the warnings and disqualifications associated with the data.

    View the disqualifications and warnings associated with the current data input provided.

    Returns:
        None
    """
    for warning in self.warnings + self.disqualification:
        warning.warn()

BillingReportingData(df, is_electricity_data)

Data class to represent Billing Reporting Data.

Only reporting data should go into the dataframe input, no blackout data should be input. Checks sufficiency for the data provided as input depending on OpenEEMeter specifications and populates disqualifications and warnings based on it.

Meter data input is optional for the reporting class.

Parameters:

Name Type Description Default
df DataFrame

A dataframe having a datetime index or a datetime column with the timezone also being set. It also requires 2 more columns - 'observed' for meter data, and 'temperature' for temperature data. The temperature column should have values in Fahrenheit. Please convert your temperatures accordingly.

required
is_electricity_data bool

Flag to ascertain if this is electricity data or not. Electricity data values of 0 are set to NaN.

required

Attributes:

Name Type Description
df DataFrame

Immutable dataframe that contains the meter and temperature values for the baseline data period.

disqualification list[EEMeterWarning]

A list of serious issues with the data that can degrade the quality of the model. If you want to go ahead with building the model while ignoring them, set the ignore_disqualification = True flag in the model. By default disqualifications are not ignored.

warnings list[EEMeterWarning]

A list of ssues with the data, but none that will severely reduce the quality of the model built.

Source code in opendsm/eemeter/models/billing/data.py
def __init__(self, df: pd.DataFrame, is_electricity_data: bool):
    df = df.copy()
    if "observed" not in df.columns:
        df["observed"] = np.nan

    super().__init__(df, is_electricity_data)

warnings = [] instance-attribute

disqualification = [] instance-attribute

is_electricity_data = is_electricity_data instance-attribute

tz = None instance-attribute

df: pd.DataFrame | None property

Get the corrected input data stored in the class. The actual dataframe is immutable, this returns a copy.

billing_df: pd.DataFrame | None property

Get the corrected input data stored in the class. The actual dataframe is immutable, this returns a copy.

log_warnings()

Logs the warnings and disqualifications associated with the data.

View the disqualifications and warnings associated with the current data input provided.

Returns:

Type Description
None

None

Source code in opendsm/eemeter/models/daily/data.py
def log_warnings(self) -> None:
    """Logs the warnings and disqualifications associated with the data.

    View the disqualifications and warnings associated with the current data input provided.

    Returns:
        None
    """
    for warning in self.warnings + self.disqualification:
        warning.warn()

from_series(meter_data, temperature_data, is_electricity_data, tzinfo=None) classmethod

Create a BillingReportingData instance from meter data and temperature data.

Parameters:

Name Type Description Default
meter_data Series | DataFrame | None

The meter data to be used for the BillingReportingData instance.

required
temperature_data Series | DataFrame

The temperature data to be used for the BillingReportingData instance.

required
is_electricity_data bool

Flag indicating whether the meter data represents electricity data.

required
tzinfo tzinfo | None

Timezone information to be used for the meter data.

None

Returns:

Type Description

An instance of the Data class.

Source code in opendsm/eemeter/models/billing/data.py
@classmethod
def from_series(
    cls,
    meter_data: pd.Series | pd.DataFrame | None,
    temperature_data: pd.Series | pd.DataFrame,
    is_electricity_data: bool,
    tzinfo: datetime.tzinfo | None = None,
):
    """Create a BillingReportingData instance from meter data and temperature data.

    Args:
        meter_data: The meter data to be used for the BillingReportingData instance.
        temperature_data: The temperature data to be used for the BillingReportingData instance.
        is_electricity_data: Flag indicating whether the meter data represents electricity data.
        tzinfo: Timezone information to be used for the meter data.

    Returns:
        An instance of the Data class.
    """
    if tzinfo and meter_data is not None:
        raise ValueError(
            "When passing meter data to BillingReportingData, convert its DatetimeIndex to local timezone first; `tzinfo` param should only be used in the absence of reporting meter data."
        )
    if is_electricity_data is None and meter_data is not None:
        raise ValueError(
            "Must specify is_electricity_data when passing meter data."
        )
    if meter_data is None:
        meter_data = pd.DataFrame(
            {"observed": np.nan}, index=temperature_data.index
        )
        if tzinfo:
            meter_data = meter_data.tz_convert(tzinfo)
    if meter_data.empty:
        raise ValueError(
            "Pass meter_data=None to explicitly create a temperature-only reporting data instance."
        )
    return super().from_series(meter_data, temperature_data, is_electricity_data)