pyobsmod.Dataset#
- class pyobsmod.Dataset(obs: ndarray | None = None, mod: ndarray | None = None, df: DataFrame | None = None, time: ndarray | Index | DatetimeIndex | None = None)#
Dataset object.
Dataset is a class built on pandas DataFrames that allows for quick plotting and analyzing of observational vs modelled data.
It inherits functionalities from the
BaseDatasetclass.- Parameters:
obs (numpy.ndarray | None) – Observations data. Should preferably be a 1-dimensional np.ndarray, but lists and tuples can be passed as well.
mod (list[str] | None) – Modelled data. Should preferably be a 1-dimensional np.ndarray, but lists and tuples can be passed as well.
df (pandas.DataFrame | None) – It is also possible to pass directly a DataFrame that must contain at least he columns
obsandmodand associates string values. Note also that ifdfisNone, bothobsandmodmust not beNone.time (numpy.ndarray | None) – A 1-dimensional array with the time steps of the observation and model data. Will be used automatically as ticks in certain plots like
Dataset.time_series_plot.
Examples
Create a dataset from some data.
import numpy as np import pyobsmod as pom obs = np.sin(np.arange(100)) mod = obs + np.random.normal(size=100) ds = pom.Dataset(obs, mod) print(ds)
pyobsmod.Dataset( obs mod 0 0.000000 -1.068269 1 0.841471 1.518350 2 0.909297 0.386428 3 0.141120 -0.093398 4 -0.756802 -0.878878 .. ... ... 95 0.683262 -0.840737 96 0.983588 -0.575326 97 0.379608 -0.525374 98 -0.573382 -0.351681 99 -0.999207 -1.411664 [100 rows x 2 columns] )- __init__(obs: ndarray | None = None, mod: ndarray | None = None, df: DataFrame | None = None, time: ndarray | Index | DatetimeIndex | None = None) None#
Initialize the Dataset object.
- Parameters:
obs (numpy.ndarray | None) – Observations data. Should preferably be a 1-dimensional np.ndarray, but lists and tuples can be passed as well.
mod (list[str] | None) – Modelled data. Should preferably be a 1-dimensional np.ndarray, but lists and tuples can be passed as well.
df (pandas.DataFrame | None) – It is also possible to pass directly a DataFrame that must contain at least the columns
obsandmodand associates string values. Note also that ifdfisNone, bothobsandmodmust not beNone.time (numpy.ndarray | None) – A 1-dimensional array with the time steps of the observation and modelling data. Will be used automatically as ticks in certain plots like
Dataset.time_series_plot.
Methods
__init__([obs, mod, df, time])Initialize the Dataset object.
bias()Bias.
compute_stats(which_stats[, names])Compute a list of statistics parameters.
Compute the bias, rmse, nrmse, and r2.
lr()Perform linear regression (y=ax+b).
nrmse([norm])Normalize root mean squared error.
r([method])Correlation coefficient.
r2()Coefficient of determination.
rmse()Root mean squared error.
save(path)Save this class as a pickle file.
scatter_plot([which_stats, names, fmt, ax, ...])Scatter plot of observed data against modelled data.
scatter_plot_joint([which_stats, names, ...])Scatter plot sns observed data against modelled data.
scatter_plot_sns([which_stats, names, fmt, ...])Scatter plot sns observed data against modelled data.
stats_plot(which_stats[, names, decimals, ...])Textbox plot summarizing specified statistics.
time_series_plot([which_stats, names, fmt, ...])Time series plot of observed data against modelled data.
Attributes
dataAlias for
self.values.dfRetrieves the DataFrame stored in the object.
valuesGet the dataset as a numpy array.