mlos_viz.base module

Base functions for visualizing, explain, and gain insights from results.

mlos_viz.base.augment_results_df_with_config_trial_group_stats(exp_data: ExperimentData | None = None, *, results_df: DataFrame | None = None, requested_result_cols: Iterable[str] | None = None) DataFrame

Add a number of useful statistical measure columns to the results dataframe.

In particular, for each numeric result, we add the following columns for each requested result column:

  • “.p50”: the median of each config trial group results

  • “.p75”: the p75 of each config trial group results

  • “.p90”: the p90 of each config trial group results

  • “.p95”: the p95 of each config trial group results

  • “.p99”: the p95 of each config trial group results

  • “.mean”: the mean of each config trial group results

  • “.stddev”: the mean of each config trial group results

  • “.var”: the variance of each config trial group results

  • “.var_zscore”: the zscore of this group (i.e., variance relative to the stddev of all group variances). This can be useful for filtering out outliers (e.g., configs with high variance relative to others by restricting to abs < 2 to remove those two standard deviations from the mean variance across all config trial groups).

Additionally, we add a “tunable_config_trial_group_size” column that indicates the number of trials using a particular config.

Parameters:
exp_dataExperimentData

The ExperimentData (e.g., obtained from the storage layer) to plot.

results_dfOptional[pandas.DataFrame]

The results dataframe to augment, by default None to use the results_df property.

requested_result_colsOptional[Iterable[str]]

Which results columns to augment, by default None to use all results columns that look numeric.

Returns:
pandas.DataFrame

The augmented results dataframe.

mlos_viz.base.ignore_plotter_warnings() None

Suppress some annoying warnings from third-party data visualization packages by adding them to the warnings filter.

mlos_viz.base.limit_top_n_configs(exp_data: ExperimentData | None = None, *, results_df: DataFrame | None = None, objectives: Dict[str, Literal['min', 'max']] | None = None, top_n_configs: int = 10, method: Literal['mean', 'p50', 'p75', 'p90', 'p95', 'p99'] = 'mean') Tuple[DataFrame, List[int], Dict[str, bool]]

Utility function to process the results and determine the best performing configs including potential repeats to help assess variability.

Parameters:
exp_dataOptional[ExperimentData]

The ExperimentData (e.g., obtained from the storage layer) to operate on.

results_dfOptional[pandas.DataFrame]

The results dataframe to augment, by default None to use the results_df property.

objectivesIterable[str], optional

Which result column(s) to use for sorting the configs, and in which direction (“min” or “max”). By default None to automatically select the experiment objectives.

top_n_configsint, optional

How many configs to return, including the default, by default 20.

method: Literal[“mean”, “median”, “p50”, “p75”, “p90”, “p95”, “p99”] = “mean”,

Which statistical method to use when sorting the config groups before determining the cutoff, by default “mean”.

Returns:
(top_n_config_results_df, top_n_config_ids, orderby_cols)
Tuple[pandas.DataFrame, List[int], Dict[str, bool]]

The filtered results dataframe, the config ids, and the columns used to order the configs.

Plots the optimizer trends for the Experiment.

Parameters:
exp_dataExperimentData

The ExperimentData (e.g., obtained from the storage layer) to plot.

results_dfOptional[“pandas.DataFrame”]

Optional results_df to plot. If not provided, defaults to exp_data.results_df property.

objectivesOptional[Dict[str, Literal[“min”, “max”]]]

Optional objectives to plot. If not provided, defaults to exp_data.objectives property.

mlos_viz.base.plot_top_n_configs(exp_data: ExperimentData | None = None, *, results_df: DataFrame | None = None, objectives: Dict[str, Literal['min', 'max']] | None = None, with_scatter_plot: bool = False, **kwargs: Any) None

Plots the top-N configs along with the default config for the given ExperimentData.

Intended to be used from a Jupyter notebook.

Parameters:
exp_data: ExperimentData

The experiment data to plot.

results_dfOptional[“pandas.DataFrame”]

Optional results_df to plot. If not provided, defaults to exp_data.results_df property.

objectivesOptional[Dict[str, Literal[“min”, “max”]]]

Optional objectives to plot. If not provided, defaults to exp_data.objectives property.

with_scatter_plotbool

Whether to also add scatter plot to the output figure.

kwargsdict

Remaining keyword arguments are passed along to the limit_top_n_configs function.