mlos_viz.base module¶

Base functions for visualizing, explain, and gain insights from results.

mlos_viz.base.augment_results_df_with_config_trial_group_stats(exp_data: ExperimentData | None = None, *, results_df: DataFrame | None = None, requested_result_cols: Iterable[str] | None = None) → DataFrame¶

Add a number of useful statistical measure columns to the results dataframe.

In particular, for each numeric result, we add the following columns for each requested result column:

“.p50”: the median of each config trial group results
“.p75”: the p75 of each config trial group results
“.p90”: the p90 of each config trial group results
“.p95”: the p95 of each config trial group results
“.p99”: the p95 of each config trial group results
“.mean”: the mean of each config trial group results
“.stddev”: the mean of each config trial group results
“.var”: the variance of each config trial group results
“.var_zscore”: the zscore of this group (i.e., variance relative to the stddev of all group variances). This can be useful for filtering out outliers (e.g., configs with high variance relative to others by restricting to abs < 2 to remove those two standard deviations from the mean variance across all config trial groups).

Additionally, we add a “tunable_config_trial_group_size” column that indicates the number of trials using a particular config.

Parameters:

exp_dataExperimentData: The ExperimentData (e.g., obtained from the storage layer) to plot.
results_dfOptional[pandas.DataFrame]: The results dataframe to augment, by default None to use the results_df property.
requested_result_colsOptional[Iterable[str]]: Which results columns to augment, by default None to use all results columns that look numeric.

Returns:

pandas.DataFrame: The augmented results dataframe.

mlos_viz.base.ignore_plotter_warnings() → None¶: Suppress some annoying warnings from third-party data visualization packages by adding them to the warnings filter.

mlos_viz.base.limit_top_n_configs(exp_data: ExperimentData | None = None, *, results_df: DataFrame | None = None, objectives: Dict[str, Literal['min', 'max']] | None = None, top_n_configs: int = 10, method: Literal['mean', 'p50', 'p75', 'p90', 'p95', 'p99'] = 'mean') → Tuple[DataFrame, List[int], Dict[str, bool]]¶

Utility function to process the results and determine the best performing configs including potential repeats to help assess variability.

Parameters:

exp_dataOptional[ExperimentData]: The ExperimentData (e.g., obtained from the storage layer) to operate on.
results_dfOptional[pandas.DataFrame]: The results dataframe to augment, by default None to use the results_df property.
objectivesIterable[str], optional: Which result column(s) to use for sorting the configs, and in which direction (“min” or “max”). By default None to automatically select the experiment objectives.
top_n_configsint, optional: How many configs to return, including the default, by default 20.
method: Literal[“mean”, “median”, “p50”, “p75”, “p90”, “p95”, “p99”] = “mean”,: Which statistical method to use when sorting the config groups before determining the cutoff, by default “mean”.

Returns:

(top_n_config_results_df, top_n_config_ids, orderby_cols)
Tuple[pandas.DataFrame, List[int], Dict[str, bool]]: The filtered results dataframe, the config ids, and the columns used to order the configs.

mlos_viz.base.plot_optimizer_trends(exp_data: ExperimentData | None = None, *, results_df: DataFrame | None = None, objectives: Dict[str, Literal['min', 'max']] | None = None) → None¶

Plots the optimizer trends for the Experiment.

Parameters:

exp_dataExperimentData: The ExperimentData (e.g., obtained from the storage layer) to plot.
results_dfOptional[“pandas.DataFrame”]: Optional results_df to plot. If not provided, defaults to exp_data.results_df property.
objectivesOptional[Dict[str, Literal[“min”, “max”]]]: Optional objectives to plot. If not provided, defaults to exp_data.objectives property.

mlos_viz.base.plot_top_n_configs(exp_data: ExperimentData | None = None, *, results_df: DataFrame | None = None, objectives: Dict[str, Literal['min', 'max']] | None = None, with_scatter_plot: bool = False, **kwargs: Any) → None¶

Plots the top-N configs along with the default config for the given ExperimentData.

Intended to be used from a Jupyter notebook.

Parameters:

exp_data: ExperimentData: The experiment data to plot.
results_dfOptional[“pandas.DataFrame”]: Optional results_df to plot. If not provided, defaults to exp_data.results_df property.
objectivesOptional[Dict[str, Literal[“min”, “max”]]]: Optional objectives to plot. If not provided, defaults to exp_data.objectives property.
with_scatter_plotbool: Whether to also add scatter plot to the output figure.
kwargsdict: Remaining keyword arguments are passed along to the limit_top_n_configs function.