跳至内容

配置选择器

smac.main.config_selector #

ConfigSelector #

ConfigSelector(
    scenario: Scenario,
    *,
    retrain_after: int = 8,
    retries: int = 16,
    min_trials: int = 1
)

配置选择器处理代理模型和采集函数。基于这两个组件,选择下一个配置。

参数#

retrain_after : int, 默认为 8 在重新训练代理模型之前应返回多少个配置。 retries : int, 默认为 8 在放弃之前重试获取新配置的次数。 min_trials: int, 默认为 1 训练代理模型所需的样本数。如果涉及预算,则首先检查最高预算。例如,如果 min_trials 为 3,但在最高预算的运行历史中只找到 2 个试验,我们将使用较低预算的试验。

源代码位于 smac/main/config_selector.py
def __init__(
    self,
    scenario: Scenario,
    *,
    retrain_after: int = 8,
    retries: int = 16,
    min_trials: int = 1,
) -> None:
    # Those are the configs sampled from the passed initial design
    # Selecting configurations from initial design
    self._initial_design_configs: list[Configuration] = []

    # Set classes globally
    self._scenario = scenario
    self._runhistory: RunHistory | None = None
    self._runhistory_encoder: AbstractRunHistoryEncoder | None = None
    self._model: AbstractModel | None = None
    self._acquisition_maximizer: AbstractAcquisitionMaximizer | None = None
    self._acquisition_function: AbstractAcquisitionFunction | None = None
    self._random_design: AbstractRandomDesign | None = None
    self._callbacks: list[Callback] = []

    # And other variables
    self._retrain_after = retrain_after
    self._previous_entries = -1
    self._predict_x_best = True
    self._min_trials = min_trials
    self._considered_budgets: list[float | int | None] = [None]

    # How often to retry receiving a new configuration
    # (counter increases if the received config was already returned before)
    self._retries = retries

    # Processed configurations should be stored here; this is important to not return the same configuration twice
    self._processed_configs: list[Configuration] = []

meta property #

meta: dict[str, Any]

返回所创建对象的元数据。

__iter__ #

__iter__() -> Iterator[Configuration]

此方法返回要评估的下一个配置。如果运行历史不为空,它将忽略已处理的配置,即来自运行历史的配置。该方法(在生成初始设计配置后)训练代理模型,最大化采集函数,并生成 n 个配置。在生成 n 个配置后,代理模型会再次训练,以此类推。如果在每次迭代中达到 retries 次重试,程序将停止。如果某个配置之前已使用过,则会被忽略。

注意#

当 SMAC 继续运行时,会忽略运行历史中已处理的配置。例如,如果初始设计配置已经处理过,则在此处会被忽略。然而,在继续运行后,代理模型在所有情况下都会基于运行历史进行训练。

返回值#

next_config : Iterator[Configuration] 要评估的下一个配置。

源代码位于 smac/main/config_selector.py
def __iter__(self) -> Iterator[Configuration]:
    """This method returns the next configuration to evaluate. It ignores already processed configurations, i.e.,
    the configurations from the runhistory, if the runhistory is not empty.
    The method (after yielding the initial design configurations) trains the surrogate model, maximizes the
    acquisition function and yields ``n`` configurations. After the ``n`` configurations, the surrogate model is
    trained again, etc. The program stops if ``retries`` was reached within each iteration. A configuration
    is ignored, if it was used already before.

    Note
    ----
    When SMAC continues a run, processed configurations from the runhistory are ignored. For example, if the
    intitial design configurations already have been processed, they are ignored here. After the run is
    continued, however, the surrogate model is trained based on the runhistory in all cases.

    Returns
    -------
    next_config : Iterator[Configuration]
        The next configuration to evaluate.
    """
    assert self._runhistory is not None
    assert self._runhistory_encoder is not None
    assert self._model is not None
    assert self._acquisition_maximizer is not None
    assert self._acquisition_function is not None
    assert self._random_design is not None

    self._processed_configs = self._runhistory.get_configs()

    # We add more retries because there could be a case in which the processed configs are sampled again
    self._retries += len(self._processed_configs)

    logger.debug("Search for the next configuration...")
    self._call_callbacks_on_start()

    # First: We return the initial configurations
    for config in self._initial_design_configs:
        if config not in self._processed_configs:
            self._processed_configs.append(config)
            self._call_callbacks_on_end(config)
            yield config
            self._call_callbacks_on_start()

    # We want to generate configurations endlessly
    while True:
        # Cost value of incumbent configuration (required for acquisition function).
        # If not given, it will be inferred from runhistory or predicted.
        # If not given and runhistory is empty, it will raise a ValueError.
        incumbent_value: float | None = None

        # Everytime we re-train the surrogate model, we also update our multi-objective algorithm
        if (mo := self._runhistory_encoder.multi_objective_algorithm) is not None:
            mo.update_on_iteration_start()

        X, Y, X_configurations = self._collect_data()
        previous_configs = self._runhistory.get_configs()

        if X.shape[0] == 0:
            # Only return a single point to avoid an overly high number of random search iterations.
            # We got rid of random search here and replaced it with a simple configuration sampling from
            # the configspace.
            logger.debug("No data available to train the model. Sample a random configuration.")

            config = self._scenario.configspace.sample_configuration()
            self._call_callbacks_on_end(config)
            yield config
            self._call_callbacks_on_start()

            # Important to continue here because we still don't have data available
            continue

        # Check if X/Y differs from the last run, otherwise use cached results
        if self._previous_entries != Y.shape[0]:
            self._model.train(X, Y)

            x_best_array: np.ndarray | None = None
            if incumbent_value is not None:
                best_observation = incumbent_value
            else:
                if self._runhistory.empty():
                    raise ValueError("Runhistory is empty and the cost value of the incumbent is unknown.")

                x_best_array, best_observation = self._get_x_best(X_configurations)

            self._acquisition_function.update(
                model=self._model,
                eta=best_observation,
                incumbent_array=x_best_array,
                num_data=len(self._get_evaluated_configs()),
                X=X_configurations,
            )

        # We want to cache how many entries we used because if we have the same number of entries
        # we don't need to train the next time
        self._previous_entries = Y.shape[0]

        # Now we maximize the acquisition function
        challengers = self._acquisition_maximizer.maximize(
            previous_configs,
            random_design=self._random_design,
        )

        counter = 0
        failed_counter = 0
        for config in challengers:
            if config not in self._processed_configs:
                counter += 1
                self._processed_configs.append(config)
                self._call_callbacks_on_end(config)
                yield config
                retrain = counter == self._retrain_after
                self._call_callbacks_on_start()

                # We break to enforce a new iteration of the while loop (i.e. we retrain the surrogate model)
                if retrain:
                    logger.debug(
                        f"Yielded {counter} configurations. Start new iteration and retrain surrogate model."
                    )
                    break
            else:
                failed_counter += 1

                # We exit the loop if we have tried to add the same configuration too often
                if failed_counter == self._retries:
                    logger.warning(f"Could not return a new configuration after {self._retries} retries." "")
                    return