跳到内容

随机外观模式

smac.facade.random_facade #

RandomFacade #

RandomFacade(
    scenario: Scenario,
    target_function: Callable | str | AbstractRunner,
    *,
    model: AbstractModel | None = None,
    acquisition_function: AbstractAcquisitionFunction
    | None = None,
    acquisition_maximizer: AbstractAcquisitionMaximizer
    | None = None,
    initial_design: AbstractInitialDesign | None = None,
    random_design: AbstractRandomDesign | None = None,
    intensifier: AbstractIntensifier | None = None,
    multi_objective_algorithm: AbstractMultiObjectiveAlgorithm
    | None = None,
    runhistory_encoder: AbstractRunHistoryEncoder
    | None = None,
    config_selector: ConfigSelector | None = None,
    logging_level: int
    | Path
    | Literal[False]
    | None = None,
    callbacks: list[Callback] = None,
    overwrite: bool = False,
    dask_client: Client | None = None
)

基类: AbstractFacade

用于使用随机在线积极竞速 (ROAR) 的外观模式。

积极竞速 (Aggressive Racing): 当我们有一个新的配置 θ 时,我们想将其与当前最佳配置(即现有配置 θ*)进行比较。ROAR 使用“竞速”方法,对不具潜力的 θ 运行少量次数,而对有潜力的配置运行多次。一旦我们确信 θ 优于 θ*,我们就更新现有配置 θ* ⟵ θ。积极(Aggressive)意味着在很早期(通常只需单次运行后)就拒绝表现不佳的配置。这整个过程称为积极竞速

ROAR 循环: 主要的 ROAR 循环如下所示

  1. 均匀随机选择一个配置 θ。
  2. 在线比较 θ 与现有配置 θ*(每次一个 θ)
  3. 使用积极竞速拒绝/接受 θ

设置: 使用随机模型和随机搜索来优化采集函数。

注意#

代理模型和采集函数在优化过程中未使用,因此被虚拟对象替代。

源代码位于 smac/facade/abstract_facade.py
def __init__(
    self,
    scenario: Scenario,
    target_function: Callable | str | AbstractRunner,
    *,
    model: AbstractModel | None = None,
    acquisition_function: AbstractAcquisitionFunction | None = None,
    acquisition_maximizer: AbstractAcquisitionMaximizer | None = None,
    initial_design: AbstractInitialDesign | None = None,
    random_design: AbstractRandomDesign | None = None,
    intensifier: AbstractIntensifier | None = None,
    multi_objective_algorithm: AbstractMultiObjectiveAlgorithm | None = None,
    runhistory_encoder: AbstractRunHistoryEncoder | None = None,
    config_selector: ConfigSelector | None = None,
    logging_level: int | Path | Literal[False] | None = None,
    callbacks: list[Callback] = None,
    overwrite: bool = False,
    dask_client: Client | None = None,
):
    setup_logging(logging_level)

    if callbacks is None:
        callbacks = []

    if model is None:
        model = self.get_model(scenario)

    if acquisition_function is None:
        acquisition_function = self.get_acquisition_function(scenario)

    if acquisition_maximizer is None:
        acquisition_maximizer = self.get_acquisition_maximizer(scenario)

    if initial_design is None:
        initial_design = self.get_initial_design(scenario)

    if random_design is None:
        random_design = self.get_random_design(scenario)

    if intensifier is None:
        intensifier = self.get_intensifier(scenario)

    if multi_objective_algorithm is None and scenario.count_objectives() > 1:
        multi_objective_algorithm = self.get_multi_objective_algorithm(scenario=scenario)

    if runhistory_encoder is None:
        runhistory_encoder = self.get_runhistory_encoder(scenario)

    if config_selector is None:
        config_selector = self.get_config_selector(scenario)

    # Initialize empty stats and runhistory object
    runhistory = RunHistory(multi_objective_algorithm=multi_objective_algorithm)

    # Set the seed for configuration space
    scenario.configspace.seed(scenario.seed)

    # Set variables globally
    self._scenario = scenario
    self._model = model
    self._acquisition_function = acquisition_function
    self._acquisition_maximizer = acquisition_maximizer
    self._initial_design = initial_design
    self._random_design = random_design
    self._intensifier = intensifier
    self._multi_objective_algorithm = multi_objective_algorithm
    self._runhistory = runhistory
    self._runhistory_encoder = runhistory_encoder
    self._config_selector = config_selector
    self._callbacks = callbacks
    self._overwrite = overwrite

    # Prepare the algorithm executer
    runner: AbstractRunner
    if isinstance(target_function, AbstractRunner):
        runner = target_function
    elif isinstance(target_function, str):
        runner = TargetFunctionScriptRunner(
            scenario=scenario,
            target_function=target_function,
            required_arguments=self._get_signature_arguments(),
        )
    else:
        runner = TargetFunctionRunner(
            scenario=scenario,
            target_function=target_function,
            required_arguments=self._get_signature_arguments(),
        )

    # In case of multiple jobs, we need to wrap the runner again using DaskParallelRunner
    if (n_workers := scenario.n_workers) > 1 or dask_client is not None:
        if dask_client is not None and n_workers > 1:
            logger.warning(
                "Provided `dask_client`. Ignore `scenario.n_workers`, directly set `n_workers` in `dask_client`."
            )
        else:
            available_workers = joblib.cpu_count()
            if n_workers > available_workers:
                logger.info(f"Workers are reduced to {n_workers}.")
                n_workers = available_workers

        # We use a dask runner for parallelization
        runner = DaskParallelRunner(single_worker=runner, dask_client=dask_client)

    # Set the runner to access it globally
    self._runner = runner

    # Adding dependencies of the components
    self._update_dependencies()

    # We have to update our meta data (basically arguments of the components)
    self._scenario._set_meta(self.meta)

    # We have to validate if the object compositions are correct and actually make sense
    self._validate()

    # Finally we configure our optimizer
    self._optimizer = self._get_optimizer()
    assert self._optimizer

    # Register callbacks here
    for callback in callbacks:
        self._optimizer.register_callback(callback)

    # Additionally, we register the runhistory callback from the intensifier to efficiently update our incumbent
    # every time new information are available
    self._optimizer.register_callback(self._intensifier.get_callback(), index=0)

intensifier 属性 #

intensifier: AbstractIntensifier

负责 BO 循环的优化器。跟踪有用信息,例如状态。

meta 属性 #

meta: dict[str, Any]

根据外观模式的所有组件生成哈希值。这用于运行名称或确定是否应继续运行。

optimizer 属性 #

optimizer: SMBO

负责 BO 循环的优化器。跟踪有用信息,例如状态。

runhistory 属性 #

runhistory: RunHistory

运行历史记录,其中填充了优化过程中的所有试验。

scenario 属性 #

scenario: Scenario

包含所有环境信息的场景对象。

ask #

ask() -> TrialInfo

向强化器请求下一个试验。

源代码位于 smac/facade/abstract_facade.py
def ask(self) -> TrialInfo:
    """Asks the intensifier for the next trial."""
    return self._optimizer.ask()

get_acquisition_function 静态方法 #

get_acquisition_function(
    scenario: Scenario,
) -> AbstractAcquisitionFunction

随机外观模式未使用采集函数。因此,我们只返回一个虚拟函数。

源代码位于 smac/facade/random_facade.py
@staticmethod
def get_acquisition_function(scenario: Scenario) -> AbstractAcquisitionFunction:
    """The random facade is not using an acquisition function. Therefore, we simply return a dummy function."""

    class DummyAcquisitionFunction(AbstractAcquisitionFunction):
        def _compute(self, X: np.ndarray) -> np.ndarray:
            return X

    return DummyAcquisitionFunction()

get_acquisition_maximizer 静态方法 #

get_acquisition_maximizer(
    scenario: Scenario,
) -> RandomSearch

我们返回 RandomSearch 作为最大化器,它从配置空间中随机采样配置,因此既不使用采集函数也不使用模型。

源代码位于 smac/facade/random_facade.py
@staticmethod
def get_acquisition_maximizer(scenario: Scenario) -> RandomSearch:
    """We return ``RandomSearch`` as maximizer which samples configurations randomly from the configuration
    space and therefore neither uses the acquisition function nor the model.
    """
    return RandomSearch(
        scenario.configspace,
        seed=scenario.seed,
    )

get_config_selector 静态方法 #

get_config_selector(
    scenario: Scenario,
    *,
    retrain_after: int = 8,
    retries: int = 16
) -> ConfigSelector

返回默认配置选择器。

源代码位于 smac/facade/abstract_facade.py
@staticmethod
def get_config_selector(
    scenario: Scenario,
    *,
    retrain_after: int = 8,
    retries: int = 16,
) -> ConfigSelector:
    """Returns the default configuration selector."""
    return ConfigSelector(scenario, retrain_after=retrain_after, retries=retries)

get_initial_design 静态方法 #

get_initial_design(
    scenario: Scenario,
    *,
    additional_configs: list[Configuration] = None
) -> DefaultInitialDesign

返回一个初始设计,该设计返回默认配置。

参数#

additional_configs: list[Configuration],默认为 [] 将附加配置添加到初始设计中。

源代码位于 smac/facade/random_facade.py
@staticmethod
def get_initial_design(
    scenario: Scenario,
    *,
    additional_configs: list[Configuration] = None,
) -> DefaultInitialDesign:
    """Returns an initial design, which returns the default configuration.

    Parameters
    ----------
    additional_configs: list[Configuration], defaults to []
        Adds additional configurations to the initial design.
    """
    if additional_configs is None:
        additional_configs = []
    return DefaultInitialDesign(
        scenario=scenario,
        additional_configs=additional_configs,
    )

get_intensifier 静态方法 #

get_intensifier(
    scenario: Scenario,
    *,
    max_config_calls: int = 3,
    max_incumbents: int = 10
) -> Intensifier

返回 Intensifier 作为强化器。

注意#

如果您想包含预算,请使用 HyperbandFacade

警告#

如果您处于算法配置设置中,请考虑增加 max_config_calls

参数#

max_config_calls : int,默认为 3 最大配置评估次数。基本而言,对于一个配置,最多应该评估多少个实例-种子键。max_incumbents : int,默认为 10 在多目标情况下要跟踪多少个现有配置。

源代码位于 smac/facade/random_facade.py
@staticmethod
def get_intensifier(
    scenario: Scenario,
    *,
    max_config_calls: int = 3,
    max_incumbents: int = 10,
) -> Intensifier:
    """Returns ``Intensifier`` as intensifier.

    Note
    ----
    Please use the ``HyperbandFacade`` if you want to incorporate budgets.

    Warning
    -------
    If you are in an algorithm configuration setting, consider increasing ``max_config_calls``.

    Parameters
    ----------
    max_config_calls : int, defaults to 3
        Maximum number of configuration evaluations. Basically, how many instance-seed keys should be max evaluated
        for a configuration.
    max_incumbents : int, defaults to 10
        How many incumbents to keep track of in the case of multi-objective.
    """
    return Intensifier(
        scenario=scenario,
        max_config_calls=max_config_calls,
        max_incumbents=max_incumbents,
    )

get_model 静态方法 #

get_model(scenario: Scenario) -> RandomModel

该模型用于采集函数中。由于我们不使用采集函数,因此我们返回一个虚拟模型(在这种情况下返回随机值)。

源代码位于 smac/facade/random_facade.py
@staticmethod
def get_model(scenario: Scenario) -> RandomModel:
    """The model is used in the acquisition function. Since we do not use an acquisition function, we return a
    dummy model (returning random values in this case).
    """
    return RandomModel(
        configspace=scenario.configspace,
        instance_features=scenario.instance_features,
        seed=scenario.seed,
    )

get_multi_objective_algorithm 静态方法 #

get_multi_objective_algorithm(
    scenario: Scenario,
    *,
    objective_weights: list[float] | None = None
) -> MeanAggregationStrategy

返回多目标算法的平均聚合策略。

参数#

scenario : Scenario objective_weights : list[float] | None,默认为 None 用于加权平均目标的权重。必须与目标数量长度相同。

源代码位于 smac/facade/random_facade.py
@staticmethod
def get_multi_objective_algorithm(  # type: ignore
    scenario: Scenario,
    *,
    objective_weights: list[float] | None = None,
) -> MeanAggregationStrategy:
    """Returns the mean aggregation strategy for the multi-objective algorithm.

    Parameters
    ----------
    scenario : Scenario
    objective_weights : list[float] | None, defaults to None
        Weights for averaging the objectives in a weighted manner. Must be of the same length as the number of
        objectives.
    """
    return MeanAggregationStrategy(
        scenario=scenario,
        objective_weights=objective_weights,
    )

get_random_design 静态方法 #

get_random_design(
    scenario: Scenario,
) -> AbstractRandomDesign

就像采集函数一样,我们不使用随机设计。因此,我们返回一个虚拟设计。

源代码位于 smac/facade/random_facade.py
@staticmethod
def get_random_design(scenario: Scenario) -> AbstractRandomDesign:
    """Just like the acquisition function, we do not use a random design. Therefore, we return a dummy design."""

    class DummyRandomDesign(AbstractRandomDesign):
        def check(self, iteration: int) -> bool:
            return True

    return DummyRandomDesign()

get_runhistory_encoder 静态方法 #

get_runhistory_encoder(
    scenario: Scenario,
) -> RunHistoryEncoder

返回默认运行历史编码器。

源代码位于 smac/facade/random_facade.py
@staticmethod
def get_runhistory_encoder(scenario: Scenario) -> RunHistoryEncoder:
    """Returns the default runhistory encoder."""
    return RunHistoryEncoder(scenario)

optimize #

optimize(
    *, data_to_scatter: dict[str, Any] | None = None
) -> Configuration | list[Configuration]

优化算法的配置。

参数#

data_to_scatter: dict[str, Any] | None 我们首先注意到此参数仅对 dask_runner 有效!当用户将数据从其本地进程分散到分布式网络时,这些数据将以轮询方式按核心数分组分发。粗略地说,我们可以将这些数据保存在内存中,这样每次我们想用大数据集执行目标函数时,就不必序列化/反序列化数据。例如,当您的目标函数具有在所有目标函数之间共享的大数据集时,此参数非常有用。

返回值#

incumbent : Configuration 找到的最佳配置。

源代码位于 smac/facade/abstract_facade.py
def optimize(self, *, data_to_scatter: dict[str, Any] | None = None) -> Configuration | list[Configuration]:
    """
    Optimizes the configuration of the algorithm.

    Parameters
    ----------
    data_to_scatter: dict[str, Any] | None
        We first note that this argument is valid only dask_runner!
        When a user scatters data from their local process to the distributed network,
        this data is distributed in a round-robin fashion grouping by number of cores.
        Roughly speaking, we can keep this data in memory and then we do not have to (de-)serialize the data
        every time we would like to execute a target function with a big dataset.
        For example, when your target function has a big dataset shared across all the target function,
        this argument is very useful.

    Returns
    -------
    incumbent : Configuration
        Best found configuration.
    """
    incumbents = None
    if isinstance(data_to_scatter, dict) and len(data_to_scatter) == 0:
        raise ValueError("data_to_scatter must be None or dict with some elements, but got an empty dict.")

    try:
        incumbents = self._optimizer.optimize(data_to_scatter=data_to_scatter)
    finally:
        self._optimizer.save()

    return incumbents

tell #

tell(
    info: TrialInfo, value: TrialValue, save: bool = True
) -> None

将试验结果添加到运行历史记录并更新强化器。

参数#

info: TrialInfo 描述要处理结果的试验。value: TrialValue 包含有关试验执行的相关信息。save : bool,可选,默认为 True 是否应该保存运行历史记录。

源代码位于 smac/facade/abstract_facade.py
def tell(self, info: TrialInfo, value: TrialValue, save: bool = True) -> None:
    """Adds the result of a trial to the runhistory and updates the intensifier.

    Parameters
    ----------
    info: TrialInfo
        Describes the trial from which to process the results.
    value: TrialValue
        Contains relevant information regarding the execution of a trial.
    save : bool, optional to True
        Whether the runhistory should be saved.
    """
    return self._optimizer.tell(info, value, save=save)

validate #

validate(
    config: Configuration, *, seed: int | None = None
) -> float | list[float]

在与优化过程中使用的种子不同的种子上以及在最高预算(如果预算类型为实值)下验证配置。

参数#

config : Configuration 要验证的配置 instances : list[str] | None,默认为 None 要验证的实例。如果为 None,则使用场景中指定的所有实例。如果预算类型为实值,则忽略此参数。seed : int | None,默认为 None 如果为 None,则使用场景中的种子。

返回值#

cost : float | list[float] 配置的平均成本。在多保真度情况下,每个目标的成本将被平均。

源代码位于 smac/facade/abstract_facade.py
def validate(
    self,
    config: Configuration,
    *,
    seed: int | None = None,
) -> float | list[float]:
    """Validates a configuration on seeds different from the ones used in the optimization process and on the
    highest budget (if budget type is real-valued).

    Parameters
    ----------
    config : Configuration
        Configuration to validate
    instances : list[str] | None, defaults to None
        Which instances to validate. If None, all instances specified in the scenario are used.
        In case that the budget type is real-valued, this argument is ignored.
    seed : int | None, defaults to None
        If None, the seed from the scenario is used.

    Returns
    -------
    cost : float | list[float]
        The averaged cost of the configuration. In case of multi-fidelity, the cost of each objective is
        averaged.
    """
    return self._optimizer.validate(config, seed=seed)