Purpose in HierarchyCraft

Every hcraft environments are sandbox environments and do not have a precise purpose by default. But of course, purpose can be added in any HierarchyCraft environment by setting up one or multiple tasks.

Tasks can be one of:

Get the given item: hcraft.GetItemTask
Go to the given zone: hcraft.GoToZoneTask
Place the given item in the given zone (or any zone if none given): hcraft.PlaceItemTask

Single task purpose

When a single task is passed to a HierarchyCraft environment, it will automaticaly build a purpose. Then the environment will terminates if the task is completed.

Let's take an example on the MineHcraft environment. (This would work on other HierarchyCraft environment)

from hcraft.examples MineHcraftv
from hcraft.purpose import GetItemTask
from hcraft.examples.minecraft.items import DIAMOND

get_diamond = GetItemTask(DIAMOND, reward=10)
env = MineHcraftEnv(purpose=get_diamond)

Reward shaping

Achievement tasks only rewards the player when completed. But this long term feedback is known to be challenging. To ease learning such tasks, HierarchyCraft Purpose can generate substasks to give intermediate feedback, this process is also known as reward shaping. See RewardShaping for more details.

For example, let's add the "required" reward shaping to the get_diamond task:

from hcraft.examples import MineHcraftEnv
from hcraft.purpose import Purpose, GetItemTask
from hcraft.examples.minecraft.items import DIAMOND

get_diamond = GetItemTask(DIAMOND, reward=10)
purpose = Purpose(shaping_value=2)
purpose.add_task(get_diamond, reward_shaping="required")

env = MineHcraftEnv(purpose=purpose)

Then getting the IRON_INGOT item for the first time will give a reward of 2.0 to the player, because IRON_INGOT is used to craft the IRON_PICKAXE that is itself used to get a DIAMOND.

Multi-tasks and terminal groups

In a sandbox environment, why limit ourselves to only one task ? In HierarchyCraft, a purpose can be composed on multiple tasks. But when does the purpose terminates ? When any task is done ? When all tasks are done ?

To solve this, we need to introduce terminal groups. Terminal groups are represented with strings.

The purpose will terminate if ANY of the terminal groups have ALL its tasks done.

When adding a task to a purpose, one can choose one or multiple terminal groups like so:

from hcraft.examples import MineHcraftEnv
from hcraft.purpose import Purpose, GetItemTask, GoToZone
from hcraft.examples.minecraft.items import DIAMOND, GOLD_INGOT, EGG
from hcraft.examples.minecraft.zones import END

get_diamond = GetItemTask(DIAMOND, reward=10)
get_gold = GetItemTask(GOLD_INGOT, reward=5)
get_egg = GetItemTask(EGG, reward=100)
go_to_end = GoToZone(END, reward=20)

purpose = Purpose()
purpose.add_task(get_diamond, reward_shaping="required", terminal_groups="get rich!")
purpose.add_task(get_gold, terminal_groups=["golden end", "get rich!"])
purpose.add_task(go_to_end, reward_shaping="inputs", terminal_groups="golden end")
purpose.add_task(get_egg, terminal_groups=None)

env = MineHcraftEnv(purpose=purpose)

Here the environment will terminate if the player gets both diamond and gold_ingot items ("get rich!" group) or if the player gets a gold_ingot and reaches the end zone ("golden end" group). The task get_egg is optional and cannot terminate the purpose anyhow, but it will still reward the player if completed.

Just like this last task, reward shaping subtasks are always optional.

View Source

  1"""# Purpose in HierarchyCraft
  2
  3**Every** hcraft environments are sandbox environments
  4and do not have a precise purpose by default.
  5But of course, purpose can be added in **any** HierarchyCraft environment
  6by setting up one or multiple tasks.
  7
  8Tasks can be one of:
  9* Get the given item: `hcraft.task.GetItemTask`
 10* Go to the given zone: `hcraft.task.GoToZoneTask`
 11* Place the given item in the given zone (or any zone if none given): `hcraft.task.PlaceItemTask`
 12
 13
 14## Single task purpose
 15
 16When a single task is passed to a HierarchyCraft environment, it will automaticaly build a purpose.
 17Then the environment will terminates if the task is completed.
 18
 19Let's take an example on the MineHcraft environment.
 20(This would work on other HierarchyCraft environment)
 21```python
 22from hcraft.examples MineHcraftv
 23from hcraft.purpose import GetItemTask
 24from hcraft.examples.minecraft.items import DIAMOND
 25
 26get_diamond = GetItemTask(DIAMOND, reward=10)
 27env = MineHcraftEnv(purpose=get_diamond)
 28```
 29
 30## Reward shaping
 31
 32Achievement tasks only rewards the player when completed. But this long term feedback is known
 33to be challenging. To ease learning such tasks, HierarchyCraft Purpose can generate substasks to give
 34intermediate feedback, this process is also known as reward shaping.
 35See `hcraft.purpose.RewardShaping` for more details.
 36
 37For example, let's add the "required" reward shaping to the get_diamond task:
 38
 39```python
 40from hcraft.examples import MineHcraftEnv
 41from hcraft.purpose import Purpose, GetItemTask
 42from hcraft.examples.minecraft.items import DIAMOND
 43
 44get_diamond = GetItemTask(DIAMOND, reward=10)
 45purpose = Purpose(shaping_value=2)
 46purpose.add_task(get_diamond, reward_shaping="required")
 47
 48env = MineHcraftEnv(purpose=purpose)
 49```
 50
 51Then getting the IRON_INGOT item for the first time will give a reward of 2.0 to the player, because
 52IRON_INGOT is used to craft the IRON_PICKAXE that is itself used to get a DIAMOND.
 53
 54## Multi-tasks and terminal groups
 55
 56In a sandbox environment, why limit ourselves to only one task ?
 57In HierarchyCraft, a purpose can be composed on multiple tasks.
 58But when does the purpose terminates ? When any task is done ? When all tasks are done ?
 59
 60To solve this, we need to introduce terminal groups.
 61Terminal groups are represented with strings.
 62
 63The purpose will terminate if ANY of the terminal groups have ALL its tasks done.
 64
 65When adding a task to a purpose, one can choose one or multiple terminal groups like so:
 66
 67```python
 68from hcraft.examples import MineHcraftEnv
 69from hcraft.purpose import Purpose, GetItemTask, GoToZone
 70from hcraft.examples.minecraft.items import DIAMOND, GOLD_INGOT, EGG
 71from hcraft.examples.minecraft.zones import END
 72
 73get_diamond = GetItemTask(DIAMOND, reward=10)
 74get_gold = GetItemTask(GOLD_INGOT, reward=5)
 75get_egg = GetItemTask(EGG, reward=100)
 76go_to_end = GoToZone(END, reward=20)
 77
 78purpose = Purpose()
 79purpose.add_task(get_diamond, reward_shaping="required", terminal_groups="get rich!")
 80purpose.add_task(get_gold, terminal_groups=["golden end", "get rich!"])
 81purpose.add_task(go_to_end, reward_shaping="inputs", terminal_groups="golden end")
 82purpose.add_task(get_egg, terminal_groups=None)
 83
 84env = MineHcraftEnv(purpose=purpose)
 85```
 86
 87Here the environment will terminate if the player gets both diamond
 88and gold_ingot items ("get rich!" group) or if the player gets a gold_ingot
 89and reaches the end zone ("golden end" group).
 90The task get_egg is optional and cannot terminate the purpose anyhow,
 91but it will still reward the player if completed.
 92
 93Just like this last task, reward shaping subtasks are always optional.
 94
 95"""
 96
 97from dataclasses import dataclass, field
 98from enum import Enum
 99from typing import TYPE_CHECKING, Dict, List, Optional, Set, Union
100
101import networkx as nx
102import numpy as np
103
104from hcraft.requirements import RequirementNode, req_node_name
105from hcraft.task import GetItemTask, GoToZoneTask, PlaceItemTask, Task
106from hcraft.elements import Item, Zone
107
108
109if TYPE_CHECKING:
110    from hcraft.env import HcraftEnv, HcraftState
111    from hcraft.world import World
112
113
114class RewardShaping(Enum):
115    """Enumeration of all reward shapings possible."""
116
117    NONE = "none"
118    """No reward shaping"""
119    ALL_ACHIVEMENTS = "all"
120    """All items and zones will be associated with an achievement subtask."""
121    REQUIREMENTS_ACHIVEMENTS = "required"
122    """All (recursively) required items and zones for the given task
123    will be associated with an achievement subtask."""
124    INPUTS_ACHIVEMENT = "inputs"
125    """Items and zones consumed by any transformation solving the task
126    will be associated with an achievement subtask."""
127
128
129@dataclass
130class TerminalGroup:
131    """Terminal groups are groups of tasks that can terminate the purpose.
132
133    The purpose will termitate if ANY of the terminal groups have ALL its tasks done.
134    """
135
136    name: str
137    tasks: List[Task] = field(default_factory=list)
138
139    @property
140    def terminated(self) -> bool:
141        """True if all tasks of the terminal group are terminated."""
142        return all(task.terminated for task in self.tasks)
143
144    def __eq__(self, other) -> bool:
145        if isinstance(other, str):
146            return self.name == other
147        if isinstance(other, TerminalGroup):
148            return self.name == other.name
149        return False
150
151    def __hash__(self) -> int:
152        return self.name.__hash__()
153
154
155class Purpose:
156    """A purpose for a HierarchyCraft player based on a list of tasks."""
157
158    def __init__(
159        self,
160        tasks: Optional[Union[Task, List[Task]]] = None,
161        timestep_reward: float = 0.0,
162        default_reward_shaping: RewardShaping = RewardShaping.NONE,
163        shaping_value: float = 1.0,
164    ) -> None:
165        """
166        Args:
167            tasks: Tasks to add to the Purpose.
168                Defaults to None.
169            timestep_reward: Reward for each timestep.
170                Defaults to 0.0.
171            default_reward_shaping: Default reward shaping for tasks.
172                Defaults to RewardShaping.NONE.
173            shaping_value: Reward value used in reward shaping if any.
174                Defaults to 1.0.
175        """
176        self.tasks: List[Task] = []
177        self.timestep_reward = timestep_reward
178        self.shaping_value = shaping_value
179        self.default_reward_shaping = default_reward_shaping
180        self.built = False
181
182        self.reward_shaping: Dict[Task, RewardShaping] = {}
183        self.terminal_groups: List[TerminalGroup] = []
184
185        if isinstance(tasks, Task):
186            tasks = [tasks]
187        elif tasks is None:
188            tasks = []
189        for task in tasks:
190            self.add_task(task, reward_shaping=default_reward_shaping)
191
192        self._best_terminal_group = None
193
194    def add_task(
195        self,
196        task: Task,
197        reward_shaping: Optional[RewardShaping] = None,
198        terminal_groups: Optional[Union[str, List[str]]] = "default",
199    ):
200        """Add a new task to the purpose.
201
202        Args:
203            task: Task to be added to the purpose.
204            reward_shaping: Reward shaping for this task.
205                Defaults to purpose's default reward shaping.
206            terminal_groups: Purpose terminates when ALL the tasks of ANY terminal group terminates.
207                If terminal groups is "" or None, task will be optional and will
208                not allow to terminate the purpose at all.
209                By default, tasks are added in the "default" group and hence
210                ALL tasks have to be done to terminate the purpose.
211        """
212        if reward_shaping is None:
213            reward_shaping = self.default_reward_shaping
214        reward_shaping = RewardShaping(reward_shaping)
215        if terminal_groups:
216            if isinstance(terminal_groups, str):
217                terminal_groups = [terminal_groups]
218            for terminal_group in terminal_groups:
219                existing_group = self._terminal_group_from_name(terminal_group)
220                if not existing_group:
221                    existing_group = TerminalGroup(terminal_group)
222                    self.terminal_groups.append(existing_group)
223                existing_group.tasks.append(task)
224
225        self.reward_shaping[task] = reward_shaping
226        self.tasks.append(task)
227
228    def build(self, env: "HcraftEnv"):
229        """
230        Builds the purpose of the player relative to the given environment.
231
232        Args:
233            env: The HierarchyCraft environment to build upon.
234        """
235        if self.built:
236            return
237
238        if not self.tasks:
239            return
240        # Add reward shaping subtasks
241        for task in self.tasks:
242            subtasks = self._add_reward_shaping_subtasks(
243                task, env, self.reward_shaping[task]
244            )
245            for subtask in subtasks:
246                self.add_task(subtask, RewardShaping.NONE, terminal_groups=None)
247
248        # Build all tasks
249        for task in self.tasks:
250            task.build(env.world)
251
252        self.built = True
253
254    def reward(self, state: "HcraftState") -> float:
255        """
256        Returns the purpose reward for the given state based on tasks.
257        """
258        reward = self.timestep_reward
259        if not self.tasks:
260            return reward
261        for task in self.tasks:
262            reward += task.reward(state)
263        return reward
264
265    def is_terminal(self, state: "HcraftState") -> bool:
266        """
267        Returns True if the given state is terminal for the whole purpose.
268        """
269        if not self.tasks:
270            return False
271        for task in self.tasks:
272            task.is_terminal(state)
273        for terminal_group in self.terminal_groups:
274            if terminal_group.terminated:
275                return True
276        return False
277
278    def reset(self) -> None:
279        """Reset the purpose."""
280        for task in self.tasks:
281            task.reset()
282
283    @property
284    def optional_tasks(self) -> List[Task]:
285        """List of tasks in no terminal group hence being optinal."""
286        terminal_tasks = []
287        for group in self.terminal_groups:
288            terminal_tasks += group.tasks
289        return [task for task in self.tasks if task not in terminal_tasks]
290
291    @property
292    def terminated(self) -> bool:
293        """True if any of the terminal groups are terminated."""
294        return any(
295            all(task.terminated for task in terminal_group.tasks)
296            for terminal_group in self.terminal_groups
297        )
298
299    @property
300    def best_terminal_group(self) -> TerminalGroup:
301        """Best rewarding terminal group."""
302        if self._best_terminal_group is not None:
303            return self._best_terminal_group
304
305        best_terminal_group, best_terminal_value = None, -np.inf
306        for terminal_group in self.terminal_groups:
307            terminal_value = sum(task._reward for task in terminal_group.tasks)
308            if terminal_value > best_terminal_value:
309                best_terminal_value = terminal_value
310                best_terminal_group = terminal_group
311
312        self._best_terminal_group = best_terminal_group
313        return best_terminal_group
314
315    def _terminal_group_from_name(self, name: str) -> Optional[TerminalGroup]:
316        if name not in self.terminal_groups:
317            return None
318        group_id = self.terminal_groups.index(name)
319        return self.terminal_groups[group_id]
320
321    def _add_reward_shaping_subtasks(
322        self, task: Task, env: "HcraftEnv", reward_shaping: RewardShaping
323    ) -> List[Task]:
324        if reward_shaping == RewardShaping.NONE:
325            return []
326        if reward_shaping == RewardShaping.ALL_ACHIVEMENTS:
327            return _all_subtasks(env.world, self.shaping_value)
328        if reward_shaping == RewardShaping.INPUTS_ACHIVEMENT:
329            return _inputs_subtasks(task, env.world, self.shaping_value)
330        if reward_shaping == RewardShaping.REQUIREMENTS_ACHIVEMENTS:
331            return _required_subtasks(task, env, self.shaping_value)
332        raise NotImplementedError
333
334    def __str__(self) -> str:
335        terminal_groups_str = []
336        for terminal_group in self.terminal_groups:
337            tasks_str_joined = self._tasks_str(terminal_group.tasks)
338            group_str = f"{terminal_group.name}:[{tasks_str_joined}]"
339            terminal_groups_str.append(group_str)
340        optional_tasks_str = self._tasks_str(self.optional_tasks)
341        if optional_tasks_str:
342            group_str = f"optional:[{optional_tasks_str}]"
343            terminal_groups_str.append(group_str)
344        joined_groups_str = ", ".join(terminal_groups_str)
345        return f"Purpose({joined_groups_str})"
346
347    def _tasks_str(self, tasks: List[Task]) -> str:
348        tasks_str = []
349        for task in tasks:
350            shaping = self.reward_shaping[task]
351            shaping_str = f"#{shaping.value}" if shaping != RewardShaping.NONE else ""
352            tasks_str.append(f"{task}{shaping_str}")
353        return ",".join(tasks_str)
354
355
356def platinium_purpose(
357    items: List[Item],
358    zones: List[Zone],
359    zones_items: List[Item],
360    success_reward: float = 10.0,
361    timestep_reward: float = -0.1,
362):
363    purpose = Purpose(timestep_reward=timestep_reward)
364    for item in items:
365        purpose.add_task(GetItemTask(item, reward=success_reward))
366    for zone in zones:
367        purpose.add_task(GoToZoneTask(zone, reward=success_reward))
368    for item in zones_items:
369        purpose.add_task(PlaceItemTask(item, reward=success_reward))
370    return purpose
371
372
373def _all_subtasks(world: "World", shaping_reward: float) -> List[Task]:
374    return _build_reward_shaping_subtasks(
375        world.items, world.zones, world.zones_items, shaping_reward
376    )
377
378
379def _required_subtasks(
380    task: Task, env: "HcraftEnv", shaping_reward: float
381) -> List[Task]:
382    relevant_items = set()
383    relevant_zones = set()
384    relevant_zone_items = set()
385
386    if isinstance(task, GetItemTask):
387        goal_item = task.item_stack.item
388        goal_requirement_nodes = [req_node_name(goal_item, RequirementNode.ITEM)]
389    elif isinstance(task, PlaceItemTask):
390        goal_item = task.item_stack.item
391        goal_requirement_nodes = [req_node_name(goal_item, RequirementNode.ZONE_ITEM)]
392        goal_zones = task.zone
393        if goal_zones is not None:
394            relevant_zones.add(goal_zones)
395            goal_requirement_nodes.append(
396                req_node_name(goal_zones, RequirementNode.ZONE)
397            )
398    elif isinstance(task, GoToZoneTask):
399        goal_requirement_nodes = [req_node_name(task.zone, RequirementNode.ZONE)]
400    else:
401        raise NotImplementedError(
402            f"Unsupported reward shaping {RewardShaping.REQUIREMENTS_ACHIVEMENTS}"
403            f"for given task type: {type(task)} of {task}"
404        )
405
406    requirements_acydigraph = env.world.requirements.acydigraph
407    for requirement_node in goal_requirement_nodes:
408        for ancestor in nx.ancestors(requirements_acydigraph, requirement_node):
409            if ancestor == "START#":
410                continue
411            ancestor_node = requirements_acydigraph.nodes[ancestor]
412            item_or_zone: Union["Item", "Zone"] = ancestor_node["obj"]
413            ancestor_type = RequirementNode(ancestor_node["type"])
414            if ancestor_type is RequirementNode.ITEM:
415                relevant_items.add(item_or_zone)
416            if ancestor_type is RequirementNode.ZONE:
417                relevant_zones.add(item_or_zone)
418            if ancestor_type is RequirementNode.ZONE_ITEM:
419                relevant_zone_items.add(item_or_zone)
420    return _build_reward_shaping_subtasks(
421        relevant_items,
422        relevant_zones,
423        relevant_zone_items,
424        shaping_reward,
425    )
426
427
428def _inputs_subtasks(task: Task, world: "World", shaping_reward: float) -> List[Task]:
429    relevant_items = set()
430    relevant_zones = set()
431    relevant_zone_items = set()
432
433    goal_zone = None
434    goal_item = None
435    goal_zone_item = None
436    if isinstance(task, GetItemTask):
437        goal_item = task.item_stack.item
438    elif isinstance(task, GoToZoneTask):
439        goal_zone = task.zone
440    elif isinstance(task, PlaceItemTask):
441        goal_zone_item = task.item_stack.item
442        if task.zone:
443            goal_zone = task.zone
444            relevant_zones.add(task.zone)
445    else:
446        raise NotImplementedError(
447            f"Unsupported reward shaping {RewardShaping.INPUTS_ACHIVEMENT}"
448            f"for given task type: {type(task)} of {task}"
449        )
450    transfo_giving_item = [
451        transfo
452        for transfo in world.transformations
453        if goal_item in transfo.production("player")
454        and goal_item not in transfo.min_required("player")
455    ]
456    transfo_placing_zone_item = [
457        transfo
458        for transfo in world.transformations
459        if goal_zone_item in transfo.produced_zones_items
460        and goal_zone_item not in transfo.min_required_zones_items
461    ]
462    transfo_going_to_goal_zone = [
463        transfo
464        for transfo in world.transformations
465        if transfo.destination is not None and transfo.destination == goal_zone
466    ]
467    relevant_transformations = (
468        transfo_giving_item + transfo_placing_zone_item + transfo_going_to_goal_zone
469    )
470
471    for transfo in relevant_transformations:
472        relevant_items |= transfo.consumption("player")
473        relevant_zone_items |= transfo.consumption("current_zone")
474        relevant_zone_items |= transfo.consumption("destination")
475        relevant_zone_items |= transfo.consumption("zones")
476        if transfo.zone:
477            relevant_zones.add(transfo.zone)
478
479    return _build_reward_shaping_subtasks(
480        relevant_items,
481        relevant_zones,
482        relevant_zone_items,
483        shaping_reward,
484    )
485
486
487def _build_reward_shaping_subtasks(
488    items: Optional[Union[List[Item], Set[Item]]] = None,
489    zones: Optional[Union[List[Zone], Set[Zone]]] = None,
490    zone_items: Optional[Union[List[Item], Set[Item]]] = None,
491    shaping_reward: float = 1.0,
492) -> List[Task]:
493    subtasks = []
494    if items:
495        subtasks += [GetItemTask(item, reward=shaping_reward) for item in items]
496    if zones:
497        subtasks += [GoToZoneTask(zone, reward=shaping_reward) for zone in zones]
498    if zone_items:
499        subtasks += [PlaceItemTask(item, reward=shaping_reward) for item in zone_items]
500    return subtasks

API Documentation

class RewardShaping(enum.Enum): View Source

115class RewardShaping(Enum):
116    """Enumeration of all reward shapings possible."""
117
118    NONE = "none"
119    """No reward shaping"""
120    ALL_ACHIVEMENTS = "all"
121    """All items and zones will be associated with an achievement subtask."""
122    REQUIREMENTS_ACHIVEMENTS = "required"
123    """All (recursively) required items and zones for the given task
124    will be associated with an achievement subtask."""
125    INPUTS_ACHIVEMENT = "inputs"
126    """Items and zones consumed by any transformation solving the task
127    will be associated with an achievement subtask."""

Enumeration of all reward shapings possible.

NONE = <RewardShaping.NONE: 'none'>

No reward shaping

ALL_ACHIVEMENTS = <RewardShaping.ALL_ACHIVEMENTS: 'all'>

All items and zones will be associated with an achievement subtask.

REQUIREMENTS_ACHIVEMENTS = <RewardShaping.REQUIREMENTS_ACHIVEMENTS: 'required'>

All (recursively) required items and zones for the given task will be associated with an achievement subtask.

INPUTS_ACHIVEMENT = <RewardShaping.INPUTS_ACHIVEMENT: 'inputs'>

Items and zones consumed by any transformation solving the task will be associated with an achievement subtask.

Inherited Members

enum.Enum: name; value

@dataclass

class TerminalGroup: View Source

130@dataclass
131class TerminalGroup:
132    """Terminal groups are groups of tasks that can terminate the purpose.
133
134    The purpose will termitate if ANY of the terminal groups have ALL its tasks done.
135    """
136
137    name: str
138    tasks: List[Task] = field(default_factory=list)
139
140    @property
141    def terminated(self) -> bool:
142        """True if all tasks of the terminal group are terminated."""
143        return all(task.terminated for task in self.tasks)
144
145    def __eq__(self, other) -> bool:
146        if isinstance(other, str):
147            return self.name == other
148        if isinstance(other, TerminalGroup):
149            return self.name == other.name
150        return False
151
152    def __hash__(self) -> int:
153        return self.name.__hash__()

Terminal groups are groups of tasks that can terminate the purpose.

The purpose will termitate if ANY of the terminal groups have ALL its tasks done.

TerminalGroup(name: str, tasks: List[hcraft.task.Task] = <factory>)

name: str

tasks: List[hcraft.task.Task]

terminated: bool View Source

140    @property
141    def terminated(self) -> bool:
142        """True if all tasks of the terminal group are terminated."""
143        return all(task.terminated for task in self.tasks)

True if all tasks of the terminal group are terminated.

def platinium_purpose( items: List[hcraft.Item], zones: List[hcraft.Zone], zones_items: List[hcraft.Item], success_reward: float = 10.0, timestep_reward: float = -0.1): View Source

357def platinium_purpose(
358    items: List[Item],
359    zones: List[Zone],
360    zones_items: List[Item],
361    success_reward: float = 10.0,
362    timestep_reward: float = -0.1,
363):
364    purpose = Purpose(timestep_reward=timestep_reward)
365    for item in items:
366        purpose.add_task(GetItemTask(item, reward=success_reward))
367    for zone in zones:
368        purpose.add_task(GoToZoneTask(zone, reward=success_reward))
369    for item in zones_items:
370        purpose.add_task(PlaceItemTask(item, reward=success_reward))
371    return purpose

Purpose in HierarchyCraft

Single task purpose

Reward shaping

Multi-tasks and terminal groups

API Documentation

Inherited Members

Arguments:

Arguments:

Arguments: