HierarchyCraft - Environements builder for hierarchical reasoning research

Fury - PyPi stable version PePy - Downloads PePy - Downloads per week Licence - GPLv3

Codacy - grade Codacy - coverage CodeStyle - Black CodeStyle - Ruff

HierarchyCraft

HierarchyCraft (hcraft for short) is a Python library designed to create arbitrary hierarchical environments that are compatible with both the OpenAI Gym Reinforcement Learning Framework and AIPlan4EU Unified Planning Framework. This library enables users to easily create complex hierarchical structures that can be used to test and develop various reinforcement learning or planning algorithms.

In environments built with HierarchyCraft the agent (player) has an inventory and can navigate into abstract zones that themselves have inventories.

The action space of HierarchyCraft environments consists of sub-tasks, referred to as Transformations, as opposed to detailed movements and controls. But each Transformations has specific requirements to be valid (eg. have enought of an item, be in the right place), and these requirements may necessitate the execution of other Transformations first, inherently creating a hierarchical structure in HierarchyCraft environments.

This concept is visually represented by the Requirements graph depicting the hierarchical relationships within each HierarchyCraft environment. The Requirements graph is directly constructed from the list of Transformations composing the environement.

More details about requirements graph can be found in the documentation at hcraft.requirements and example of requirements graph for some HierarchyCraft environements can be found in hcraft.examples.

No feature extraction for fast research even with low compute

HierarchyCraft returns vectorized state information, which plainly and directly describes the player's inventory, current positions, and the inventory of the current zone. Compared to benchmarks that return grids, pixel arrays, text or sound, we directly return a low-dimensional latent representation that doesn't need to be learned. Therefore saving compute time and allowing researchers to focus only the the hierarchical reasoning part.

See hcraft.state for more details.

Create your own tailored HierarchyCraft environments

You can use HierarchyCraft to create various custom hierarchical environments from a list of customized Transformations.

See hcraft.env for a complete tutorial on creating custom environments.

Installation

Using pip

Without optional dependencies:

pip install hcraft

All hcraft environments can use a common graphical user interface that can be used with gui requirements:

pip install hcraft[gui]

Gym environment can be obtained with gym requirements:

pip install hcraft[gym]

Planning problems can be obtained throught the upf interface with planning requirements:

pip install hcraft[planning]

Some complex graph can be represented in html interactive visualisation:

pip install hcraft[htmlvis]

Quickstart

Play yourself!

A player knowing Minecraft will find MineHcraft easy.

Install the graphical user interface optional dependencies:

pip install hcraft[gui]

Using the command line interface

You can directly try to play yourself with the GUI available for any HierarchyCraft environments, for example:

hcraft minecraft

For more examples:

hcraft --help

Using the programmatic interface:

from hcraft import get_human_action
from hcraft.examples import MineHcraftEnv

env = MineHcraftEnv()
# or env: MineHcraftEnv = gym.make("MineHcraft-NoReward-v1")
n_episodes = 2
for _ in range(n_episodes):
    env.reset()
    done = False
    total_reward = 0
    while not done:
        env.render()
        action = get_human_action(env)
        print(f"Human pressed: {env.world.transformations[action]}")

        _observation, reward, done, _info = env.step(action)
        total_reward += reward

    print(f"SCORE: {total_reward}")

As a Gym RL environment

Using the programmatic interface, any HierarchyCraft environment can easily be interfaced with classic reinforcement learning agents.

import numpy as np
from hcraft.examples import MineHcraftEnv

def random_legal_agent(observation, action_is_legal):
    action = np.random.choice(np.nonzero(action_is_legal)[0])
    return int(action)

env = MineHcraftEnv(max_step=10)
done = False
observation, _info = env.reset()
while not done:
    action_is_legal = env.action_masks()
    action = random_legal_agent(observation, action_is_legal)
    _observation, _reward, terminated, truncated, _info = env.step(action)
# Other examples of HierarchyCraft environments
from hcraft.examples import  TowerHcraftEnv, RecursiveHcraftEnv, RandomHcraftEnv

tower_env = TowerHcraftEnv(height=3, width=2)
# or tower_env = gym.make("TowerHcraft-v1", height=3, width=2)
recursive_env = RecursiveHcraftEnv(n_items=6)
# or recursive_env = gym.make("RecursiveHcraft-v1", n_items=6)
random_env = RandomHcraftEnv(n_items_per_n_inputs={0:2, 1:5, 2:10}, seed=42)
# or random_env = gym.make("RandomHcraft-v1", n_items_per_n_inputs={0:2, 1:5, 2:10}, seed=42)

See hcraft.env for a more complete description.

As a UPF problem for planning

HierarchyCraft environments can be converted to planning problem in one line thanks to the Unified Planning Framework (UPF):

# Example env
env = TowerHcraftEnv(height=3, width=2)

# Make it into a unified planning problem
planning_problem = env.planning_problem()
print(planning_problem.upf_problem)

Then they can be solved with any compatible planner for UPF:

# Solve the planning problem and show the plan
planning_problem.solve()
print(planning_problem.plan)

The planning_problem can also give actions to do in the environment, triggering replaning if necessary:

done = False
_observation, _info = env.reset()
while not done:
    # Automatically replan at the end of each plan until env termination

    # Observations are not used when blindly following a current plan
    # But the state in required in order to replan if there is no plan left
    action = planning_problem.action_from_plan(env.state)
    if action is None:
        # Plan is existing but empty, thus nothing to do, thus terminates
        done = True
        continue
    _observation, _reward, terminated, truncated, _info = env.step(action)
    done = terminated or truncated

if terminated:
    print("Success ! The plan worked in the actual environment !")
else:
    print("Failed ... Something went wrong with the plan or the episode was truncated.")

See hcraft.planning for a more complete description.

More about HierarchyCraft

Online documentation

Learn more in the DOCUMENTATION

Contributing

You want to contribute to HierarchyCraft ? See our contributions guidelines and join us !

Custom purposes for agents in HierarchyCraft environments

HierarchyCraft allows users to specify custom purposes (one or multiple tasks) for agents in their environments. This feature provides a high degree of flexibility and allows users to design environments that are tailored to specific applications or scenarios. This feature enables to study mutli-task or lifelong learning settings.

See hcraft.purpose for more details.

Solving behavior for all tasks of most HierarchyCraft environments

HierarchyCraft also includes solving behaviors that can be used to generate actions from observations that will complete most tasks in any HierarchyCraft environment, including user-designed. Solving behaviors are handcrafted, and may not work in some edge cases when some items are rquired in specific zones. This feature makes it easy for users to obtain a strong baseline in their custom environments.

See hcraft.solving_behaviors for more details.

Visualizing the underlying hierarchy of the environment (requirements graph)

HierarchyCraft gives the ability to visualize the hierarchy of the environment as a requirements graph. This graph provides a potentialy complex but complete representation of what is required to obtain each item or to go in each zone, allowing users to easily understand the structure of the environment and identify key items of the environment.

For example, here is the graph of the 'MiniCraftUnlock' environment where the goal is to open a door using a key: Unlock requirements graph

And here is much more complex graph of the 'MineHcraft' environment shown previously: Minehcraft requirements graph

See hcraft.requirements for more details.

 1"""
 2.. include:: ../../README.md
 3
 4## Custom purposes for agents in HierarchyCraft environments
 5
 6HierarchyCraft allows users to specify custom purposes (one or multiple tasks) for agents in their environments.
 7This feature provides a high degree of flexibility and allows users to design environments that
 8are tailored to specific applications or scenarios.
 9This feature enables to study mutli-task or lifelong learning settings.
10
11See [`hcraft.purpose`](https://irll.github.io/HierarchyCraft/hcraft/purpose.html) for more details.
12
13## Solving behavior for all tasks of most HierarchyCraft environments
14
15HierarchyCraft also includes solving behaviors that can be used to generate actions
16from observations that will complete most tasks in any HierarchyCraft environment, including user-designed.
17Solving behaviors are handcrafted, and may not work in some edge cases when some items are rquired in specific zones.
18This feature makes it easy for users to obtain a strong baseline in their custom environments.
19
20See [`hcraft.solving_behaviors`](https://irll.github.io/HierarchyCraft/hcraft/solving_behaviors.html) for more details.
21
22## Visualizing the underlying hierarchy of the environment (requirements graph)
23
24HierarchyCraft gives the ability to visualize the hierarchy of the environment as a requirements graph.
25This graph provides a potentialy complex but complete representation of what is required
26to obtain each item or to go in each zone, allowing users to easily understand the structure
27of the environment and identify key items of the environment.
28
29For example, here is the graph of the 'MiniCraftUnlock' environment where the goal is to open a door using a key:
30![Unlock requirements graph](../../docs/images/requirements_graphs/MiniHCraftUnlock.png)
31
32
33And here is much more complex graph of the 'MineHcraft' environment shown previously:
34![Minehcraft requirements graph](../../docs/images/requirements_graphs/MineHcraft.png)
35
36See [`hcraft.requirements`](https://irll.github.io/HierarchyCraft/hcraft/requirements.html) for more details.
37
38"""
39
40import hcraft.state as state
41import hcraft.solving_behaviors as solving_behaviors
42import hcraft.purpose as purpose
43import hcraft.transformation as transformation
44import hcraft.requirements as requirements
45import hcraft.env as env
46import hcraft.examples as examples
47import hcraft.world as world
48import hcraft.planning as planning
49
50from hcraft.elements import Item, Stack, Zone
51from hcraft.transformation import Transformation
52from hcraft.env import HcraftEnv, HcraftState
53from hcraft.purpose import Purpose
54from hcraft.render.human import get_human_action, render_env_with_human
55from hcraft.task import GetItemTask, GoToZoneTask, PlaceItemTask
56
57
58__all__ = [
59    "HcraftState",
60    "Transformation",
61    "Item",
62    "Stack",
63    "Zone",
64    "HcraftEnv",
65    "get_human_action",
66    "render_env_with_human",
67    "Purpose",
68    "GetItemTask",
69    "GoToZoneTask",
70    "PlaceItemTask",
71    "state",
72    "transformation",
73    "purpose",
74    "solving_behaviors",
75    "requirements",
76    "world",
77    "env",
78    "planning",
79    "examples",
80]

API Documentation

class HcraftState:
 13class HcraftState:
 14    """State manager of HierarchyCraft environments.
 15
 16    The state of every HierarchyCraft environment is composed of three parts:
 17    * The player's inventory: `state.player_inventory`
 18    * The one-hot encoded player's position: `state.position`
 19    * All zones inventories: `state.zones_inventories`
 20
 21    The mapping of items, zones, and zones items to their respective indexes is done through
 22    the given World. (See `hcraft.world`)
 23
 24    ![hcraft state](../../docs/images/hcraft_state.png)
 25
 26    """
 27
 28    def __init__(self, world: "World") -> None:
 29        """
 30        Args:
 31            world: World to build the state for.
 32        """
 33        self.player_inventory = np.array([], dtype=np.int32)
 34        self.position = np.array([], dtype=np.int32)
 35        self.zones_inventories = np.array([], dtype=np.int32)
 36
 37        self.discovered_items = np.array([], dtype=np.ubyte)
 38        self.discovered_zones = np.array([], dtype=np.ubyte)
 39        self.discovered_zones_items = np.array([], dtype=np.ubyte)
 40        self.discovered_transformations = np.array([], dtype=np.ubyte)
 41
 42        self.world = world
 43        self.reset()
 44
 45    @property
 46    def current_zone_inventory(self) -> np.ndarray:
 47        """Inventory of the zone where the player is."""
 48        if self.position.shape[0] == 0:
 49            return np.array([])  # No Zone
 50        return self.zones_inventories[self._current_zone_slot, :][0]
 51
 52    @property
 53    def observation(self) -> np.ndarray:
 54        """The player's observation is a subset of the state.
 55
 56        Only the inventory of the current zone is shown.
 57
 58        ![hcraft state](../../docs/images/hcraft_observation.png)
 59
 60        """
 61        return np.concatenate(
 62            (
 63                self.player_inventory,
 64                self.position,
 65                self.current_zone_inventory,
 66            )
 67        )
 68
 69    def amount_of(self, item: "Item", owner: Optional["Zone"] = "player") -> int:
 70        """Current amount of the given item owned by owner.
 71
 72        Args:
 73            item: Item to get the amount of.
 74            owner: Owner of the inventory to check. Defaults to player.
 75
 76        Returns:
 77            int: Amount of the item in the owner's inventory.
 78        """
 79
 80        if owner in self.world.zones:
 81            zone_index = self.world.zones.index(owner)
 82            zone_item_index = self.world.zones_items.index(item)
 83            return int(self.zones_inventories[zone_index, zone_item_index])
 84
 85        item_index = self.world.items.index(item)
 86        return int(self.player_inventory[item_index])
 87
 88    def has_discovered(self, zone: "Zone") -> bool:
 89        """Whether the given zone was discovered.
 90
 91        Args:
 92            zone (Zone): Zone to check.
 93
 94        Returns:
 95            bool: True if the zone was discovered.
 96        """
 97        zone_index = self.world.zones.index(zone)
 98        return bool(self.discovered_zones[zone_index])
 99
100    @property
101    def current_zone(self) -> Optional["Zone"]:
102        """Current position of the player."""
103        if self.world.n_zones == 0:
104            return None
105        return self.world.zones[self._current_zone_slot[0]]
106
107    @property
108    def _current_zone_slot(self) -> int:
109        return self.position.nonzero()[0]
110
111    @property
112    def player_inventory_dict(self) -> Dict["Item", int]:
113        """Current inventory of the player."""
114        return self._inv_as_dict(self.player_inventory, self.world.items)
115
116    @property
117    def zones_inventories_dict(self) -> Dict["Zone", Dict["Item", int]]:
118        """Current inventories of the current zone and each zone containing item."""
119        zones_invs = {}
120        for zone_slot, zone_inv in enumerate(self.zones_inventories):
121            zone = self.world.zones[zone_slot]
122            zone_inv = self._inv_as_dict(zone_inv, self.world.zones_items)
123            if zone_slot == self._current_zone_slot or zone_inv:
124                zones_invs[zone] = zone_inv
125        return zones_invs
126
127    def apply(self, action: int) -> bool:
128        """Apply the given action to update the state.
129
130        Args:
131            action (int): Index of the transformation to apply.
132
133        Returns:
134            bool: True if the transformation was applied succesfuly. False otherwise.
135        """
136        choosen_transformation = self.world.transformations[action]
137        if not choosen_transformation.is_valid(self):
138            return False
139        choosen_transformation.apply(
140            self.player_inventory,
141            self.position,
142            self.zones_inventories,
143        )
144        self._update_discoveries(action)
145        return True
146
147    def reset(self) -> None:
148        """Reset the state to it's initial value."""
149        self.player_inventory = np.zeros(self.world.n_items, dtype=np.int32)
150        for stack in self.world.start_items:
151            item_slot = self.world.items.index(stack.item)
152            self.player_inventory[item_slot] = stack.quantity
153
154        self.position = np.zeros(self.world.n_zones, dtype=np.int32)
155        start_slot = 0  # Start in first Zone by default
156        if self.world.start_zone is not None:
157            start_slot = self.world.slot_from_zone(self.world.start_zone)
158        if self.position.shape[0] > 0:
159            self.position[start_slot] = 1
160
161        self.zones_inventories = np.zeros(
162            (self.world.n_zones, self.world.n_zones_items), dtype=np.int32
163        )
164        for zone, zone_stacks in self.world.start_zones_items.items():
165            zone_slot = self.world.slot_from_zone(zone)
166            for stack in zone_stacks:
167                item_slot = self.world.zones_items.index(stack.item)
168                self.zones_inventories[zone_slot, item_slot] = stack.quantity
169
170        self.discovered_items = np.zeros(self.world.n_items, dtype=np.ubyte)
171        self.discovered_zones_items = np.zeros(self.world.n_zones_items, dtype=np.ubyte)
172        self.discovered_zones = np.zeros(self.world.n_zones, dtype=np.ubyte)
173        self.discovered_transformations = np.zeros(
174            len(self.world.transformations), dtype=np.ubyte
175        )
176        self._update_discoveries()
177
178    def _update_discoveries(self, action: Optional[int] = None) -> None:
179        self.discovered_items = np.bitwise_or(
180            self.discovered_items, self.player_inventory > 0
181        )
182        self.discovered_zones_items = np.bitwise_or(
183            self.discovered_zones_items, self.current_zone_inventory > 0
184        )
185        self.discovered_zones = np.bitwise_or(self.discovered_zones, self.position > 0)
186        if action is not None:
187            self.discovered_transformations[action] = 1
188
189    @staticmethod
190    def _inv_as_dict(inventory_array: np.ndarray, obj_registry: list):
191        return {
192            obj_registry[index]: value
193            for index, value in enumerate(inventory_array)
194            if value > 0
195        }
196
197    def as_dict(self) -> dict:
198        state_dict = {
199            "pos": self.current_zone,
200            InventoryOwner.PLAYER.value: self.player_inventory_dict,
201        }
202        state_dict.update(self.zones_inventories_dict)
203        return state_dict

State manager of HierarchyCraft environments.

The state of every HierarchyCraft environment is composed of three parts:

  • The player's inventory: state.player_inventory
  • The one-hot encoded player's position: state.position
  • All zones inventories: state.zones_inventories

The mapping of items, zones, and zones items to their respective indexes is done through the given World. (See hcraft.world)

hcraft state

HcraftState(world: hcraft.world.World)
28    def __init__(self, world: "World") -> None:
29        """
30        Args:
31            world: World to build the state for.
32        """
33        self.player_inventory = np.array([], dtype=np.int32)
34        self.position = np.array([], dtype=np.int32)
35        self.zones_inventories = np.array([], dtype=np.int32)
36
37        self.discovered_items = np.array([], dtype=np.ubyte)
38        self.discovered_zones = np.array([], dtype=np.ubyte)
39        self.discovered_zones_items = np.array([], dtype=np.ubyte)
40        self.discovered_transformations = np.array([], dtype=np.ubyte)
41
42        self.world = world
43        self.reset()
Arguments:
  • world: World to build the state for.
player_inventory
position
zones_inventories
discovered_items
discovered_zones
discovered_zones_items
discovered_transformations
world
current_zone_inventory: numpy.ndarray
45    @property
46    def current_zone_inventory(self) -> np.ndarray:
47        """Inventory of the zone where the player is."""
48        if self.position.shape[0] == 0:
49            return np.array([])  # No Zone
50        return self.zones_inventories[self._current_zone_slot, :][0]

Inventory of the zone where the player is.

observation: numpy.ndarray
52    @property
53    def observation(self) -> np.ndarray:
54        """The player's observation is a subset of the state.
55
56        Only the inventory of the current zone is shown.
57
58        ![hcraft state](../../docs/images/hcraft_observation.png)
59
60        """
61        return np.concatenate(
62            (
63                self.player_inventory,
64                self.position,
65                self.current_zone_inventory,
66            )
67        )

The player's observation is a subset of the state.

Only the inventory of the current zone is shown.

hcraft state

def amount_of( self, item: Item, owner: Optional[Zone] = 'player') -> int:
69    def amount_of(self, item: "Item", owner: Optional["Zone"] = "player") -> int:
70        """Current amount of the given item owned by owner.
71
72        Args:
73            item: Item to get the amount of.
74            owner: Owner of the inventory to check. Defaults to player.
75
76        Returns:
77            int: Amount of the item in the owner's inventory.
78        """
79
80        if owner in self.world.zones:
81            zone_index = self.world.zones.index(owner)
82            zone_item_index = self.world.zones_items.index(item)
83            return int(self.zones_inventories[zone_index, zone_item_index])
84
85        item_index = self.world.items.index(item)
86        return int(self.player_inventory[item_index])

Current amount of the given item owned by owner.

Arguments:
  • item: Item to get the amount of.
  • owner: Owner of the inventory to check. Defaults to player.
Returns:

int: Amount of the item in the owner's inventory.

def has_discovered(self, zone: Zone) -> bool:
88    def has_discovered(self, zone: "Zone") -> bool:
89        """Whether the given zone was discovered.
90
91        Args:
92            zone (Zone): Zone to check.
93
94        Returns:
95            bool: True if the zone was discovered.
96        """
97        zone_index = self.world.zones.index(zone)
98        return bool(self.discovered_zones[zone_index])

Whether the given zone was discovered.

Arguments:
  • zone (Zone): Zone to check.
Returns:

bool: True if the zone was discovered.

current_zone: Optional[Zone]
100    @property
101    def current_zone(self) -> Optional["Zone"]:
102        """Current position of the player."""
103        if self.world.n_zones == 0:
104            return None
105        return self.world.zones[self._current_zone_slot[0]]

Current position of the player.

player_inventory_dict: Dict[Item, int]
111    @property
112    def player_inventory_dict(self) -> Dict["Item", int]:
113        """Current inventory of the player."""
114        return self._inv_as_dict(self.player_inventory, self.world.items)

Current inventory of the player.

zones_inventories_dict: Dict[Zone, Dict[Item, int]]
116    @property
117    def zones_inventories_dict(self) -> Dict["Zone", Dict["Item", int]]:
118        """Current inventories of the current zone and each zone containing item."""
119        zones_invs = {}
120        for zone_slot, zone_inv in enumerate(self.zones_inventories):
121            zone = self.world.zones[zone_slot]
122            zone_inv = self._inv_as_dict(zone_inv, self.world.zones_items)
123            if zone_slot == self._current_zone_slot or zone_inv:
124                zones_invs[zone] = zone_inv
125        return zones_invs

Current inventories of the current zone and each zone containing item.

def apply(self, action: int) -> bool:
127    def apply(self, action: int) -> bool:
128        """Apply the given action to update the state.
129
130        Args:
131            action (int): Index of the transformation to apply.
132
133        Returns:
134            bool: True if the transformation was applied succesfuly. False otherwise.
135        """
136        choosen_transformation = self.world.transformations[action]
137        if not choosen_transformation.is_valid(self):
138            return False
139        choosen_transformation.apply(
140            self.player_inventory,
141            self.position,
142            self.zones_inventories,
143        )
144        self._update_discoveries(action)
145        return True

Apply the given action to update the state.

Arguments:
  • action (int): Index of the transformation to apply.
Returns:

bool: True if the transformation was applied succesfuly. False otherwise.

def reset(self) -> None:
147    def reset(self) -> None:
148        """Reset the state to it's initial value."""
149        self.player_inventory = np.zeros(self.world.n_items, dtype=np.int32)
150        for stack in self.world.start_items:
151            item_slot = self.world.items.index(stack.item)
152            self.player_inventory[item_slot] = stack.quantity
153
154        self.position = np.zeros(self.world.n_zones, dtype=np.int32)
155        start_slot = 0  # Start in first Zone by default
156        if self.world.start_zone is not None:
157            start_slot = self.world.slot_from_zone(self.world.start_zone)
158        if self.position.shape[0] > 0:
159            self.position[start_slot] = 1
160
161        self.zones_inventories = np.zeros(
162            (self.world.n_zones, self.world.n_zones_items), dtype=np.int32
163        )
164        for zone, zone_stacks in self.world.start_zones_items.items():
165            zone_slot = self.world.slot_from_zone(zone)
166            for stack in zone_stacks:
167                item_slot = self.world.zones_items.index(stack.item)
168                self.zones_inventories[zone_slot, item_slot] = stack.quantity
169
170        self.discovered_items = np.zeros(self.world.n_items, dtype=np.ubyte)
171        self.discovered_zones_items = np.zeros(self.world.n_zones_items, dtype=np.ubyte)
172        self.discovered_zones = np.zeros(self.world.n_zones, dtype=np.ubyte)
173        self.discovered_transformations = np.zeros(
174            len(self.world.transformations), dtype=np.ubyte
175        )
176        self._update_discoveries()

Reset the state to it's initial value.

def as_dict(self) -> dict:
197    def as_dict(self) -> dict:
198        state_dict = {
199            "pos": self.current_zone,
200            InventoryOwner.PLAYER.value: self.player_inventory_dict,
201        }
202        state_dict.update(self.zones_inventories_dict)
203        return state_dict
class Transformation:
242class Transformation:
243    """The building blocks of every HierarchyCraft environment.
244
245    A list of transformations is what defines each HierarchyCraft environement.
246    Transformation becomes the available actions and all available transitions of the environment.
247
248    Each transformation defines changes of:
249
250    * the player inventory
251    * the player position to a given destination
252    * the current zone inventory
253    * the destination zone inventory (if a destination is specified).
254    * all specific zones inventories
255
256    Each inventory change is a list of removed (-) and added (+) Stack.
257
258    If specified, they may be restricted to only a subset of valid zones,
259    all zones are valid by default.
260
261    A Transformation can only be applied if valid in the given state.
262    A transformation is only valid if the player in a valid zone
263    and all relevant inventories have enough items to be removed *before* adding new items.
264
265    The picture bellow illustrates the impact of
266    an example transformation on a given `hcraft.HcraftState`:
267    <img
268        src="https://raw.githubusercontent.com/IRLL/HierarchyCraft/master/docs/images/hcraft_transformation.png"
269    width="90%"/>
270
271    In this example, when applied, the transformation will:
272
273    * <span style="color:red">(-)</span>
274        Remove 1 item "0", then <span style="color:red">(+)</span>
275        Add 4 item "3" in the <span style="color:red">player inventory</span>.
276    * Update the <span style="color:gray">player position</span>
277        from the <span style="color:green">current zone</span> "1".
278        to the <span style="color:orange">destination zone</span> "3".
279    * <span style="color:green">(-)</span>
280        Remove 2 zone item "0" and 1 zone item "1", then <span style="color:green">(+)</span>
281        Add 1 item "1" in the <span style="color:green">current zone</span> inventory.
282    * <span style="color:orange">(-)</span>
283        Remove 1 zone item "2", then <span style="color:orange">(+)</span>
284        Add 1 item "0" in the <span style="color:orange">destination zone</span> inventory.
285    * <span style="color:blue">(-)</span>
286        Remove 1 zone item "0" in the zone "1" inventory
287        and 2 zone item "2" in the zone "2" inventory,
288        then <span style="color:blue">(+)</span>
289        Add 1 zone item "1" in the zone "0" inventory
290        and 1 zone item "2" in the zone "1" inventory.
291
292    """
293
294    def __init__(
295        self,
296        name: Optional[str] = None,
297        destination: Optional[Zone] = None,
298        inventory_changes: Optional[List[InventoryChange]] = None,
299        zone: Optional[Zone] = None,
300    ) -> None:
301        """The building blocks of every HierarchyCraft environment.
302
303        Args:
304            name: Name given to the Transformation. If None use repr instead.
305                Defaults to None.
306            destination: Destination zone.
307                Defaults to None.
308            inventory_changes: List of inventory changes done by this transformation.
309                Defaults to None.
310            zone: Zone to which Transformation is restricted. Unrestricted if None.
311                Defaults to None.
312        """
313        self.destination = destination
314        self._destination = None
315
316        self.zone = zone
317        self._zone = None
318
319        self._changes_list = inventory_changes
320        self.inventory_changes = _format_inventory_changes(inventory_changes)
321        self._inventory_operations: Optional[
322            Dict[InventoryOwner, InventoryOperations]
323        ] = None
324
325        self.name = name if name is not None else self.__repr__()
326
327    def apply(
328        self,
329        player_inventory: np.ndarray,
330        position: np.ndarray,
331        zones_inventories: np.ndarray,
332    ) -> Tuple[np.ndarray, np.ndarray, np.ndarray]:
333        """Apply the transformation in place on the given state."""
334
335        for owner, operations in self._inventory_operations.items():
336            operation_arr = operations[InventoryOperation.APPLY]
337            if operation_arr is not None:
338                _update_inventory(
339                    owner,
340                    player_inventory,
341                    position,
342                    zones_inventories,
343                    self._destination,
344                    operation_arr,
345                )
346        if self._destination is not None:
347            position[...] = self._destination
348
349    def is_valid(self, state: "HcraftState") -> bool:
350        """Is the transformation valid in the given state?"""
351        if not self._is_valid_position(state.position):
352            return False
353        if not self._is_valid_player_inventory(state.player_inventory):
354            return False
355        if not self._is_valid_zones_inventory(state.zones_inventories, state.position):
356            return False
357        return True
358
359    def build(self, world: "World") -> None:
360        """Build the transformation array operations on the given world."""
361        self._build_destination_op(world)
362        self._build_inventory_ops(world)
363        self._build_zones_op(world)
364
365    def get_changes(
366        self, owner: InventoryOwner, operation: InventoryOperation, default: Any = None
367    ) -> Optional[Union[List[Stack], Dict[Zone, List[Stack]]]]:
368        """Get individual changes for a given owner and a given operation.
369
370        Args:
371            owner: Owner of the inventory changes to get.
372            operation: Operation on the inventory to get.
373
374        Returns:
375            Changes of the inventory of the given owner with the given operation.
376        """
377        owner = InventoryOwner(owner)
378        operation = InventoryOperation(operation)
379        operations = self.inventory_changes.get(owner, {})
380        return operations.get(operation, default)
381
382    def production(self, owner: InventoryOwner) -> Set["Item"]:
383        """Set of produced items for the given owner by this transformation."""
384        return self._relevant_items_changed(owner, InventoryOperation.ADD)
385
386    def consumption(self, owner: InventoryOwner) -> Set["Item"]:
387        """Set of consumed items for the given owner by this transformation."""
388        return self._relevant_items_changed(owner, InventoryOperation.REMOVE)
389
390    def min_required(self, owner: InventoryOwner) -> Set["Item"]:
391        """Set of items for which a minimum is required by this transformation
392        for the given owner."""
393        return self._relevant_items_changed(owner, InventoryOperation.MIN)
394
395    def max_required(self, owner: InventoryOwner) -> Set["Item"]:
396        """Set of items for which a maximum is required by this transformation
397        for the given owner."""
398        return self._relevant_items_changed(owner, InventoryOperation.MAX)
399
400    @property
401    def produced_zones_items(self) -> Set["Item"]:
402        """Set of produced zones items by this transformation."""
403        return (
404            self.production(CURRENT_ZONE)
405            | self.production(DESTINATION)
406            | self.production(InventoryOwner.ZONES)
407        )
408
409    @property
410    def consumed_zones_items(self) -> Set["Item"]:
411        """Set of consumed zones items by this transformation."""
412        return (
413            self.consumption(CURRENT_ZONE)
414            | self.consumption(DESTINATION)
415            | self.consumption(InventoryOwner.ZONES)
416        )
417
418    @property
419    def min_required_zones_items(self) -> Set["Item"]:
420        """Set of zone items for which a minimum is required by this transformation."""
421        return (
422            self.min_required(CURRENT_ZONE)
423            | self.min_required(DESTINATION)
424            | self.min_required(InventoryOwner.ZONES)
425        )
426
427    @property
428    def max_required_zones_items(self) -> Set["Item"]:
429        """Set of zone items for which a maximum is required by this transformation."""
430        return (
431            self.max_required(CURRENT_ZONE)
432            | self.max_required(DESTINATION)
433            | self.max_required(InventoryOwner.ZONES)
434        )
435
436    def _relevant_items_changed(
437        self, owner: InventoryOwner, operation: InventoryOperation
438    ):
439        added_stacks = self.get_changes(owner, operation)
440        items = set()
441
442        if added_stacks:
443            if owner is not InventoryOwner.ZONES:
444                return _items_from_stack_list(added_stacks)
445
446            for _zone, stacks in added_stacks.items():
447                items |= _items_from_stack_list(stacks)
448
449        return items
450
451    def _is_valid_position(self, position: np.ndarray):
452        if self._zone is not None and not np.any(np.multiply(self._zone, position)):
453            return False
454        if self._destination is not None and np.all(self._destination == position):
455            return False
456        return True
457
458    def _is_valid_inventory(
459        self,
460        inventory: np.ndarray,
461        added: Optional[np.ndarray],
462        removed: Optional[np.ndarray],
463        max_items: Optional[np.ndarray],
464        min_items: Optional[np.ndarray],
465    ):
466        added = 0 if added is None else added
467        removed = 0 if removed is None else removed
468        if max_items is not None and np.any(inventory > max_items):
469            return False
470        if min_items is not None and np.any(inventory < min_items):
471            return False
472        return True
473
474    def _is_valid_player_inventory(self, player_inventory: np.ndarray):
475        items_changes = self._inventory_operations.get(InventoryOwner.PLAYER, {})
476        added = items_changes.get(InventoryOperation.ADD, 0)
477        removed = items_changes.get(InventoryOperation.REMOVE)
478        max_items = items_changes.get(InventoryOperation.MAX)
479        min_items = items_changes.get(InventoryOperation.MIN)
480        return self._is_valid_inventory(
481            player_inventory, added, removed, max_items, min_items
482        )
483
484    def _is_valid_zones_inventory(
485        self, zones_inventories: np.ndarray, position: np.ndarray
486    ):
487        if zones_inventories.size == 0:
488            return True
489
490        # Specific zones operations
491        zones_changes = self._inventory_operations.get(InventoryOwner.ZONES, {})
492        zeros = np.zeros_like(zones_inventories)
493        added = zones_changes.get(InventoryOperation.ADD, zeros.copy())
494        removed = zones_changes.get(InventoryOperation.REMOVE, zeros.copy())
495        infs = np.inf * np.ones_like(zones_inventories)
496        max_items = zones_changes.get(InventoryOperation.MAX, infs.copy())
497        min_items = zones_changes.get(InventoryOperation.MIN, zeros.copy())
498
499        # Current zone
500        current_changes = self._inventory_operations.get(InventoryOwner.CURRENT, {})
501        current_slot = position.nonzero()[0]
502        added[current_slot] += current_changes.get(InventoryOperation.ADD, 0)
503        removed[current_slot] += current_changes.get(InventoryOperation.REMOVE, 0)
504        max_items[current_slot] = np.minimum(
505            max_items[current_slot],
506            current_changes.get(InventoryOperation.MAX, np.inf),
507        )
508        min_items[current_slot] = np.maximum(
509            min_items[current_slot],
510            current_changes.get(InventoryOperation.MIN, -np.inf),
511        )
512
513        # Destination
514        if self._destination is not None:
515            dest_changes = self._inventory_operations.get(
516                InventoryOwner.DESTINATION, {}
517            )
518            dest_slot = self._destination.nonzero()[0]
519            added[dest_slot] += dest_changes.get(InventoryOperation.ADD, 0)
520            removed[dest_slot] += dest_changes.get(InventoryOperation.REMOVE, 0)
521            max_items[dest_slot] = np.minimum(
522                max_items[dest_slot],
523                dest_changes.get(InventoryOperation.MAX, np.inf),
524            )
525            min_items[dest_slot] = np.maximum(
526                min_items[dest_slot],
527                dest_changes.get(InventoryOperation.MIN, -np.inf),
528            )
529
530        return self._is_valid_inventory(
531            zones_inventories, added, removed, max_items, min_items
532        )
533
534    def _build_destination_op(self, world: "World") -> None:
535        if self.destination is None:
536            return
537        self._destination = np.zeros(world.n_zones, dtype=np.int32)
538        self._destination[world.slot_from_zone(self.destination)] = 1
539
540    def _build_zones_op(self, world: "World") -> None:
541        if self.zone is None:
542            return
543        self._zone = np.zeros(world.n_zones, dtype=np.int32)
544        self._zone[world.slot_from_zone(self.zone)] = 1
545
546    def _build_inventory_ops(self, world: "World"):
547        self._inventory_operations = {}
548        for owner, operations in self.inventory_changes.items():
549            self._build_inventory_operation(owner, operations, world)
550        self._build_apply_operations()
551
552    def _build_inventory_operation(
553        self, owner: InventoryOwner, operations: InventoryChanges, world: "World"
554    ):
555        owner = InventoryOwner(owner)
556        if owner is InventoryOwner.PLAYER:
557            world_items_list = world.items
558        else:
559            world_items_list = world.zones_items
560
561        for operation, stacks in operations.items():
562            operation = InventoryOperation(operation)
563            default_value = 0
564            if operation is InventoryOperation.MAX:
565                default_value = np.inf
566            if owner is InventoryOwner.ZONES:
567                operation_arr = self._build_zones_items_op(
568                    stacks, world.zones, world.zones_items, default_value
569                )
570            else:
571                operation_arr = self._build_operation_array(
572                    stacks, world_items_list, default_value
573                )
574            if owner not in self._inventory_operations:
575                self._inventory_operations[owner] = {}
576            self._inventory_operations[owner][operation] = operation_arr
577
578    def _build_apply_operations(self):
579        for owner, operations in self._inventory_operations.items():
580            apply_op = InventoryOperation.APPLY
581            apply_arr = _build_apply_operation_array(operations)
582            self._inventory_operations[owner][apply_op] = apply_arr
583
584    def _build_operation_array(
585        self,
586        stacks: List[Stack],
587        world_items_list: List["Item"],
588        default_value: int = 0,
589    ) -> np.ndarray:
590        operation = default_value * np.ones(len(world_items_list), dtype=np.int32)
591        for stack in stacks:
592            item_slot = world_items_list.index(stack.item)
593            operation[item_slot] = stack.quantity
594        return operation
595
596    def _build_zones_items_op(
597        self,
598        stacks_per_zone: Dict[Zone, List["Stack"]],
599        zones: List[Zone],
600        zones_items: List["Item"],
601        default_value: float = 0.0,
602    ) -> np.ndarray:
603        operation = default_value * np.ones(
604            (len(zones), len(zones_items)), dtype=np.int32
605        )
606        for zone, stacks in stacks_per_zone.items():
607            zone_slot = zones.index(zone)
608            for stack in stacks:
609                item_slot = zones_items.index(stack.item)
610                operation[zone_slot, item_slot] = stack.quantity
611        return operation
612
613    def __str__(self) -> str:
614        return self.name
615
616    def __repr__(self) -> str:
617        return f"{self._preconditions_repr()}⟹{self._effects_repr()}"
618
619    def _preconditions_repr(self) -> str:
620        preconditions_text = ""
621
622        owners_brackets = {
623            PLAYER: ".",
624            CURRENT_ZONE: "Zone(.)",
625            DESTINATION: "Dest(.)",
626        }
627
628        for owner in InventoryOwner:
629            if owner is InventoryOwner.ZONES:
630                continue
631            owner_texts = []
632            owner_texts += _stacks_precontions_str(
633                self.get_changes(owner, InventoryOperation.MIN),
634                symbol="≥",
635            )
636            owner_texts += _stacks_precontions_str(
637                self.get_changes(owner, InventoryOperation.MAX),
638                symbol="≤",
639            )
640            stacks_text = ",".join(owner_texts)
641            if not owner_texts:
642                continue
643            if preconditions_text:
644                preconditions_text += " "
645            preconditions_text += owners_brackets[owner].replace(".", stacks_text)
646
647        zones_specific_ops: Dict[Zone, Dict[InventoryOperation, List[Stack]]] = {}
648        for op, zones_stacks in self.inventory_changes.get(
649            InventoryOwner.ZONES, {}
650        ).items():
651            for zone, stacks in zones_stacks.items():
652                if zone not in zones_specific_ops:
653                    zones_specific_ops[zone] = {}
654                if op not in zones_specific_ops[zone]:
655                    zones_specific_ops[zone][op] = []
656                zones_specific_ops[zone][op] += stacks
657
658        for zone, operations in zones_specific_ops.items():
659            owner_texts = []
660            owner_texts += _stacks_precontions_str(
661                operations.get(InventoryOperation.MIN, []),
662                symbol="≥",
663            )
664            owner_texts += _stacks_precontions_str(
665                operations.get(InventoryOperation.MAX, []),
666                symbol="≤",
667            )
668            stacks_text = ",".join(owner_texts)
669            if not owner_texts:
670                continue
671            if preconditions_text:
672                preconditions_text += " "
673            preconditions_text += f"{zone.name}({stacks_text})"
674
675        if self.zone is not None:
676            if preconditions_text:
677                preconditions_text += " "
678            preconditions_text += f"| at {self.zone.name}"
679
680        if preconditions_text:
681            preconditions_text += " "
682
683        return preconditions_text
684
685    def _effects_repr(self) -> str:
686        effects_text = ""
687        owners_brackets = {
688            PLAYER: ".",
689            CURRENT_ZONE: "Zone(.)",
690            DESTINATION: "Dest(.)",
691        }
692
693        for owner in InventoryOwner:
694            if owner is InventoryOwner.ZONES:
695                continue
696            owner_texts = []
697            owner_texts += _stacks_effects_str(
698                self.get_changes(owner, InventoryOperation.REMOVE),
699                stack_prefix="-",
700            )
701            owner_texts += _stacks_effects_str(
702                self.get_changes(owner, InventoryOperation.ADD),
703                stack_prefix="+",
704            )
705            stacks_text = ",".join(owner_texts)
706            if not owner_texts:
707                continue
708            effects_text += " "
709            effects_text += owners_brackets[owner].replace(".", stacks_text)
710
711        zones_specific_ops: Dict[Zone, Dict[InventoryOperation, List[Stack]]] = {}
712        for op, zones_stacks in self.inventory_changes.get(
713            InventoryOwner.ZONES, {}
714        ).items():
715            for zone, stacks in zones_stacks.items():
716                if zone not in zones_specific_ops:
717                    zones_specific_ops[zone] = {}
718                if op not in zones_specific_ops[zone]:
719                    zones_specific_ops[zone][op] = []
720                zones_specific_ops[zone][op] += stacks
721
722        for zone, operations in zones_specific_ops.items():
723            owner_texts = []
724            owner_texts += _stacks_effects_str(
725                operations.get(InventoryOperation.REMOVE, []),
726                stack_prefix="-",
727            )
728            owner_texts += _stacks_effects_str(
729                operations.get(InventoryOperation.ADD, []),
730                stack_prefix="+",
731            )
732            stacks_text = ",".join(owner_texts)
733            if not owner_texts:
734                continue
735            effects_text += " "
736            effects_text += f"{zone.name}({stacks_text})"
737
738        if self.destination is not None:
739            effects_text += " "
740            effects_text += f"| at {self.destination.name}"
741
742        return effects_text

The building blocks of every HierarchyCraft environment.

A list of transformations is what defines each HierarchyCraft environement. Transformation becomes the available actions and all available transitions of the environment.

Each transformation defines changes of:

  • the player inventory
  • the player position to a given destination
  • the current zone inventory
  • the destination zone inventory (if a destination is specified).
  • all specific zones inventories

Each inventory change is a list of removed (-) and added (+) Stack.

If specified, they may be restricted to only a subset of valid zones, all zones are valid by default.

A Transformation can only be applied if valid in the given state. A transformation is only valid if the player in a valid zone and all relevant inventories have enough items to be removed before adding new items.

The picture bellow illustrates the impact of an example transformation on a given HcraftState:

In this example, when applied, the transformation will:

  • (-) Remove 1 item "0", then (+) Add 4 item "3" in the player inventory.
  • Update the player position from the current zone "1". to the destination zone "3".
  • (-) Remove 2 zone item "0" and 1 zone item "1", then (+) Add 1 item "1" in the current zone inventory.
  • (-) Remove 1 zone item "2", then (+) Add 1 item "0" in the destination zone inventory.
  • (-) Remove 1 zone item "0" in the zone "1" inventory and 2 zone item "2" in the zone "2" inventory, then (+) Add 1 zone item "1" in the zone "0" inventory and 1 zone item "2" in the zone "1" inventory.
Transformation( name: Optional[str] = None, destination: Optional[Zone] = None, inventory_changes: Optional[List[Union[hcraft.transformation.Use, hcraft.transformation.Yield]]] = None, zone: Optional[Zone] = None)
294    def __init__(
295        self,
296        name: Optional[str] = None,
297        destination: Optional[Zone] = None,
298        inventory_changes: Optional[List[InventoryChange]] = None,
299        zone: Optional[Zone] = None,
300    ) -> None:
301        """The building blocks of every HierarchyCraft environment.
302
303        Args:
304            name: Name given to the Transformation. If None use repr instead.
305                Defaults to None.
306            destination: Destination zone.
307                Defaults to None.
308            inventory_changes: List of inventory changes done by this transformation.
309                Defaults to None.
310            zone: Zone to which Transformation is restricted. Unrestricted if None.
311                Defaults to None.
312        """
313        self.destination = destination
314        self._destination = None
315
316        self.zone = zone
317        self._zone = None
318
319        self._changes_list = inventory_changes
320        self.inventory_changes = _format_inventory_changes(inventory_changes)
321        self._inventory_operations: Optional[
322            Dict[InventoryOwner, InventoryOperations]
323        ] = None
324
325        self.name = name if name is not None else self.__repr__()

The building blocks of every HierarchyCraft environment.

Arguments:
  • name: Name given to the Transformation. If None use repr instead. Defaults to None.
  • destination: Destination zone. Defaults to None.
  • inventory_changes: List of inventory changes done by this transformation. Defaults to None.
  • zone: Zone to which Transformation is restricted. Unrestricted if None. Defaults to None.
destination
zone
inventory_changes
name
def apply( self, player_inventory: numpy.ndarray, position: numpy.ndarray, zones_inventories: numpy.ndarray) -> Tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray]:
327    def apply(
328        self,
329        player_inventory: np.ndarray,
330        position: np.ndarray,
331        zones_inventories: np.ndarray,
332    ) -> Tuple[np.ndarray, np.ndarray, np.ndarray]:
333        """Apply the transformation in place on the given state."""
334
335        for owner, operations in self._inventory_operations.items():
336            operation_arr = operations[InventoryOperation.APPLY]
337            if operation_arr is not None:
338                _update_inventory(
339                    owner,
340                    player_inventory,
341                    position,
342                    zones_inventories,
343                    self._destination,
344                    operation_arr,
345                )
346        if self._destination is not None:
347            position[...] = self._destination

Apply the transformation in place on the given state.

def is_valid(self, state: HcraftState) -> bool:
349    def is_valid(self, state: "HcraftState") -> bool:
350        """Is the transformation valid in the given state?"""
351        if not self._is_valid_position(state.position):
352            return False
353        if not self._is_valid_player_inventory(state.player_inventory):
354            return False
355        if not self._is_valid_zones_inventory(state.zones_inventories, state.position):
356            return False
357        return True

Is the transformation valid in the given state?

def build(self, world: hcraft.world.World) -> None:
359    def build(self, world: "World") -> None:
360        """Build the transformation array operations on the given world."""
361        self._build_destination_op(world)
362        self._build_inventory_ops(world)
363        self._build_zones_op(world)

Build the transformation array operations on the given world.

def get_changes( self, owner: hcraft.transformation.InventoryOwner, operation: hcraft.transformation.InventoryOperation, default: Any = None) -> Union[List[Stack], Dict[Zone, List[Stack]], NoneType]:
365    def get_changes(
366        self, owner: InventoryOwner, operation: InventoryOperation, default: Any = None
367    ) -> Optional[Union[List[Stack], Dict[Zone, List[Stack]]]]:
368        """Get individual changes for a given owner and a given operation.
369
370        Args:
371            owner: Owner of the inventory changes to get.
372            operation: Operation on the inventory to get.
373
374        Returns:
375            Changes of the inventory of the given owner with the given operation.
376        """
377        owner = InventoryOwner(owner)
378        operation = InventoryOperation(operation)
379        operations = self.inventory_changes.get(owner, {})
380        return operations.get(operation, default)

Get individual changes for a given owner and a given operation.

Arguments:
  • owner: Owner of the inventory changes to get.
  • operation: Operation on the inventory to get.
Returns:

Changes of the inventory of the given owner with the given operation.

def production( self, owner: hcraft.transformation.InventoryOwner) -> Set[Item]:
382    def production(self, owner: InventoryOwner) -> Set["Item"]:
383        """Set of produced items for the given owner by this transformation."""
384        return self._relevant_items_changed(owner, InventoryOperation.ADD)

Set of produced items for the given owner by this transformation.

def consumption( self, owner: hcraft.transformation.InventoryOwner) -> Set[Item]:
386    def consumption(self, owner: InventoryOwner) -> Set["Item"]:
387        """Set of consumed items for the given owner by this transformation."""
388        return self._relevant_items_changed(owner, InventoryOperation.REMOVE)

Set of consumed items for the given owner by this transformation.

def min_required( self, owner: hcraft.transformation.InventoryOwner) -> Set[Item]:
390    def min_required(self, owner: InventoryOwner) -> Set["Item"]:
391        """Set of items for which a minimum is required by this transformation
392        for the given owner."""
393        return self._relevant_items_changed(owner, InventoryOperation.MIN)

Set of items for which a minimum is required by this transformation for the given owner.

def max_required( self, owner: hcraft.transformation.InventoryOwner) -> Set[Item]:
395    def max_required(self, owner: InventoryOwner) -> Set["Item"]:
396        """Set of items for which a maximum is required by this transformation
397        for the given owner."""
398        return self._relevant_items_changed(owner, InventoryOperation.MAX)

Set of items for which a maximum is required by this transformation for the given owner.

produced_zones_items: Set[Item]
400    @property
401    def produced_zones_items(self) -> Set["Item"]:
402        """Set of produced zones items by this transformation."""
403        return (
404            self.production(CURRENT_ZONE)
405            | self.production(DESTINATION)
406            | self.production(InventoryOwner.ZONES)
407        )

Set of produced zones items by this transformation.

consumed_zones_items: Set[Item]
409    @property
410    def consumed_zones_items(self) -> Set["Item"]:
411        """Set of consumed zones items by this transformation."""
412        return (
413            self.consumption(CURRENT_ZONE)
414            | self.consumption(DESTINATION)
415            | self.consumption(InventoryOwner.ZONES)
416        )

Set of consumed zones items by this transformation.

min_required_zones_items: Set[Item]
418    @property
419    def min_required_zones_items(self) -> Set["Item"]:
420        """Set of zone items for which a minimum is required by this transformation."""
421        return (
422            self.min_required(CURRENT_ZONE)
423            | self.min_required(DESTINATION)
424            | self.min_required(InventoryOwner.ZONES)
425        )

Set of zone items for which a minimum is required by this transformation.

max_required_zones_items: Set[Item]
427    @property
428    def max_required_zones_items(self) -> Set["Item"]:
429        """Set of zone items for which a maximum is required by this transformation."""
430        return (
431            self.max_required(CURRENT_ZONE)
432            | self.max_required(DESTINATION)
433            | self.max_required(InventoryOwner.ZONES)
434        )

Set of zone items for which a maximum is required by this transformation.

@dataclass(frozen=True)
class Item:
5@dataclass(frozen=True)
6class Item:
7    """Represent an item for any hcraft environement."""
8
9    name: str

Represent an item for any hcraft environement.

Item(name: str)
name: str
@dataclass(frozen=True)
class Stack:
12@dataclass(frozen=True)
13class Stack:
14    """Represent a stack of an item for any hcraft environement"""
15
16    item: Item
17    quantity: int = 1
18
19    def __str__(self) -> str:
20        quantity_str = f"[{self.quantity}]" if self.quantity > 1 else ""
21        return f"{quantity_str}{self.item.name}"

Represent a stack of an item for any hcraft environement

Stack(item: Item, quantity: int = 1)
item: Item
quantity: int = 1
@dataclass(frozen=True)
class Zone:
24@dataclass(frozen=True)
25class Zone:
26    """Represent a zone for any hcraft environement."""
27
28    name: str

Represent a zone for any hcraft environement.

Zone(name: str)
name: str
class HcraftEnv(typing.Generic[~ObsType, ~ActType]):
312class HcraftEnv(Env):
313    """Environment to simulate inventory management."""
314
315    def __init__(
316        self,
317        world: "World",
318        purpose: Optional[Union[Purpose, List["Task"], "Task"]] = None,
319        invalid_reward: float = -1.0,
320        render_window: Optional[HcraftWindow] = None,
321        name: str = "HierarchyCraft",
322        max_step: Optional[int] = None,
323    ) -> None:
324        """
325        Args:
326            world: World defining the environment.
327            purpose: Purpose of the player, defining rewards and termination.
328                Defaults to None, hence a sandbox environment.
329            invalid_reward: Reward given to the agent for invalid actions.
330                Defaults to -1.0.
331            render_window: Window using to render the environment with pygame.
332            name: Name of the environement. Defaults to 'HierarchyCraft'.
333            max_step: (Optional[int], optional): Maximum number of steps before episode truncation.
334                If None, never truncates the episode. Defaults to None.
335        """
336        self.world = world
337        self.invalid_reward = invalid_reward
338        self.max_step = max_step
339        self.name = name
340        self._all_behaviors = None
341
342        self.render_window = render_window
343        self.render_mode = "rgb_array"
344
345        self.state = HcraftState(self.world)
346        self.current_step = 0
347        self.current_score = 0
348        self.cumulated_score = 0
349        self.episodes = 0
350        self.task_successes: Optional[SuccessCounter] = None
351        self.terminal_successes: Optional[SuccessCounter] = None
352
353        if purpose is None:
354            purpose = Purpose(None)
355        if not isinstance(purpose, Purpose):
356            purpose = Purpose(tasks=purpose)
357        self.purpose = purpose
358        self.metadata = {}
359
360    @property
361    def truncated(self) -> bool:
362        """Whether the time limit has been exceeded."""
363        if self.max_step is None:
364            return False
365        return self.current_step >= self.max_step
366
367    @property
368    def observation_space(self) -> Union[BoxSpace, TupleSpace]:
369        """Observation space for the Agent."""
370        obs_space = BoxSpace(
371            low=np.array(
372                [0 for _ in range(self.world.n_items)]
373                + [0 for _ in range(self.world.n_zones)]
374                + [0 for _ in range(self.world.n_zones_items)]
375            ),
376            high=np.array(
377                [np.inf for _ in range(self.world.n_items)]
378                + [1 for _ in range(self.world.n_zones)]
379                + [np.inf for _ in range(self.world.n_zones_items)]
380            ),
381        )
382
383        return obs_space
384
385    @property
386    def action_space(self) -> DiscreteSpace:
387        """Action space for the Agent.
388
389        Actions are expected to often be invalid.
390        """
391        return DiscreteSpace(len(self.world.transformations))
392
393    def action_masks(self) -> np.ndarray:
394        """Return boolean mask of valid actions."""
395        return np.array([t.is_valid(self.state) for t in self.world.transformations])
396
397    def step(
398        self, action: Union[int, str, np.ndarray]
399    ) -> Tuple[np.ndarray, float, bool, bool, dict]:
400        """Perform one step in the environment given the index of a wanted transformation.
401
402        If the selected transformation can be performed, the state is updated and
403        a reward is given depending of the environment tasks.
404        Else the state is left unchanged and the `invalid_reward` is given to the player.
405
406        """
407
408        if isinstance(action, np.ndarray):
409            if not action.size == 1:
410                raise TypeError(
411                    "Actions should be integers corresponding the a transformation index"
412                    f", got array with multiple elements:\n{action}."
413                )
414            action = action.flatten()[0]
415        try:
416            action = int(action)
417        except (TypeError, ValueError) as e:
418            raise TypeError(
419                "Actions should be integers corresponding the a transformation index."
420            ) from e
421
422        self.current_step += 1
423
424        self.task_successes.step_reset()
425        self.terminal_successes.step_reset()
426
427        success = self.state.apply(action)
428        if success:
429            reward = self.purpose.reward(self.state)
430        else:
431            reward = self.invalid_reward
432
433        terminated = self.purpose.is_terminal(self.state)
434
435        self.task_successes.update(self.episodes)
436        self.terminal_successes.update(self.episodes)
437
438        self.current_score += reward
439        self.cumulated_score += reward
440        return (
441            self.state.observation,
442            reward,
443            terminated,
444            self.truncated,
445            self.infos(),
446        )
447
448    def render(self, mode: Optional[str] = None, **_kwargs) -> Union[str, np.ndarray]:
449        """Render the observation of the agent in a format depending on `render_mode`."""
450        if mode is not None:
451            self.render_mode = mode
452
453        if self.render_mode in ("human", "rgb_array"):  # for human interaction
454            return self._render_rgb_array()
455        if self.render_mode == "console":  # for console print
456            raise NotImplementedError
457        raise NotImplementedError
458
459    def reset(
460        self,
461        *,
462        seed: Optional[int] = None,
463        options: Optional[dict] = None,
464    ) -> Tuple[np.ndarray,]:
465        """Resets the state of the environement.
466
467        Returns:
468            (np.ndarray): The first observation.
469        """
470
471        if not self.purpose.built:
472            self.purpose.build(self)
473            self.task_successes = SuccessCounter(self.purpose.tasks)
474            self.terminal_successes = SuccessCounter(self.purpose.terminal_groups)
475
476        self.current_step = 0
477        self.current_score = 0
478        self.episodes += 1
479
480        self.task_successes.new_episode(self.episodes)
481        self.terminal_successes.new_episode(self.episodes)
482
483        self.state.reset()
484        self.purpose.reset()
485        return self.state.observation, self.infos()
486
487    def close(self):
488        """Closes the environment."""
489        if self.render_window is not None:
490            self.render_window.close()
491
492    @property
493    def all_behaviors(self) -> Dict[str, "Behavior"]:
494        """All solving behaviors using hebg."""
495        if self._all_behaviors is None:
496            self._all_behaviors = build_all_solving_behaviors(self)
497        return self._all_behaviors
498
499    def solving_behavior(self, task: "Task") -> "Behavior":
500        """Get the solving behavior for a given task.
501
502        Args:
503            task: Task to solve.
504
505        Returns:
506            Behavior: Behavior solving the task.
507
508        Example:
509            ```python
510            solving_behavior = env.solving_behavior(task)
511
512            done = False
513            observation, _info = env.reset()
514            while not done:
515                action = solving_behavior(observation)
516                observation, _reward, terminated, truncated, _info = env.step(action)
517                done = terminated or truncated
518
519            assert terminated  # Env is successfuly terminated
520            assert task.is_terminated # Task is successfuly terminated
521            ```
522        """
523        return self.all_behaviors[task_to_behavior_name(task)]
524
525    def planning_problem(self, **kwargs) -> HcraftPlanningProblem:
526        """Build this hcraft environment planning problem.
527
528        Returns:
529            Problem: Unified planning problem cooresponding to that environment.
530
531        Example:
532            Write as PDDL files:
533            ```python
534            from unified_planning.io import PDDLWriter
535            problem = env.planning_problem()
536            writer = PDDLWriter(problem.upf_problem)
537            writer.write_domain("domain.pddl")
538            writer.write_problem("problem.pddl")
539            ```
540
541            Using a plan to solve a HierarchyCraft gym environment:
542            ```python
543            hcraft_problem = env.planning_problem()
544
545            done = False
546
547            _observation, _info = env.reset()
548            while not done:
549                # Observations are not used when blindly following a plan
550                # But the state in required in order to replan if there is no plan left
551                action = hcraft_problem.action_from_plan(env.state)
552                _observation, _reward, terminated, truncated, _info = env.step(action)
553                done = terminated or truncated
554            assert env.purpose.is_terminated # Purpose is achieved
555            ```
556        """
557        return HcraftPlanningProblem(self.state, self.name, self.purpose, **kwargs)
558
559    def infos(self) -> dict:
560        infos = {
561            "action_is_legal": self.action_masks(),
562            "score": self.current_score,
563            "score_average": self.cumulated_score / self.episodes,
564        }
565        infos.update(self._tasks_infos())
566        return infos
567
568    def _tasks_infos(self):
569        infos = {}
570        infos.update(self.task_successes.done_infos)
571        infos.update(self.task_successes.rates_infos)
572        infos.update(self.terminal_successes.done_infos)
573        infos.update(self.terminal_successes.rates_infos)
574        return infos
575
576    def _render_rgb_array(self) -> np.ndarray:
577        """Render an image of the game.
578
579        Create the rendering window if not existing yet.
580        """
581        if self.render_window is None:
582            self.render_window = HcraftWindow()
583        if not self.render_window.built:
584            self.render_window.build(self)
585        fps = self.metadata.get("video.frames_per_second")
586        self.render_window.update_rendering(fps=fps)
587        return surface_to_rgb_array(self.render_window.screen)

Environment to simulate inventory management.

HcraftEnv( world: hcraft.world.World, purpose: Union[Purpose, List[hcraft.task.Task], hcraft.task.Task, NoneType] = None, invalid_reward: float = -1.0, render_window: Optional[hcraft.render.render.HcraftWindow] = None, name: str = 'HierarchyCraft', max_step: Optional[int] = None)
315    def __init__(
316        self,
317        world: "World",
318        purpose: Optional[Union[Purpose, List["Task"], "Task"]] = None,
319        invalid_reward: float = -1.0,
320        render_window: Optional[HcraftWindow] = None,
321        name: str = "HierarchyCraft",
322        max_step: Optional[int] = None,
323    ) -> None:
324        """
325        Args:
326            world: World defining the environment.
327            purpose: Purpose of the player, defining rewards and termination.
328                Defaults to None, hence a sandbox environment.
329            invalid_reward: Reward given to the agent for invalid actions.
330                Defaults to -1.0.
331            render_window: Window using to render the environment with pygame.
332            name: Name of the environement. Defaults to 'HierarchyCraft'.
333            max_step: (Optional[int], optional): Maximum number of steps before episode truncation.
334                If None, never truncates the episode. Defaults to None.
335        """
336        self.world = world
337        self.invalid_reward = invalid_reward
338        self.max_step = max_step
339        self.name = name
340        self._all_behaviors = None
341
342        self.render_window = render_window
343        self.render_mode = "rgb_array"
344
345        self.state = HcraftState(self.world)
346        self.current_step = 0
347        self.current_score = 0
348        self.cumulated_score = 0
349        self.episodes = 0
350        self.task_successes: Optional[SuccessCounter] = None
351        self.terminal_successes: Optional[SuccessCounter] = None
352
353        if purpose is None:
354            purpose = Purpose(None)
355        if not isinstance(purpose, Purpose):
356            purpose = Purpose(tasks=purpose)
357        self.purpose = purpose
358        self.metadata = {}
Arguments:
  • world: World defining the environment.
  • purpose: Purpose of the player, defining rewards and termination. Defaults to None, hence a sandbox environment.
  • invalid_reward: Reward given to the agent for invalid actions. Defaults to -1.0.
  • render_window: Window using to render the environment with pygame.
  • name: Name of the environement. Defaults to 'HierarchyCraft'.
  • max_step: (Optional[int], optional): Maximum number of steps before episode truncation. If None, never truncates the episode. Defaults to None.
world
invalid_reward
max_step
name
render_window
render_mode = None
state
current_step
current_score
cumulated_score
episodes
task_successes: Optional[hcraft.metrics.SuccessCounter]
terminal_successes: Optional[hcraft.metrics.SuccessCounter]
purpose
metadata = {'render_modes': []}
truncated: bool
360    @property
361    def truncated(self) -> bool:
362        """Whether the time limit has been exceeded."""
363        if self.max_step is None:
364            return False
365        return self.current_step >= self.max_step

Whether the time limit has been exceeded.

observation_space: Union[gymnasium.spaces.box.Box, gymnasium.spaces.tuple.Tuple]
367    @property
368    def observation_space(self) -> Union[BoxSpace, TupleSpace]:
369        """Observation space for the Agent."""
370        obs_space = BoxSpace(
371            low=np.array(
372                [0 for _ in range(self.world.n_items)]
373                + [0 for _ in range(self.world.n_zones)]
374                + [0 for _ in range(self.world.n_zones_items)]
375            ),
376            high=np.array(
377                [np.inf for _ in range(self.world.n_items)]
378                + [1 for _ in range(self.world.n_zones)]
379                + [np.inf for _ in range(self.world.n_zones_items)]
380            ),
381        )
382
383        return obs_space

Observation space for the Agent.

action_space: gymnasium.spaces.discrete.Discrete
385    @property
386    def action_space(self) -> DiscreteSpace:
387        """Action space for the Agent.
388
389        Actions are expected to often be invalid.
390        """
391        return DiscreteSpace(len(self.world.transformations))

Action space for the Agent.

Actions are expected to often be invalid.

def action_masks(self) -> numpy.ndarray:
393    def action_masks(self) -> np.ndarray:
394        """Return boolean mask of valid actions."""
395        return np.array([t.is_valid(self.state) for t in self.world.transformations])

Return boolean mask of valid actions.

def step( self, action: Union[int, str, numpy.ndarray]) -> Tuple[numpy.ndarray, float, bool, bool, dict]:
397    def step(
398        self, action: Union[int, str, np.ndarray]
399    ) -> Tuple[np.ndarray, float, bool, bool, dict]:
400        """Perform one step in the environment given the index of a wanted transformation.
401
402        If the selected transformation can be performed, the state is updated and
403        a reward is given depending of the environment tasks.
404        Else the state is left unchanged and the `invalid_reward` is given to the player.
405
406        """
407
408        if isinstance(action, np.ndarray):
409            if not action.size == 1:
410                raise TypeError(
411                    "Actions should be integers corresponding the a transformation index"
412                    f", got array with multiple elements:\n{action}."
413                )
414            action = action.flatten()[0]
415        try:
416            action = int(action)
417        except (TypeError, ValueError) as e:
418            raise TypeError(
419                "Actions should be integers corresponding the a transformation index."
420            ) from e
421
422        self.current_step += 1
423
424        self.task_successes.step_reset()
425        self.terminal_successes.step_reset()
426
427        success = self.state.apply(action)
428        if success:
429            reward = self.purpose.reward(self.state)
430        else:
431            reward = self.invalid_reward
432
433        terminated = self.purpose.is_terminal(self.state)
434
435        self.task_successes.update(self.episodes)
436        self.terminal_successes.update(self.episodes)
437
438        self.current_score += reward
439        self.cumulated_score += reward
440        return (
441            self.state.observation,
442            reward,
443            terminated,
444            self.truncated,
445            self.infos(),
446        )

Perform one step in the environment given the index of a wanted transformation.

If the selected transformation can be performed, the state is updated and a reward is given depending of the environment tasks. Else the state is left unchanged and the invalid_reward is given to the player.

def render(self, mode: Optional[str] = None, **_kwargs) -> Union[str, numpy.ndarray]:
448    def render(self, mode: Optional[str] = None, **_kwargs) -> Union[str, np.ndarray]:
449        """Render the observation of the agent in a format depending on `render_mode`."""
450        if mode is not None:
451            self.render_mode = mode
452
453        if self.render_mode in ("human", "rgb_array"):  # for human interaction
454            return self._render_rgb_array()
455        if self.render_mode == "console":  # for console print
456            raise NotImplementedError
457        raise NotImplementedError

Render the observation of the agent in a format depending on render_mode.

def reset( self, *, seed: Optional[int] = None, options: Optional[dict] = None) -> Tuple[numpy.ndarray]:
459    def reset(
460        self,
461        *,
462        seed: Optional[int] = None,
463        options: Optional[dict] = None,
464    ) -> Tuple[np.ndarray,]:
465        """Resets the state of the environement.
466
467        Returns:
468            (np.ndarray): The first observation.
469        """
470
471        if not self.purpose.built:
472            self.purpose.build(self)
473            self.task_successes = SuccessCounter(self.purpose.tasks)
474            self.terminal_successes = SuccessCounter(self.purpose.terminal_groups)
475
476        self.current_step = 0
477        self.current_score = 0
478        self.episodes += 1
479
480        self.task_successes.new_episode(self.episodes)
481        self.terminal_successes.new_episode(self.episodes)
482
483        self.state.reset()
484        self.purpose.reset()
485        return self.state.observation, self.infos()

Resets the state of the environement.

Returns:

(np.ndarray): The first observation.

def close(self):
487    def close(self):
488        """Closes the environment."""
489        if self.render_window is not None:
490            self.render_window.close()

Closes the environment.

all_behaviors: Dict[str, hebg.behavior.Behavior]
492    @property
493    def all_behaviors(self) -> Dict[str, "Behavior"]:
494        """All solving behaviors using hebg."""
495        if self._all_behaviors is None:
496            self._all_behaviors = build_all_solving_behaviors(self)
497        return self._all_behaviors

All solving behaviors using hebg.

def solving_behavior(self, task: hcraft.task.Task) -> hebg.behavior.Behavior:
499    def solving_behavior(self, task: "Task") -> "Behavior":
500        """Get the solving behavior for a given task.
501
502        Args:
503            task: Task to solve.
504
505        Returns:
506            Behavior: Behavior solving the task.
507
508        Example:
509            ```python
510            solving_behavior = env.solving_behavior(task)
511
512            done = False
513            observation, _info = env.reset()
514            while not done:
515                action = solving_behavior(observation)
516                observation, _reward, terminated, truncated, _info = env.step(action)
517                done = terminated or truncated
518
519            assert terminated  # Env is successfuly terminated
520            assert task.is_terminated # Task is successfuly terminated
521            ```
522        """
523        return self.all_behaviors[task_to_behavior_name(task)]

Get the solving behavior for a given task.

Arguments:
  • task: Task to solve.
Returns:

Behavior: Behavior solving the task.

Example:
solving_behavior = env.solving_behavior(task)

done = False
observation, _info = env.reset()
while not done:
    action = solving_behavior(observation)
    observation, _reward, terminated, truncated, _info = env.step(action)
    done = terminated or truncated

assert terminated  # Env is successfuly terminated
assert task.is_terminated # Task is successfuly terminated
def planning_problem(self, **kwargs) -> hcraft.planning.HcraftPlanningProblem:
525    def planning_problem(self, **kwargs) -> HcraftPlanningProblem:
526        """Build this hcraft environment planning problem.
527
528        Returns:
529            Problem: Unified planning problem cooresponding to that environment.
530
531        Example:
532            Write as PDDL files:
533            ```python
534            from unified_planning.io import PDDLWriter
535            problem = env.planning_problem()
536            writer = PDDLWriter(problem.upf_problem)
537            writer.write_domain("domain.pddl")
538            writer.write_problem("problem.pddl")
539            ```
540
541            Using a plan to solve a HierarchyCraft gym environment:
542            ```python
543            hcraft_problem = env.planning_problem()
544
545            done = False
546
547            _observation, _info = env.reset()
548            while not done:
549                # Observations are not used when blindly following a plan
550                # But the state in required in order to replan if there is no plan left
551                action = hcraft_problem.action_from_plan(env.state)
552                _observation, _reward, terminated, truncated, _info = env.step(action)
553                done = terminated or truncated
554            assert env.purpose.is_terminated # Purpose is achieved
555            ```
556        """
557        return HcraftPlanningProblem(self.state, self.name, self.purpose, **kwargs)

Build this hcraft environment planning problem.

Returns:

Problem: Unified planning problem cooresponding to that environment.

Example:

Write as PDDL files:

from unified_planning.io import PDDLWriter
problem = env.planning_problem()
writer = PDDLWriter(problem.upf_problem)
writer.write_domain("domain.pddl")
writer.write_problem("problem.pddl")

Using a plan to solve a HierarchyCraft gym environment:

hcraft_problem = env.planning_problem()

done = False

_observation, _info = env.reset()
while not done:
    # Observations are not used when blindly following a plan
    # But the state in required in order to replan if there is no plan left
    action = hcraft_problem.action_from_plan(env.state)
    _observation, _reward, terminated, truncated, _info = env.step(action)
    done = terminated or truncated
assert env.purpose.is_terminated # Purpose is achieved
def infos(self) -> dict:
559    def infos(self) -> dict:
560        infos = {
561            "action_is_legal": self.action_masks(),
562            "score": self.current_score,
563            "score_average": self.cumulated_score / self.episodes,
564        }
565        infos.update(self._tasks_infos())
566        return infos
Inherited Members
gymnasium.core.Env
spec
unwrapped
np_random_seed
np_random
has_wrapper_attr
get_wrapper_attr
set_wrapper_attr
def get_human_action( env: HcraftEnv, additional_events: List[pygame.event.Event] = None, can_be_none: bool = False, fps: Optional[float] = None):
10def get_human_action(
11    env: "HcraftEnv",
12    additional_events: List["Event"] = None,
13    can_be_none: bool = False,
14    fps: Optional[float] = None,
15):
16    """Update the environment rendering and gather potential action given by the UI.
17
18    Args:
19        env: The running HierarchyCraft environment.
20        additional_events (Optional): Additional simulated pygame events.
21        can_be_none: If False, this function will loop on rendering until an action is found.
22            If True, will return None if no action was found after one rendering update.
23
24    Returns:
25        The action found using the UI.
26
27    """
28    action_chosen = False
29    while not action_chosen:
30        action = env.render_window.update_rendering(additional_events, fps)
31        action_chosen = action is not None or can_be_none
32    return action

Update the environment rendering and gather potential action given by the UI.

Arguments:
  • env: The running HierarchyCraft environment.
  • additional_events (Optional): Additional simulated pygame events.
  • can_be_none: If False, this function will loop on rendering until an action is found. If True, will return None if no action was found after one rendering update.
Returns:

The action found using the UI.

def render_env_with_human(env: HcraftEnv, n_episodes: int = 1):
35def render_env_with_human(env: "HcraftEnv", n_episodes: int = 1):
36    """Render the given environment with human iteractions.
37
38    Args:
39        env (HcraftEnv): The HierarchyCraft environment to run.
40        n_episodes (int, optional): Number of episodes to run. Defaults to 1.
41    """
42    print("Purpose: ", env.purpose)
43
44    for _ in range(n_episodes):
45        env.reset()
46        done = False
47        total_reward = 0
48        while not done:
49            env.render()
50            action = get_human_action(env)
51            print(f"Human did: {env.world.transformations[action]}")
52
53            _observation, reward, terminated, truncated, _info = env.step(action)
54            done = terminated or truncated
55            total_reward += reward
56
57        print("SCORE: ", total_reward)

Render the given environment with human iteractions.

Arguments:
  • env (HcraftEnv): The HierarchyCraft environment to run.
  • n_episodes (int, optional): Number of episodes to run. Defaults to 1.
class Purpose:
156class Purpose:
157    """A purpose for a HierarchyCraft player based on a list of tasks."""
158
159    def __init__(
160        self,
161        tasks: Optional[Union[Task, List[Task]]] = None,
162        timestep_reward: float = 0.0,
163        default_reward_shaping: RewardShaping = RewardShaping.NONE,
164        shaping_value: float = 1.0,
165    ) -> None:
166        """
167        Args:
168            tasks: Tasks to add to the Purpose.
169                Defaults to None.
170            timestep_reward: Reward for each timestep.
171                Defaults to 0.0.
172            default_reward_shaping: Default reward shaping for tasks.
173                Defaults to RewardShaping.NONE.
174            shaping_value: Reward value used in reward shaping if any.
175                Defaults to 1.0.
176        """
177        self.tasks: List[Task] = []
178        self.timestep_reward = timestep_reward
179        self.shaping_value = shaping_value
180        self.default_reward_shaping = default_reward_shaping
181        self.built = False
182
183        self.reward_shaping: Dict[Task, RewardShaping] = {}
184        self.terminal_groups: List[TerminalGroup] = []
185
186        if isinstance(tasks, Task):
187            tasks = [tasks]
188        elif tasks is None:
189            tasks = []
190        for task in tasks:
191            self.add_task(task, reward_shaping=default_reward_shaping)
192
193        self._best_terminal_group = None
194
195    def add_task(
196        self,
197        task: Task,
198        reward_shaping: Optional[RewardShaping] = None,
199        terminal_groups: Optional[Union[str, List[str]]] = "default",
200    ):
201        """Add a new task to the purpose.
202
203        Args:
204            task: Task to be added to the purpose.
205            reward_shaping: Reward shaping for this task.
206                Defaults to purpose's default reward shaping.
207            terminal_groups: Purpose terminates when ALL the tasks of ANY terminal group terminates.
208                If terminal groups is "" or None, task will be optional and will
209                not allow to terminate the purpose at all.
210                By default, tasks are added in the "default" group and hence
211                ALL tasks have to be done to terminate the purpose.
212        """
213        if reward_shaping is None:
214            reward_shaping = self.default_reward_shaping
215        reward_shaping = RewardShaping(reward_shaping)
216        if terminal_groups:
217            if isinstance(terminal_groups, str):
218                terminal_groups = [terminal_groups]
219            for terminal_group in terminal_groups:
220                existing_group = self._terminal_group_from_name(terminal_group)
221                if not existing_group:
222                    existing_group = TerminalGroup(terminal_group)
223                    self.terminal_groups.append(existing_group)
224                existing_group.tasks.append(task)
225
226        self.reward_shaping[task] = reward_shaping
227        self.tasks.append(task)
228
229    def build(self, env: "HcraftEnv"):
230        """
231        Builds the purpose of the player relative to the given environment.
232
233        Args:
234            env: The HierarchyCraft environment to build upon.
235        """
236        if self.built:
237            return
238
239        if not self.tasks:
240            return
241        # Add reward shaping subtasks
242        for task in self.tasks:
243            subtasks = self._add_reward_shaping_subtasks(
244                task, env, self.reward_shaping[task]
245            )
246            for subtask in subtasks:
247                self.add_task(subtask, RewardShaping.NONE, terminal_groups=None)
248
249        # Build all tasks
250        for task in self.tasks:
251            task.build(env.world)
252
253        self.built = True
254
255    def reward(self, state: "HcraftState") -> float:
256        """
257        Returns the purpose reward for the given state based on tasks.
258        """
259        reward = self.timestep_reward
260        if not self.tasks:
261            return reward
262        for task in self.tasks:
263            reward += task.reward(state)
264        return reward
265
266    def is_terminal(self, state: "HcraftState") -> bool:
267        """
268        Returns True if the given state is terminal for the whole purpose.
269        """
270        if not self.tasks:
271            return False
272        for task in self.tasks:
273            task.is_terminal(state)
274        for terminal_group in self.terminal_groups:
275            if terminal_group.terminated:
276                return True
277        return False
278
279    def reset(self) -> None:
280        """Reset the purpose."""
281        for task in self.tasks:
282            task.reset()
283
284    @property
285    def optional_tasks(self) -> List[Task]:
286        """List of tasks in no terminal group hence being optinal."""
287        terminal_tasks = []
288        for group in self.terminal_groups:
289            terminal_tasks += group.tasks
290        return [task for task in self.tasks if task not in terminal_tasks]
291
292    @property
293    def terminated(self) -> bool:
294        """True if any of the terminal groups are terminated."""
295        return any(
296            all(task.terminated for task in terminal_group.tasks)
297            for terminal_group in self.terminal_groups
298        )
299
300    @property
301    def best_terminal_group(self) -> TerminalGroup:
302        """Best rewarding terminal group."""
303        if self._best_terminal_group is not None:
304            return self._best_terminal_group
305
306        best_terminal_group, best_terminal_value = None, -np.inf
307        for terminal_group in self.terminal_groups:
308            terminal_value = sum(task._reward for task in terminal_group.tasks)
309            if terminal_value > best_terminal_value:
310                best_terminal_value = terminal_value
311                best_terminal_group = terminal_group
312
313        self._best_terminal_group = best_terminal_group
314        return best_terminal_group
315
316    def _terminal_group_from_name(self, name: str) -> Optional[TerminalGroup]:
317        if name not in self.terminal_groups:
318            return None
319        group_id = self.terminal_groups.index(name)
320        return self.terminal_groups[group_id]
321
322    def _add_reward_shaping_subtasks(
323        self, task: Task, env: "HcraftEnv", reward_shaping: RewardShaping
324    ) -> List[Task]:
325        if reward_shaping == RewardShaping.NONE:
326            return []
327        if reward_shaping == RewardShaping.ALL_ACHIVEMENTS:
328            return _all_subtasks(env.world, self.shaping_value)
329        if reward_shaping == RewardShaping.INPUTS_ACHIVEMENT:
330            return _inputs_subtasks(task, env.world, self.shaping_value)
331        if reward_shaping == RewardShaping.REQUIREMENTS_ACHIVEMENTS:
332            return _required_subtasks(task, env, self.shaping_value)
333        raise NotImplementedError
334
335    def __str__(self) -> str:
336        terminal_groups_str = []
337        for terminal_group in self.terminal_groups:
338            tasks_str_joined = self._tasks_str(terminal_group.tasks)
339            group_str = f"{terminal_group.name}:[{tasks_str_joined}]"
340            terminal_groups_str.append(group_str)
341        optional_tasks_str = self._tasks_str(self.optional_tasks)
342        if optional_tasks_str:
343            group_str = f"optional:[{optional_tasks_str}]"
344            terminal_groups_str.append(group_str)
345        joined_groups_str = ", ".join(terminal_groups_str)
346        return f"Purpose({joined_groups_str})"
347
348    def _tasks_str(self, tasks: List[Task]) -> str:
349        tasks_str = []
350        for task in tasks:
351            shaping = self.reward_shaping[task]
352            shaping_str = f"#{shaping.value}" if shaping != RewardShaping.NONE else ""
353            tasks_str.append(f"{task}{shaping_str}")
354        return ",".join(tasks_str)

A purpose for a HierarchyCraft player based on a list of tasks.

Purpose( tasks: Union[hcraft.task.Task, List[hcraft.task.Task], NoneType] = None, timestep_reward: float = 0.0, default_reward_shaping: hcraft.purpose.RewardShaping = <RewardShaping.NONE: 'none'>, shaping_value: float = 1.0)
159    def __init__(
160        self,
161        tasks: Optional[Union[Task, List[Task]]] = None,
162        timestep_reward: float = 0.0,
163        default_reward_shaping: RewardShaping = RewardShaping.NONE,
164        shaping_value: float = 1.0,
165    ) -> None:
166        """
167        Args:
168            tasks: Tasks to add to the Purpose.
169                Defaults to None.
170            timestep_reward: Reward for each timestep.
171                Defaults to 0.0.
172            default_reward_shaping: Default reward shaping for tasks.
173                Defaults to RewardShaping.NONE.
174            shaping_value: Reward value used in reward shaping if any.
175                Defaults to 1.0.
176        """
177        self.tasks: List[Task] = []
178        self.timestep_reward = timestep_reward
179        self.shaping_value = shaping_value
180        self.default_reward_shaping = default_reward_shaping
181        self.built = False
182
183        self.reward_shaping: Dict[Task, RewardShaping] = {}
184        self.terminal_groups: List[TerminalGroup] = []
185
186        if isinstance(tasks, Task):
187            tasks = [tasks]
188        elif tasks is None:
189            tasks = []
190        for task in tasks:
191            self.add_task(task, reward_shaping=default_reward_shaping)
192
193        self._best_terminal_group = None
Arguments:
  • tasks: Tasks to add to the Purpose. Defaults to None.
  • timestep_reward: Reward for each timestep. Defaults to 0.0.
  • default_reward_shaping: Default reward shaping for tasks. Defaults to RewardShaping.NONE.
  • shaping_value: Reward value used in reward shaping if any. Defaults to 1.0.
tasks: List[hcraft.task.Task]
timestep_reward
shaping_value
default_reward_shaping
built
reward_shaping: Dict[hcraft.task.Task, hcraft.purpose.RewardShaping]
terminal_groups: List[hcraft.purpose.TerminalGroup]
def add_task( self, task: hcraft.task.Task, reward_shaping: Optional[hcraft.purpose.RewardShaping] = None, terminal_groups: Union[str, List[str], NoneType] = 'default'):
195    def add_task(
196        self,
197        task: Task,
198        reward_shaping: Optional[RewardShaping] = None,
199        terminal_groups: Optional[Union[str, List[str]]] = "default",
200    ):
201        """Add a new task to the purpose.
202
203        Args:
204            task: Task to be added to the purpose.
205            reward_shaping: Reward shaping for this task.
206                Defaults to purpose's default reward shaping.
207            terminal_groups: Purpose terminates when ALL the tasks of ANY terminal group terminates.
208                If terminal groups is "" or None, task will be optional and will
209                not allow to terminate the purpose at all.
210                By default, tasks are added in the "default" group and hence
211                ALL tasks have to be done to terminate the purpose.
212        """
213        if reward_shaping is None:
214            reward_shaping = self.default_reward_shaping
215        reward_shaping = RewardShaping(reward_shaping)
216        if terminal_groups:
217            if isinstance(terminal_groups, str):
218                terminal_groups = [terminal_groups]
219            for terminal_group in terminal_groups:
220                existing_group = self._terminal_group_from_name(terminal_group)
221                if not existing_group:
222                    existing_group = TerminalGroup(terminal_group)
223                    self.terminal_groups.append(existing_group)
224                existing_group.tasks.append(task)
225
226        self.reward_shaping[task] = reward_shaping
227        self.tasks.append(task)

Add a new task to the purpose.

Arguments:
  • task: Task to be added to the purpose.
  • reward_shaping: Reward shaping for this task. Defaults to purpose's default reward shaping.
  • terminal_groups: Purpose terminates when ALL the tasks of ANY terminal group terminates. If terminal groups is "" or None, task will be optional and will not allow to terminate the purpose at all. By default, tasks are added in the "default" group and hence ALL tasks have to be done to terminate the purpose.
def build(self, env: HcraftEnv):
229    def build(self, env: "HcraftEnv"):
230        """
231        Builds the purpose of the player relative to the given environment.
232
233        Args:
234            env: The HierarchyCraft environment to build upon.
235        """
236        if self.built:
237            return
238
239        if not self.tasks:
240            return
241        # Add reward shaping subtasks
242        for task in self.tasks:
243            subtasks = self._add_reward_shaping_subtasks(
244                task, env, self.reward_shaping[task]
245            )
246            for subtask in subtasks:
247                self.add_task(subtask, RewardShaping.NONE, terminal_groups=None)
248
249        # Build all tasks
250        for task in self.tasks:
251            task.build(env.world)
252
253        self.built = True

Builds the purpose of the player relative to the given environment.

Arguments:
  • env: The HierarchyCraft environment to build upon.
def reward(self, state: HcraftState) -> float:
255    def reward(self, state: "HcraftState") -> float:
256        """
257        Returns the purpose reward for the given state based on tasks.
258        """
259        reward = self.timestep_reward
260        if not self.tasks:
261            return reward
262        for task in self.tasks:
263            reward += task.reward(state)
264        return reward

Returns the purpose reward for the given state based on tasks.

def is_terminal(self, state: HcraftState) -> bool:
266    def is_terminal(self, state: "HcraftState") -> bool:
267        """
268        Returns True if the given state is terminal for the whole purpose.
269        """
270        if not self.tasks:
271            return False
272        for task in self.tasks:
273            task.is_terminal(state)
274        for terminal_group in self.terminal_groups:
275            if terminal_group.terminated:
276                return True
277        return False

Returns True if the given state is terminal for the whole purpose.

def reset(self) -> None:
279    def reset(self) -> None:
280        """Reset the purpose."""
281        for task in self.tasks:
282            task.reset()

Reset the purpose.

optional_tasks: List[hcraft.task.Task]
284    @property
285    def optional_tasks(self) -> List[Task]:
286        """List of tasks in no terminal group hence being optinal."""
287        terminal_tasks = []
288        for group in self.terminal_groups:
289            terminal_tasks += group.tasks
290        return [task for task in self.tasks if task not in terminal_tasks]

List of tasks in no terminal group hence being optinal.

terminated: bool
292    @property
293    def terminated(self) -> bool:
294        """True if any of the terminal groups are terminated."""
295        return any(
296            all(task.terminated for task in terminal_group.tasks)
297            for terminal_group in self.terminal_groups
298        )

True if any of the terminal groups are terminated.

best_terminal_group: hcraft.purpose.TerminalGroup
300    @property
301    def best_terminal_group(self) -> TerminalGroup:
302        """Best rewarding terminal group."""
303        if self._best_terminal_group is not None:
304            return self._best_terminal_group
305
306        best_terminal_group, best_terminal_value = None, -np.inf
307        for terminal_group in self.terminal_groups:
308            terminal_value = sum(task._reward for task in terminal_group.tasks)
309            if terminal_value > best_terminal_value:
310                best_terminal_value = terminal_value
311                best_terminal_group = terminal_group
312
313        self._best_terminal_group = best_terminal_group
314        return best_terminal_group

Best rewarding terminal group.

class GetItemTask(hcraft.task.AchievementTask):
 83class GetItemTask(AchievementTask):
 84    """Task of getting a given quantity of an item."""
 85
 86    def __init__(self, item_stack: Union[Item, Stack], reward: float = 1.0):
 87        self.item_stack = _stack_item(item_stack)
 88        super().__init__(name=self.get_name(self.item_stack), reward=reward)
 89
 90    def build(self, world: "World") -> None:
 91        super().build(world)
 92        item_slot = world.items.index(self.item_stack.item)
 93        self._terminate_player_items[item_slot] = self.item_stack.quantity
 94
 95    def _is_terminal(self, state: "HcraftState") -> bool:
 96        return np.all(state.player_inventory >= self._terminate_player_items)
 97
 98    @staticmethod
 99    def get_name(stack: Stack):
100        """Name of the task for a given Stack"""
101        quantity_str = _quantity_str(stack.quantity)
102        return f"Get{quantity_str}{stack.item.name}"

Task of getting a given quantity of an item.

GetItemTask( item_stack: Union[Item, Stack], reward: float = 1.0)
86    def __init__(self, item_stack: Union[Item, Stack], reward: float = 1.0):
87        self.item_stack = _stack_item(item_stack)
88        super().__init__(name=self.get_name(self.item_stack), reward=reward)
item_stack
def build(self, world: hcraft.world.World) -> None:
90    def build(self, world: "World") -> None:
91        super().build(world)
92        item_slot = world.items.index(self.item_stack.item)
93        self._terminate_player_items[item_slot] = self.item_stack.quantity

Build the task operation arrays based on the given world.

@staticmethod
def get_name(stack: Stack):
 98    @staticmethod
 99    def get_name(stack: Stack):
100        """Name of the task for a given Stack"""
101        quantity_str = _quantity_str(stack.quantity)
102        return f"Get{quantity_str}{stack.item.name}"

Name of the task for a given Stack

Inherited Members
hcraft.task.AchievementTask
reward
hcraft.task.Task
name
terminated
is_terminal
reset
class GoToZoneTask(hcraft.task.AchievementTask):
105class GoToZoneTask(AchievementTask):
106    """Task to go to a given zone."""
107
108    def __init__(self, zone: Zone, reward: float = 1.0) -> None:
109        super().__init__(name=self.get_name(zone), reward=reward)
110        self.zone = zone
111
112    def build(self, world: "World"):
113        super().build(world)
114        zone_slot = world.zones.index(self.zone)
115        self._terminate_position[zone_slot] = 1
116
117    def _is_terminal(self, state: "HcraftState") -> bool:
118        return np.all(state.position == self._terminate_position)
119
120    @staticmethod
121    def get_name(zone: Zone):
122        """Name of the task for a given Stack"""
123        return f"Go to {zone.name}"

Task to go to a given zone.

GoToZoneTask(zone: Zone, reward: float = 1.0)
108    def __init__(self, zone: Zone, reward: float = 1.0) -> None:
109        super().__init__(name=self.get_name(zone), reward=reward)
110        self.zone = zone
zone
def build(self, world: hcraft.world.World):
112    def build(self, world: "World"):
113        super().build(world)
114        zone_slot = world.zones.index(self.zone)
115        self._terminate_position[zone_slot] = 1

Build the task operation arrays based on the given world.

@staticmethod
def get_name(zone: Zone):
120    @staticmethod
121    def get_name(zone: Zone):
122        """Name of the task for a given Stack"""
123        return f"Go to {zone.name}"

Name of the task for a given Stack

Inherited Members
hcraft.task.AchievementTask
reward
hcraft.task.Task
name
terminated
is_terminal
reset
class PlaceItemTask(hcraft.task.AchievementTask):
126class PlaceItemTask(AchievementTask):
127    """Task to place a quantity of item in a given zone.
128
129    If no zone is given, consider placing the item anywhere.
130
131    """
132
133    def __init__(
134        self,
135        item_stack: Union[Item, Stack],
136        zone: Optional[Union[Zone, List[Zone]]] = None,
137        reward: float = 1.0,
138    ):
139        item_stack = _stack_item(item_stack)
140        self.item_stack = item_stack
141        self.zone = zone
142        super().__init__(name=self.get_name(item_stack, zone), reward=reward)
143
144    def build(self, world: "World"):
145        super().build(world)
146        if self.zone is None:
147            zones_slots = np.arange(self._terminate_zones_items.shape[0])
148        else:
149            zones_slots = np.array([world.slot_from_zone(self.zone)])
150        zone_item_slot = world.zones_items.index(self.item_stack.item)
151        self._terminate_zones_items[zones_slots, zone_item_slot] = (
152            self.item_stack.quantity
153        )
154
155    def _is_terminal(self, state: "HcraftState") -> bool:
156        if self.zone is None:
157            return np.any(
158                np.all(state.zones_inventories >= self._terminate_zones_items, axis=1)
159            )
160        return np.all(state.zones_inventories >= self._terminate_zones_items)
161
162    @staticmethod
163    def get_name(stack: Stack, zone: Optional[Zone]):
164        """Name of the task for a given Stack and list of Zone"""
165        quantity_str = _quantity_str(stack.quantity)
166        zones_str = _zones_str(zone)
167        return f"Place{quantity_str}{stack.item.name}{zones_str}"

Task to place a quantity of item in a given zone.

If no zone is given, consider placing the item anywhere.

PlaceItemTask( item_stack: Union[Item, Stack], zone: Union[Zone, List[Zone], NoneType] = None, reward: float = 1.0)
133    def __init__(
134        self,
135        item_stack: Union[Item, Stack],
136        zone: Optional[Union[Zone, List[Zone]]] = None,
137        reward: float = 1.0,
138    ):
139        item_stack = _stack_item(item_stack)
140        self.item_stack = item_stack
141        self.zone = zone
142        super().__init__(name=self.get_name(item_stack, zone), reward=reward)
item_stack
zone
def build(self, world: hcraft.world.World):
144    def build(self, world: "World"):
145        super().build(world)
146        if self.zone is None:
147            zones_slots = np.arange(self._terminate_zones_items.shape[0])
148        else:
149            zones_slots = np.array([world.slot_from_zone(self.zone)])
150        zone_item_slot = world.zones_items.index(self.item_stack.item)
151        self._terminate_zones_items[zones_slots, zone_item_slot] = (
152            self.item_stack.quantity
153        )

Build the task operation arrays based on the given world.

@staticmethod
def get_name(stack: Stack, zone: Optional[Zone]):
162    @staticmethod
163    def get_name(stack: Stack, zone: Optional[Zone]):
164        """Name of the task for a given Stack and list of Zone"""
165        quantity_str = _quantity_str(stack.quantity)
166        zones_str = _zones_str(zone)
167        return f"Place{quantity_str}{stack.item.name}{zones_str}"

Name of the task for a given Stack and list of Zone

Inherited Members
hcraft.task.AchievementTask
reward
hcraft.task.Task
name
terminated
is_terminal
reset