Emerging Artificial Societies Through Learning
Journal of Artificial Societies and Social Simulation
vol. 9, no. 2
Received: 02-Sep-2005 Accepted: 10-Dec-2005 Published: 31-Mar-2006
Table 1: Properties encoded in the genome of the agent | |||
Name | Type | Value | Additional comments |
Metabolism | real | [0,1] | Determines how how much food is converted into energy |
OnsetTimeAdulthood | int | [15,25] | Reaching the adult age the agent becomes fertile |
OnsetTimeElderdom | int | [55,65] | Reaching elderdom the agent loses fertility. |
Socialness | real | [0,1] | The degree to which an agent wants to interact with other agents. |
FollowBehaviour | real | [0,1] | The degree to which an agent wants to follow its parents |
MaxVisionDistance | real | [0,1] | How far an agent can see. |
InitialSpeed | real | [0,1] | Initial distance an agent can walk per time step |
InitialStrength | real | [0,1] | Initial weight an agent can lift. |
MaxShoutDistance | real | [0,1] | The maximal reach of an agent''s shout |
![]() |
Figure 1. DQTs are decision trees with the following features: (1) nodes can denote rules, indices of states, and actions, (2) decisions can be stochastic (denoted by the dice), (3) nodes are assumed to have values (denoted by the gray levels of the nodes). Special cases of DQTs among others are: decision trees, the action-state value tables of reinforcement learning, and Bayesian decision networks. DQTs have large compression power in state-action mapping and suit evolutionary algorithms. DQT partitions of the state-action space can be overlapping. |
![]() |
Figure 2. (A) exploitation using DQT, (B) exploration and (C) "greedification" of a DQT. The value of each node is shown by the depth of the grey shading. |
Table 2: An example scheme for playing a language game between a speaker and hearer. The game may take up to 5 time steps t. See the text for details. | ||
t | Speaker | Hearer |
n | Perceive context | |
Categorisation/DG | ||
Target selection | ||
Produce utterance | ||
Update memory1 | ||
Send message | ||
n + 1 | Receive message | |
Perceive context | ||
Categorisation/DG | ||
Interpret utterance | ||
Update memory1 | ||
Respond | ||
n + 2 | Evaluate effect | |
n + 3 | Evaluate effect | |
n + 4 | Update memory2 | Update memory2 |
![]() |
(1) |
When an association is used, its co-occurrence frequency uij is incremented. This update is referred to in Table 3 as 'update memory1'. The association scores σi j are calculated based on the agents' evaluation of the success of the game. If the game is a success, the association is reinforced according to Eq. (2), otherwise it is inhibited using Eq. (3).
σi j = η σi j + 1 - η | (2) |
σi j = η σi j | (3) |
where η is a learning parameter (typically η=0.9).
strL(αij + (1 - σij)Pij | (4) |
This equation has the neat property that if the association score is low (i.e. the agent is not confident about the association's effectiveness), the influence of the a posterior probability is large, and when the association score is high, the influence of the a posterior probability is low. (For a case study using this combined model, consult Vogt & Divina 2005).
2This motivation does not have to be 'hard-coded' or built in; it too can be evolved, since agents that have no 'motivation' to survive are also likely to fail to reproduce. Hence there will be evolutionary pressure in favour of agents that behave in ways that make them appear to be motivated for survival.
3We call it the 'NewTies' agent to distinguish this agent design from others that may be introduced into the environment. As noted later, the intention is to make the environment open so that other agent designs may be trialled.
