The Second Social Learning Strategies Tournament FAQs
Posted 29 Aug 2011:
Does a strategy know “at birth” which extension(s) world it is in? I’m assuming it must “know” about cumulative because the REFINE//move is enabled. It would know about spatial as soon as it is switched between demes (but would it know prior to such a switch?). Finally, would an OBSERVE move know that it is watching a model it has picked, and therefore is in model bias extension, or that observed models have been chosen at random?
You’ve prompted some thought on our part here. Our original intention was that things would become obvious in the course of the agent’s life but actually, that’s not the case when it comes to model bias – an agent would never, in the original structure, ‘know’ that the result of an observe move came from random selection or their own choice. This seemed pretty unrealistic to us. Also, in the spatial case, agents in demes 2 and 3 would ‘know’ immediately that they were in a spatial simulation, whereas those in deme 1 would not until and if they migrated themselves. This seemed to us on reflection to introduce an artificial information imbalance. Therefore we’ve decide to make agents aware of whether the model bias and spatial extensions are in effect, in the same way as we currently do for REFINE, by giving the ‘move’ function true/false variables for each extension – in the example and strategy template these are called ‘canChooseModel’,’canPlayRefine’ and ‘multipleDemes’. This is now described in section 3.2 of the rules. Thanks for your input.
The errors associated with observation seem different from the last tournament (the new 1.2.4 & 1.4.3). When an OBSERVE move fails, does this mean no act is added to the repertoire (i.e., “individual receives no new behaviour or knowledge of its payoff”)? Last time a failure resulted in a random act being added with errors associated with the observed payoff.
That is correct – they are different, and when an OBSERVE move fails then nothing is learned. We received many comments suggesting we had selected an odd way to model social learning errors in the first tournament that actually gave an advantage to social learning by causing it to effectively result in innovation once in a while.
With observe_who, no information is given on what strategy a model has chosen to EXPLOIT that round? Only when a model(s) is/are actually specified will the OBSERVE move get to see what they do. Also, can one get the observe_who info and then decide to do something else (i.e., if there are no good models)? Or does one get that info only on the condition of committing to OBSERVE?
That’s right, only the information specified in section 1.2.5 is available, and it is only available once the decision has been made to play OBSERVE. The rationale is that the gathering of that information is part of the act of observing itself, so takes time and therefore must be paid for as part of the time cost of playing OBSERVE.
When the basic payoff of an act changes (due to pc), are all refinements of that act lost, such that r resets to 0? Or do the refinement levels carry over to the new act?
The refinement levels remain the same, they are just added to the new basic payoff. Section 1.2.7 : “The resultant payoff available to that individual for that act is equal to the basic payoff defined by the environment (which can change; 1.1.2), plus an increment which is a function of the refinement level and is unaffected by basic payoff changes (see 1.4.5 for details of the function).”
When observing a refined act, the observer can get both the basic and the refinement level. Does the observer also get the observed act’s associated r-value or now is r = 0 because the observer has not refined the act?
We have added to Section 1.2.3 to clarify this: “In the cumulative case, the act is observed at the same level of refinement as demonstrated by the model, meaning the observer can acquire that level of refinement *and its associated payoff increment, for that act*.”
In all extensions melee, would all refinements of acts be lost if the individual is switched between demes? (For instance, if act #4 has r=5 in deme one, if switched to deme two, then act #4 would still be in the repertoire but have r = 0?).
We didn’t specify this properly, so we have expanded section 1.3.3 to clarify this:”In the spatial case, a number of individuals, nmigrate, are selected at random from each deme, and then each reassigned to a randomly selected deme. **Their repertoire is unaffected by migration – the agent will still know the same acts, at the same refinement levels in the cumulative case. They will not know what the payoffs for those acts are in the new environment**.”
So what you’re saying is that if an individual switches demes, it is as if all their acts’ basic payoffs in their repertoire were simultaneously switched to something else as would happen for individual acts within-deme by the P sub c? But all the ‘new’ acts would still be refined to the same existing level.
In the spatial case, is P sub c the same in all 3 demes within a single, given simulation? Or can each deme have its own value? (I.e., what you learn about Pc in one deme, tells you nothing about Pc in the 2nd or 3rd demes.)
The pc parameter value will be the same across all three demes. The idea of varying it across demes is certainly interesting, but a complexity too far for our present purposes.
If I understand refinement correctly, it adds a bonus payoff to the basic value. However, the function returns non-integer values, while the basic values are all integers (the same as in the first tournament?). Therefore, if you observe a non-integer value you know you have seen a refined act. But the payoff may have also been seen with an error in estimate. Are the errors assigned to payoff estimates only integer values or can they also be non-integers? Also are errors associated with only with the basic payoff, such that the refinement addition would be observed w/o error?
Apologies for the confusion. The payoff increment (bonus payoff) is actually always rounded to the nearest integer. This is now stated plainly in the rules. Regarding errors, when an exploiting agent is observed (successfully), the observing agent receives only two pieces of information – the act performed by the exploiter, and the payoff received by the exploiter. It does not know whether it has observed a refined act or not, nor the level of refinement, so it cannot know which portion of the payoff is attributable to the basic payoff or any refinement increment. The error is applied to the observed payoff.