A stimulus set that was used in experiments by Schaper, Kuhlmann and Bayen (2019a; 2019b). The item pool consists of 96 German words. Each word represents an object that is either typically found in a bathroom or in a kitchen.

schaper2019

Format

A data frame with 96 rows and 7 variables

item

The name of an object (in German)

room

The room in which the item is typically found; can be 'kitchen' or 'bathroom'

rating_consistent

How expected would it be to find the item in the typical room

rating_inconsistent

How expected would it be to find the item in the atypical room

syllables

The number of syllables in the object name

frequency

A value indicating the relative frequency of the object name in German language (lower values indicate higher frequency)

list

Represents the set affiliation of the item as realized in experiments by Schaper et al.

Source

Courteously provided by Marie Lusia Schaper and Ute Bayen.

References

Schaper, M. L., Kuhlmann, B. G., & Bayen, U. J. (2019a). Metacognitive expectancy effects in source monitoring: Beliefs, in-the-moment experiences, or both? Journal of Memory and Language, 107, 95–110. https://doi.org/10.1016/j.jml.2019.03.009

Schaper, M. L., Kuhlmann, B. G., & Bayen, U. J. (2019b). Metamemory expectancy illusion and schema-consistent guessing in source monitoring. Journal of Experimental Psychology: Learning, Memory, and Cognition, 45, 470. https://doi.org/10.1037/xlm0000602

Examples


head(schaper2019)
#>                 item     room rating_consistent rating_inconsistent syllables
#> 1 Feuchtigkeitsmaske bathroom              4.10                1.04         5
#> 2        Damenbinden bathroom              4.22                1.12         4
#> 3          Haarspray bathroom              4.32                1.13         2
#> 4             Tampon bathroom              4.35                1.22         2
#> 5          Badewanne bathroom              4.55                1.02         4
#> 6     Ohrenstaebchen bathroom              4.63                1.26         4
#>   frequency list
#> 1        21    1
#> 2        19    1
#> 3        17    1
#> 4        17    1
#> 5        13    1
#> 6        21    1
features <- schaper2019[, 3:6]

# Optimize the variance criterion
# (tends to maximize similarity in feature means)
anticlusters <- anticlustering(
  features,
  K = 3,
  objective = "variance",
  categories = schaper2019$room,
  method = "exchange"
)

# Means are quite similar across sets:
by(features, anticlusters, function(x) round(colMeans(x), 2))
#> anticlusters: 1
#>   rating_consistent rating_inconsistent           syllables           frequency 
#>                4.49                1.10                3.44               18.31 
#> ------------------------------------------------------------ 
#> anticlusters: 2
#>   rating_consistent rating_inconsistent           syllables           frequency 
#>                4.49                1.10                3.41               18.31 
#> ------------------------------------------------------------ 
#> anticlusters: 3
#>   rating_consistent rating_inconsistent           syllables           frequency 
#>                4.49                1.10                3.41               18.31 
# Check differences in standard deviations:
by(features, anticlusters, function(x) round(apply(x, 2, sd), 2))
#> anticlusters: 1
#>   rating_consistent rating_inconsistent           syllables           frequency 
#>                0.27                0.06                1.05                2.42 
#> ------------------------------------------------------------ 
#> anticlusters: 2
#>   rating_consistent rating_inconsistent           syllables           frequency 
#>                0.23                0.07                0.84                2.01 
#> ------------------------------------------------------------ 
#> anticlusters: 3
#>   rating_consistent rating_inconsistent           syllables           frequency 
#>                0.24                0.06                0.91                2.75 
# Room is balanced between the three sets:
table(Room = schaper2019$room, Set = anticlusters)
#>           Set
#> Room        1  2  3
#>   bathroom 16 16 16
#>   kitchen  16 16 16

# Maximize the diversity criterion
ac_dist <- anticlustering(
  features,
  K = 3,
  objective = "diversity",
  categories = schaper2019$room,
  method = "exchange"
)
# With the distance criterion, means tend to be less similar,
# but standard deviations tend to be more similar:
by(features, ac_dist, function(x) round(colMeans(x), 2))
#> ac_dist: 1
#>   rating_consistent rating_inconsistent           syllables           frequency 
#>                4.49                1.10                3.41               18.31 
#> ------------------------------------------------------------ 
#> ac_dist: 2
#>   rating_consistent rating_inconsistent           syllables           frequency 
#>                4.50                1.11                3.44               18.31 
#> ------------------------------------------------------------ 
#> ac_dist: 3
#>   rating_consistent rating_inconsistent           syllables           frequency 
#>                4.49                1.10                3.41               18.31 
by(features, ac_dist, function(x) round(apply(x, 2, sd), 2))
#> ac_dist: 1
#>   rating_consistent rating_inconsistent           syllables           frequency 
#>                0.24                0.07                0.84                2.39 
#> ------------------------------------------------------------ 
#> ac_dist: 2
#>   rating_consistent rating_inconsistent           syllables           frequency 
#>                0.26                0.06                1.05                2.42 
#> ------------------------------------------------------------ 
#> ac_dist: 3
#>   rating_consistent rating_inconsistent           syllables           frequency 
#>                0.26                0.07                0.91                2.43