Ratings for 96 words — schaper2019 • anticlust

A stimulus set that was used in experiments by Schaper, Kuhlmann and Bayen (2019a; 2019b). The item pool consists of 96 German words. Each word represents an object that is either typically found in a bathroom or in a kitchen.

schaper2019

Format

A data frame with 96 rows and 7 variables

item: The name of an object (in German)
room: The room in which the item is typically found; can be 'kitchen' or 'bathroom'
rating_consistent: How expected would it be to find the item in the typical room
rating_inconsistent: How expected would it be to find the item in the atypical room
syllables: The number of syllables in the object name
frequency: A value indicating the relative frequency of the object name in German language (lower values indicate higher frequency)
list: Represents the set affiliation of the item as realized in experiments by Schaper et al.

Source

Courteously provided by Marie Lusia Schaper and Ute Bayen.

References

Schaper, M. L., Kuhlmann, B. G., & Bayen, U. J. (2019a). Metacognitive expectancy effects in source monitoring: Beliefs, in-the-moment experiences, or both? Journal of Memory and Language, 107, 95–110. https://doi.org/10.1016/j.jml.2019.03.009

Schaper, M. L., Kuhlmann, B. G., & Bayen, U. J. (2019b). Metamemory expectancy illusion and schema-consistent guessing in source monitoring. Journal of Experimental Psychology: Learning, Memory, and Cognition, 45, 470. https://doi.org/10.1037/xlm0000602

Examples


head(schaper2019)
#>                 item     room rating_consistent rating_inconsistent syllables
#> 1 Feuchtigkeitsmaske bathroom              4.10                1.04         5
#> 2        Damenbinden bathroom              4.22                1.12         4
#> 3          Haarspray bathroom              4.32                1.13         2
#> 4             Tampon bathroom              4.35                1.22         2
#> 5          Badewanne bathroom              4.55                1.02         4
#> 6     Ohrenstaebchen bathroom              4.63                1.26         4
#>   frequency list
#> 1        21    1
#> 2        19    1
#> 3        17    1
#> 4        17    1
#> 5        13    1
#> 6        21    1
features <- schaper2019[, 3:6]

# Optimize the variance criterion
# (tends to maximize similarity in feature means)
anticlusters <- anticlustering(
  features,
  K = 3,
  objective = "variance",
  categories = schaper2019$room,
  method = "exchange"
)

# Means are quite similar across sets:
by(features, anticlusters, function(x) round(colMeans(x), 2))
#> anticlusters: 1
#>   rating_consistent rating_inconsistent           syllables           frequency 
#>                4.49                1.10                3.44               18.31 
#> ------------------------------------------------------------ 
#> anticlusters: 2
#>   rating_consistent rating_inconsistent           syllables           frequency 
#>                4.49                1.10                3.41               18.31 
#> ------------------------------------------------------------ 
#> anticlusters: 3
#>   rating_consistent rating_inconsistent           syllables           frequency 
#>                4.49                1.10                3.41               18.31 
# Check differences in standard deviations:
by(features, anticlusters, function(x) round(apply(x, 2, sd), 2))
#> anticlusters: 1
#>   rating_consistent rating_inconsistent           syllables           frequency 
#>                0.27                0.06                1.05                2.42 
#> ------------------------------------------------------------ 
#> anticlusters: 2
#>   rating_consistent rating_inconsistent           syllables           frequency 
#>                0.23                0.07                0.84                2.01 
#> ------------------------------------------------------------ 
#> anticlusters: 3
#>   rating_consistent rating_inconsistent           syllables           frequency 
#>                0.24                0.06                0.91                2.75 
# Room is balanced between the three sets:
table(Room = schaper2019$room, Set = anticlusters)
#>           Set
#> Room        1  2  3
#>   bathroom 16 16 16
#>   kitchen  16 16 16

# Maximize the diversity criterion
ac_dist <- anticlustering(
  features,
  K = 3,
  objective = "diversity",
  categories = schaper2019$room,
  method = "exchange"
)
# With the distance criterion, means tend to be less similar,
# but standard deviations tend to be more similar:
by(features, ac_dist, function(x) round(colMeans(x), 2))
#> ac_dist: 1
#>   rating_consistent rating_inconsistent           syllables           frequency 
#>                4.49                1.10                3.41               18.31 
#> ------------------------------------------------------------ 
#> ac_dist: 2
#>   rating_consistent rating_inconsistent           syllables           frequency 
#>                4.50                1.11                3.44               18.31 
#> ------------------------------------------------------------ 
#> ac_dist: 3
#>   rating_consistent rating_inconsistent           syllables           frequency 
#>                4.49                1.10                3.41               18.31 
by(features, ac_dist, function(x) round(apply(x, 2, sd), 2))
#> ac_dist: 1
#>   rating_consistent rating_inconsistent           syllables           frequency 
#>                0.24                0.07                0.84                2.39 
#> ------------------------------------------------------------ 
#> ac_dist: 2
#>   rating_consistent rating_inconsistent           syllables           frequency 
#>                0.26                0.06                1.05                2.42 
#> ------------------------------------------------------------ 
#> ac_dist: 3
#>   rating_consistent rating_inconsistent           syllables           frequency 
#>                0.26                0.07                0.91                2.43