A standardised set of images for judgements of proportion

In the present work, we present normative data for a set of 39 original clipart-style images that can be used as material in studies involving judgements of proportion. The original images are drawings that depict different day-to-day scenarios (e.g., lighted windows in a building; books on a shelf) and each has seven variants of different proportions (from 20% to 80%) belonging to different categories (discrete vs continuous; social vs non-social; natural vs artificial; stimuli physical dimensions; number of referents). Normative data for these images are presented in an interactive database (available at https://judgment-images-and-norms.shinyapps.io/estimates_interactive/), corresponding to the means of proportion estimates (in percentage form), the perceived ease of making such estimates, the perceived level of familiarity and liking for each image, and the relationships between these variables. In the paper, we analyse the data at an individual level, addressing how the latter judgements are related to the proportion estimates, how those estimates are related to objective proportions, and how these relationships are moderated by image category. The analyses presented in this paper aim to aid readers in selecting images that enable them to better address specific influences on proportional estimates or to control for those influences in their studies.


Introduction
Various types of judgements and decisions involve quantification processes based on the comparison of items varying in quantity in one or more dimensions (e.g., Mussweiler & Epstude, 2009).Visual stimuli (consisting of quantifiable elements) have been widely used in research on magnitude processing, particularly proportion estimates, a fundamental process in the field of quantitative cognition (Beran & Parrish, 2016).
Proportional reasoning is studied using tasks that are based on the process of understanding a relationship (between quantities) as a simple quantity and being able to perform mental operations with it (Lamon, 2005).Researchers typically manipulate one of two dimensions: the number of elements of multiple types (i.e., discrete quantity), or the size of the parts/segments making up a single element, usually geometric shapes (i.e., continuous quantity).Examples of such tasks are those that employ visual delimited displays containing two types of elements varying in relative frequency, with participants being asked to estimate the proportion of one element type in comparison with the other type or to the whole set of elements.A variety of simple geometric elements have been used in these visual array tasks, such as coloured dots (Meert et al., 2012;Philip, 1947;Stevens & Galanter, 1957;Varey et al., 1990), squares (Shuford, 1961), rectangles (Spinillo & Bryant, 1999), spheres (Ginsburg & Rapoport, 1967;Jeong et al., 2007), or vertical and horizontal lines (Shuford, 1961).Other tasks use continuous stimuli, whose quantities are processed analogously.Examples include scaling tasks, that require matching of a parcel of a target stimulus, which could be a subsegment of a length-line (i.e., number-line estimation task) or an area-shape (e.g., divided bars and pie charts), to its symbolic or non-symbolic proportional equivalent (e.g., Cohen & Sarnecka, 2014;Huttenlocher et al., 1999;Möhring et al., 2018), and the relative fullness tasks, that require the participant to estimate the percentage of liquid in a container, like water or coloured liquids in straight glass cylinders (e.g., Hollands & Dyre, 2000;Pearson, 1964;Piaget, 1967;Raghubir & Krishna, 1999;Yang & Raghubir, 2005).
Studies in the domain of proportional reasoning are typically carried out using materials built for the study in question.The specific physical features of those stimuli likely motivate differences in the psychophysical functions that establish the relationship between objective and subjective proportions (see claim by Hollands & Dyre, 2000;Spence, 1990;Zhang & Maloney, 2012).For instance, Boyer et al. (2008) show that children's performance is worse when stimuli map discrete rather than continuous quantities.This stimulus variability is welcome, since it allows us to address relevant commonalities in the definition of a psychophysical function associated with proportion estimates (e.g., Hollands & Dyre, 2000).Nevertheless, the differences between studies have not been systematically addressed, and biases found in the description of processes may be amplified by the lack of variability of stimuli within studies themselves.In most cases, experimental stimuli consist only of simple geometric elements (e.g., arrays or parts of circles, squares, spheres, cylinders).More diverse stimuli, comprising different types of features, may allow researchers not only to understand the general mechanism underlying proportion estimates, but also to control features that can interfere with them and to identify different biases that can be promoted by such features.
Although the preferential use of colour and geometric shapes as material allows for greater experimental control (see Smets et al., 2015), these types of stimuli may also lack generalisation to our social world.Stimuli with social meaning, on the other hand, have the advantage of generalisation.Studies that examine the development of children's ability to reason about proportions frequently employ experimental paradigms relying on stimuli with social meaning, such as bricks (e.g., Spinillo & Bryant, 1999), marbles (Jeong et al., 2007), drawings of pizza (e.g., Singer-Freeman & Goswami, 2001), pie slices (e.g., Spinillo & Bryant, 1991), chocolate pieces (e.g., Singer-Freeman & Goswami, 2001), or volumes of water and juice (e.g., Fujimura, 2001;Schwartz & Moore, 1998).In adults, the use of stimuli with social meaning has been restricted to consumer behaviour, such as beverages (wine, beer, and lemonade) in glasses or bottles (e.g., Attwood et al., 2012;Troy et al., 2018), food on plates (e.g., Kosīte et al., 2019;Wansink & van Ittersum, 2013), or supermarket products in packages (e.g., Chandon & Ordabayeva, 2009;Li et al., 2018).The use of stimuli with social meaning in these domains relates to the social nature of our cognition and the fact that most quantity estimations occur with this type of stimuli.Quantitative cognition, and proportion estimation research in particular, may be strengthened considerably by taking into account that proportion estimates are usually undertaken in a social environment.Specifically, having social environments depicted in stimuli, in which substantial control of the relevant features of the stimulus is still possible, would be an advantage to this field (see suggestion by Barth et al., 2003, that research on human numerical competence must consider the influence of stimulus properties).
In line with this reading of the field, we found a need for a set of visual stimuli that could be made available to the research community, along with normative data associated with different dimensions, which could be easily used in different proportional reasoning studies, allowing for testing of new hypotheses as well as control of undesirable influences in a context that favours comparisons between studies.These normative data are most relevant when stimuli have a social meaning.Specifically, stimuli with social meaning are the most likely to differ in features able to modulate the cues that can be used to infer quantity (Fornaciai et al., 2019), thus needing stronger experimental control.
In this paper, we present normative data for a set of clipartstyle social images that can be used as materials in studies involving judgements of proportion, offering the possibility to control for a set of dimensions.We specifically consider dimensions that were previously shown to influence subjective probability estimates, such as stimuli familiarity (e.g., Fox & Levav, 2000), liking (e.g., Bohner et al., 1988;Zajonc, 1968), and response ease (e.g., Suantak et al., 1996).We believe that this set of images will be useful for researchers, allowing them to study different dimensions of proportional reasoning with social stimuli, without having to compromise experimental control.The normative data presented in this paper can also be used in research focusing on the role of these features in the estimation process clarifying, for instance, how perceptual fluency and liking influence proportion estimates (e.g., Doi & Shinohara, 2016;Schwarz, 2004).

Development of norms for quantifiable images
We developed a set of 39 clipart-style images defining different social settings.These images correspond to proportions represented in different day-to-day scenarios (e.g., lighted windows in a building; books on a shelf), which vary systematically in different dimensions and proportion levels.
All the images define different social settings that may be used in a part-whole type of task, where researchers ask participants to estimate the proportion of one type of element (e.g., lighted windows, green books on a shelf) relative to the total set of elements of various types (e.g., green and red books), or the part of an entity (e.g., volume of water in a bottle, or clouds covering the sky) relative to its total (e.g., the total volume of the bottle, or the whole sky).
Each of the 39 images has seven different proportion level variants, where the proportions represented vary from 20% to 80%.The proportional judgements elicited are in percentage format, ranging from 0 to 100% (e.g., percentage of lighted windows in a building, green books on a shelf, water in a bottle, or clouds in the sky).Percentages are more consistently and quickly perceived by participants (see Goldstone, 1993), and provide uniform data for all images.
Descriptive details regarding the data collected for each image are made accessible to readers via an online interactive database (available at https:// judgm ent-images-andnorms. shiny apps.io/ estim ates_ inter active/).Proportional normative data are presented in the interactive database for each of the seven variants of each image in the form of mean percentage estimates, as well as mean estimate bias associated with each image (i.e., the absolute difference between the estimates and the real proportions depicted in the images).In addition, we provide readers with data regarding the mean of the ease that participants felt while making the estimates, for each of the seven proportion-level variants of each image.The 39 images are further characterised by their perceived levels of familiarity and liking, corresponding to the means of all participants' evaluations for the 50% proportion level.Relationships between these judgements at the level of each image are also provided in the interactive database.
It should be further noted that the images were classified according to their different characteristics in order to consider the influence of different stimulus properties on proportion estimates (see Barth et al., 2003).Specifically, this set of images varies in five categories, allowing for different manipulations within the same set of stimuli, besides varying in the proportion to be estimated.The images vary in the quantity type represented-either continuous or discreteand in terms of other stimulus characteristics: the presence of social elements-social elements are either present or absent in the image; the nature of the stimulus-either natural or artificial; the physical dimensions to be quantifiedcolour, shape, area, or number of elements; and the number of subsets of elements-dichotomic (two types of elements) or multiple (more than two types).
In the results section of this paper, we provide readers with information at the individual level of analysis, clarifying how the individual's subjective proportion estimates related to objective estimates and the other judgements associated with the images, and also how these are likely to be qualified by the different image features described above.This should allow researchers to select materials knowing in advance that specific features of the stimuli are likely or unlikely to influence individuals' proportion estimates.The images may be used either for further testing those influences or for controlling for those influences when they are considered undesirable.

Participants
One hundred and eighty English-speaking participants were recruited via the online recruiting platform Prolific (www.proli fic.co) [7 January 2020] to take part in this study.Decisions regarding sample size aimed at ensuring sufficient power to perform the analyses at the participant level.This decision was supported by G*Power (Faul et al., 2009), aiming at a .80power level in detecting a relationship between two variables, and its moderation by a category, with medium effect size (f =0.25) and an alpha level of .05.In order to be eligible, Prolific users needed to be native English speakers, aged 18 to 40 years, with an approval rate of over 90%.One set of participant data was not used, as the participant responded to the study twice, thus leaving a total of 179 participants (M = 26.84,SD = 6.46; 53.63% female).This study was approved by ISPA's internal research ethics committee.All participants provided informed consent to participate and were informed that they could abandon the study at any time.

Stimuli
Original visual stimuli for judgements of proportions were created in clipart style by the authors using Adobe Illustrator CS6 (Adobe Systems Incorporated: San Jose, CA).
The content of the images was developed using a focus group.The group's goal was to put together different ideas for scenarios in which an image may differ in part-whole proportions represented in area or volume, or in specific discrete entities.Images to be selected should provide high variability in scenarios and in different features (see below).An important aspect to be taken into account was that the stimuli should not appear bizarre in one or more of the part-whole proportions.Thirteen scenarios resulted from this process (see Fig. 1 for examples).These scenarios were chosen following different criteria.The first criterion was that scenarios would have to allow for stimuli to vary in terms of either discrete quantity characteristics, based on multiple arrays of entities (i.e., proportion of elements; for instance, percentage of green marbles in a glass container) or continuous quantity characteristics, based on area or volume (i.e., fill proportion; for instance, percentage of dark fur on a pet).Specific criteria were established for the images representing the discrete part-whole proportion of elements.First, all images had to include a large number of total elements in order to avoid that participants would resort to counting rather than estimating percentages (participants tend to almost immediately quantify up to six items; Kaufman et al., 1949).Furthermore, all the images designed had to depict non-symbolic quantities, without containing symbolical numerical elements, so that this dimension would not interfere with the judgements made.Discrete images have either two types of elements (e.g., percentage of lighted windows against dark windows) or four types of elements (e.g., percentage of yellow marbles against blue, green, and red marbles; see how this may affect quantity judgements in Pelham et al., 1994).
For the images representing continuous part-whole proportions we ensure that the number of levels of the dimension was either unitary-the target and its complementary (e.g., percentage of water in the bottle), or multiple-the complementary area is a compound of multiple elements (e.g., percentage of clouds in the sky).
Three different versions of each of the chosen scenarios were designed.More specifically, each version corresponded to a different exemplar of the setting, differing in general colours and stimuli disposition or type (e.g., for the percentage of fruit in a bowl scenario, three different pieces of fruit were used for each version).
For each of the 39 images, seven levels of part-whole proportion of fill or elements (depending on the stimuli characteristics) were created.The percentages depicted in the images varied from 20% to 80% in intervals of 10%.For the discrete stimuli images, the total number of elements was kept constant, with elements altered to result in the intended percentage of the stimuli.For the continuous stimuli images, we ran a script in Adobe Illustrator which allowed us to calculate the area of a selected vector object.The total area of fill was obtained through this method and the stimuli were altered to correspond to the intended percentage.In sum, a total of 273 original images of equal dimensions (189 × 189 cm) were created.
All images were drawn in the same clipart styleelements were derived from simple vector objects and shapes and manipulated using the Adobe Illustrator pen and selection tool, to produce simple images with little detail.Images are coloured, and mainly use tones of red, green, blue, and/or yellow, provided by the Adobe Illustrator colour guide.For some images, these are combined with neutral colours, such as elements of grey (e.g., marble container), brown (e.g., land area; skin), black (e.g., night sky), and white.Images generally contain a white background, except for those images which require an alternative background for emphasising the stimuli (e.g., cloud-covered sky, night sky for lighted windows).The choice of colours and minimalism design was so that the images resembled those we encounter in today's illustration and design industries, which we are exposed to every day.

Design and procedure
Data were collected through Qualtrics (Qualtrics: Provo, UT) and participants accessed through Prolific (www.proli fic.co) [2020, 7 January].Participants were randomly and equally assigned to one of three groups of images to evaluate.Each group contained all 13 scenarios, but only one version of each with the respective seven proportion levels.Thus, each participant saw 91 images.The order of image presentation was randomised and differed for each participant in each group.Firstly, participants either consented to participate voluntarily and entered the study or opted to abandon the study and were immediately redirected to the end of the survey.
For the first task, participants were asked to estimate the percentages (0 to 100%) of several visual items as spontaneously as possible.The images were displayed, one at a time, in the centre of the screen below the question "What is the percentage of (stimuli)?"(e.g., "What is the percentage of yellow marbles?").Underneath the image, participants were presented with a text box with the instruction "Estimate the percentage", where they were asked to provide a number from 0 to 100.On the same screen, they were also asked to rate how easy it was to give the estimation, using a seven-point scale from 1 = very hard to 7 = very easy.This was repeated for all 91 images of the group in random order.
Following this task, participants rated the familiarity and liking of each of the 13 images that they had seen (only the 50% proportion levels of each image were shown), using two seven-point scales, 1 = unfamiliar to 7 = very familiar and 1 = dislike to 7 = like a lot, respectively.Participants were explicitly told that the images that they were rating for familiarity had all been presented in the previous task, in order to control for the influence of previous exposure on familiarity ratings1 .Once again, the images were displayed one at a time, in the centre of the screen, in random order.On the same screen, both rating scales appeared below, underneath the question "how familiar is this item?" for the familiarity rating, and "how much do you like this item?" for the likeability rating.
Final demographic questions were then asked, and participants were thanked.All participants received monetary compensation for their participation.The study lasted approximately 15 min.

Scores in the online interactive database
The 39 images (i.e., three versions of the 13 scenarios) belong to different, but not all orthogonal, categories.Therefore, each image was classified in the interactive database with regard to (a) the type of quantity-either discrete or continuous; (b) presence of social elements-either present, if the image depicted people (e.g., images portraying a receding hairline in a person) or absent, if the image depicted only objects (e.g., images portraying a container with marbles); (c) nature of the stimulus-either natural (e.g., dark fur) or artificial (e.g., green book); (d) the stimuli dimensions that varied in proportion-for discrete stimuli, these varied in amount (e.g., lighted windows), colour (e.g., yellow marbles), or shape (e.g., star-shaped beads), and for continuous stimuli, these were coded as varying in a unitary area (e.g., baldness-as the images portrayed a receding hairline, and so the bald spot increased as a whole) or multiple areas (e.g., forested area-as bare land area would decrease as different patches of trees appeared spread out over the image); (e) the number of referents (i.e., total types of elements) present for discrete stimuli images-either two (e.g., percentage of lighted windows against dark windows) or four (e.g., percentage of yellow marbles against blue, green, and red flowers).
Each of the 273 stimuli (39 images × 7 proportions) was scored for estimated proportion (as a percentage), and the associated estimation ease.Normative scores were calculated for each image by directly averaging individuals' proportion estimates (0-100) and ratings of estimation ease (1-7).
Additionally, the 273 stimuli were scored in terms of the bias associated with the individuals' proportion estimates (i.e., estimate bias).This score corresponded to the mean of the modulo of the value resulting from the subtraction of the estimated proportion provided by each participant from the objective proportion presented in the image.
Each of the 39 images presented in this normative study was also scored in terms of two image features (perceived familiarity and liking).These scores were derived by averaging the ratings of familiarity (1-7) or liking (1-7), respectively, provided by individuals for the 50% level of proportion of each image version.
All image information and scores were compiled into an interactive database run as a web application (app).This was accomplished using the R Shiny package.The web app is currently launched and hosted by the Shiny apps platform and can be accessed at https:// judgm ent-images-andnorms. shiny apps.io/ estim ates_ inter active/.Each row of the data table concerns one image, where the image name (i.e., "Image"), image scenarios (i.e., "Group"), amount of proportion fill (i.e., "Proportion"), short description (i.e., "Description"), and stimulus within the image used for the estimation judgement in our experiment (i.e., "Judgement Stimulus") are referred to for image identification.The following columns refer to the image classifications, as described above (i.e., "Discrete|Continuous", "Nature of Stimulus", "Dimension of Stimulus", "Total Referents", and "Social Elements").Finally, all normative data for mean scores (i.e., "M"), standard deviations (i.e., "SD"), and low and high confidence intervals (i.e., "low CI", "high CI") of familiarity (i.e., "Familiarity"), liking (i.e., "Liking"), estimate (i.e., "Estimate"), estimate bias (i.e., "Bias"), and ease (i.e., "Ease") are presented.The last columns represent the relationship found between the different judgements and estimates (in the form of Pearson correlation coefficients and respective p-values) at each individual image level (e.g., Pearson correlation coefficient of familiarity and liking is represented as "FamiliarityRLiking" and the respective p-value as "FamiliarityRLiking_p").
The interactive database allows for the following: exact word search in all fields or inside specific columns; numerical or alphabetical ordering of values in each column; restriction of particular score fields between specific numeric intervals using a slider feature; unlimited combination of specific searches and numeric restrictions to identify the best image to be used.All images are available for download in jpeg format in a folder at https:// osf.io/ 4cs6k/?view_ only= 04e5c edf79 0f4f8 8ab04 c5e40 f70b3 ab.

Additional analyses at the participant level
Besides providing normative scores for each image, we also analysed the data in order to understand how specific features and categories of an image interfere with individuals' estimates, namely, the relationship between objective and subjective proportions (a proxy of a linear psychophysical function).We defined this function with the slope indexing sensitivity (i.e., the ability to discriminate different levels of actual proportions) and the intercept indexing the general response bias (i.e., the tendency to provide high or low estimates).
We first provide descriptive statistics for different measures (i.e., estimate, estimate bias, ease, familiarity, and liking).Secondly, we characterise the relationship between these different measures using a mixed-model approach, with specific images as random factors.Third, we address the relationship between objective and subjective proportions, assessing levels of sensitivity and response bias using a linear mixed-effects analysis, with specific images as random factors.Finally, we focus on how image categories moderate all the previously studied relationships.Our general approach was to estimate different mixed models where categories are used as moderators of the intercepts and of the slopes of the main relationship.

Descriptive statistics
Normative data for three different aggregated measures (estimate, estimate bias, ease) of the 272 stimuli and for the aggregated ratings of familiarity and liking of the 39 images are summarised in Table 1.We further summarise these measures by gender, providing means in Table 1 and signalling when they achieved significance2 .This allows researchers to take gender differences into account when this could be relevant to their studies.
These summaries reveal that all images were evaluated as generally neutral, M = 4.04, with liking ratings ranging from 2.18 to 5.25, and perceived as reflecting very familiar events, M = 5.95, with familiarity ratings ranging from 5.03 to 6.40.In general, participants reported the estimation task as not being difficult, M = 4.24.However, there are clear differences between the perceived difficulty in estimation of the different images, considering that the values range from 2.43 to 5.39.
Estimates ranged from 13.88 to 92.34, and the mean estimates were correctly around 50%, M = 51.54,reflecting the objective range of percentages presented.Congruently, the estimate bias was relatively low, at 11.54, ranging from 2.36 to 28.78.

How do the judgements and estimates relate to one another?
In order to inform researchers as to how individual estimates and judgements relate to one other and also how the different specific images define those relationships, we ran a set of mixed-model analyses (GAMLj package with support of the jamovi platform, version 1.2, The jamovi project, 2020), with the specific images as a random factor.We test for the random intercepts to control for differences between how images were evaluated.We test for random first-order interaction factors (a two-way interaction) to determine whether images vary in terms of the relationships approached in each analysis.We use the likelihood ratio test (LRT) to inform readers about possible differences in the focused relationship between different specific images3 .
The two-way interaction component shows that images vary in terms of the relationship with which participants evaluate their familiarity and liking; SD = 0.09; LRT: chisquare (2) = 5.45, p = .060.This suggests that it is possible for researchers to choose materials for which these two factors are more strongly or weakly related (image individual data for each relationship are reported on the interactive database).

Liking and ease
There is a relationship between how much an image is liked and ease with which the estimation is made; slope ± SE = 0.17 ± 0.02; t(37.3)= 8.57, p < .001.This relationship is not moderated by the random factor, suggesting that the relationship tends to be the same across all images.
Familiarity and ease Perceived familiarity and ease are positively related; slope ± SE = 0.04 ± 0.02; t(37.3) = 1.89, p = .033.This relationship is also not moderated by the random factor.

Estimate and liking
Estimates (i.e., subjective proportions) are related to liking; slope ± SE = 0.53 ± 0.24; t(40.4)= 2.21, p = .035,with higher estimates being associated with betterliked images (and vice versa).The effect is moderated by the image random factor; it is less strong for some images than for others (see correlations in the interactive database), since the random factor moderates the slopes of those relationships, SD = 1.07;LRT: chi-square (2) =13, p = .002.

Estimate and ease
The estimates relate positively to the ease felt; slope ± SE = 3.80 ± 0.28; t(39.2) = 13.6,p < .001,with higher estimates being felt as easier to make.The interaction with the random image factor shows that this relationship is stronger for some images than others (see correlations in the interactive database), SD = 1.52;LRT: chi-square (2) = 85.3, p < .001.
Estimate bias and judgements None of the judgements (i.e., perceived familiarity, liking, and ease) is related to the estimate bias (all ps > .12).

Relationship between objective and subjective proportions
Each image was presented in seven different levels of proportions to be estimated.Here, we examine the relationship between these objective proportions and subjective proportions (i.e., estimates) within a mixed-model analysis, with the specific images as a random factor.We test for random effect intercept to determine whether some images lead to higher estimates than others.We test for two-way interaction with the random factor to understand whether some images are more prone to general response bias than others.
Results show that estimates made by participants covary with the objective proportion presented in the image; slope ± SE = 0.97 ± 0.03; t(39) = 35.20,p < .001,with the intercept of this relationship being near the 50% value 4 (51.49;SE = 1.21; t(40.4)= 2.21, p = .035).The intercept of the random factor is significant; SD = 7.63; ICC = 0.226; LRT: chi-square (1) = 3878, p < .001,suggesting that some images promote higher estimates then others.We also found two-way interaction with the random factor, suggesting that the factor interferes with the slope of the relationship (SD = 0.168; LRT: chi-square (2) = 822, p < .001).This suggests that some images promote better estimates across all levels than others (see mean differences in estimate bias).Thus, future researchers should choose their images carefully as those that are more or less susceptible to general response bias and thus to objective proportion sensitivity.For instance, image PD1_50 is the image on the interactive database with the lowest mean value of estimate bias, with the mean values of estimate bias for each of the seven proportion variants of this scenario and image version ranging from 2.36 (PD1_50) to 8.49 (PD1_20).On the other hand, image PC3_50 has a mean estimate bias of 8.5, with the mean values of estimate bias for each of the seven proportion variants of this scenario and image version ranging from 6.43 (PC3_80) to 12.09 (PC3_60).Moreover, readers can look at the correlations between estimates and estimate bias (i.e., "EstimateRBias"), along with the respective p-values (EstimateRBias_p), which inform the reader about the slopes that operationalise how the degree of bias is dependent on the magnitude of the estimate.More specifically, these are either non-significant (i.e., a constant), positive (i.e., higher estimates are associated with higher bias), or negative (i.e., lower estimates are associated with lower bias).As an example, it can be seen in the interactive database that by selecting the image scenario "PD" and version "1" (PD1) for a future study, we would be selecting an image for which those who provide higher estimates also present less bias, which suggests that the bias promoted by the image is an underestimation.On the contrary, by selecting the image scenario "BA" and version "3" (BA3), we would be selecting an image for which the bias increases for higher estimates, which suggests that the bias promoted by the image is an overestimation.

Differences promoted by image categories
In this section, we analyse differences between image categories in terms of judgements and proportion estimates, and also how they modulate the previously studied relationships, adding them independently, as moderators, to the previously defined mixed models5 .For this last analysis we also test for random second-order interaction effects (three-way interaction), to determine whether the category moderation effect was the same or different for all the images.
The categories analysed correspond to those described above: discrete versus continuous (type of quantity), depicting social elements or not-presence versus absence (social elements), artificial versus natural (nature of stimuli), as being dichotomic or having more levels (total types of referents), and as either varying in amount, colour, or shape (for discrete stimuli), or varying in unitary area or multiple areas (for continuous stimuli) (stimuli dimensions).
Below, we summarise our analyses, firstly regarding (a) how each category influences judgements and estimates and (b) how each category moderates the relationships between judgements, and secondly, regarding (c) how each category influences the relationship between objective and subjective proportions (see Fig. 2).
Type of quantity Discrete and continuous images do not differ in terms of familiarity but differ in terms of liking (continuous images are liked best: effect ± SE = −0.48± 0.18; t(36.9)= −2.71,p = .010).Differences in estimation ease are also significant; effect ± SE = −0.230± 0.13; t(270) = −3.90,p < .001,suggesting that estimations are easier for continuous images.Discrete and continuous images differ in terms of the mean of proportion of estimates (estimates are lower for continuous images; effect ± SE = 6.50 ± 2.55; t(270) = 2.55, p = .011)that they promote.
For all of the analyses presented above, the null effect of the image random factor suggests that this effect is similar for any pair of images of both categories.
The discrete-continuous factor also interferes with the relationship between objective and subjective proportions; slope ± SE = −0.09± 0.01; t(16124) = 8.12, p < .001.Simple effects analysis shows that both slopes are close to 1, but that the slope is lower for the continuous material (slope ± SE = 0.93 ± 0.007) than for the discrete material (slope ± SE = 1.02 ± 0.008).This suggests that an increase in the proportion levels reduces the sensitivity to objective proportions to a greater extent for the continuous images than for the discrete ones.This interaction is further moderated by the image random factor, suggesting that this pattern of differences may be dependent on the specific images selected; SD discrete = 0.119, SD continuous = 0.200; LRT: chisquare (5) = 826, p < .001(see mean correlations in the interactive database).

Presence of social elements
Images with no social elements and images with social elements do not differ in terms of familiarity and ease.However, they differ in liking (effect ± SE = −0.478± 0.01; t(39) = −2.22,p = .032),with individuals liking images with no social elements more than they liked images with social elements.A two-way interaction with the random factor, SD= 1.22; LRT: chi-square (2) = 7.97, p = .019,suggests that the effect is not the same for all images, possibly because there is greater variability in the means of liking for social images than non-social images (see correlations in the interactive database).Again, this variability allows researchers to select images that are equivalent with regard to liking in this category.There are no differences between images with no social elements and images with social elements regarding the proportion estimates made or estimate bias.
Regarding the previously analysed relationships, this category moderates the relationship found between ease and estimate bias; slope ± SE = −0.01 ± 0.01; t(16002) = −2.49,p < .013.Simple effects analysis shows that the slope is positive (slope ± SE = 0.01 ± 0.01) for images with no social elements and the relationship is null (slope = 0, n.s.) for images with social elements.The interaction is further moderated by the image random factor (SD no social elements = 0.02, SD social elements = 0.01; LRT: chi-square (5) = 55.9 p < .001),suggesting that this pattern of differences may be dependent on the specific images selected (see correlations in the interactive database).
This category further interferes with the relationship between objective and subjective proportions, slope ± SE = −.208 ± 0.01; t(16123.1)= −15.97,Fig. 2 Relationship between objective and subjective proportions for each category.The slope of the relationship between subjective and objective proportions reflects the amount of bias promoted by each category under analysis.For most of the categories, biases are not major, but may go in different directions.Higher variability is obtained with different stimuli dimensions, with different shapes reducing accuracy p < .001.Simple effects analysis shows that the slope is smaller (slope ± SE = 0.811 ± 0.01) for images with social elements than for images with no social elements (slope ± SE = 1.020 ± 0.006).More specifically, sensitivity to real proportions is lower for images with social elements than for images with no social elements.The interaction is moderated by the image random factor (SD no presence of social elements = 0.12, SD presence of social elements = 0.20; LRT: chi-square (5) = 30.60p < .001)calling attention to the fact that some images promote more bias than others (see means of bias estimate in the interactive database).
Nature of the stimuli This category did not promote differences in either liking, familiarity, or ease.We found differences in estimates associated with this category; effect ± SE = −7.08 ± 2.31; t(22) = −3.06,p = .006.More specifically, participants overestimate natural stimuli and underestimate artificial stimuli.There was no interaction with the random effect, which indicates that this occurs equally for all images presented.
Regarding the previously analysed relationships, this category moderates the relationship between ease and estimate bias; slope ± SE = −0.01 ± 0.01; t(16058) = −4.45,p < .001.Simple effects analysis shows that the slope is positive (slope ± SE = 0.01 ± 0.01) for artificial stimuli and negative (slope ± SE = −0.01 ± 0.01) for natural stimuli.However, the three-way interaction with the random factor shows that this depends on the images considered, with the effect being stronger for some images than others (SD artificial = 0.02, SD natural = 0.01; LRT: chi-square (5) = 49.2,p < .001;see correlations in the interactive database).
Natural and artificial stimuli also moderate the relationship between subjective and objective proportions; slope ± SE = −0.139± 0.01; t(16124.9)= −11.54,p < .001.Simple effects analysis shows that the slope is stronger positive (slope ± SE = 1.01 ± 0.029) for artificial stimuli than (slope ± SE = 0.86 ± 0.044) for natural stimuli.This suggests that artificial stimuli promote higher sensitivity to objective proportions than natural stimuli.The three-way interaction with the random factor is not significant, suggesting that this effect is generalised for all images.

Number of referents
The number of referents did not impact ratings of familiarity nor ratings of liking.However, this category impacted ratings of ease (effect ± SE = −0.291± 0.10; t(124) = −2.84,p = .005),suggesting that dichotomic images made the task easier in comparison with images with more referents.The random factor two-way interaction is reliable; SD = 0.896; LRT: chi-square (2) = 18.6, p < .001,and revealed that images with more referents can be perceived as either easy to estimate or difficult to estimate (see the interactive database), whereas dichotomic images are consistently reported to be easy to estimate (M = 4.26; SD = 0.28).This category did not promote differences in subjective estimates.
Finally, this category moderates the relationship between objective and subjective proportions; slope ± SE = 6.31 ± 2.87; t(11.5)= 2.20, p = .049 .Simple effects analysis shows that the slope is higher (slope ± SE = 1.05 ± 0.051) for dichotomic stimuli than for stimuli with more referents (slope ± SE = 0.98 ± 0.051).This indicates that individuals are better at estimating proportions in dichotomic stimuli.This effect is stronger for some images than others; SD dichotomic = 0.04, SD more referents = 0.43; LRT: chi-square (5) = 22.6, p < .001,suggesting that differences promoted by the number of referents are more likely found with some images than others (see correlations in the interactive database).

Stimuli dimensions
Stimuli dimensions promoted differences in familiarity and liking judgements.Specifically, these differences were found to be associated with the shape dimension (see Table 2 and interactive database).Shape was perceived to be less familiar (F(4, 38.9) = 3.25, p = .021)and liked less (F(4, 38.9) = 12.5, p < .001)than other stimuli dimensions.Image as a random factor did not interfere with these effects Tables 3, 4, 5 and 6.
This category moderates the relationship between ease and estimate, F(4, 7.23) = 18.8, p < .01.Specifically, this relationship is stronger for the colour dimension and weaker for the amount dimension (although still significant, p < .001).Furthermore, this category moderates the relationship between ease and estimate bias, (F(4, 1462) = 20.3,p <.010).Specifically, this relationship is stronger for the colour dimension and weaker for the multiple areas dimension (although still significant, p < .001).Image as a random factor did not interfere significantly with these effects.
Finally, this category moderates the relationship between objective and subjective proportions, F(4, 16124.7)= 150.00,p < .001.Simple effects analysis clarifies that only the amount (slope ± SE = 1.055 ± 0.01) and area (slope ± SE = 0.971 ± 0.01) slopes are closer to 1. Slopes are lower than 1 for multiple areas (slope ± SE = 0.870 ± 0.01), shape (slope ± SE = 0.637 ± 0.02), and higher than 1 for colour (slope ± SE = 1.157 ± 0.01).This suggests that researchers may choose the dimension to be estimated knowing which dimensions increase or decrease estimation correctness.The three-way interaction with the random factor was not significant, suggesting that image as a random factor did not interfere significantly with these effects.

Discussion
In this paper, we have presented normative data associated with a set of visual stimuli that depict social images and allow for judgements of proportion within a part-whole task.These stimuli were developed as an addition to traditional stimuli (mostly compositions of simple geometric elements) used in a broad range of research topics in proportional reasoning studies.The new material adds social relevance to those stimuli (or real-world correspondence), while at the same time providing the capacity to control for other dimensions that are here shown to be relevant for increasing or reducing the bias of estimates.It should be noted, however, that no comparisons were made in the present work between typical stimuli sets used in prior work, as our goal was first and foremost to provide normative data for these newly developed visual stimuli depicting social images.However, future research should take this aspect into consideration, as comparisons between new and traditionally employed stimuli could prove highly valuable in the advancement of the field.
All images under analysis in this paper are readily downloadable, and all information related to them is searchable within an online interactive database.We provide researchers with these materials to facilitate the advancement of research on this topic, since developing these kinds of materials can be time-consuming, even requiring specific skills.
Data available in the online interactive file enable a proper selection of materials for future studies.In selecting materials, researchers should take into account the analysis performed at the individual level in this paper.The category of the image may itself impact either the proportion estimates or their relationship with the objective proportions.Importantly, when the images, added to the analysis as a random factor, are shown to significantly interact with the impact of a category, it means that there is sufficient variability in the image set and that researchers may be able to find images that either strengthen or control the effect at stake.For instance, when data show that discrete and continuous stimuli differ in terms of the sensitivity to real proportions, but that image as a random factor qualifies this difference, researchers can find an image setting for which sensitivity to real proportions is equivalent for continuous and discrete images.For that, authors can rely on the analysis conducted at an individual level in this paper, and on the correlation analysis at the image level provided in our interactive database.
The analyses conducted at an individual level are further informative in how individuals perceive an image (judgements of liking and familiarity) and the ease felt when proving a response, with the relationships between these variables differing depending on the category of the image.A specific category of images may also moderate the relationship between objective and subjective proportions, suggesting that individuals' sensitivity to the real proportions depends on the image category.
Researchers should consider the fact that the images differ in how they are perceived, and that this perception is related to their estimates and/or estimate bias.Images differ in how much they are liked, and the level of liking is associated with the estimate bias.More specifically, images that are less liked are underestimated in their proportions, and images that are more liked are overestimated.Furthermore, images differ in terms of the ease felt by participants when providing estimates.This is relevant because there is a positive relationship between the values estimated and the perceived ease.This suggests that higher values are more easily estimated or that the estimation of higher values makes participants feel that the process is easier, or even that if individuals experience processing ease, they estimate higher values.Therefore, it seems that the way individuals perceive a stimulus, associated with how much they like the stimulus and how familiar they feel the stimulus to be, as well as the ease with which they made their estimates, is directly related to the magnitude of estimates.Social cognitive research suggests affect as a relevant modulator of subjective probability estimates (Bohner et al., 1988;Fox & Levav, 2000;Suantak et al., 1996).Since processing of familiar and perceptual fluent stimuli is charged with positive affect (e.g., Garcia-Marques et al., 2010, 2016), it is not surprising that our data directly demonstrate that these features interfere with the proportion estimation process.Considering that the goal of this paper is to provide stimuli for which relevant properties are previously known, when studying proportion reasoning, we do not draw conclusions about the presence of the effects, and neither explore nor offer possible explanations for them.However, we believe that this should be undertaken in future research.
Importantly, this work showed that the relationship established between the subjective and objective proportion estimates was qualified by different types of materials, suggesting that individuals are less sensitive to real proportions when (a) these proportions change continuously rather than in a discrete way; (b) images do not have social elements (versus having social elements); (c) images represent natural (versus artificial) scenery; (d) the stimuli represented are dichotomic (versus multiple referents); and (e) images depict shapes, colour, and multiple areas (versus other elements).This diversity of biases occurring with perceptual stimuli is likely to inform theories that aim to explain proportion estimation processes.The bias detected with our stimuli may have two different sources: either perception differs starkly from explicit expressions, or the numerosity perceptual context modulates perception itself.In the first case, bias occurs in the process of transforming perceptions into responses (see Shepard, 1984), and in the second, bias occurs in the perception of the quantity itself (e.g., Ebbinghaus illusions).An insight provided by the fact that there are differences in the levels of distortions promoted by the different categories is that perceptual mechanisms play a relevant role in promoting these biases.The data suggest that the characteristics of the images themselves create biases, and that these are not promoted exclusively by the process which transforms what is visually perceived into a cognitive response.This perceptual factor should thus be added to the list of factors that are already known to have an impact on the magnitude of systematic estimation errors, such as individual differences (e.g., overconfidence: see Moore & Healy, 2008) and contextual features (e.g., anchoring; see Furnham & Boo, 2011).While it is likely that the same type of bias occurs through different pathways, parsimony would challenge researchers to test the relationship between these different factors-for example, testing whether individual differences or contextual factors are not themselves interfering with how we perceive reality.
Although the material provided makes it easy for researchers to test their hypotheses, additional control procedures are still needed in specific studies.For example, although our study manipulates and measures many features of the images, there are other features, such as perceived visual complexity, that are also relevant in understanding proportion estimation processes (e.g., Madan et al., 2018).Thus, authors may need to assess the degree of perceived complexity of the images that they select for their studies.In addition, it should be kept in mind that, in this normative study, the different images comprising different versions of the same scenario were assessed by different participants (i.e., a between-subjects comparison).Therefore, our data do not allow us to determine whether the scenarios are truly perceived similarly or, in other words, to report on the consistency levels of the evaluations provided to the different images of the same scenario (i.e., their reliability scores).

Extending the use of this set of images to other magnitude estimation tasks
This set of images was addressed here in a part-whole task that requires the estimation of the proportion (in percentage) of a type of element relative to the total of the elements, or of a part of an entity relative to its entirety.However, the use of these stimuli can be extended to other proportion judgement tasks, such as scaling tasks, in which the participant is asked to match a target stimulus (composed of two elements with a given proportional relation) to its proportional equivalent (e.g., Barth et al., 2003;Mix et al., 1999;Meert et al., 2012;Sophian, 2000).Tasks of matching non-symbolic fractions (e.g., Boyer et al., 2008;Jacob & Nieder, 2009;Jeong et al., 2007;Meert et al., 2012) or spatial scaling tasks (e.g., Huttenlocher et al., 1999;Möhring et al., 2018) are some examples.
We further believe that the use of these images can also be extended to other magnitude tasks of the part-part ratio comparisons type, in which the quantity of an element is evaluated as a fraction of another (or its multiple), or of the ordinal comparisons type, which involve the comparison of the proportion of a stimulus relative to the proportion of others (B is smaller than/equal to/larger than A).

Fig. 1
Fig. 1 Scenarios depicted in the images.Examples of one of the versions of each of the 13 scenarios depicted in the images along with the stimulus within the image used for the estimation judgement

Table 1
Descriptive statistics for the variables estimate, estimate bias, ease, familiarity, and liking * signals that gender differences are significant (p < .05)

Table 2
Descriptive statistics for the variables estimate, estimate bias, ease, familiarity, and liking for the category type of quantity

Table 3
Descriptive statistics for the variables estimate, estimate bias, ease, familiarity, and liking for the category presence of social elements

Table 4
Descriptive statistics for the variables estimate, estimate bias, ease, familiarity, and liking for the category nature of the stimuli

Table 5
Descriptive statistics for the variables estimate, estimate bias, ease, familiarity, and liking, for the category number of referents

Table 6
Descriptive statistics for the variables estimate, estimate bias, ease, familiarity, and liking for the category stimuli dimensions