Phenotypic architecture of sociality and its associated genetic polymorphisms in zebrafish

Abstract Sociality relies on motivational and cognitive components that may have evolved independently, or may have been linked by phenotypic correlations driven by a shared selective pressure for increased social competence. Furthermore, these components may be domain‐specific or of general‐domain across social and non‐social contexts. Here, we used zebrafish to test if the motivational and cognitive components of social behavior are phenotypically linked and if they are domain specific or of general domain. The behavioral phenotyping of zebrafish in social and equivalent non‐social tests shows that the motivational (preference) and cognitive (memory) components of sociality: (1) are independent from each other, hence not supporting the occurrence of a sociality syndrome; and (2) are phenotypically linked to non‐social traits, forming two general behavioral modules, suggesting that sociality traits have been co‐opted from general‐domain motivational and cognitive traits. Moreover, the study of the association between single nucleotide polymorphisms (SNPs) and each behavioral module further supports this view, since several SNPs from a list of candidate “social” genes, are statistically associated with the motivational, but not with the cognitive, behavioral module. Together, these results support the occurrence of general‐domain motivational and cognitive behavioral modules in zebrafish, which have been co‐opted for the social domain.

varying ecological conditions, depending on the relative weight of these costs and benefits. Selection for increased sociality (i.e., higher social tendency) should also increase the selective pressure for the evolution of social abilities that optimize this trade-off, namely by increasing the social competence of animals, hence decreasing their costs of group living. Consequently, social living is predicted to drive the evolution of cognitive abilities that enhance social competence (aka social brain hypothesis, 2,3 ), and increased social preference is expected to be correlated with enhanced socially-related cognitive abilities. The social brain hypothesis has been extensively studied using comparative studies of phylogenetically related species with different degrees of sociality with conflicting results. 2,4 However, this hypothesis can also be tested within species. At this level, three predictions can be generated: (1) there should be phenotypic correlations between measures of social preference and measures of social cognition; if these phenotypic correlations result from a common genetic or physiological mechanism for social preference and social cognition that evolved in response to selection for sociality, then: (2) they should be maintained across different environments; and (3) they should share, at least partially, their genetic basis.
For testing the occurrence of phenotypic correlations between the motivational drive to form social groups and cognitive abilities that enhance fitness in a social environment two elementary components of social behavior can be considered: (1) a measure of approach response towards conspecifics (social tendency) that leads to the formation of social groups; and (2) the cognitive ability to recognize different conspecifics (social recognition) that allows individuals to selectively adjust the expression of their behavior to different individuals they encounter. It should be mentioned that social recognition can range from being more course or categorical, as the ability to recognize categories of conspecifics (e.g., male/female, familiar/stranger), to being fine grained, as the ability to recognize specific individuals within the group (e.g., pair mate). As mentioned above, given the role of these two behaviors for sociality, one can predict them to be selected together (i.e., co-evolve) during social evolution, leading to a phenotypic correlation between them. However, these traits can have evolved from other similar traits that have initially evolved in a nonsocial domain and were co-opted for the social domain when selection for group living increased. For example, social recognition may reflect a general domain cognitive ability, that evolved to allow animals to discriminate different entities, social or not (e.g., edible vs. non-edible food), in the environment, rather than a domain-specific trait selected by sociality. 5,6 In this case a phenotypic correlation would be expected between social recognition and non-social (e.g., object) recognition. Similarly, social tendency may reflect a general domain response to threat perception in the environment, since cohesiveness in animal aggregations is known to increase with perceived danger (i.e., aka defensive aggregation; e.g., rats: Reference 7; zebrafish: Reference 8). In this case a phenotypic correlation would be expected between social tendency and behavioral measures of anxiety/stress. Thus, the phenotypic architecture of sociality can be characterized by the pattern of phenotypic correlations among these behavioral traits.
The evolution of correlated traits can be explained by two alternative hypotheses, which are not necessarily mutually exclusive: (1) the constraint hypothesis, that postulates the occurrence of shared proximal mechanisms such as a pleiotropic effect of a gene, or a hormone with multiple target tissues; or (2) the adaptive hypothesis, that proposes that positive correlations between traits only occur in environments that favor them, such that selection can break apart maladaptive combinations of traits. 9 These two hypotheses generate different predictions that can be tested by comparing the patterns of correlated characters across different populations of the same species.
The constraint hypothesis predicts traits to be correlated across populations irrespective of ecological conditions, whereas the adaptive hypothesis predicts correlations between traits to vary between populations depending on local conditions. Thus, these two scenarios also have different evolutionary consequences, with the correlated traits acting as evolutionary constraint in the first case and, the correlation being itself an adaptation in the latter. Although, this rationale has been used to study the evolution of behavioral syndromes (aka personality), 9 to the best of our knowledge, it has not been applied yet to analyze the evolution of correlated social behavior traits.
Finally, it can also be tested if the genetic architecture of correlated traits is shared or not. Given the complexity of social behavior traits, they are expected to be under the influence of multiple genes, with small effects of each of them. In fact, several genes involved in neurotransmission (e.g., dopamine, serotonin [10][11][12], neuromodulation (e.g., oxytocin [13][14][15] ) and synaptic plasticity mechanisms (e.g., neuroligins/ neurexins [16][17][18][19] ) have been reported to influence social behavior in multiple ecological domains across a wide range of vertebrate taxa. Moreover, these "social" genes are expressed in brain regions that together form an evolutionary conserved social decision-making network in vertebrates. 20,21 Therefore, the question is to what extent these candidate genes show specific or shared patterns of association with the motivational and cognitive components of sociality discussed above.
Enough variation in both social tendency and social recognition occurs across species and between individuals of the same species, which should allow to test the abovementioned hypotheses. The tendency to associate with conspecifics varies considerably among species, ranging from weakly social species, in which social interactions only occur at specific times (e.g., breeding), to highly social species, in which individuals stay all their lives in close proximity and interacting with others. Similarly, variation in social recognition ability also occurs across species, from basic levels of recognition (e.g., conspecific vs. heterospecific), to increasingly more elaborate ones with high degree of specificity (e.g., kin vs. non-kin; particular individuals). 22 Moreover, variation in both social tendency and social recognition also occur within species, both intra-(e.g., with age and life-history stage) and inter-individually.
In this study we aim to characterize the phenotypic architecture of sociality in zebrafish (Danio rerio) by characterizing social tendency, social recognition and object recognition across multiple laboratory zebrafish populations that have evolved separately in captivity for multiple generations and by characterizing the genetic polymorphisms of candidate "social" genes associated with these behavioral traits. In zebrafish, isogenic lines are not viable due to inbreeding depression. 23 Hence, laboratory zebrafish populations differ from those of other model organisms in that they are recurrently outcrossed to maintain diversity. 24 As a result, laboratory zebrafish populations contain significant but varying levels of genetic diversity. 25,26 In parallel, zebrafish lines (e.g., AB, TU and WIK) have already been shown to vary in many behaviors, some of them interlinked, including locomotor activity, anxiety traits, stress reactivity, learning abilities and shoaling. [27][28][29][30][31][32][33][34][35][36][37][38] The paralleled variation in genetic diversity 25,26 and several behavioral phenotypes, provides the rationale that constitutive genetic variation may contribute to the observed behavioral variability.
Here we specifically aim to test: (1) if there is an association between social tendency and social recognition; (2) if social and nonsocial cognitive abilities (i.e., social vs. object recognition) are independent from each other, or if they co-vary supporting a general domain factor; (3) if there is an association between social tendency and anxiety trait; (4) if the phenotypic correlations found are fixed or vary across lines (populations), in order to test the constraint versus adaptive hypothesis; (5) to what extent the genetic architecture of each of these behavioral traits is shared or not, which would provide evidence for genetic pleiotropic effects underlying a putative sociality syndrome. For the latter, we have assessed the association between known single nucleotide polymorphisms (SNPs) in zebrafish for a set of candidate "social" genes (see Section 2 for details) and each behavioral trait. Artemia salina) and processed dry food (GEMMA Micro).

| Experimental setup and procedures
The behavior of each experimental fish was assessed in four different tests: (1) a shoal preference test to measure social tendency; two onetrial recognition tests using either objects (2)  Behavior during tests was recorded using black and white mini surveillance cameras (Henelec 300B) suspended above the experimental tank. Videos were analyzed using a commercial video-tracking software (EthoVision XT, Version 11.5, Noldus Information Technology) and behavioral measures were extracted from each test. Regions of interest (ROI) marked were kept at an average body length distance from the target location (gray regions in Figure 1A,B). Social tendency during the shoal preference test was quantified by the proportion of time in ROIs spent near the shoal ( Figure 1C), social and non-social discrimination during the conspecific and object recognition tests was measured by the proportion of time in ROIs spent near the preferred stimulus (familiar or novel; Figure 1D,E, respectively), while the overall time spent in ROIs near both stimuli was used as a measure of exploration. Anxiety in the open field test is typically exhibited by thigmotaxis (i.e., the propensity to avoid exposed areas), which was measured as the proportion of time spent within the ROI near the periphery following first entry (to control for any initial freezing in the center), while the average distance (in cm) from the wall was used to quantify the edge or wall orienting tendency associated with fear-induced thigmotaxis. 39  We built a list of candidate genes to test their association with the behavior traits, based on evidence from the literature for their involvement in the regulation of social behavior. This gene list included genes for: neurotransmitter systems (e.g., dopamine, serotonin), neuromodulators (e.g., oxytocin, AVT and NPY), neuroplasticity (e.g., bdnf, neurexins and neuroligins), and genes linked to autism (e.g., shank3a). A total of 139 SNPs in the genes of interest were successfully sequenced (see Supplementary material and Table S1 for details), but we had to remove 7 for lack of variation between the 164 tested zebrafish.

| Statistical analysis
One-sample t-tests (μ ≠ 0.5 vs. >0.5) were used to test if the scores of social tendency, object discrimination and social discrimination were significantly different from chance levels for each sex and for each line. Regions of interest (ROI) were set within 1 standard body-length from target locations or stimuli. (C) Social tendency was measured by interaction preferences towards a shoal. Social (D) and non-social (E) discrimination tests were comprised of two phases: an acquisition phase, in which the focal fish was exposed to two unfamiliar items (two fish or two objects, respectively) followed (as indicated by arrow in D and E) by a probe-test phase, in which the focal fish had to discriminate between one of the previously seen items (fish or object) and a novel one; recognition in both the social (D) and non-social (E) context were measured by the ability to discriminate between a familiar and a novel stimulus. Males (full circles) and females (open circles) of all lines (5D, AB, LEO, TL, TU, Wik) exhibited above chance (dashed line) preference for shoal over an empty tank (social tendency, F) and discrimination between a novel and familiar stimulus in both a social (conspecific; G) and non-social (object; H) context (bars indicate 95% CI). Behavioral measures exhibited different degrees of correlation (r), illustrated in the cladogram as degrees of association (I), based on which factor analysis revealed three principal components (PC): PC1 aggregates social tendency and social and object exploration corresponding to a motivational component of sociality; PC2 aggregates thigmotaxis and (i.e., proportion time in periphery) and edge-orienting (distance to wall) measured in the open field test, corresponding to an anxiety component; PC3 aggregates object and social discrimination, corresponding to a general-domain cognitive component Next, we extracted behavioral modules that aggregate correlated behav- To test if the behavioral modules are differently related with each other in each zebrafish line, we used the quadratic assignment procedure (QAP) correlation test with 5000 permutations, 41 to assess the association between any two correlation matrices between different zebrafish lines on UCINET 6. 42 Given that the null hypothesis of the QAP test is that there is no association between matrices, a significant p-value indicates that the correlation matrices are similar.
To check whether the genetic distances between subjects are structured by line or represent a uniform population, we computed a genetic distance (i.e., jaccard distance) matrix between all subjects (using their genetic data from the list of 132 SNPs), based on which we performed a hierarchical clustering with complete-linkage.
To assess the associations between genetic polymorphisms and behavior, we tested each of the 132 SNPs independently against each behavioral phenotype (the 7 behaviors and 3 PC scores). We did not include three zebrafish subjects in this analysis because their sample call rate was below 5%, meaning they lack genetic information for most SNPs. For the behaviors that followed a linear distribution (general inspection, general recognition, anxiety and edge-orienting) we used linear models (LM). For the behaviors that were proportions (social tendency, social discrimination, social exploration, object discrimination, object exploration and thigmotaxis), we used generalized linear models (GLM) with beta regression implemented. In all models, the behaviors were the response variables, SNP was the explanatory variable and line was a co-variate. Because we run 132 independent tests for each SNP, we corrected the p-values with the false discovery rate (FDR) adjustment method.

| Ethics
All experimental procedures were reviewed by the institutional internal Ethics Committee at the Gulbenkian Institute of Science and approved by the National Veterinary Authority (Direção Geral de Alimentação e Veterinária, Portugal; permit number 0421/000/000/2017).

| Phenotypic architecture of sociality in zebrafish
Scores of social tendency (i.e., preference for shoal over empty tank), as well as object and conspecific discrimination scores (i.e., preference between a novel and a familiar stimulus) were all significantly different than chance for individuals of both sexes and for all lines tested (onesample t-test: μ ≠ 0.5, p < 0.001; see Table S2; Figure 1F-H), indicating that social affiliation and social and object recognition abilities are present in males and females across zebrafish lines.
The PCA used to assess the phenotypic architecture of sociality, based on the correlation matrix between measures extracted from the four separate tests of social and associated behaviors (sampling adequacy: KMO > 0.5; sphericity: Bartlett's χ 2 21 = 253.76, p < 0.001; determinacy of multicollinearity: ρ = 0.754), identified three principal components (PC) with eigenvalues ≥1 ( Figure 1I and Table 1). PC1 shows a strong loading of social tendency measured in the social preference test and of social and object exploration measured in the social and object discrimination tests, respectively, suggesting the occur-  Figure S1).

| Genetic polymorphisms associated with behavioral modules
To assess if the different behavioral modules identified above were linked by a shared genetic architecture, we have investigated the association between a set of genetic polymorphisms (SNPs) in a list of candidate "social" genes and each of the measured behaviors and PC behavioral modules. Given the fact that we have phenotyped individuals from six different wild type lines, we checked for structured genetic variation by computing the genetic distance between the phenotyped individuals for the SNPs under study. We found that genetic variation for the SNPs of interest is highly structured with individuals from the same wild type lines clustering together ( Figure 3A). Therefore, we have used the line as a covariate in the model that assessed the association between each SNP and each of the behavioral modules.
Out of the 132 SNPs that showed variation in our sampled individuals, 53 (which mapped to 28 genes) were significantly associated with General Inspection, none with General Recognition and 8 (which mapped to 6 genes) with Anxiety (Table 2). Regarding the 3 behaviors that loaded to the General Inspection behavioral module, 6 SNPs (mapping to 6 genes) were associated with social tendency, 11 (mapping to 10 genes) with social exploration, and 3 (mapping to 3 genes) with object exploration. Of these 20 SNPs associated with these behaviors that load to General Inspection, only one (mapping to the serotonin receptor gene 5HTR 2cl2) is not also associated with General Inspection ( Figure 3B; Table 2). Moreover, of the 29 SNPs associated with General inspection, 16 are also associated at least with one of the behaviors that constitutes these behavioral module ( Figure 3B; Table 2). However, there is a reduced overlap between the SNPs associated with these different behaviors: only one SNP affects both social tendency and social exploration (matching the gene 5HTR-1aa), and only another SNP affects both social exploration and object exploration (matching the gene 5HTR-2cl1) ( Figure 3B; Table 2).
The SNPs associated with the General Inspection behavioral module are widely distributed across the zebrafish genome being absent only from chromosomes 11, 12, 19, 21 and 23 ( Figure 3C). However, one can find SNPs associated with behaviors that load to General Inspection module in some of these chromosomes; SNPs associated with social exploration in chromosome 11, 19 and 21; and SNPs associated with social tendency and with object exploration in chromosome 21 ( Figure 3C).

| DISCUSSION
In this study we have characterized the phenotypic architecture of sociality in zebrafish. We have behaviorally phenotyped males and females of six different wild type laboratory lines in four behavioral tests (social tendency, social and object discrimination and open-field) and showed that social tendency (i.e., preference to associate with conspecifics) and the ability to discriminate between conspecifics (social recognition) is present in both sexes of all lines tested. A factor analysis identified three main behavioral modules: (1) general inspection, which includes social tendency measured in the social preference test and social and object exploration, measured in the social and T A B L E 1 Loadings extracted by the varimax rotaton of principal components from the correlation matrix of behaviors across tests, for zebrafish of all lines  Thus, both studies suggest a common proximate mechanism indicative of a general-domain cognitive trait (i.e., social and asocial memory). 43 Finally, it is worth noting that even though sociality has been proposed to be promoted by predator pressure as a defensive mechanism, 44 anxiety forms an independent behavioral module from those where social traits are included.
Even with the motivational and the cognitive components of sociality being part of two different behavioral modules, a shared selective pressure on both for the enhancement of social competence could result in a physiological linkage between the two behavioral modules; for example, due to the evolution of a common neuromodulator that phenotypically integrates the independent neural mechanisms underlying general inspection and general recognition. In fact, even though that social affiliation and social memory have been shown to rely on separate neural circuitry, some neuromodulators, such as oxytocin have been shown to regulate both mechanisms, 45,46 opening the possibility for the evolution of physiological constraints that phenotypically link these two domains. We tested the constraint hypothesis, which predicts traits to be correlated across populations irrespective of ecological conditions, 9,47  On the other hand, the genetic polymorphisms associated with object exploration include less "social genes" (only 3), which are restricted to the serotonergic and dopaminergic neurotransmitter T A B L E 2 Lists of genes with SNPs associated with the behavioral modules General Inspection (and its contributing behaviors) and anxiety  However, the General Inspection module, where social tendency is included, has associated SNPs on chromosomes 18 and 24. Hence, this mismatch between the QTL results and our results presented here can be due either to a false detection of these QTLs by the genetic algorithm mapping method, given the lack of support from the interval mapping method in the previous study, which led the authors not to claim these QTLs themselves 51 ; or to an indirect association through the link between social tendency and the general inspection module. Either way, our results show that the SNPs associated with both the general inspection module and the behaviors that constitute this module are widespread across the genome, supporting a many gene (each with small effects) genetic architecture for these traits.