Although knowledge of their fossil record continues to improve, multituberculates nonetheless remain one of the more poorly understood mammalian clades, which can be attributed to a record comprised of isolated teeth and fragmentary jaws. Fortunately, the p4 of multituberculates is the most common form of remains for this group and is a principal source of diagnostic characters in systematic studies, the p4 of cimolodontan multituberculates is both common and a source of diagnostic characters in systematic studies. The results of a recent morphometric study on the neoplagiaulacid Mesodma suggest that p4 size may be more useful than shape in diagnosing the various species referred to this genus. We tested this hypothesis by applying two different morphometric methods (2D geometric morphometrics and linear measurements) to two samples: (1) one including the p4s of four known species of Mesodma (M. ambigua, M. thompsoni, M. formosa, and M. pygmaea), and (2) a sample of unidentified p4s of Mesodma from the Bug Creek Anthills locality of northeastern Montana. Our results indicate that while form explains most of the morphological variation in p4s of the various species of Mesodma, linear-measurement data support differences in p4 morphology that are not recovered by form data alone. Depending on the methods used, we found evidence for the presence of one or more species of Mesodma in the Bug Creek Anthills fauna. Although shape and size both contribute to morphological variation in the p4 of Mesodma, our results suggest that the diagnostic power of each varies with the type of methodology employed.

Multituberculates were rodent-like mammals that existed for some 130 million years and survived the mass extinction event that decimated the non-avian dinosaurs (Cretaceous–Paleogene boundary; 66 million years ago). Despite this lengthy record, multituberculates remain one of the more poorly understood mammalian groups, a situation resulting in part from their fossil record consisting largely of isolated teeth and jaws, which makes confident identification difficult. Fortunately, the blade-like lower fourth premolar (p4) of many multituberculates is frequently preserved, and has significant diagnostic power, allowing researchers to distinguish multituberculates, sometimes to species level. Various methods have been used in examining the p4, ranging from qualitative assessments and basic measurements to more sophisticated statistical methods that quantify form (called morphometrics). A recent study comparing the effectiveness of qualitative and quantitative methods for distinguishing the p4s of species of the Late Cretaceous–early Paleocene multituberculate genus Mesodma concluded that size is the most important contributor to morphological variation among the included species. To test this hypothesis, we applied the study methods to a novel dataset that included additional species of Mesodma, and to a second dataset consisting of unidentified p4s of Mesodma. Our results suggest that rather than size being the most important variable in distinguishing species of Mesodma, that shape and size are more informative when analyzed together. Our results confirm previous hypotheses that shape and size are intricately linked, and that biological significance is sometimes difficult to maintain when attempting to isolate each variable. The use of quantitative methods such as those proposed in the original study, with appropriate caution, were found to be useful in distinguishing among the p4s of the various species of Mesodma and have potential for use in studies on other multituberculates more broadly.

With a temporal record spanning approximately 130 million years, multituberculates are one of the longest-lived mammalian groups (Robinson et al., 1964; Clemens and Kielan-Jaworowska, 1979; Wilson et al., 2012; Weaver et al., 2021). Multituberculate fossils have been discovered on nearly every continent and at many localities, and their remains often outnumber those of contemporaneous mammals in many locality collections (Van Valen and Sloan, 1966; Kielan-Jaworowska et al., 2004; Weil and Krause, 2008; Wilson, 2013). Despite this extensive fossil record, the interrelationships of multituberculates are poorly understood. This issue is exemplified in North American multituberculates, where lack of well-preserved, articulated specimens has led to a necessary reliance on isolated dental remains (Bown and Kraus, 1979; Kielan-Jaworowska et al., 2004; Yuan et al., 2013; Weaver et al., 2021), which is compounded by parallel evolution in the multituberculate dentition (Gingerich, 1977). Although progress has been made in resolving the evolutionary relationships of some multituberculates (e.g., Wible et al., 2019), the interrelationships of most multituberculates remain obscure. Despite these difficulties, the blade-like lower fourth premolar (p4) of most multituberculates has demonstrated diagnostic value (Jepsen, 1940; Clemens, 1963; Krause, 1977; Novacek and Clemens, 1977; Clemens and Kielan-Jaworowska, 1979; Sloan, 1981; Archibald, 1982; Sloan et al., 1987).

Although the utility of p4 anatomy in multituberculate systematics has long been recognized, the results of recent research suggest that for some multituberculate taxa, p4 size is more important than overall shape in species discrimination (Smith and Wilson, 2017). This finding is particularly important when considering samples in which closely related species co-occur. Smith and Wilson (2017) focused their analysis on the neoplagiaulacid Mesodma (Jepsen, 1940), whose constituent species are known from prior to and after the Cretaceous–Paleogene boundary in northeastern Montana and elsewhere in the North American Western Interior (Smith and Wilson, 2017). Ratios of standard measurements of length and height traditionally have been used as proxies for p4 shape (including symmetry and relative height; Novacek and Clemens 1977; Archibald, 1982; Weaver et al., 2021), and these have formed the basis for diagnoses of the species of Mesodma, as they have for the majority of other closely related multituberculates. Smith and Wilson (2017) compared these traditional proxies and qualitative assessments of p4 shape with the results of two-dimensional geometric morphometric analyses (2D GMM). Their results suggest that identification accuracy was significantly improved among the species of Mesodma included in their study (M. formosa Marsh, 1889; M. thompsoni Clemens, 1963; M. garfieldensis Archibald, 1982; M. hensleighi Lillegraven, 1969) when p4 shape and a proxy for size (log centroid size) were considered together as compared to shape alone. These results prompted the authors to suggest that size may be the most useful parameter for distinguishing p4s (and, by extension, species) of not only those species of Mesodma included in their study, but all known species of the genus. If confirmed, the results of Smith and Wilson's (2017) study could have significant implications for diagnosing multituberculate taxa beyond Mesodma.

In this study, we explored the utility of size, shape, and size and shape together (referred to in this study as “form”) in distinguishing the p4s of various species of Mesodma by analyzing a dataset of specimens referable to four species (M. formosa, M. thompsoni, M. pygmaea Sloan, Fassett, and Rigby, 1987; and M. ambigua Jepsen, 1940). Size, shape, and form were assessed using traditional linear measurements and more recently derived methods involving 2D geometric morphometrics (Clemens, 1963; Wilson, 2013; Smith and Wilson, 2017). We then applied these methods in combination with linear discriminant methods in an exploration of the differences in size, shape, and form in a large sample of unidentified p4s of Mesodma from the Bug Creek Anthills (BCA) locality of northeastern Montana. The BCA is a temporally mixed deposit preserving specimens of mammals and other vertebrates that lived during the latest part of the Cretaceous and the earliest part of the Paleocene (see Sloan and Van Valen, 1965; Archibald, 1982; Lofgren, 1995; Wilson, 2013). Species of Mesodma have been identified in this fauna, but diversity estimates have wavered from a single indeterminate species to two or more (Sloan and Van Valen, 1965; Novacek and Clemens, 1977; Archibald, 1982; Lofgren, 1995). Given the results of Smith and Wilson's (2017) study, additional insight into species discrimination in the BCA multituberculate fauna could be gained through application of their methods.

Sampling

Mesodma has included up to ten species (M. ambigua Jepsen, 1940; M. formosa Marsh, 1889; M. thompsoni Clemens, 1963; M. hensleighi Lillegraven, 1969; M. pygmaea Sloan, Faassett, and Rigby, 1987; M. senecta Fox, 1971; M. primaeva Lambe, 1902; M. garfieldensis Archibald, 1982; M. minor Eaton, 2002; M. archibaldi Eaton, 2002), with the latter five having been removed recently (Smith and Wilson, 2017; Weaver et al., 2021). In exploring what parameter might best distinguish the p4 of various species of Mesodma, we first analyzed specimens of the two species included in Smith and Wilson's (2017) study (M. formosa; N = 24 and M. thompsoni; N = 26) and those of two additional species (M. ambigua; N = 5 and M. pygmaea; N = 18) from Paleocene localities in Wyoming and Alberta. The total sample analyzed in this part of the study is referred to as the known dataset (KDS, N = 73).

The second part of this study focused on a sample of unidentified p4s from the BCA locality of northeastern Montana (see Sloan and Van Valen, 1965; Archibald, 1982; Lofgren, 1995; Wilson, 2013). We applied a modified version of Smith and Wilson's (2017) methods to a sample of 183 p4s of Mesodma from the BCA locality, accessioned in the collections of the Royal Ontario Museum in Toronto, Ontario, applying a similar landmark scheme, as well as linear discriminant methods. The linear discriminant study used known species from localities outside of the BCA to train a linear discriminant function (LDF), which was then applied to the unknown data set. The sample analyzed in this part of the study is referred to as the BCA dataset (BCADS, N = 183).

Data collection: imaging, 2D geometric morphometrics, and linear measurements

The p4s analyzed in this study were imaged in labial view; left p4s were digitally inverted to match the right p4s in orientation, which assisted in standardization. We used image data for specimens of M. formosa and M. thompsoni that were originally included in Smith and Wilson's (2017) study. Specimens of M. ambigua were imaged by M. Fox (Yale Peabody Museum of Natural History) with the aid of an Olympus SZX12 microscope with a Sony A6000 Digital Camera. Specimens of M. pygmaea (primarily from the Trainspotting locality of southwestern Alberta; Scott et al., 2018) and all BCADS specimens were imaged by one of the authors (AJA) with a Dino-Lite USB microscope (AM4115ZT build; locality information and specimens included in Supplemental Material 1). Only p4s that were complete, relatively unworn, or otherwise undamaged were included in the analyses.

The landmarking scheme employed in this study describes the serrate crest of the p4 and considers this structure homologous among the p4s of the various species of Mesodma studied here (Wilson, 2013; Smith and Wilson, 2017). Although individual landmarks positioned along the serrate crest may or may not be homologous, the serrations together function as a single unit and can be accordingly considered homologous (Wilson, 2013). Aspects of the exodaenodont lobe were excluded from the landmark scheme following the rationale of Smith and Wilson (2017).

To capture the relevant morphology of the p4, digital landmarks were placed at the anteriormost and posteriormost projections of the serrate crest (TL1 and TL2, respectively; Fig. 1). TL1 was positioned at the point of maximum curvature at the anterior margin of the crown. TL2 was placed at the posteriormost part of the crown–root junction at the posterior root. After TL1 and TL2 were located, a curve was traced in TPSDig (version 2.31; Rohlf, 2013a) along the apical portion of the crown to join the two landmarks. Eighteen semilandmarks were then resampled from the curve with 19 helper points (HP) placed between semilandmarks to assist in later standardization. We resampled all landmarks to be equally spaced using TPSUtil (version 1.78; Rohlf, 2013b). Landmark configurations were then prepared for 2D GMM analyses.

Landmark configurations were translated, scaled, and rotated to a centroid size of zero using Procrustes superimposition in the Geomorph package (version 3.3.2; Adams and Otárola-Castillo, 2013) to control for scale-based differences in morphology (Rohlf and Slice, 1990). During superimposition, semilandmarks were slid to minimize the Procrustes distance. Helper points were used to preserve the shape of the p4 through Procrustes superimposition but were subsequently removed (Wilson, 2013; Smith and Wilson, 2017). Final configurations consisted of 20 landmarks.

An allometric regression was considered as a size correction to our shape data, but ultimately was not included in our study for two reasons. Although the effect was significant (R2 < 0.10, p < 0.0), log centroid size had low explanatory power. Second, it is not known if these species share a common allometric component, casting doubt on whether a single- or multiple-regression approach should be undertaken (Klingenberg, 1996, 2016; Mitteroecker et al., 2004). Given these facts, Procrustes coordinates were considered the best mode of representing p4 shape overall.

Standard linear measurements were collected for specimens in both the KDS and BCADS. Three standard linear measurements have traditionally been employed in describing the shape of multituberculate p4s: crown length (L), crown height (H), and the length from the anterior margin of the crown to the intersection with the crown apogee (L1; Clemens, 1963; Krause, 1977; Novacek and Clemens, 1977; Sloan, 1981; Archibald, 1982; Sloan et al., 1987). L refers to the length of the crown, as measured along a baseline from the anteriormost point on the crown to the point where the posterolabial shelf intersects the posterior margin. H refers to height, the distance from the baseline to crown apogee. L1 is the distance between the anteriormost point on the crown and the point where the lines used to measure L and H intersect (Fig. 1). Measurements were taken in Fiji version 1.53c and recorded for further analysis (Schindelin et al., 2012). Linear measurements for all specimens are compiled in Supplemental Material 1.

Statistical analyses

Shape, form, and linear measurement data were collected for all specimens in the KDS and BCADS. Relevant shape, size, and form data were sourced from the landmark configurations previously described, whereas linear measurements were taken directly from the specimens for both the KDS and BCADS. Because the taxonomic composition of the BCADS was unknown, species-specific centroid size averages could not be assigned to specimens. Shape data reflect only the morphological variation in the p4, with the 20 landmarks being the only variables analyzed. CS was used as an estimate of size for each specimen and was extracted from the Procrustes superimposition output. Form data combined the shape and size values into one data matrix, and the subsequent analyses were conducted with CS and the 20 landmarks considered together. Several statistical analyses were performed on the KDS and BCADS data sets—principal components analysis (PCA), canonical variate analysis (CVA), linear discriminant analysis (LDA), and multivariate analysis of variance (MANOVA).

Principal component analyses were conducted on shape, form, and linear measurements to visualize morphological variation. With linear measurements only providing three possible axes of variation of comparable units, we calculated the PC loadings to determine how each linear measurement contributed to the variation along each axis.

Following the PCAs for the KDS, we conducted a CVA, MANOVA, and subsequent pairwise comparisons, if appropriate, on shape, form, and linear measurement data. Canonical variate analysis was used to visualize the data along axes that the showed the most variation in p4 morphology in the context of species groupings. The implicit linear discriminant function from the CVA was used to evaluate identification accuracy for size, shape, and form based on a priori specimen identifications (LDA). A MANOVA was then performed to determine if species occupied significantly different areas of morphospace. Subsequent pairwise comparisons were conducted to determine what, if any, species differed significantly from one another in p4 shape, form, and linear measurements. We also conducted a MANOVA and a subsequent post-hoc comparison on CS to evaluate the relative contribution of size variation to form differences between species.

LDA was performed on shape, form, and linear measurements of the BCADS, using specimens of M. formosa and M. thompsoni as a training set to calculate a linear discriminant function. This function was then used to explore the differences in taxonomic affinity of the BCA specimens. For the purposes of this study, we assumed, based on previous research, that the sample of unidentified p4s from BCA included those of M. formosa, M. thompsoni, or both. We ran an additional LDA on the BCADS with M. formosa, M. thompsoni, and M. ambigua as the training set (as discussed later in the text, we do not place as much weight in the results of this second LDA). We will also discuss why we do not consider M. hensleighi (a species of Mesodma found in several Lancian and earliest Paleocene mammalian faunas in northeastern Montana and elsewhere) as a candidate for inclusion in the LDA.

All data and R code used to complete these analyses are available on data dryad (see data availability). All statistical analyses were conducted in RStudio (R version 4.0.2; R Core Team, 2022).

Known data set

The results of the PCA for shape data show no differentiation among species, with the p4s of each species overlapping to a large degree (Fig. 2.1). Principal component 1 (PC1) explains 43.55% of the variation in morphology, representing the height, length, and position of the crown apogee. Negative values reflect crowns that are higher, with a posteriorly positioned apogee, whereas positive values reflect crowns that are lower. Principal component 2 (PC2) explains 25.6% of the variation: positive values reflect crowns that support a shallower anterior margin, with the apogee positioned nearer the midpoint of the serrate crest, resulting in a more nearly symmetrical lateral profile to the crown; in contrast, negative values reflect crowns that support a steeper anterior margin, and with the apogee positioned more anteriorly, resulting in a more asymmetrical lateral profile to the crown (Fig. 2.1). Principal component 3 (PC3) explains 9.35% of the variation, representing the degree of curvature of the serrate crest: p4s with a serrate crest forming a shallow arc scored positively, whereas those with a strongly convex serrate crest scored negatively (Supplemental Material 2). The results of the PCA for form data indicate that the p4 of M. pygmaea is isolated from those of the remaining species, which themselves overlap along PC1 (Fig. 2.2). PC1, representing CS, explains 98.93% of all variation; in contrast, the remaining PCs are all explained by shape and show little to no separation of the various species (Fig. 2.2, Supplemental Material 2).

The results of the PCA for linear measurements show a different signal compared to those of the previous two PCAs. The p4s of M. ambigua and M. pygmaea, although showing moderate overlap, are more isolated from those of the other two species when compared to the results of the PCA for shape data, whereas the p4s of M. formosa and M. thompsoni remain fully overlapped (Fig. 2.3; Supplemental Material 2). PC1 explains 94.45% of all variation, with PC2 (4.66%) and PC3 (0.36%) explaining a relatively low amount of variation (Fig. 2.3, Supplemental Material 2). PC1 has a strong negative association with L (−0.87), but weak negative associations with H and L1 (−0.32 and −0.37, respectively). PC2 has a moderate negative association with H (−0.70), but relatively weak associations with L and L1 (0.48 and −0.52, respectively). PC3 has a moderate but opposite association between H and L1 (0.77 and −0.63) but has a negligible association with L (0.10; Supplemental Material 1).

The CVA plots show distinct differences between the shape, form, and linear measurement results (Fig. 3). The shape CVA plot shows the p4s of all species overlapping in some aspect of their distribution (Fig. 3.1), and like the results of the PCA for form, those of the CVA for form show the p4s of M. pygmaea isolated from those of the remaining species along CV1, with those of all the other species overlapping (Fig. 3.2). Lastly, the results of the linear measurement CVA show a similar pattern to those of the linear measurement PCA, with the p4 of M. ambigua and M. pygmaea overlapping and somewhat isolated from that of M. formosa and M. thompsoni, which fully overlap and are indistinguishable from one another along this axis of variation (Fig. 3.3).

In addition to qualitative differences in morphospace occupancy, the results of the LDA suggest that there are differences in the ability of shape, form, and linear measurements to refer the p4s to the correct species (Table 1). Shape data were 69.9% successful at identifying a given p4 to the correct species (51 of a possible 73 specimens), with p4 of M. thompsoni having the highest rate of identification success (83.3%), followed by increasingly poorer results for the remaining three species (Table 1). Form data performed marginally better than shape data, with a 71.2% success rate; the p4s of M. pygmaea and M. formosa had the highest rate of identification success, with those of M. thompsoni and M. ambigua having considerably lower rates of success (Table 1). The results of LDA for linear measurement data approximated those for form data, with a 72.6% rate of identification success. The p4s of M. pygmaea, M. ambigua, and M. formosa were all identified at a relatively high rate of success (88.9%, 80.0%, 76.9%, respectively), whereas the p4 of M. thompsoni was correctly referred half of the time (Table 1). Full CVA tables can be found in Supplemental Material 1.

The MANOVA tests found significant differences among species in all four analyses (shape: F3,69 = 3.02, p < 0.05; size: F3,69 = 15.9 × 101, p < 0.05; form: F3,69 = 21.7 × 101, p < 0.05; linear measurements: F3,69 = 19.6, p < 0.05). The results for shape data show a significant difference in morphology among M. formosa, M. thompsoni, and M. pygmaea (p < 0.05), while finding no significant pair-wise differences for M. ambigua (p > 0.05; Table 2). The results for size and form data distinguish M. pygmaea from all other species (p < 0.05; Table 2). No other pair-wise species comparisons were significant for size and form data (p > 0.05; Table 2). Of particular interest is the magnitude of difference between the results for size and form data, with those for size being much greater than those for form (4.00, 4.67, 4.52 vs. 5.69 × 10−1, 6.38 × 10−1, 6.21 × 10−1 for size and form respectively). Linear measurements show all species comparisons as significantly different (p < 0.05), with the exception of M. formosa versus M. thompsoni (p > 0.05; Table 2).

Bug Creek Anthills data set

The results of the PCA for shape data show no distinct clustering (Fig. 4.1). PC1 explains 56.8% of the variation in morphology, representing the height and the length of the p4. Negative scores indicate short but high crowns, whereas positive scores indicate longer, lower crowns. PC2 explains 20.09% of the variation, representing the position of the crown apogee: negative values indicate a relatively posteriorly positioned apogee (and hence a more nearly symmetrical lateral profile), whereas positive values indicate a more anteriorly positioned apogee (and hence a more asymmetrical lateral profile; Fig. 4.1). PC3 explains 7.29% of the variation, representing the flatness of the serrate crest: p4s with strongly convex serrate crests score negatively whereas those with lower, flatter serrate crests score positively (Supplemental Material 2). Like those for shape, the results for form data show no distinct clustering (Fig. 4.2). PC1, representing CS, explains 92.54% of all variation. In contrast, PC2 (4.33%), PC3 (1.51%), and the remaining variation are all explained by shape (Fig. 4.2, Supplemental Material 2).

The results of the linear measurement PCA are congruent with those of the previous two PCAs, with no distinct clustering noted (Fig. 4.3). PC1, which is mostly associated with L, explains 78.22% of all variation. In contrast, PC2 (12.74%) and PC3 (9.04%) explain a relatively low amount of variation (Fig. 4.3, Supplemental Material 2). PC loadings show that PC1 has a strong negative association with L (−0.91), but weak negative associations with H and L1 (−0.28 and −0.29, respectively). PC2 has a strong negative association with H (−0.86), but relatively weak associations with L and L1 (0.38 and −0.34, respectively). PC3 has a strong negative association with L1 (−0.89), but relatively weak positive associations with L and L1 (0.15 and 0.42, respectively; Supplemental Material 1).

The results of the linear discriminant analysis reveal differences in the number of specimens referred to species of Mesodma in the BCADS based on shape, form, and linear measurement data (Table 3). Shape data for M. thompsoni and M. formosa identify 99 specimens (54.1%) as referable to M. formosa, with the remaining 84 specimens (45.9%) as referable to M. thompsoni. Form data identify 147 specimens (80.3%) as referable to M. formosa, with the remaining 36 specimens (19.7%) as referable to M. thompsoni. Linear measurement data identify 37 specimens (20.2%) as referable to M. formosa, with the remaining 146 specimens (79.8%) as referable to M. thompsoni. In summary, shape data support a nearly even split of specimens referred to either M. formosa or M. thompsoni, form data support a greater number of specimens referable to M. formosa, and linear measurement data support a greater number of specimens referable to M. thompsoni (Table 3).

In the LDA where we include M. ambigua data in our training set, a similar pattern of species prediction occurs as in the LDA without M. ambigua (Table 4). Shape, form, and linear measurement data identify almost all unknown specimens as referable to either M. formosa or M. thompsoni with very few specimens being referred to M. ambigua (Table 4). Shape data identify 108 specimens (59.0%) as referable to M. formosa, 71 specimens (38.8%) as referable to M. thompsoni, and the remaining 4 specimens (2.20%) as referable to M. ambigua. Form data identify 126 specimens (68.8%) as referable to M. formosa, 51 specimens (27.9%) as referable to M. thompsoni, and the remaining 6 specimens (3.30%) as referable to M. ambigua. Linear measurement data identify 28 specimens (15.3%) as referable to M. formosa, 155 specimens (84.7%) as referable to M. thompsoni, while no specimens are referable to M. ambigua. These conclusions match those from the LDA informed solely by M. formosa and M. thompsoni morphometric data—both shape and form data favor M. formosa identification over linear measurements (Table 4).

In contrast to traditional methods of assessing subtle differences in both shape and size of the diagnostic p4 (Clemens, 1963; Lillegraven, 1969; Novacek and Clemens, 1977; Archibald, 1982), the results of Smith and Wilson's (2017) study suggest that p4 size may be the most appropriate method for distinguishing species of Mesodma. The results of our study lend some support to this hypothesis, with size-dominant data (form) able to explain most of the variation in p4 morphology among all the included species of Mesodma generally, and in distinguishing M. pygmaea from the remaining species of Mesodma more particularly. However, the results of the analyses of linear measurement data suggest significant differences exist among the p4s that are not captured by form data. If linear measurements are considered reliable proxies for both shape and size, the discrepancy between the results for these and 2D GMM form data is unexpected and warrants further investigation. Here, we discuss the implications of our results in the context of size, shape, and form in distinguishing p4s of the various species of Mesodma, the value of linear measurements in multivariate analyses of multituberculate p4s, the BCA Mesodma problem, and the future of species discrimination in Mesodma.

Evidence against a monovariable approach to species discrimination in Mesodma.—The results of our analyses of shape, form, and linear measurement data for the KDS imply differences in the ability of each of these to distinguish among p4s of M. formosa, M. thompsoni, M. pygmaea, and M. ambigua. Separation among species clusters is generally poor, although the results of the form PCA suggest that the p4 of M. pygmaea can be distinguished from those of the remaining species (Fig. 2.1, 2.2); given that M. pygmaea is the smallest known species of Mesodma (Sloan et al., 1987), PC1 can be inferred to represent CS, in concert with our size-exclusive MANOVA results. Results of the linear measurement PCA show intermediate resolution, with more obvious separation of the p4s of both M. ambigua and M. pygmaea from those of M. formosa and M. thompsoni (Fig. 2.3).

Form appears to be the most effective means for distinguishing the p4 of M. pygmaea from those of other species along the primary axes of variation and from the results of the statistical analyses (Fig. 3.2). If shape was the more reliable variable, then distinct clusters should have been discernable in the results of the analyses for shape data; on the contrary, distinct clustering did not occur in the shape results for either the PCA or CVA (Figs. 2, 3).

Although the results of our analyses for shape data were not significant, those of the quantitative analyses, which considered the entirety of the morphospaces generated by form and linear measurement data, provide evidence against the use of a single variable to distinguish the p4s of species of Mesodma. The results of the pairwise species comparisons within the KDS highlight the different contributions of form data and linear measurement data to understanding the morphological diversity in the p4 of Mesodma. Considered in isolation, shape data were found to be ineffective at distinguishing the various p4s, supporting Smith and Wilson's (2017) conclusions (F3,69 = 1.06, p > 0.05; Table 1). Form data distinguished the p4 of M. pygmaea from those of the remaining species but were unable to distinguish among the other included species. Linear measurement data were the most successful at distinguishing among the p4s, with all but one pairwise comparison yielding significant results. These results suggest that form and linear measurement data are both able to capture aspects of shape and size, but in different ways when evaluated in multivariate space.

Discrepancies between traditional and novel methods of species discrimination in Mesodma.—Our results indicate that linear measurements were able to capture variation in length, height, and the position of the crown apogee in multivariate space; when reviewing PCA plots, these aspects of crown morphology change most apparently along the primary axes of variation (see warp grid deformation; Fig. 2.1; Appendices 1 and 2). Interestingly, although the analyses of shape data failed to find significant differences, those for linear measurements were able to discriminate among p4s of the various species of Mesodma, with shape and size both contributing. Although our results are congruent with those of Smith and Wilson (2017) in that p4 size—when considered with shape—plays an important role in distinguishing among species of Mesodma, they diverge in the relative importance of that role considering the linear measurement analysis results.

If p4 size is the most reliable variable for differentiating species of Mesodma, there should be strong associations for all three linear measurements along the primary axes of variation (i.e., if the overall size of p4 increases, it would be expected that the three linear measurements scale isometrically). Our results indicate a strong loading only for the L measurement on PC1, suggesting that larger p4s are not simply isometrically larger versions of smaller p4s (Supplemental Material 2). The same holds true for the remaining PCs, and there is limited evidence that any one primary axis of variation is explained solely by size. Although these results point to significant complexities in the interaction of shape and size in the linear measurements, they nonetheless suggest that linear measurements capture aspects of p4 morphology that are not apparent from form data alone (i.e., if the same interactions of shape and size were present in form data, we would expect nearly identical results to those from linear measurements, but this is not the case).

Even though form data should in principle capture aspects of p4 shape and size, similar to linear measurements, only one of the included species (M. pygmaea) was clearly separable, almost certainly owing to its much smaller size (Table 2). These discrepancies indicate complex interactions exist between CS, shape, and the size of the p4, and may highlight the limitations of the analytical techniques used in this study. For example, previous research on the ecomorphology of adaptive radiation in extant carnivorans has suggested that rapid diversification in dental morphology can easily mask important morphological characters (Slater and Friscia, 2019).

A possible explanation for the difference in the ability of linear measurement and GMM methods to capture the same degree of morphological differences could lie in the amount of data that are captured by each method. GMM captures all possible shape variation delimited by the landmark scheme, whereas linear measurements reduces shape variation to a morphospace that is known to differentiate multituberculate species (Jepsen, 1940; Clemens and Kielan-Jaworowska, 1979). It is possible that the shape data collected is much too “noisy” to be informative in differentiating species compared to the entrusted linear measurement morphospace.

In our opinion, the nature of these interactions between methods and variables described above needs further exploration before any generalizations can be made about which variable may be the most reliable in distinguishing among p4s of Mesodma. While a more detailed analysis of these interactions is beyond the scope of this study, the results of our analyses of the KDS are largely reflected in those of the BCADS as well.

The Bug Creek Anthills Mesodma problem continues to be a problem: how many species?—The results of most prior studies of the BCA mammalian fauna have suggested the presence of either one (indeterminate; Novacek and Clemens, 1977) or two (M. formosa and M. thompsoni; Sloan and Van Valen, 1965) species of Mesodma. Lofgren (1995), although not excluding the possibility of both species being present at the BCA and other localities documenting typical BCA mammalian faunas, nonetheless opted to consider all specimens referable to Mesodma in this depositional context as indeterminate. A third possibility is that some of the BCADS p4s are referable to M. hensleighi or M. ambigua, species that are known from similarly aged deposits (Fox, 1971; Eberle and Lillegraven, 1998; Wilson, 2013). Mesodma hensleighi is among the smallest species of Mesodma so far discovered, with p4 lengths averaging 2.51–3.01 mm (Lillegraven, 1969); 50.3% (92 of 183) of the BCADS specimens fall within this range. The possibility of p4s of M. hensleighi being present in the BCADS is important to consider, and further emphasizes the uncertainties inherent in untangling multituberculate species identification at the BCA locality. Mesodma ambigua, while potentially present in the BCA mammalian fauna, is considered less likely given the results of our study, particularly based on the results of the linear discriminant analysis. Nonetheless, given these realities there is a low—but not zero—probability that specimens of M. hensleighi and M. ambigua are present in the BCADS, and this was considered in the following discussion.

The results of the BCADS lend some support to the hypothesis proposed by Novacek and Clemens (1977): those for the PCAs for shape, form, and linear measurements all resulted in a mass of data points, with no distinguishable clustering (Fig. 4). In contrast, the results of LDA were more conclusive, but mutually exclusive: where the results of the LDA for shape data saw a relatively even split of p4s referred either to M. formosa or M. thompsoni, those for form and linear measurements show starkly different results, with form data resulting in the majority of specimens being referred to M. formosa, and linear measurements data resulting in the majority of specimens being referred to M. thompsoni. We emphasize that the LDA was trained with M. formosa and M. thompsoni p4s and is limited to comparing to these data. Further work is needed in building a robust LDA data set to test these methods before regarding the discrepancy between the results of the PCA and LDA as well supported.

Based on the results of their study, Smith and Wilson (2017) suggested that the p4 of M. formosa increased in size (and hence body size, as proxy) across the K–Pg boundary. If this hypothesis is true, M. formosa and M. thompsoni from earliest Paleocene deposits are of near-equivalent body size, and their p4s are likely indistinguishable from one another. Contrarily, if Smith and Wilson's (2017) hypothesis is incorrect, M. formosa and M. thompsoni should be distinguishable on the basis of p4 size, regardless of locality or age. Given that the BCA fauna conceivably includes the remains of organisms that existed both prior to and immediately after the K–Pg event, it would be expected by chance alone that the BCADS would include p4s from individuals of both M. formosa and M. thompsoni that lived during both time intervals. The results of the PCAs for any of the included data (shape, form, linear measurements) offer little in the way of clarity, with a single, large cluster resulting for each lending support to Novacek and Clemens’ (1977) single-species hypothesis. In contrast, the results of the LDAs point to the presence of both M. formosa and M. thompsoni in the BCADS, but the proportion of p4s referred to each differs depending on which variable is used. If Smith and Wilson's (2017) hypothesis is true, some or all of the p4s referred to M. thompsoni may be referable to individuals of M. formosa that lived during the earliest Paleocene.

Given that the results of the PCAs provide little to no resolution, whereas those of the LDAs are more conclusive in their suggestion of two (or more, given the possibilities surrounding M. hensleighi) species in the BCADS, we are inclined to support the latter hypothesis, but recognize that the varying results point to significant differences in the resolving power of each of the variables used, and given the complex nature of the BCA deposit, further study is needed.

Species discrimination in Mesodma: An ongoing area of study.—Results of the morphometric analyses shown in our results provide evidence that aspects of both shape and size can contribute to distinguishing among p4s of species of Mesodma, but in different ways. From our results, we have identified a few items to contextualize future work with regards to species discrimination in Mesodma.

Although shape and size both contribute to morphological variation in the p4 of Mesodma when evaluated concurrently (i.e., multivariate linear measurements) or consecutively (i.e., 2D GMM form data), the results vary in which species are most readily distinguished and the magnitude of the differences among species. Using multivariate linear measurements, the differences among all included species (excluding M. formosa and M. thompsoni) were significant (Table 2). In contrast, 2D GMM form data found only M. pygmaea as significantly different from the other species (Table 2). The discrepancies between these results highlight the likelihood that shape and size are not captured in the same way with these methods, and caution is warranted when comparing results. Our results provide some support for Smith and Wilson's (2017) conclusions but also highlight that shape, size, form, and linear measurements interact in nuanced ways. Until a better understanding of these interactions is reached, we conclude that neither p4 shape nor size alone should be considered a more reliable variable.

The results of this also study illuminate the complexity of the BCA Mesodma problem. Considering the method-specific differences in the results, one possibility is that the BCADS includes not only specimens of M. thompsoni (from both the Late Cretaceous and earliest Paleocene), but also specimens of M. formosa from individuals that existed during the earliest Paleocene. In this scenario, size is of no assistance in distinguishing these species—a situation that naturally leads to questions of whether two or more species of Mesodma are represented in the earliest Paleocene in the study area (a possibility considered by Smith and Wilson, 2017). There are stark differences in LDA performance among the methods despite all three methods recovering a similar amorphous cluster of specimens in the PCA (Fig. 4). The use of LDA techniques therefore appear to hold some potential for addressing the difficulties in species recognition in the BCA faunas, but, like that for distinguishing among species of Mesodma more generally, a deeper understanding of the interrelationships of shape and size is warranted prior to any firm conclusions being drawn. Future work should develop more robust LDA training sets with a more diverse species list to further evaluate the indeterminate, two, and more-than-two species hypotheses for Mesodma species diversity at the BCA locality.

Given the well-known difficulties in distinguishing among species of Mesodma, the possibility that p4 size may be the most effective means for doing so could potentially assist in resolving what has been a longstanding taxonomic problem. However, our results confirm the existence of differences in p4 morphology among species of Mesodma that fail to be recovered in analyses of size-inclusive shape data (i.e., form), results that are largely in agreement with previous research (e.g., Novacek and Clemens, 1977; Archibald, 1982; Weaver and Wilson, 2021), and further suggesting that a reliance on size alone may not be beneficial for resolving this issue. Shape and size have an intricate relationship generally, and the two are rarely able to be isolated and manipulated while still retaining biological significance (Collyer et al., 2020). Our results highlight this reality. A better understanding of the relationship between shape and size will be a necessary precursor for future studies of species discrimination in Mesodma, other multituberculates, and other extinct mammalian groups identified by individual teeth.

We thank K. Seymour and D. Evans for access to specimens in the ROM collections, M. Caldwell for access to specimens in the UALVP collections, and B. Strilisky for access to specimens in the RTMP collections. We would like to thank D. Brinkman and M. Fox (Yale Peabody Museum of Natural History) for imaging several of the specimens used in this study during the COVID-19 pandemic. S. Smith assisted with elucidating some aspects of the methodology used in Smith and Wilson (2017). We would also like to thank D. Maucieri for lending code that was beneficial to the completion of this manuscript. We would like to thank D. Polly, L. Weaver, and anonymous reviewers whose comments greatly improved this manuscript.

The authors of this manuscript declare they have no conflicts of interest in submitting this manuscript.

Data and all supplemental materials are available from the Data Dryad Digital Repository: https://doi.org/10.5061/dryad.fxpnvx0zd

This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.