Abstract
Psychological science can be used to inform climate science graph design, resulting in more meaningful and useful graphs for communication, especially with non-scientists. In this study, we redesigned graphs from the Intergovernmental Panel on Climate Change (IPCC) Fifth Assessment Report (AR5) and compared participant attention and perceptions between original and novel designs with pre-/post-surveys, eye-tracking, graph usability and ranking activities, and interviews. Participants were selected for lower content knowledge and risk perception of climate from a sample of undergraduate students in the southeastern U.S. Here, we demonstrate our robust graph redesign process and the associated impacts on participants’ perceptions of graph usability, graph and scientist credibility, and risk associated with climate change. These findings indicate that interacting with climate change graphs may impact perceptions that are relevant to individuals’ motivation to take action to address climate change across political audiences, and possibly even more so among self-identified Conservatives. Additionally, participants who viewed graphs designed to align with research-informed best practices had greater increases in perceptions of climate scientist credibility and climate change risk, though these contrasts were not statistically significant (p > 0.05). Participants rated redesigned graphs as being more trustworthy, which is critical to successful climate change communication, and our qualitative results provide a possible explanation and initial points of exploration for future research.
INTRODUCTION
Reducing harm from climate change requires public education about the facts and risks associated with current and future global change, and data visualizations and graphs can be effective means for communicating climate change to general audiences (IPCC, 2014; van der Linden et al., 2017). There is a growing body of cognitive and psychological science that can inform the design of climate change graphs that are both more effective for communication efforts and more accessible to broad audiences (Harold et al., 2016). Specifically, Harold and colleagues explored the existing literature to create and describe 11 guidelines for improving climate science graphs based on strategies for directing visual attention, reducing complexity, supporting inference-making, and integrating text with graphics. The authors then used these guidelines to make a revised version of an important Intergovernmental Panel on Climate Change (IPCC) figure, AR5 WG1 SPM.5, which was presented in their paper and inspired the current study. In this study, we tested original IPCC figures against new designs to measure the impact of each on viewers’ judgments of credibility, usability, and risk, which is described in greater detail below.
Several recent studies have assessed the impacts of IPCC graph design on audience performance and perceptions, including the target audience of IPCC reports: policymakers and decision-makers (Fischer et al., 2020; Kause et al., 2020; McMahon et al., 2015). Designing and conducting assessments with a target audience in mind is especially important because climate perceptions and intake of climate information can vary widely across individuals. For example, there is a well-known discrepancy in climate change beliefs between U.S. political parties, such that 92% of Liberal Democrats think global warming is caused mostly by human activities, whereas only 30% of Conservative Republicans think the same (Leiserowitz et al., 2020). Hornsey et al. (2016) conducted a meta-analysis and found many factors related to climate perceptions, with the most significant being participants’ trust in scientists, perceived scientific consensus, ecological perspectives, experience of local weather change, and political affiliation (2016). For example, it is often assumed that scientific knowledge is a major determinant of climate change beliefs, but many researchers have debated whether it increases individuals’ perception of risk from climate change (e.g., Aksit et al., 2017; Kahan et al., 2012; Shi et al., 2016). Communication efforts using scientific and factual approaches may be more effective for younger audiences, perhaps due to a greater malleability of belief systems (Aksit et al., 2017; Stevenson et al., 2014). Recent work suggests that climate science communication can improve climate acceptance, policy attitudes, and willingness to make personal sacrifices for climate action, including long-term impacts on knowledge and acceptance of anthropogenic climate change (Ranney and Clark, 2016).
Perceptions of Climate Scientists
Because the public cannot make atmospheric measurements individually, they must rely on external sources of information about anthropogenic climate change, which has led to a wealth of research focused on perceptions of source credibility and scientific consensus surrounding climate change (Druckman and McGrath, 2019; Siegrist and Cvetkovich, 2000). Source credibility has been described in several ways in previous literature, but definitions often incorporate considerations for a consistent set of constructs, including (1) trustworthiness or honesty, (2) expertise or competence, and less often (3) intent or goodwill (Dong et al., 2018; Druckman and McGrath, 2019; Fiske and Dupree, 2014; Kahan et al., 2011; McCroskey and Teven, 1999). Several authors have studied climate change communication efforts focused on conveying the 97% scientific consensus around anthropogenic climate change (Bolsen and Druckman, 2018; Cook et al., 2016; van der Linden et al., 2015). Communication of scientific consensus has been shown to reduce political polarization of beliefs by up to 50% (van der Linden et al., 2017). However, in other studies, political identification and relevant knowledge may have instead moderated the effects of consensus messaging (Bolsen and Druckman, 2018). McCright et al. (2013) found that perceived scientific agreement was a stronger predictor of global warming beliefs (both acceptance and risk perception, in this case) than socio-demographic, political, or environmental identities. Determinations of climate scientist credibility are likely complex and affected by a range of judgments, such that climate skeptics may still have high regard for scientific authority and defer to scientists on matters of environmental policy (Sarathchandra and Haltinner, 2020).
Incorporating multiple factors relevant to climate perceptions in research can help elicit the mechanisms by which individuals’ cultural backgrounds influence their climate beliefs. For example, the theory of cultural cognition of risk, described by Kahan et al. (2011), suggests that perceptions of risk are motivated by cultural values, and previous research specifically concerning climate change suggests that this relationship may be moderated by source credibility and perceptions of scientific consensus. Thus, the authors posit that uncertainty and denial of anthropogenic climate change persist in the public “not because members of the public are unwilling to defer to experts but because culturally diverse persons tend to form opposing perceptions of what experts believe” (Kahan et al., 2011, p. 166). Thus, participants’ worldviews and politics influence their perceptions of source credibility, which in turn influences their perceptions of what credible sources believe (i.e., their estimate of consensus among climate scientists that anthropogenic climate change is real and dangerous).
Recent research has expanded on this idea, namely Druckman and McGrath’s (2019) review of previous research concerning motivated reasoning, which shows the importance of source credibility judgments to information intake and assimilation. Namely, they suggest that many members of the public may strive to form accurate opinions about climate change but may disagree on what information is credible, potentially along political divides. Thus, an individual’s acceptance or rejection of any piece of information, based on the source, leads to shifts in their personally held opinions. Akin et al. (2020) tested this idea directly using their Leveraging–Involving–Visualizing–Analogizing (LIVA) model, which showed that intentional climate communication efforts can reduce the identity-protective interpretation tendencies of both Liberal and Conservative audiences. The authors argue that presenting data and assigning a task (i.e., answering questions about the data) primes participants for accuracy rather than cultural cognition, which improves their acceptance of the information. Furthermore, even small details of how a piece of information is presented, such as framings of scientific uncertainty, have been found to impact trust in scientists, which then impacts whether a participant accepts the new information (Howe et al., 2019). Ultimately, public climate change acceptance and education, including consideration of the above factors, are important because they have been demonstrated to lead to climate change risk awareness and pro-environmental beliefs and actions (Hornsey et al., 2016; McCright et al., 2013; O’Connor et al., 1999; Ranney and Clark, 2016; van der Linden et al., 2015).
Climate Communication with Graphs
Climate communication often includes graphs of climate data, which has motivated extensive research on how to design the graphs for greater effectiveness and accessibility, as discussed above (Harold et al., 2016; Fischer et al., 2020). Limited evidence also shows that graph design may impact perceptions of credibility (McMahon et al., 2016) and risk, though this topic has been explored in greater depth in health communication than in climate change (Ancker et al., 2006; Okan et al., 2018). Ultimately, climate change communication efforts must include consideration of each audience’s cultural perspectives and perceptions of science and scientists, and more research is needed to explain the possible interactions of these factors.
Graph use and comprehension can be assessed using eye-tracking technology and the construct of usability (Atkins and McNeal, 2018; Goldberg and Wichansky, 2003; Renshaw et al., 2003). Eye trackers measure a user’s visual attention to space over time, either on a computer screen with a screen-based tracker or in three-dimensional space with eye-tracking glasses, using infrared emissions and reflections from the cornea and retina. Most eye movement consists only of fixations, which are momentary pauses on a small (~5°) area during which information is viewed and processed, and fast saccade movements between fixations (Duchowski, 2007). Thus, eye tracking measures attention by recording the location and duration of each fixation.
Usability is a construct originating in computer science that describes how well a defined audience can use a tool to achieve a goal. It is therefore task-focused and measured using an individuals’ effectiveness (accuracy) and efficiency (speed) in completing the task, and user satisfaction with the tool during use. Eye tracking can be used to help measure these metrics, as well as to provide data about user attention during the task, potentially illuminating why and how the product is more or less usable (Bojko, 2013; Maudlin et al., 2020).
Objectives
In this paper, eye-tracking technology, interviews, surveys, and ranking activities were used to compare the usability and participant perceptions of original and redesigned IPCC graphs. We focused on measuring risk perception of climate change, which is related both to climate change knowledge (relevant to information uptake) and intention to take climate action (Stevenson et al., 2014; van der Linden et al., 2015). Additionally, we chose to examine perceptions of source credibility and consensus, because both are important to perceptions of climate change information and beliefs (Kahan et al., 2011). We were also curious about whether simpler or more attractive graphs may be considered less scientific or less trustworthy by participants (McMahon et al., 2016). The audience for this study consisted of undergraduate students recruited from large introductory classes. This was partly an audience of convenience, since our study required significant time and repeated engagement with participants; however, these findings could also apply to communication efforts with other undergraduates and public audiences, for which IPCC figures are often used.
This study was guided by several research questions: how does graph design influence (RQ1) participant attention to and usability of the graphs, (RQ2) participant perceptions of the graphs, and (RQ3) participant perceptions of climate change more broadly? We also investigated (RQ4) whether the activity had differential effects on participants with different self-reported political affiliations. Lastly, qualitative data were analyzed to learn (RQ5) what features of the graphs and experiences with them were relevant to participants’ judgments of credibility.
MATERIALS AND METHODS
Graph Redesign
First, a rubric was created to assess poor and successful applications of each guideline presented by Harold et al. (2016), and the rubric was reviewed by the primary author of that publication (Table 1). The rubric was then applied to four graphs from the IPCC’s AR5 Summaries for Policymakers (SPMs) with the rationale that the SPMs are the reports intended for non-expert audiences (i.e., policymakers rather than climate researchers; IPCC, 2013, 2014). Based on the rubric, an analysis of potential weaknesses of each graph was conducted, and improvements to each were made to create early redesigns (Table 2). Early graph redesigns and their accompanying rubrics were reviewed by several experts, including two authors of Harold et al. (2016).
Based on the expert feedback, additional changes were made to each graph, and two of the four were selected for use in this study (Redesign 1). Specifically, we used the graphs that had the least data series, were deemed simpler in design by the authors, and could best exemplify the redesign principles. These were graphs SYR SPM.3 (IPCC, 2014; Fig. 1) and WG1 SPM.1(a) (IPCC, 2013; Fig. 2). Because this study was inspired by Harold et al. (2016), we also used the original and redesigned versions of AR5 WG1 SPM.5 presented by those authors but with the second data set (total anthropogenic emissions at three timepoints) removed from each (Fig. 3). Thus, each of the three graphs used in this study contained the same amount of data in the original and redesigned forms, and design changes between the versions were relatively modest to ensure that any effects on use or perceptions were due only to those changes inspired by the Harold et al. (2016) guidelines.
The first guideline recommends iterative redesign of graphs, including user testing with an identified audience. Thus, after this study was completed with 60 participants, data from those participants were analyzed to inform a second redesign of two of the graphs (Redesign 2). No substantial improvements were apparent for the redesign of AR5 WG1 SPM.5 presented by Harold et al. (2016); therefore, a second redesign was not created or tested. Because the second redesign was based on user testing, experts were not consulted, and the redesign was instead informed solely by collected data. The experiment was then completed with an additional 24 participants all using the Redesign 2 graphs.
The changes made for the redesigns of SYR SPM.3 (Fig. 1) are described in Tables 1 and 2, and design changes to WG1 SPM.5 (Fig. 3) are described in detail by Harold et al. (2016). For the first redesign of WG1 SPM.1(a) (Fig. 2), we focused on clarifying the relationships on the y-axis by emphasizing the zero (average) value and adding colors and text to emphasize the meaning of the values. We also made the existing labels and titles larger for improved accessibility and salience. Most participants were very comfortable interpreting the graph, so the changes made for Redesign 2 were primarily intended to reduce visual clutter and direct more attention to the lower half of the graph, which is described in greater detail below.
Recruitment and Participants
User testing is most meaningful when applied to a specific audience with shared perspectives and background knowledge who may therefore face similar usability challenges or benefit from similar modifications. Thus, the first step of this study was to build a defined audience (Table 3). To accomplish this task, an online survey assessing climate change perceptions and demographic information was emailed to introductory undergraduate classes at a large public university in the southeastern U.S. (Table 4). The results of this survey also served as a point of comparison for before and after the in-person study, and thus this survey is called the “pre-survey” going forward. Participant perceptions of risk posed by climate change was our primary outcome. We measured this using an index of six items regularly employed by the Yale Program on Climate Change Communication as well as an item on perception of climate scientist consensus around anthropogenic climate change used by those authors (Leiserowitz et al., 2017; see Methods section of the Supplemental Material1). We also used instruments related to source credibility (McCroskey and Teven, 1999; 18 seven-point bipolar items adapted to address perceptions of climate scientists), climate change knowledge (Libarkin et al., 2018; 21 multiple-choice items), graphing experience (Atkins and McNeal, 2018; five eight-point items), and demographic questions, largely those used by Libarkin et al. (2018). Survey respondents provided informed consent and were entered for a chance to win one of several $50 online gift cards. The survey received 704 complete and valid responses, and 631 of those participants agreed to be contacted for the in-person follow-up study described in this manuscript.
From those completed surveys, we invited 239 participants with scores below the group median in both climate change knowledge and risk perception to participate in the in-person study, resulting in 100 study sign-ups and 87 participants. The data from three participants were discarded due to poor eye-tracking capture (<70% combined capture). Of the in-person participants whose data were retained (N = 84), 43 identified as women, 41 as men, and none as other genders; 74 were white, 6 Black or African American, 2 Native American and white, 1 Black or African American and white, and 1 Latine and white. Fourteen identified as politically very Conservative, 39 Conservative, 23 as middle-of-the-road, 7 Liberal, and 1 very Liberal. We selected those students with lower climate change knowledge and risk perception both because of the educational interests of the researchers and to improve our ability to measure any increases in risk perception, i.e., to avoid ceiling effects from participants with risk perception that was already high. Ideally, we could have included more participants to improve statistical power for the study but were limited by researcher (graduate student) time restraints. Additionally, the total sample size of 84 participants was deemed sufficient, if not ambitious, for a study including both qualitative analysis and eye tracking.
Experimental Methods
In the lab, participants were first given a verbal explanation of the study and an informative letter, approved by the Auburn University Institutional Review Board, to read and sign, if they chose to do so. Then, each participant completed an activity on a computer with a Tobii eye tracker using either the three original IPCC graphs (n = 30), three Redesign 1 graphs (n = 30), or two Redesign 2 graphs (n = 24). Participants were not aware of additional graph design variants until the end of the eye-tracking experiment. The participants first viewed each graph alone to become familiar with it and then answered three questions, advancing at their own pace, which required them to extract specific values, years, etc., from the information in the graph. Thus, these are referred to as data extraction questions (Table S2, see footnote 1). These questions were used as the task for the usability assessment (RQ1), for which efficiency and effectiveness were measured, and provided a basis for the participants’ ratings of the graphs’ ease of use (satisfaction). For each graph, the questions were presented in the same order for all participants; however, the sequence in which each graph was shown was shuffled to minimize any effects of ordering.
After the eye-tracking activity was completed, participants took the post-survey, which included some of the instruments presented in the pre-survey to measure pre/post changes (Table 4). Next, each participant was introduced to an additional design variant of each graph to compare to the design they had seen during the eye-tracking activity. However, they did not complete the eye-tracking activity again, and instead compared printed-out paper copies while being interviewed, similar to a picture-sorting task (Morgan et al., 2002). For each pair of graphs (original and new design), participants were asked which one was easier to use and which looked more trustworthy. After comparing each graph side-by-side with the additional design variant, participants were asked to rank each of the graphs on the basis of trustworthiness (“in order from most trustworthy-looking to least”), ease of use (satisfaction; “in order from easiest to use to hardest”), and worry (risk perception; “in order of how worried they make you about climate change”). The participants’ rankings of the graphs were recorded, quantified, and statistically analyzed.
Analysis and Statistics
Research question (RQ) 1 was addressed via the eye-tracking activity, where eye-tracking attention and use findings were primarily used qualitatively to inform the design of the second redesigns of the graphs. The quantitative efficiency and effectiveness results of the activity were used to compare the usability of the graphs (Table 5). The results of the ranking activity were analyzed with non-parametric tests because the data were ordinal, dependent, and had unequal groups. These analyses provided evidence for the satisfaction component of usability (ease of use; RQ1) and all aspects of RQ2. RQ3 and RQ4 compared the pre-survey and post-survey data, including interactions with participants’ political affiliations that they provided at the end of the pre-survey (demographics section). To address RQ5, participants’ descriptions and explanations from side-by-side and whole-group comparisons were audio-recorded and transcribed for qualitative analysis using the Dedoose software program (Saldaña, 2013). Additional details about our analyses can be found in the Methods section of the Supplemental Material.
In addition to means comparison tests, we also use the Cohen’s d effect size throughout the study. Cohen’s d provides a measure of the mean difference between groups standardized to the standard deviation of the groups, which gives more information than binary statistical significance outcomes alone (Cohen, 1988). Effect-size reporting is especially important to this study as our statistical power was limited by our sample size, whereas most effect sizes are unaffected by sample sizes. Effect sizes have thresholds commonly used to aid interpretation, which for d are small, medium, and large effects at values of 0.2, 0.5, and 0.8, respectively (Cohen, 1988).
RESULTS
RQ1: Measuring Attention and Improving Usability of the Graphs
From the eye-tracking data, heatmap figures showing total attention to different features were produced for each data extraction question posed to participants with each graph design variant (e.g., Fig. 4). Heatmaps and interview data were qualitatively analyzed, and attention to key areas of interest were quantitatively analyzed to inform the second redesigns (Results section of the Supplemental Material). For example, the participants using the first redesign of graph 2 (Fig. 4B) more often answered one of the questions incorrectly, possibly because they paid more attention to the upper panel of the graph, where the answer to this question was much more difficult to discern. Thus, in designing the next version of the graph (Fig. 4C), we decreased the distance between the two panels and increased the prominence of the graph labels that we thought would encourage use of the bottom panel and improve participant performance. After these changes were implemented, attention to the top panel decreased and accuracy increased.
There were few statistically significant contrasts in effectiveness or efficiency, the first two measures of usability, which was measured with participant accuracy and time spent on three data extraction questions per graph (Table S3, see footnote 1). Specifically, participants were less effective and efficient when using Redesign 1 of graph 1 than the original: F(2, 81) = 3.30, p = 0.04, Tukey HSD p = 0.03; F(2, 81) = 3.83, p = 0.03, Tukey HSD p = 0.03, respectively. However, participants’ efficiency and effectiveness improved with Redesign 2, and there were no statistically significant differences between the original and Redesign 2. All other contrasts of the usability metrics between designs of the same graph were not statistically significant (p > 0.05), and each of the effect sizes describing the impact of graph redesign on effectiveness and efficiency were small or very small, which indicates that the changes made to the graphs did not drastically influence viewers’ ability to extract relevant data or the length of time it took to do so.
RQ2: Effects on Perceptions of the Graphs
In contrast, participant satisfaction, the third measure of usability, varied across graph designs. The redesigns of Graphs 2 and 3 were rated as significantly easier to use than the original versions, with large and medium effect sizes, respectively (Table 6). The ranking activity was also used to assess participants’ relative assessments of trustworthiness and climate change worry associated with each graph, which were meant to serve as a parallel to perceptions of climate change and climate scientists included in the surveys. Participants rated the redesigned graphs as significantly more trustworthy than all three original designs of the IPCC graphs, and they rated the redesigns of Graph 2 as significantly more worrisome than the original (all p < 0.01; Table 6).
RQ3: Effects on Perceptions of Climate Change and Climate Scientists
Overall, participants had significant increases from pre- to post-test in estimates of climate scientist consensus around climate change (t = 10.47, p < 0.01, d = 0.37), perception of risk associated with climate change (t = 15.28, p < 0.01, d = 0.32), and perception of credibility of climate scientists (t = 49.95, p < 0.01, d = 0.68). These changes are likely a result of their participation in the graph activity on the computer, which concluded just before the post-survey, but we cannot validate this direct effect because we did not include a control condition. Participants who interacted with the redesigned graphs had greater pre-/post-survey changes; however, between-group contrasts were not statistically different (mixed analysis with variance [ANOVA] of p > 0.05; Fig. 5). Specifically, the mixed ANOVA test statistics for the combined influence of participation in the study (within subjects) and experimental group (between subjects) were F = 0.31, p = 0.72 for the consensus estimate; F = 1.62, p = 0.20 for risk perception; and F = 1.00, p = 0.37 for perceptions of climate scientist credibility (additional statistics in Fig. 5).
RQ4: Interaction Effects on Perceptions from Ideology
Because previous work has suggested that ideology is one of the greatest predictors of climate change perceptions and beliefs (political, worldview, and ecological ideologies), we were interested in whether participants reacted to the activity differently in relation to self-reported political affiliation, measured via pre-/post-survey changes (Fig. 4). Because the statistical power of this experiment was very low for five groups, we collapsed the political affiliation options into a binary variable, where very Liberal, Liberal, and middle-of-the-road participants were in one group (n = 31), and their Conservative and very Conservative peers were in another group (n = 53). A two-by-two mixed ANOVA revealed a significant interaction of change in the estimate of climate scientist consensus and participants’ identified political conservatism, F(1, 81) = 6.23, p = 0.02, where the minority middle-of-the-road and Liberal students did not significantly increase their estimates of consensus after the activity, but the Conservative and very Conservative students did, such that their final perceptions of consensus converged. The other variables of interest (climate change risk perception and climate scientist credibility) did not show evidence of interactions with ideology.
RQ5: Participants’ Judgments of Credibility
Though participants’ performance in the graph activity was addressed quantitatively with usability, the qualitative data collection was centered around participants’ subjective perceptions of their understanding of the graphs as they compared graphs and designs. Reflections on understanding were usually inseparable from participants’ comments about the ease of use of the graphs, a component of usability, so these concepts were all included in the code “understanding.” Overall, credibility was most associated with participants’ understanding of the graphs, both due to their increased capacity to judge the graph by understanding it, and because more understandable graphs were associated with more credible creators, according to qualitative analyses (Table 7). Certain factors were common and critical to the former, including the amount of information in the graph to aid understanding (color-coded axes, contextual text, etc.). The latter, perceptions of the graph’s creators, were usually tied to aesthetic factors such as the use of color, fonts, and organization in the graphs. Participants shared varying interpretations of how those aesthetic differences might be related to scientific, academic, and “official” sources of information (Table 7). Participants sometimes related aesthetics explicitly to factors related to source credibility, especially expertise/competence, and often related these judgments to their perception of significant time and effort put into graph creation.
DISCUSSION
In this study, eye-tracking, interview, survey, and ranking methods were used to compare the usability and participant perceptions of original and redesigned IPCC graphs. We found that minor aesthetic changes to the graphs, grounded in cognitive science research and user testing, impacted participant perceptions of the graphs, climate science, and scientists. Participants perceived all three of the redesigned graphs to be more trustworthy than the originals, thought two of the redesigns were easier to use, and reported that one of the redesigns made them more worried about climate change than the original. We found some evidence that participants using the redesigned graphs had greater increases in climate change risk perception and perceptions of climate scientists’ credibility from pre- to post-survey, but the ANOVA test results were not statistically significant (p > 0.05), so additional investigation would be valuable.
The qualitative data in this study lend insight to possible causes or pathways for design changes to influence perceptions of the trustworthiness of the graphs and their assumed authors. It seems this connection is partly due to the close and important relationships between ease of use and credibility, but participants also emphasized aesthetic factors and resultant judgments about the creators of the graphs. It must be noted that these connections were made in the ranking and interview portions of the study, and thus participants were reflecting on their subjective perceptions of understanding, which can diverge from objective understanding (Fischer et al., 2020). In this study, we did not find significant improvements in the usability of the redesigned graphs, which would have indicated greater objective understanding. This was surprising because the framework we used to redesign the graphs was intended to improve comprehension and, therefore, performance on the data extraction tasks. Instead, the participants performed worse using redesigns of one of the graphs and statistically equivalently on all designs of the others. Some possible explanations include our use of simple graph designs, our choices of comprehension questions, and the redesign choices we made, or perhaps the design changes were not substantial enough to produce performance differences between graphs (i.e., no changes in the amount of data presented, no change in data representation in one of the graphs, keeping most of the same labels and titles). Additional study is needed to understand the interactions between specific design aspects and performance on related tasks.
That said, it could be argued that subjective understanding and individuals associated judgments of credibility, consensus, and risk are more important to climate communication efforts than individuals’ ability to extract data and values from a graph. The strength of the connection between graph design and perceptions of graph trustworthiness is a novel and important finding, especially considering other new research highlighting the role of trust in scientists in climate perceptions (Dong et al., 2018; Druckman and McGrath, 2019; Hornsey et al., 2016; Howe et al., 2019; Sarathchandra and Haltinner, 2020).
Our findings suggest that the graph designs did not have an impact on participants’ change in estimation of scientific consensus (non-significant ANOVA and less distinct contrasts in effect sizes; Fig. 5). However, we did find a significant interaction of participants’ change in consensus estimate and political affiliation, such that politically Conservative participants “caught up” to moderate and Liberal participants’ perceptions of scientific consensus (Fig. 6). Otherwise, politically conservative students had equal changes between their pre- and post-surveys. This effect is notable because this study was conducted in a state where climate change acceptance is significantly lower than in the U.S. at large (Howe et al., 2015) and because it contrasts with previous research suggesting that Conservatives may be less influenced by scientific information (Kahan et al., 2012, with the caveat that those authors use worldview measures but mention the connections between worldview and politics; Luo and Zhao, 2019). However, other sources have shown positive connections between climate change knowledge and climate change acceptance and risk perception regardless of political ideology (Hornsey et al., 2016; Ranney and Clark, 2016; van der Linden et al., 2017). Furthermore, this dynamic may be especially relevant for younger audiences such as ours, which have previously shown greater connections between climate knowledge and risk perception (Aksit et al., 2017; McNeal et al., 2014; Rooney-Varga et al., 2018; Stevenson et al., 2014).
Additionally, this study shows that judgments of credibility are not only determined by how the information fits into participants’ ideological or political beliefs. In other words, politically Conservative students were equally impacted by the graphs even though the information was not explicitly associated with any culturally significant sources or topics, especially since climate change is typically associated with Liberal politics. This implies that influences on judgments of credible sources of information may be shared across cultural and ideological backgrounds. This is supported by a recent review of research suggesting it is equally likely that consumers of climate change information are motivated to make accurate conclusions, especially when the information is framed to encourage making accurate inferences, and that they rely on judgments of credibility to do so (Druckman and McGrath, 2019). Other authors have tested methods for intentionally priming accuracy motivations, such as the LIVA research of Akin et al. (2020), whose presentation of graphical data with accompanying accuracy questions was very similar to that of this study. Thus, it is possible that our study design encouraged participants to make accurate conclusions, rather than identity-protective conclusions, which would explain the equal and sometimes greater pre-/post-survey changes of political Conservatives. More generally, this further demonstrates that perceptions of source credibility are critical to climate change communication, and the design of graphs can be impactful and important for such communication efforts across audiences.
We also measured participants’ awareness of the 97% consensus among climate scientists surrounding anthropogenic climate change (RQ3a; Cook et al., 2016). We included these measurements because perceptions of scientific consensus are known to be closely tied to acceptance of climate change and risk perception (Bolsen and Druckman, 2018; McCright et al., 2013; van der Linden et al., 2014, 2015) and are possibly impacted by perceptions of climate scientist credibility (Kahan et al., 2011). The impact of climate scientist credibility perceptions is likely relevant in this instance, since none of the stimuli in the study directly addressed climate scientist consensus in any way, but clarification of this relationship would require additional research. The significant interaction between political affiliation and perception of scientific consensus in this study (discussed above) was exciting, as it relates to previous findings that awareness of the 97% consensus may reduce political polarization around climate change (van der Linden et al., 2017), especially among individuals with low content knowledge (Bolsen and Druckman, 2018).
The audience used in this study is not the intended audience of the IPCC Summaries for Policymakers, but IPCC figures are often used for a wide variety of communicative and educational efforts. The IPCC also uses a great range of graph styles and levels of complexity, but our study only used very simple figures, so it is unclear whether our results would apply to more complex or uncommon IPCC graphs. Thus, we suggest that future research use equally rigorous methods and comparisons but test greater aesthetic contrasts, changes in language and legends, and more complex and creative visualizations. Additionally, research studies that use IPCC figures with their intended audience, decision makers, will illuminate more strategies and caveats for improving climate change communication with graphs, building on previous work (Fischer et al., 2020; Kause et al., 2020; McMahon et al., 2015). However, because IPCC graphs and others are frequently used in all kinds of climate communication efforts, future research with a wide variety of audiences will also be critical to improving climate literacy and reducing harm from climate change. Our qualitative findings showed myriad connections between graph characteristics and user perceptions that could be used to spark this exploration.
Limitations
Our eye-tracking and interview methods, while limiting the sample size of this study, allowed for a deeper, more thorough understanding of how this user group interacted with the graphs to inform the redesign process. Thirty participants per group is a rough “rule of thumb” used in mixed methods research and eye-tracking, and only 20 participants per group may be satisfactory for qualitative designs; thus, our sample size is quite large for qualitative analysis (Creswell and Plano Clark, 2018; Morgan et al., 2002). Though we approached the 30-per-group threshold, our statistical power was weak, such that we only had 0.8 power to detect effects R2 ≥ 0.11 (d ≥ 0.69) among three groups (between experimental conditions) or R2 ≥ 0.09 (d ≥ 0.62) between two groups for interaction effects, as addressed by RQ4.
Additionally, because the focus of this study was on testing the effect of research- and user feedback-based redesigns, we focused greatly on internal validity (minimizing random variability to strengthen inference-making) at the cost of external validity (generalizability). Namely, the changes made to the graphs between redesigns were minor, especially for the line graph (graph 2), because we wanted to ensure that any changes were explicitly and solely based on the research and user feedback. However, this may be unrealistic, and thus less informative to situations in which dramatic changes are made to the aesthetics or language used in graph redesigns. The usability results of graph 1 show the importance of the user-testing aspect of this study, since our design changes based on previous research alone seem to have lowered the usability of the first redesign, whereas the changes made to create the second redesign, based on user feedback, improved the usability such that it was then equivalent to the original.
Similarly, we sampled from a very homogenous population (undergraduates in large introductory classes, who were then given major vocabulary definitions before the study started) to minimize variations in performance based on background information, hoping it would maximize variation due to the experimental (graph) conditions. But this, too, is unrealistic for real-world communication efforts, where audiences are typically much more heterogeneous in background knowledge and perspectives. Additionally, because this was a multi-phase research study about a controversial topic, there were many opportunities for students less comfortable with climate change to drop out, which could have resulted in self-selection and resultant sample bias. Furthermore, because we did not include a control group, any participant changes between the pre- and post-surveys cannot be definitively attributed to their interactions with the climate graphs, and instead could be related to such sample biases or other events that occurred in between the survey administrations.
We must also acknowledge that our participants’ 30–50 min intensive interactions with these graphs are not representative of everyday climate change information exposure, which also limits the generalizability of this work. Despite this, there are currently no data to suggest that participants’ judgments of usability or credibility would be any different during a briefer encounter, and in fact, there was only a detectable effect of exposure time for one judgment of one of the graphs (Results section of the Supplemental Material). Additionally, this experiment was conducted with undergraduate college students alone, who are not representative of the public at large, and in a university laboratory setting, which some participants said affected their judgments of credibility, according to their interview responses.
Suggestions for Climate Change Graph Design and Testing
The most important insights from our redesign experience and especially our qualitative analysis (interviews) are provided below. While similar to suggestions from previous literature (e.g., Harold et al., 2016; Hegarty, 2011; Kause et al., 2020), these also include consideration of user perceptions of credibility.
Among the various alterations made to the graphs, color-coding axes was the most user-praised change. Contextual scaffolding such as “warmer than average” text was usually highly praised, but this addition risks adding visual clutter.
Use of color was very important to participants both for understanding and judgments of credibility. Minimal and meaningful use of color (i.e., color-coding variables to show values or relationships) was perceived as highly credible.
Graphics formats that were familiar to the audience were most useful. Participants were confused by even minor changes to common graph formats, e.g., non-bracketed error bars. If a less familiar format is used, include a key, such as on the redesign presented by Harold et al. (2016). The key was perceived as helpful and more credible.
Pilot testing should be done early and thoroughly. Participant comprehension errors that appeared in the first several participants typically persisted, especially because the population being sampled was relatively homogenous.
Design should be done iteratively based on input from the target population of the communication tool. Different audiences may have very different content knowledge and graph experience, as well as very different associations with culturally bound perceptions, such as risk and credibility. In this study, changes based on the general literature had quite different impacts on perceptions and performance than improvements that also included user feedback.
CONCLUSIONS
We demonstrated a robust process for iteratively redesigning climate change graphs to improve accessibility for non-scientists using cognitive science literature, as synthesized by Harold et al. (2016), including user testing with an identified audience. We showed the potential for improvements to user awareness of climate change and perceptions of climate scientists and graphs (RQ2) from even minor changes in graph designs (RQ5) and across participant worldviews (RQ4). There were no statistically significant contrasts in performance or changes in perceptions between experimental conditions (ANOVA p > 0.1; RQ1, RQ3). However, participants who used the second redesign of the graphs (based on both cognitive science and user feedback) had the greatest increases in perceptions of climate scientist credibility and climate change risk perception, and participants rated redesigned graphs as being significantly more trustworthy (all three graphs), easier to use (two graphs), and more worrying (one of the graphs). Because participants had improved perceptions of the redesigned graphs without negative impacts on usability, we recommend applying both cognitive science and user feedback to improve graphs for climate change communication.
These findings and the rigorous cognitive and psychological science that underlie them are particularly significant as the IPCC recently published its Sixth Assessment Report. AR6 will be used by, and could be adapted for, a wide range of decision-makers and educators across the globe for years to come, and lessons learned from additional research and user-testing could inform future reports. The IPCC has recognized the importance of thoughtful, intentional communications, and has begun taking steps to make the reports and graphics more accessible to end-users, such as hiring graphics and communications officers at each Technical Support Unit and supporting communications guidance to authors (Corner et al., 2018; IPCC, 2016). As similar research and resources continue to emerge, these findings may inform the design of more accessible graphics to advance climate change awareness among the public and policymakers, ultimately reducing harm from climate change.
ACKNOWLEDGMENTS
We thank Jordan Harold and Tim Shipley for reviewing the graphs during the redesign process, and Jordan Harold for additional guidance and feedback during the design stage of the study. We also thank Robin Matthews and Melissa Gomis with the IPCC Working Group I Technical Support Unit for providing the vector files of the IPCC graphics, as well as Elijah T. Johnson and Haven Cashwell for their assistance with qualitative coding for reliability analyses. This study was approved by the Auburn University Institutional Review Board (protocol #18–085 EP 1803). This work was supported by the authors’ National Science Foundation Graduate Research Fellowship under grant DGE-1414475.