In searching for ways to improve undergraduate success in introductory geoscience courses, the importance of experiential learning in engaging students has become clear—and in geoscience, that is encapsulated best by field trips. However, as general education class sizes increase, so do the cost, liability, and difficulty of running a field trip. A solution for economically and conveniently bringing kinesthetic field experiences to a broader audience lies in the integration of technology through mobile-device games, apps, and augmented reality (AR) field trips. We report here an examination of learning gains at five colleges after intervention with augmented reality field trips to Grand Canyon. The AR field trips cover three topics taught in introductory geoscience courses: geologic time, geologic structures, and hydrologic processes.

Results involving nearly 1000 students show that overall gains are similar to control groups, with completion of the AR field trips being a predictor of student learning success in some cases. Prior interest in the geosciences, students’ base-level understanding of the material, and whether or not the student is a science, technology, engineering, and mathematics (STEM) major are strong predictors of improvement in geoscience learning. Gender and ethnicity had no statistical impact on the results, suggesting the AR field trip modules have broad reach across student demographics. Because these modules have been shown elsewhere to increase student interest in learning the geosciences, we advocate their adoption, leading to increases in student learning.

Traditional undergraduate geoscience education often fails to inspire and engage students (Krockover et al., 2002; McConnell et al., 2003), contributing to trends of declining interest, low persistence, and lack of diversity among U.S. students in science, technology, engineering, and math (STEM) disciplines (e.g., Seymour, 2001; Ashby, 2006; Fairweather, 2010). A more effective alternative to lecture-focused, traditional approaches is experience-based learning (Andresen et al., 1996; Mazur, 2009; Deslauriers et al., 2011). Kolb (1984) described experiential learning as the gaining of knowledge through transformation via experience. In the geosciences, experiential learning and problem solving are best delivered through field experiences, which include making observations and orienting oneself spatially within landscapes (Orion and Hofstein, 1994; Tal, 2001; Fuller, 2006; Bowen and Roth-Wolff, 2007; Simmons et al., 2008; Kastens et al., 2009; Mogk and Goodwin, 2012). Field excursions provide students with opportunities to hone observational and critical-thinking skills by distinguishing features amid visual complexity (Kastens et al., 2009; Mogk and Goodwin, 2012). For geoscience, field trips provide a primary map-reading (orienteering) and kinesthetic experience, analogous to the role of laboratory experiments in physics and chemistry. Despite the value, field experiences are often prohibitive for high-enrollment introductory classes due to the expense, liability, and time constraints in the modern university setting (McGreen and Sánchez, 2005; Friess et al., 2016). Some students struggle to find geologic features being pointed out in a natural setting and may not ask for help in doing so. Further, the best sites may be too remote, are unrealistic for an online class, or prohibitive for persons with disabilities who have limited access to rough terrain.

A solution for bringing experiential, kinesthetic field trips to a broader audience lies in ongoing advances in mobile communication and augmented reality (AR) technologies. AR involves the real-world environment with elements supplemented, or “augmented,” by computer-generated input. For the purposes of this AR field trip study, experiential learning is considered not under the strict definition of Kolb (1984), but within the context of the gaming “flow experience.” Kiili (2005) described an experiential gaming model as one allowing learning through a cyclic process that provides direct experience in the game. The experiential gaming model includes idea generation, experiences, and challenges that generate new ideas (Kiili, 2005). Although on one hand, mobile AR technologies can represent a communication gap between incoming freshmen and educators (National Higher Education ICT Initiative, 2007; Perlmutter, 2011; Dahlstrom and Bichsel, 2014), studies have shown that simulations, games, and virtual field trips (VFTs) actually increase students’ motivation (McGreen and Sánchez, 2005; Bell et al., 2009; Honey and Hilton, 2011; Johnson and Johnston, 2013; Bursztyn et al., 2017). There have been increasing reports of VFTs being used in a variety of college courses, including biology, medicine, engineering, geography, and geology (e.g., Spicer and Stratford, 2001; Liarokapis et al., 2004; Stumpf et al., 2008; Jacobson et al., 2009; Yuen et al., 2011; Lee, 2012; Pringle, 2013; Friess et al., 2016). Results from those studies indicate that students enjoy using the VFTs, and researchers see gains in interest in the material through the interactivity and immersive experience as compared to traditional learning (Spicer and Stratford, 2001; Stumpf et al., 2008; Jacobson et al., 2009; Pringle, 2013; Friess et al., 2016). AR usage in medical, engineering, and mathematics education has resulted in similar findings (e.g., Liarokapis et al., 2004; Yuen et al., 2011; Lee, 2012). Recent research has also investigated the use of mobile devices, such as smartphones, to further contextualize learning (Roschelle, 2003; McGreen and Sánchez, 2005; Clough et al., 2008). Smartphone and tablet computer technologies are increasingly ubiquitous among college students (Dahlstrom and Bichsel, 2014; Anderson, 2015), and they have global positioning system (GPS) technology built in for spatial orientation and navigation. This was most recently demonstrated by a hugely popular app that takes advantage of spatial orientation and navigation: Pokemon Go. Because of the established motivational aspects and convenience, educators will be tasked with teaching geoscience concepts in a spatially oriented context with the technology already in students’ hands. However, the question remains: Do virtual or AR field experiences actually improve geoscience learning?

This research aims to determine the impact of AR field trips on student learning of geoscience concepts in introductory classes in a variety of postsecondary institutions and environments. The Grand Canyon AR field trip modules were developed, tested, and made freely accessible to students and faculty of diverse backgrounds and physical abilities. They were designed to be easily incorporated into higher education programs and curricula at institutions globally. Specific research questions for this study were:

  • (1) How does learning with mobile AR field trips impact student geoscience concept scores?

  • (2) Which AR field trip, student, and institutional factors best predict student geoscience concept scores?

The AR field trip modules developed and tested in this study utilize relative GPS locations for spatial orientation and reference, meaning that the starting location is set as the origin for the rest of the game. The location-based app development platform GeoBob (, made by the Interactive Design for Instructional Applications and Simulations (IDIAS,; Shelton et al., 2012) laboratory at Utah State University, was used for the original AR field trip prototyping (Bursztyn et al., 2015). The modules require the students to physically navigate a Grand Canyon landscape that is scaled down to a 100-m-long playing field. The geographic location of the player does not matter; however, since GPS is integrated into the application, the module must be played outside. The design takes advantage of the benefits of games that provide immersion-in-context, rewards for correctness, and immediate feedback in response to student interaction (Fig. 1). In this case, the student interaction includes correct/incorrect responses and tapping/swiping observations on the touchscreen of a smartphone or tablet (apps are described in more detail in Bursztyn et al., 2015; Bursztyn et al., 2017). These interactive features of the AR field trips mirror the experiential gaming model described by Kiili (2005), with the experience component being most beneficial when the game provides clear goals and appropriate feedback. Studies have shown that these gaming features contribute to increased student engagement through greater self-confidence and self-efficacy (Mayo, 2009; Bursztyn et al., 2017).

The three AR field trip modules (geologic time, geologic structures, and hydrologic processes) cover key curriculum concepts that can easily be addressed with iconic Grand Canyon features. The themes of geologic time, geologic structures, and hydrologic processes were selected for their universality in introductory geoscience courses (Bursztyn et al., 2015). A workshop with collaborating geology and education faculty was held to determine specific content within the themes to be included in the AR field trips. This collaboration resulted in the inclusion of the following constructs: (1) for geologic time: stratigraphic principles, unconformities, relative dating, numeric dating, and human versus geologic time; (2) for geologic structures: stress and strain, folds, faults, strike and dip, and plate tectonics (as related to faults); and (3) for surface-water processes: the hydrologic cycle, fluvial hydrology, sediment transport, groundwater, and human influence on surface water.

All three modules use the same base map of Grand Canyon for navigation between stops, and each module begins at the traditional rafting trip launch of Lees Ferry and has 10 different field trip stops that represent outstanding real-world examples of curriculum content. The field trip stops appear in sequence after a multiple-choice question and interactive touchscreen task have been answered correctly (Bursztyn et al., 2015). Points are allocated to each question based on the number of attempts, and incorrect answers trigger explanations of the answer selected, so that the student can immediately know why the answer they chose was not correct (Fig. 1). Students must physically navigate to each new location (Fig. 2) and complete interactive touchscreen activities either requiring them to identify and tap on a geologic feature or swipe the screen to draw a line (along a fold axis or on a graph) or indicate the direction of movement of a fault’s hanging wall (Fig. 1), for example. Each module takes ∼20 min to play through, a length of time aimed to capture the typical student’s attention span (Middendorf and Kalish, 1996; Milner-Bolotin et al., 2007). The apps are free to download on iTunes and Google Play, titled GCX: Geologic Time, GCX: Geologic Structures, and GCX: Hydrologic Processes.


The AR field trips were tested in introductory physical geology and earth science classes at collaborating institutions; however, not all participants completed all three AR field trips. The diverse student population included STEM majors, nonmajors fulfilling their general education science requirement, community college students, large public university students, and private liberal arts college students. We classified these groups of students as coming from institutions that were either teaching-focused (TF), teaching-research split (TR), or research-focused (RF; see Supplemental Table 11). Typical enrollment in these introductory geoscience courses was dependent on the institution and ranged from 20 to 300 students. Control groups came from the participating institutions with the largest student enrollments, which included the research-focus (RF) and teaching-research split (TR) schools. Control groups completed the same pre- and posttests; however, participants in control groups experienced traditional curriculum instead of the AR field trips. All of the classes that participated in this study were traditional lecture-based courses with accompanying laboratories. Part or all of the traditional laboratories were replaced with AR field trips for the experimental group.

Assessment Instruments

Evaluation instruments encompassing interest in the geosciences and understanding of introductory-level geoscience concepts, as well as a demographics survey, were used at the beginning of the semester. The same content-specific assessments were then administered within two weeks after each intervention was complete. The Geoscience Interest Survey (GeoIS) was adapted from the Motivated Strategies for Learning Questionnaire (MSLQ) and was the primary instrument in a related study (Bursztyn et al., 2017). Assessment results were analyzed and compared across demographic groups, in the context of pre-intervention interest, and by institution.

The assessment content questions were assembled from the Digital Library of Earth Science Education (DLESE) and the Science Education Resource Center (SERC) at Carleton College, including the Geoscience Concept Inventory (GCI) and ConcepTests. These assessment resources have been used in other geoscience education research studies (Libarkin and Anderson, 2005; McConnell et al., 2003, 2006; Petcovic and Ruhf, 2008). DLESE is a comprehensive online source for geoscience education that contains a collection of pedagogically sound, technologically robust, and scientifically accurate resources, including multiple-choice assessment questions, about the Earth system. ConcepTests and the GCI are conceptual multiple-choice questions that focus on a single concept, are clearly worded, are intermediate in difficulty, and have response sets that fit into the same category (e.g., principles of relative dating) but remain distinct from each other. Our collaborating instructors vetted the questions from these sources and agreed upon the final 10 selected to assess each module theme based on those questions covering fundamental curriculum concepts.

Reliability for the content posttests used Cronbach’s alpha (scale 0–1). For geologic structures (0.49), hydrologic processes (0.47), and geologic time (0.43), alpha scores were all quite low (DeVellis, 2016), suggesting that the content areas were not being measured in a stable way. This result may be due to the low number of items and the fact that Cronbach’s alpha is a lower bound estimate of reliability (Cronbach, 1951), as well as the particularly challenging context for reliability. Since Cronbach’s alpha is based on interitem correlations, the broad spectrum of content covered makes reliable and succinct measures difficult. However, assessing that range remains important given the nature of introductory geology courses.

To examine validity, an exploratory factor analysis of each measure was considered (Stevens, 2002). In the three cases, all 10 variables (pre- and posttest questions) were loaded onto a single factor with similarly ranging factor loadings. Geologic time had factor loadings ranging from 0.21 to 0.58. For geologic structures, factor loadings ranged from 0.22 to 0.49, and for hydrologic processes, loadings ranged from 0.22 to 0.61. Given the number of observations, range of factor loadings, and single factor solutions, the pre- and posttest questions appear to have been measuring the same construct (Stevens, 1999). Coupled with item review from content experts, the overall validity is considered strong. In the case of hydrologic processes, which had an especially diverse set of topics, the factor analysis suggested dropping four items. Thus, the final scales ranged from 1 to 10 for geologic structures and geologic time and 1–6 for hydrologic processes.

Variables and Statistical Methods

Relevant variables were determined by generating a correlation matrix, and these consisted of: pre- and postintervention scores for each of the three content areas, whether or not they completed the AR field trip associated with a content area; designation of the institution as teaching-focus (TF), teaching-research (TR), or research-focus (RF); student focus (geology majors, STEM majors, non-STEM majors); student demographics (ethnicity, gender); and students’ level of pre-intervention interest in geosciences (see Supplemental Table 2 [footnote 1]).

To address the first research question examining the impact of the AR field trips on students’ geoscience content test scores, three analyses of covariance (ANCOVA) were run. As recommended by Campbell and Stanley (1963) for research when students are not randomly assigned, control was considered for preexisting differences by using the individual content pretest scores as a covariate for all groups, including those who completed and who did not complete a particular AR field trip. For the second research question, multiple regression was used to examine factors that predicted students’ geoscience content test scores. Multiple regression analysis tests the impact of two or more predictor variables on a single outcome variable. This method allows for examination of the joint effect of all the predictor variables on a single outcome while parsing the influence of each individual predictor.

GCX: Geologic Time was completed by students at all participating institutions (n = 540; Table 1). Students at four out of five institutions, and representing all institutional variables (TF, TR, and RF), completed GCX: Geologic Structures (n = 315; Table 1). Students at three out of five institutions, representing only two of the three institutional variables (TF and RF), completed GCX: Hydrologic Processes (n = 219; Table 1). Mean gains for all students (experimental and control) were generally minor, from a score of 3.9 to 4.9 out of 10 for geologic time, 3.7 to 5.4 out of 10 for geologic structures, and 2.6 to 3.0 out of 6 for hydrologic processes (Table 1). ANCOVAs revealed no statistically significant differences between AR field trip participants and control groups on posttest scores after accounting for pretest differences. Effect size comparisons were equally modest, with η2 values ranging from 0.05 to 0.15 (Table 1). These results indicate that experimental and control group differences account for 5% to 15% of the variability on geoscience content test scores. All students improved their knowledge over the time of the study at a statistically significantly level.

For the prediction of posttest scores, most of the predictors were selected based on their statistically significant correlations with posttest scores and lack of correlation with other predictors (Supplemental Tables 1 and 2 [see footnote 1]). The variables “gender” and “ethnicity” were kept in the models because they are theoretically relevant, though not statistically significant. For geologic time, each one point increase on the Geo-IS pre-intervention survey was associated with a 0.06 increase on the posttest (Table 2). Being a STEM major or completing GCX: Geologic Time was associated with a 0.56 or 0.51 point increase on the posttest, respectively. For site classification, “research-focus (RF) institution” was used as the reference group with a coefficient of 0, so this is not shown on Table 2. From that reference group, the site classifications of “teaching-research (TR)” combined and “teaching-focus (TF)” were measured by how far removed from the reference group their results fell.

All three models were predictive of posttest scores, and the R2 measures of effect size (Table 2) suggest that anywhere from 87% to 90% of variability on the outcome can be accounted for with the set of predictor variables. A factor most important to this study is that completion of the AR field trip modules in geologic time and hydrologic processes was associated with a predicted gain of 0.51 and 0.52 from pre- to posttest scores, respectively. This predicted increase in score is on par with gains associated with being a STEM major for the geologic time AR field trip. Completing the AR field trip module had no statistically significant impact for geologic structures. In all three regression analyses, interest and content pretest scores were both consistent and positive predictors of student learning. STEM majors were predicted to have higher posttest scores for geologic time and geologic structures but not hydrologic processes. For all three AR field trip modules, results indicate that being at a teaching-focused institution resulted in less improvement in understanding of material relative to research-focused institutions (Table 2). Along the same lines, the results show that there was a statistically significant lower performance among students at the split-focus teaching-research institution for geologic structures (Table 2); however, this latter, module-specific, result could be due to a curriculum difference at that particular site.

Although gains in student performance were present and consistent for all groups for the content assessments, none of the mean scores was over 55%, which is far from a “good”—or passing—grade. The level of difficulty of these tests provides key context for this result. The relatively difficult assessment tests result in exposure of subtle gains, whereas an easy assessment would likely mask these changes, making discerning minor improvement extremely difficult.

Most importantly, completion of the AR field trip modules for two of the three themes was a stronger positive predictor for gains in content comprehension than being a STEM major. The variable “STEM major” was an expected predictor of increased posttest scores because that group of students has a declared interest and presumed ability in the sciences. Student motivation to learn the material, their score on the Geoscience Interest Survey, was also expected to be a valuable predictor, because motivation is ranked as the most important driver for student learning by many postsecondary geoscience educators (Gilbert et al., 2012). While the results of this study show that student interest is a statistically significant predictor of student learning, it is not a major driver of increased posttest scores. Gender and ethnicity were considered as predictor variables because of the nation-wide and decades-long concern over low numbers of minorities in STEM fields (Ashby, 2006; National Research Council, 2011; Chang et al., 2014). Gender and ethnicity had no significant impact on gains from pre- to posttest scores, and these variables were also not significant predictors of student interest and motivation to learn the geosciences (Gilbert et al., 2012; Bursztyn et al., 2017).

The variability of student success for each AR field trip theme across site classifications may be mapping the disparities in curriculum between institutions as well as the student demographics at particular institution types. Throughout this study, it became clear that there was a marked disparity within introductory geoscience curricula in the amount of each content area that is taught, dependent on instructor preference and local geology, as well as variations in technical terminology usage. We were able to accommodate collaborating instructors’ different terminology preferences by adding definitions within the content assessment text and AR field trip text and audio (e.g., defining “stage” as “water level,” “confluence” as “join,” and “catchment” as “watershed”). However, one of our teaching-research split institutions did not consistently include faults and folds in their introductory geology course, and this is seen as a consequence in Table 2 in the form of substantially worse results for the TR institutions in geologic structures as compared to the other subjects. The teaching-focused institutions in this study were community colleges with open admission policies, giving those institutions access to a broader array of students than the other collaborating schools.

It is clear that all student participants in this study had similar improvement regardless of completing one or more AR field trips or receiving traditional lecture and laboratory instruction. The flip side of this result is that the AR field trip modules did not decrease nor detract from student learning, which is consistent with findings from recent research comparing traditional curriculum with other VFT-like activities (Stumpf et al., 2008; Stokes et al., 2012; Pringle, 2013; Friess et al., 2016). Elsewhere, VFTs and AR field trips have been shown to improve student interest in learning the geosciences (Spicer and Stratford, 2001; Pringle, 2013; Bursztyn et al., 2017), so these improvements in student engagement can be gained with mobile-device activities without negatively impacting student learning. In the future, it will be important to determine what improvements are necessary for mobile games, AR field trips, and VFTs to more significantly improve student learning as well as student interest.

Assessment of constructs as broad as geologic time, geologic structures, and hydrologic processes is challenging. However, the single factor solutions for each exploratory factor analysis and expert collaborator reviews offer an encouraging view of validity. The low reliability scores for the geoscience concept assessments suggest some measurement error, and these low reliability scores may partially explain the lack of differences between the control and treatment group students. The lack of significant differences in posttest geoscience concept scores between control and treatment groups from this study is consistent with other related studies discussed in the introduction of this paper. Students took the pre- and posttests individually online and were encouraged to “just go with their gut responses” to discourage any collaboration or research. The low scoring on the assessments suggest there is a possibility that this “just go with your gut” instruction may have had the adverse effect of encouraging rushing and a lack of effort. Future work might focus on games and/or AR field trips that are more narrowly focused on specific geoscience concepts with correspondingly focused assessments. However, it is equally important to develop virtual and AR field trips that meaningfully represent the broad content covered in introductory geology courses. Finally, in light of the sometimes-challenging weather experienced often during field testing (heat [114 °F], downpours, and snow), it will be important to conduct an investigation on the efficacy of a desktop (Web-based) nonmobile version of these AR field trips.

The future of science education will inevitably involve mobile technology, but are game-like mobile apps that simulate field trip experiences effective for education? The results of this study show that three Grand Canyon AR field trip modules resulted in overall similar gains in learning as compared to other teaching methods, with subtle improvements associated with the completion of two of them. In the case of the geologic time and hydrologic processes AR field trips, the statistics (Table 2) suggest that completing these modules did in fact contribute to an increased gain from pre- to posttest scores.

The major factors that most clearly correspond to student learning, as might be expected, are whether or not students are STEM majors, whether they have self-identified as being interested in geoscience, and whether they possessed an initial understanding of the material. By contrast, the lack of impact on posttest scores across ethnicities and genders suggests that these modules have the advantage of being accessible across a broad swath of student demographics. The AR field trip modules proved to increase student interest in learning the geosciences (Bursztyn et al., 2017), so that increased interest combined with student learning and broad accessibility should lead to further increases in student performance as the quality of mobile-device educational games continues to improve.

This research was made possible by funding from the National Science Foundation TUES award DUE-1245948. We would like to thank our collaborating geoscience faculty focus group: Michelle Fleck, Richard Goode, Eric Hiatt, and Laura Triplett. We would also like to thank all faculty and their students at our participating test institutions who provided data for this research. Finally, we would like to acknowledge the reviewers of this manuscript for their keen insight and valuable comments that significantly improved the presentation of our findings.

1Supplemental Tables. All predictor variables used in statistical analyses and correlation table of all variables. Please visit or the full-text article on to view the Supplemental Tables.