Volumetric imaging techniques, such as high-resolution X-ray computed tomography, allow three-dimensional measurements of discrete objects inside solid samples. This paper introduces a new computer program called BLOB3D that is designed to allow efficient measurement of up to thousands of features in a single sample, such as porphyroblasts, sediment grains, clasts, and voids. BLOB3D implements an original suite of software methods, divided into three modules, which respectively enable the tasks of segmenting those regions in the data volume that correspond to the material of interest; separating touching or impinging objects; and extracting measurements from the interpreted volume. Program functions are demonstrated and verified with a set of phantoms used to test determinations of particle size distributions and particle-particle contact orientations.
Petrographic inspection of rocks has largely been restricted to two-dimensional observation, whether on the outer surface of an outcrop or hand sample or from analysis of thin sections. While two-dimensional measurements can sometimes be used to obtain three-dimensional information using stereological methods, model-based assumptions are often required, and the data are primarily generalizations, describing rock-wide properties rather than particular features. Acquisition of direct three-dimensional information has typically been labor intensive and somewhat expensive, whether by serial sectioning (Daniel and Spear, 1999; Johnson, 1993) or imaging technologies such as high-resolution X-ray computed tomography (HRXCT) (Carlson and Denison, 1992; Denison et al., 1997; Ketcham and Carlson, 2001), synchrotron-based computed microtomography (Ikeda et al., 2004), magnetic-resonance imaging (Gingras et al., 2002; Herrmann et al., 2002), and confocal microscopy (Bozhilov et al., 2003). Furthermore, once such data have been acquired, analyzing them to obtain desired measurements has often required surmounting an entirely new set of challenges and obstacles: in many cases investigators need to exert significant effort to invent dedicated approaches and tools that are often applicable only within narrow circumstances of data type, computer configuration, and geological problem.
The barriers to routine acquisition and utilization of three-dimensional data describing geological specimens are gradually being overcome as three-dimensional imaging technologies increase in capability and diminish in cost. Many of these methods are nondestructive, leaving samples intact for posterity or for subsequent analysis. Advances in computer processing power and memory allow even desktop machines to handle substantial data volumes, and there are many sources of software dedicated to three-dimensional visualization. However, software tools that allow extraction of quantitative information of geological interest are still required.
This paper and its companion1 respectively describe and utilize new software intended to fill some of this gap, permitting efficient and accurate identification and measurement of up to thousands of discrete objects in a volumetric digital data set. The software, BLOB3D, implements a three-dimensional version of the image-processing tool of blob analysis, in which contiguous sets of pixels meeting certain conditions are traced and described. As with standard blob analysis, the primary difficulty comes when a single region contains multiple contacting or impinging objects that must be separated to allow each to be considered and measured individually. Although some automated procedures have been developed for relatively straightforward cases such as vesicles in basalt (Proussevitch and Sahagian, 2001), in general the variety of materials, textures, and forms in geological samples complicates development of a strictly algorithmic approach. Thus, BLOB3D is designed to give the program operator primary control of how the data are interpreted and measurements made, so that geological experience and insight can guide the analytical process.
The BLOB3D software was developed with the primary motivation of analyzing HRXCT imagery of porphyroblastic metamorphic rocks, as described in the companion to this paper. However, it was also written to be applicable generally to any three-dimensional data set containing identifiable discrete features; other applications to date have included HRXCT data of vesicles in meteoritic basalts (Benedix et al., 2003; McCoy et al., 2002), gold grains in ores (Kyle and Ketcham, 2003), and aggregate clasts in asphalt concretes (Ketcham and Shashidhar, 2001).
HRXCT is the industrial adaptation of medical CAT scanning. By using X-ray sources with higher energies and smaller focal spots, smaller or finer-resolution detectors, and longer acquisition times, industrial scanners achieve one to three orders of magnitude better resolution than medical devices while imaging denser objects (Ketcham and Carlson, 2001). The gray levels in HRXCT images correspond to linear attenuation coefficient, which is strongly correlated with density and is also a function of atomic number and X-ray energy. Individual computed tomography (CT) images are termed slices, as they correspond to what would be observed if a slice were extracted from the object being scanned, analogous to a slice from a loaf of bread or a petrographic thin section. Because each slice image represents a finite thickness of material, the pixels in HRXCT images are referred to as voxels, or volume elements. By acquiring a contiguous series of slices, data describing an entire volume can be obtained.
BLOB3D DATA PROCESSING
Previous textural studies using HRXCT (Carlson and Denison, 1992; Denison and Carlson, 1997; Hirsch, 2000) included the creation of software tools to process scan imagery using a two-stage procedure to measure the size and location of garnet porphyroblasts. Each slice image was first interpreted individually by superposing a circle on each porphyroblast to represent its location and cross- sectional area. After all of the slices were processed in two dimensions, the three- dimensional porphyroblast volumes were reconstructed by stacking circles from successive slices and approximating them as spheres (Denison et al., 1997; Hirsch, 2000). Data processing with these tools was very labor intensive, typically requiring weeks per sample. In addition, data quality and flexibility were limited by assuming an exclusively spherical crystal geometry and only offering a single planar view of the data to inform interpretation. Publicly available freeware image processing programs, such as NIH Image and ImageJ, have also been effectively used to interpret two-dimensional data and provide limited three-dimensional information (e.g., Higgins, 2000; Koeberl et al., 2002).
BLOB3D allows this processing to be done in three dimensions, increasing both efficiency and accuracy. The program begins by assembling a contiguous series of slice images, typically a set of TIFF files, into a three-dimensional matrix representing the entire data volume (this is also known as a data brick or voxel brick). Data dimensions are defined in terms of the pixel and slice spacing. These need not be equal, as the program will appropriately account for noncubic data elements.
The data-processing task is divided into three steps, each of which is implemented in its own program module. The first, Segment, is used to define which voxels in the data volume correspond to each material of interest. The second, Separate, takes all three-dimensionally connected sets of segmented voxels of a particular material and allows them to be cut apart into individual objects. The third, Extract, performs a variety of measurements on the individual objects that have been separated, including size, shape, orientation, and contact relationships. Each module is described in detail below, and an example sequence of operations for a set of porphyro blasts is shown in Figure 1 .
The Segment module performs the image-processing task of segmentation, determining which voxels in the data set correspond to a particular material of interest (Figs. 1A and 1B). Three classes of algorithms are available, grouped by the form of their input and output data. The first class is grayscale-to-grayscale algorithms that may be used to clarify the image data through noise reduction or edge enhancement. The second class, grayscale-to- binary algorithms, transform the processed data into binary (two-component) data, defining the material of interest versus everything else. A third set of binary-to-binary algorithms can be used for clean-up tasks such as removing small isolated groups of voxels. All Segment filters are implemented as three- dimensional operators. For example, a Gaussian3D filter (a grayscale-to-grayscale algorithm) uses a Gaussian n × n × m kernel to smooth the volume data, where n is the kernel dimension in the slice plane and m is the between-slice dimension, which may be the same as n or different if the within-slice pixel spacing is not equal to the spacing between slices. The filters are implemented using object- oriented software design, allowing new modules to be written based on a template and added to the program as needs arise.
The grayscale-to-binary step in Segment exerts the greatest influence on the overall results. For polymineralic rocks, specialized algorithms were created to account for the finite resolution of HRXCT data, which has three consequences. First, the partial-volume effect causes blurring of phase boundaries: because voxels are discrete, they may contain more than one phase, resulting in a grayscale value that is a weighted average between the ideal end-member values for each phase. Second, the averaging brought about by the nonzero size of the X-ray focal spot and/or detectors causes material in one voxel to affect surrounding ones. As a result, the grayscale transition between two adjacent objects of differing X-ray attenuation, rather than being sharp, characteristically spans several voxel widths (typically 2–4 in this study). Third, because of the limited signal-to-noise ratio of HRXCT data and the frequent variability of geological materials due to impurities, zoning, inclusions, and microporosity, the grayscale for a given mineral will tend to have a range of values which may overlap with the ranges for other phases with similar attenuation.
In standard industrial CT practice, the proper grayscale value for defining a boundary between two phases is the average of their two mean end-member grayscales (ASTM [American Society for Testing and Materials], 1992). In a polymineralic rock, simple thresholding based on this value is not practical when other phases have overlapping ranges of grayscales (Fig. 2 ). Three grayscale-to-binary filters were created to achieve accurate segmentation by utilizing connectivity information. The Seeded Threshold filter first selects all voxels that lie within a specified “seed” grayscale range, followed by all voxels that lie in a second, expanded grayscale range and are connected to seed voxels via a path of any length. The Expanding Threshold filter is similar, except that it limits the voxels selected in the second stage to those within a finite distance of first-stage voxels. The Expanding Seeded Threshold combines these by first executing a seeded threshold and then allowing a final expansion step. These algorithms allow selection of phases based on their end-member grayscales, and then allow limited expansion into their blurred boundaries.
The Separate module proceeds by tra versing the data volume and presenting the program operator with each set of three- dimensionally connected voxels (blob) encountered (Figs. 1C and 1D; Fig. 3 ). The operator then decides whether one or more objects are represented, based on descriptive statistics, the shape of the blob, and reference to the original image data.
Several tools are available for subdividing a voxel set into its component objects. The simplest, but most labor intensive, is manual definition of one or more cutting planes using mouse clicks on the blob surface. Three other tools work by removing certain voxels to achieve separate objects and then replacing the removed voxels while maintaining the separation. The Erode/Dilate function is suited to cases where the cross-sectional area of a contact is less than the maximum cross-sectional area of the objects (for example, two slightly impinging spheres). Successive erosion operations, removing shells of voxels from the outer surface, can cause the contact to neck and eventually break, separating the objects (Proussevitch and Sahagian, 2001). The eroded material is then added back to each object using dilation operations. The Grayscale Seed + Dilate function is similar, but replaces the erosion step with a search for local grayscale maxima (or minima) and uses each isolated grouping as a seed from which to commence dilation. This method is intended for situations where objects are touching or in close proximity but not impinging, and thus tend to have a slight grayscale barrier between them. Finally, the Grayscale Sea Level Fall function implements a version of the watershed algorithm. A threshold value (“sea level”) is initially set to encompass all of the data, and is slowly lowered to first uncover clusters of similar voxels (“islands”) that grow as the threshold continues to fall. It is intended for clusters of small grains whose peak gray levels are variably affected by blurring.
Manual versus Automated Separation
Whereas the three-dimensional processing described by Proussevitch and Sahagian (2001) for interpreting HRXCT data of vesicles in basalt is largely automated, every decision in the Separate module is made by the program operator. This design choice was made because of the difficulty of defining a set of prearranged operations and criteria that fully account for the range of textures encountered in natural geological samples with the reliability of a human petrologist. Features such as crystal habit, alteration and resorption, fractures, inclusions, and zoning affect the appearance of porphyroblasts in HRXCT data and their three-dimensional shapes, detracting from the ideal forms that purely algorithmic procedures may be based upon. Because of its reliance on a human operator, Separate is the most time-consuming BLOB3D step, typically requiring several seconds per individual object separated, and thus several hours when thousands of objects are involved. However, the speed of data processing is an order-of- magnitude improvement over the methods described by Denison et al. (1997).
Further time savings are anticipated as improvements in automation are made. As currently envisioned, we anticipate adding the capability to make certain easy decisions by default, based on passing a customizable (and perhaps learnable) set of criteria: for example, always accepting nearly perfect spheres, while leaving harder decisions to the user. Some automated separation processing may also be attempted. Ultimately, the danger to be avoided is having the software make poor decisions because the imagery includes features not well characterized or well tested by the measurements that choices are based on, and for the resulting poor data to be undetected and interpreted.
Example of Separate Operation
An example of the Separate module applied to a phantom created for verification of particle size distribution (grading) is shown in Figure 3. The phantom consists of a set of glass spheres of various sizes, bound with asphalt. Figure 3A shows a typical slice of the scan data. Because the network of contacting spheres spans the entire sample, the set of three-dimensionally connected voxels that comprises them likewise spans the entire data set, which would overtax computer resources. To solve this problem, the volume-searching algorithm automatically limits the size of the growing blob region; those faces that are cut off are denoted with red lines (Fig. 3B). A good separation into individual objects is obtained from two erosion steps followed by dilation (Fig. 3C). In Figure 3D, the operator has postponed the processing of all voxels in subblobs intersected by the cutoff planes, leaving only spheres that are complete or truncated by the edge of the data set. The operator would then accept, reject, or further process these blobs, after which the traversal algorithm would find the next set of unprocessed three-dimensionally connected voxels. Voxels that are postponed are returned to the pool available for searching, so objects that were truncated will eventually appear complete and suitable for processing.
Despite efforts to create an accurate segmentation during the first stage of processing, sometimes unwanted voxels may be included. When such voxels are recognized during processing with Separate they can be rejected by the operator, returning them to the pool of undefined voxels that may be classified as a different phase during subsequent processing. Thus, the Separate step also constitutes a final segmentation filter.
A volume may be processed for up to seven components by successive iterations of running the Segment and Separate modules. Because the edges of high-attenuation phases will have grayscale values that overlap with less attenuating materials, processing optimally proceeds from greatest to least departure from mean matrix grayscale (for example, from brightest to dimmest phase).
Once one or more phases are segmented and separated, the Extract module is used to extract data of interest for each object. A variety of measurements is available, including center of mass, volume, surface area, aspect ratio, and orientation. The first three are calculated simply from the set of voxels comprising an object, while shape and orientation are estimated by least-squares fitting of an ellipsoid to the object outer surface. The Extract module also measures contacts between particles, including contact normal orientations and surface areas.
Two more specialized quantities necessary for textural analysis applied to study of crystal nucleation and growth mechanisms are the center of nucleation and the extended volume (i.e., the volume in absence of impingement by neighboring porphyroblasts). Image processing described by Denison et al. (1997) and Hirsch (2000) accounted for impingement by allowing the operator to adjust the size and location of the circle superposed over each porphyroblast in each slice to compensate for missing material. A weakness of this method is that detection and delineation of impingement is not always obvious on a two-dimensional section. To account for impingement in BLOB3D, the Extract module includes two three-dimensional algorithms that are based on primitive fitting: finding idealized shapes (spheres or ellipsoids) that best match the porphyroblast surface based on modified least-squares criteria. The Impinged Surface algorithm excludes all surface points that are in contact with another porphyroblast of the same type, or are at the edge of the data volume. The Incomplete Surface algorithm is intended for cases where other factors impact a porphyroblast surface, such as uneven resorption and alteration, and it works by restricting the deleterious effects of a percentage of the worst-fitting points. The maximum percentage of restricted points inside and outside the primitive boundary, and the maximum sum-of-squares values that these points can achieve, are parameters selected by the operator. An example of a garnet described using a sphere with the Impinged Surface algorithm is shown in Figures 1E and 1F.
Of the primitive-fitting methods, the Impinged Surface algorithm applied to spheres tends to give the best-constrained results. The transition from sphere to ellipsoid primitive corresponds to a jump from four to nine degrees of freedom (from center position and radius to center position, three radii, and three angular displacements to align the major axes), greatly increasing the range of possible solutions. The Incomplete Surface algorithm uses no a priori information to select the parts of a surface to exclude. These factors, individually and particularly in concert, can occasionally result in nonintuitive primitive fits. In practice, an operator inspects the results of several primitive fits, adjusting settable parameters as appropriate, and then allows the entire data volume to be processed with a single set of parameters.
Contacts can be measured for any touching pair of objects. The areas of surfaces are measured by summing the areas of the voxel faces that make up the boundary between them. Contact orientation is calculated in two ways: by taking the vector sum of the normal to each contacting voxel face, weighted by face surface area; and by fitting a plane to the set of points on the midpoints of each contacting voxel face. These two approaches are compared below using the orientation of the plane normal to the line connecting the center of mass of each object, which is also calculated.
The BLOB3D software is written in the IDL (Interactive Data Language) programming language (Research Systems, Inc., Boulder, Colorado). Programs developed in IDL have the potential to run on any platform (Windows, Macintosh, Linux, Unix), although minor incompatibilities may need to be resolved; the current version is maintained on Windows and Linux. Most current desktop computers are capable of running it, with the principal requirement being sufficient computer memory to store multiple copies of the volumetric data set; typically 1–2 Gb are required. Persons interested in obtaining or using the software should contact the author.
A series of tests were done to verify that the program produces reliable results and to provide examples of various practical matters of program operation. The first test verified that the program could reproduce the particle-size characteristics of a sample, including accommodating large-aspect-ratio data voxels and utilizing primitive fitting to correct for particle cut-off. Another pair of tests verified the calculations of contact orientation.
Particle Size and Gradation
The phantom for the first test consists of a mixture of glass spheres of various sizes and known grading bound with asphalt (Ketcham and Shashidhar, 2001). The manufacturer-specified sphere sizes and the proportions used to create the phantom are given in 01 . A section of the phantom was scanned at the HRXCT facility at the University of Texas at Austin (UTCT), using the high-resolution subsystem described by Ketcham and Carlson (2001). Slices were acquired at 0.4 mm intervals with a 1024 × 1024 reconstruction of a 140 mm field of view, resulting in an inter-pixel spacing of 0.137 mm. The data were thus much lower resolution in the z dimension than in x-y (e.g., Figures 3B–3D), in part to test the program's ability to deal with inequant voxels. An additional 1.18-mm-size fraction was also included in the phantom, but the slice spacing was insufficient to allow reliable detection and measurement of it, so it was excluded from the analysis.
A 512 × 512 subvolume of the complete CT data set was processed by first segmenting and separating the 5-mm-size class, which had a slightly higher attenuation coefficient than the other classes. The two larger size classes were processed as a single set and later differentiated by size; the final component to be processed was the ∼2.5-mm-size class. In all, 8356 spheres were measured. Of these, 6673 were entirely enclosed in the subvolume, resulting in the size and grading listed in columns 3 through 5 of 01. Measured sizes are slightly smaller than the manufacturer-specified sizes. Caliper measurements made afterwards indicated that the spheres are often slightly irregular and slightly smaller than the specifications. Part of the discrepancy may also be due to conservative segmentation, as parameters were chosen to minimize the number of unresolvable 1.18 mm spheres that were included. Although a large statistical sample was taken, the confined volume leads to a bias in the estimate of grading, as larger spheres are more likely to be truncated by the edges of the subvolume, and are thus underrepresented in the grading based on complete spheres (01, Column 5).
Spheres that were truncated by one or more faces of the subvolume were processed using the impinged-sphere primitive fitting method described previously. A total of 1321 additional spheres was successfully fitted; the remaining 362 were omitted from further consideration, as they comprised a volumetrically insignificant fraction (<0.4% of the subvolume). Results for each population are shown in columns 6 and 7 of 01, and the effectiveness of the primitive fitting procedure can be judged using Figure 4 , which shows the relationship between measured volume (i.e., number of voxels subtended) and primitive volume. Enclosed spheres fall on the 1:1 line, while truncated spheres fall to the left. Among the large sphere classes, the size distribution remains unchanged until the truncated volume falls below ∼100 mm3, or 10%–20% of the full volume. Below this point the spread in values increases, with a slight bias toward higher values, although in some cases apparently reasonable fits are obtained for spheres as much as 99% truncated. These trends are also observed among the smaller size classes, with the onset of increased variation occurring once a sphere is more than 50% truncated.
The classification of truncated spheres by primitive fitting allows their volumes to be included in the grading analysis (01, column 8). The two small size classes match the original grading to within 1.3%. The larger two size classes do not match the grading, probably because of the limited sample size and consequent small number of such spheres analyzed. However, their combined grading matches the original grading of these two phases to within 0.9%.
Particle-Particle Contact Orientations
Particle-particle contact orientations are not a widely used type of information for geological investigations, perhaps in part because they have not been easy to obtain in three dimensions. However, such data have a wide range of possible uses, including: evaluating the mechanical load-bearing properties of matrix-supported conglomerates (e.g., Ketcham and Shashidhar, 2001); estimating the gravitational vector direction in differentiating metal-sulfide assemblages in extraterrestrial impact melts (Benedix et al., 2003); and testing whether porphyroblasts may behave as rigid particles in a deforming schistose matrix undergoing asymmetric shear (Ketcham, 2005).
The phantoms used for these tests were made by gluing relatively uniformly sized (∼9.25 mm radius) glass spheres into simple cubic and hexagonal close packing arrangements (Fig. 5 ). In cubic packing, each sphere is in contact with up to six others in the orthogonal directions. In hexagonal packing each sphere may be in contact with up to twelve others, six along the horizontal plane 60° apart, and three each 54.7° above and below the plane, each 120° apart. Because the spheres were not perfectly uniform in their size and shape, geometric configurations were not perfect, as shown in the scan images, and contacts range from genuinely touching to being separated by some thickness of glue. However, the spheres are close enough to uniform that the center-to-center vector of a sphere to its neighbor provides a reasonable test of the results provided by contact measurement methods. The phantoms were scanned on the UTCT high-resolution subsystem, with 1.0-mm-thick slices spaced at 0.8 mm, with each 512 × 512 image encompassing a field of view of 134.5 mm for hexagonal packing and 154 mm for cubic packing. Once again, the thick slices lead to a voxel aspect ratio of ∼3:1.
Processing results for the cubic packing phantom (Figs. 6A–6C ) show the expected configuration of contact orientations, with all populations 90° apart. The spread evident in center-to-center vectors (Fig. 6A) is a consequence of imperfections in the construction of the phantom. The two contact measurement algorithms (the vector average of normals to contacting voxel faces and the best-fit plane to the center points of these faces) give nearly indistinguishable results (Figs. 6B and 6C). In comparison with the center-to-center data, the contact-normal measurements have a relative lack of dispersion off of the horizontal plane, and away from the vertical direction. Also, the clusters on the horizontal planes are shifted 2– 3 degrees toward the orthogonal axes. These reflect the biasing effects of the orthogonal faces of the data voxels, which can affect separations during BLOB3D processing in much the same way as a cleavage plane. This effect is exacerbated by the voxel anisotropy, which makes stepping from one horizontal plane to another less likely.
The hexagonal packing phantom (Figs. 6D– 6F) shows similar effects. Again, the results are in general accordance with expectation, with the spread in center-to-center vectors attributable to phantom imperfections. Among the contact normal measurements, the populations on the horizontal plane once again show less dispersion off of the plane than within it, and populations near orthogonal planes have slight biases toward those planes. The three horizontal clusters should be spaced at 60°, whereas the measured cluster spacings are at 55°, 60°, and 65°. The two algorithms are similarly affected; the median error (inferred as net angular divergence from the center-to-center vector) among nonhorizontal orientations is 3.5° for the best-fit plane algorithm and 3.2° for the mean face normal determination. The nonhorizontal clusters are also biased somewhat toward the vertical direction, although the best-fit plane algorithm has less divergence (4.2° versus 7.1° median error). The small-surface-area outliers in Figures 6E–6F reflect cases where spheres are connected only by a small and irregular interface of glue.
The new tools introduced here greatly improve the quality and ease of acquisition of three-dimensional measurements from volumetric data sets. While making heavy use of operator decision-making, BLOB3D's processing is fast enough (typically <1–10 seconds per object) that it can measure thousands of objects in a matter of hours, providing excellent statistical sampling. Furthermore, it makes many three-dimensional measurements possible that heretofore may have been considered impossible or impractical. Although optimized specifically for HRXCT imagery, BLOB3D can potentially be used with any type of volumetric data.
See “Improved methods for quantitative analysis of three-dimensional porphyroblastic textures,” by R.A. Ketcham et al., Geosphere, v. 1, p. 42–59, doi: 10.1130/GES00002.1.
This research was supported by National Science Foundation grants EAR-9902682 and EAR-0113480, and by National Research Council Transportation Research Board IDEA Program contract number NCHRP-64. Programming contributions at early stages were made by D. Hirsch, and recent enhancements were made by W.P. Watson. Program testing and utilization for data acquisition were performed by S. Anderson, T. Eiting, and C. Meth. Illustrations were made with help from C. Meth. The manuscript was improved thanks to helpful reviews by M. Rivers and M. Higgins.