Sequential data can be segmented by the techniques of (1) numeric differentiation, (2) split moving window, (3) maximum level variance, and (4) piecewise regression. Each method is compared and contrasted with the others for performance with different types of data strings. In geology, data strings could be electric logs (or other borehole logs), seismic traces, x-ray curves, or traverses measuring some property with distance. Segmenting traces into like parts with uniform characters facilitates comparison and correlation.
Classification and categorization, particularly the numerical classification of samples into categories, are common aspects of geological investigations. However, variables that are irrelevant to the classification or investigation are commonly included. Relevant variables are often difficult to isolate, and objective methods for pointing them out are lacking.
This paper describes a semiobjective scheme for identifying important variables; the scheme can be reduced to basic statistical computations, with some repetition of the numerical classification method. The steps are as follows: (1) Classify or cluster the data and determine the best classification; (2) select a set of potentially important parameters by using the methods described in the paper; (3) reclassify the samples, but use only the parameters selected in step 2; and (4) if the results of step 3 agree with step 1, the procedure is ended; otherwise, go to step 2 and modify the list of selected parameters.
The procedures are illustrated in detail with an analysis of foraminiferal data collected from the Gulf of Mexico. A classification of 38 samples into five categories, based on 252 species, is shown to be obtainable with only 14 species. Reduction in the number of species to be considered may result in future savings in time, both in interpretation and data gathering.