Examples for picking model areas¶
In general, the usefulness of modelling any area depends on landscape variation - too little variation and modelling is not needed or appropriate, too much variation and 16 landscape types may not be enough. The usefulness of a model is therefore not necessarily dependent on the area size, although it can be reasonably expected that in many settings larger areas will contain more landscape variation. In addition, the scalability of model areas is also dependent on other factors, namely, input data properties and computing time. The intention of the landscape model is to provide landscape domaining for general mapping and to provide geochemical outlier definition by the mapped landscape types. The number of samples in an area should therefore also be considered. For statistical outlier definition we recommend a minimum of 50 samples be present in every landscape type.
Example - Impact of landscape variation on model area¶
Homogenous survey areas, e.g., areas of consistent sheetwash without variation, will not produce a useful model. In such instances geochemistry should be reviewed as a whole population.
BACKGROUND
The dimensional reduction built into the workflow that creates the models summarises similarities in the input data to group similar pixels together. These groups will be clustered based on major differences to other groups. A prominent ridge in an otherwise flat landscape will likely be highlighted in any number of landscape types. A minor difference in soil properties in a large, strongly varied landscape may not be highlighted.
Example 1 – Landscape variation
A useful model
A useful model will cover an area with a minimum of four different landscape types and contain sufficient samples (see Impact of sampling density in the next section). In the example below multiple different features are part of the model area: active and ephemeral channels, sheetwash plains, elevation changes, strong radiometric features and even the surface expression of a fault line. Landscape variation is apparent in the 2D model with congruent features and little pixelation. The clusterbox (pixels of the map in 3D) shows distinct clusters (clouds of points). This is a useful model of regolith context in this area and to divide geochemical datasets into sub-populations for outlier definition.
Example 2 – Landscape variation
A less useful model – little landscape variation in a small area
A small area with little landscape variation results in a pixelated (noisy), largely incongruent model in 2D (only a few features are defined such as channels in the north and outcrops in dark purple) with a fuzzy cloud of pixels in 3D and few distinct point clusters.
Extending the model area will, in most cases, include further landscape features/variation and produce a more useful model. However, location and position of samples should be considered (see Impact of sampling density in the next section) as there may be little value in interpreting outliers by sub-populations if most were sampled in the same landscape setting.
Examples - Impact of sampling density on model area extent¶
The purpose of the models is to put your soil surveys into context. In general, we recommend a 10 % buffer around a soil sample survey to ensure that landscape clusters are representative of the sample locations. For small areas, we recommend a larger buffer. The extent of your model area will affect statistical outlier definition due to the inclusion or exclusion of surrounding landscape features. For robust outlier definition, we recommend a minimum of 50 samples per landscape cluster.
Example 1 – Appropriate sampling density
The two examples below show an appropriate number and spread of samples over varied landscape types and similar area sizes. In both example a buffer of 20-25 % was chosen. The left-hand example contains ≈2500 samples, the right-hand example ≈800. Both areas will produce useful landscape models. Larger sample numbers will provide more robust identification of outliers.
Example 2 – Appropriate sample density to compare multiple survey results in a regional area
The next example also shows an appropriate number (≈2000) of samples for the size of the survey area. The landscape variation is also appropriate with multiple prominent features (various channels, outcrops, sheetwash and residual settings). It is likely that a large number of landscape types will be needed to effectively capture the major landscape features. This is an appropriate model extent if the goal is to treat (what looks like) different soil surveys collectively as a single site for comparison.
Example 3 – Inappropriate sample density for outlier definition
The last example is unlikely to produce robust or useful outliers for geochemical elements by landscape type. The area is large and landscape maps produced from such a model will have value as a broad-scale landscape map. However, the entire ≈200 samples of the soil survey (in the SW corner of the model area) are located in what will likely be mapped as one single landscape type. As a result, the geochemical outliers by landscape type will have little value (they will be the same as looking for outliers within the whole data set, i.e., one population).
Example - Impact of anthropogenic disturbance on model areas¶
Anthropogenic activities may disturb the natural expression of regolith properties at the surface. For example, in an area with pine plantations that is otherwise undisturbed by human activity the model may assign separate landscape types for these pine plantations. You may choose to mask such areas or not mask them and ignore the subsequently modelled landscape type(s)/feature(s). This may become challenging in a large model if you have an airport, a tailings dam and a pine plantation in a large survey area when only 16 landscape types are available. Your model may assign three landscape types to different anthropogenic features, reducing the number of landscape types available for your regolith materials. In addition, a disturbed area may have been affected in a way that mimics other landscape types in your survey area and the model will combine these with such areas.
Example 1 – Isolated anthropogenic disturbances
In the below example both, the model with and the model without masks could be used, since no samples were sampled in the affected areas and 16 landscape types were sufficient to capture the variety of landscape types adequately.
In the maps below, whilst the colours of different landscape types have changed, the main landscape features within the model area have been mapped > regardless of the presence or absence of masks.