SADIE logo
FAQ      Non-parametric
RES logo


What's this all about - I thought SADIE was by definition a non-parametric method?

Yes, it is. There is no statistical model underlying SADIE, so the method is always non-parametric. But the term as used here applies to the set of data, not the method.



Why do we need a non-parametric version of SADIE?

When a set of counts is very skew, with a variance much greater than its mean, there may be relatively very few counts greater than the mean. For example, in the set of counts {0,0,1,2,4,9,16,63,904}, the mean is 111, so eight of the nine counts are less than the mean and only one, 904, is greater than the mean. The variance, 88832, greatly exceeds the mean. In the SADIE system, for this set of data, only one sample unit, the one with count 904, is a 'donor' unit and could possibly be assessed as being within a (red) patch. All the eight other units are 'receiver' units and are therefore potential (blue) gap units. This clearly limits the ability of the method to discriminate spatial pattern if it exists.




What is the idea behind the non-parametric approach?

To yield greater discrimination, consider replacing the actual counts with their non-parametric ranks, and, for technical reasons, first multiply each rank by 2. The above set would then be transformed to: {3,3,6,8,10,12,14,16,18}. Now, this set is the non-parametric equivalent of the original. The mean is now 10, and there are four units with count smaller than this and four units with count larger. This new set of counts may be used as input to SADIE, exactly as the original counts would have been. With the new set, there are thus four potential patch units and four potential gap units and the ability of the new data to discriminate spatial pattern is therefore enhanced. Note that the arrangement of the original set, defined by the coordinates of its sample units is retained; it is just the counts that are different.

Hence, if the original data had coordinates:

x y count
1.0 1.0 0
2.0 2.0 0
1.0 2.0 1
2.0 1.0 2
2.0 2.0 4
. . .
. . .
5.0 6.0 904

the new, transformed, equivalent non-parametric data would have the same coordinates:

x y count
1.0 1.0 3
2.0 2.0 3
1.0 2.0 6
2.0 1.0 8
2.0 2.0 10
. . .
. . .
5.0 6.0 18

In summary, the idea behind the non-parametric approach is that it addresses the problem, for very skew data, that there may be relatively few counts greater than the mean, and therefore an inherent difficulty for the method to detect clustering in the form of patchiness. It does this by 'centering' the data about the median. The median of the (old) parametric data becomes the mean of the (new) non-prarametric equivalent data, so, by definition, there are as many values greater than the new mean as there are less than it. However, crucially, in transforming the data it retains the concept that there is information in the arrangement of the counts relative to one another.




When should I use the non-parametric version of SADIE?

If you believe that the order or rank of the counts relative to one another are as, or more important, as their actual magnitude. This is especially useful with data that is highly skew and for which the variance far exceeds the mean.




How can I run a non-parametric analysis?

If you are using SADIEShell (the preferred option), then look at the second button from the left on the toolbar, which shows a red frequency distribution on a black and white graph with two axes.
Depressing this button will give a non-parametric analysis.
Alternatively, look under the drop-down 'Tools' menu at the first option.
You can toggle between the usual parametric and the non-parametric versions.
A cream coloured pane immediately underneath the toolbar displays the chosen option.
Note that if you choose the non-parametric option there is no need for you to do anything to your data prior to input; the program generates the transformation to the non-parametric version of your data automatically.

Alternatively, if you are not running under SADIEShell, there is a non-parametric equivalent, rbrelv13np.exe, of the standard program rbrelv13.exe, that can be downloaded from the downloads page.

Note that for the non-parametric version, the output in rbno6.DAT is slightly different and explains how the data have been transformed.




How do the results differ when I run a non-parametric analysis

This is best explained by an example. Look at the frequency distribution representing data of the aphid Sitobion avenae, from Winder et al. (2001) Ecology Letters. The distribution shows that there were 256 sample units, of which 12 were '6', 12 were '7' and 12 were '8'. These 36 are shown in white. They were less than the mean, 8.33, in the original data, but they are greater or equal to the mean of the new set of data (indicated by the median, 6, of the original distribution). frequency distribution for non-parametric analysis
In other words, although the majority of original 'donor' units (shown in red) remain as 'donor' units that are potential patches, and although most of the original 'receiver' units (shown in blue) remain as 'receiver' units that are potential gaps, a good many units (those shown in white that were originally '7' or '8') have changed from being 'donor' units (potential patch units) to 'receiver' units (potential gap units). These are qualitative changes. They occur in addition to the purely quantitative numerical changes in the data brought about by the transformation.

It is important to realise that, for a non-parametric analysis, the data change, and therefore the results must also change. This is shown by the comparison between the red-blue plot for the parametric analysis (moderate gaps, few patches): avenae parametric analysis

and that for the non-parametric analysis (moderate gaps, moderate patches): avenae non-parametric analysis





The figure below attempts to shows how these two differ on one graph. Here, the open circles indicate those sample units that have changed, from 'receiver' units in the parametric analysis, mostly to 'donor' units in the non-parametric analysis. The bolder, hatched clusters with the solid boundary lines are from the usual parametric analysis; the stippled clusters with the dashed boundary lines are from the non-parametric analysis. It is largely because of the transfer of the units from 'blue' to 'red', that there are fewer stippled gaps and more stippled patches in the non-parametric analysis.
avenae para and non-para comparison



Who produced the graphics for the examples?

My post-doc Colin Alexander, the inventor of the green-plum Alexander plot of the surface of lagged spatial associations, published in the Winder et al. (2001) paper in Ecology Letters.



go backrun back to
the last page
home 
SADIE home page