Levich A.P., Teriokhin A.T.

Dept.of Biology,Subfaculty of Zoology of Vertebrates and General Ecology, Moscow State University, Moscow, 119899


Key words: Ecological data processing; Ecosystem state diagnostics; Ecological standartization.


A method for determining the region of normal functioning of an ecosystem in the space of its abiotic factors is proposed. The lower and upper boundaries named ecologically tolerable levels (ETL) of this region for different factors acting on the ecosystem are calculated. Escaping beyond these levels corresponds to a transition of the ecosystem from a normal state to an abnormal one. Calculation of ecologically tolerable levels is accompanied by calculation of criteria of their exactness, completeness, statistical reliability and essentiality. The factors are ranked in accordance with their contribution to the degree of ecological degradation. Inability to explain the observed ecological disturbances by using the method suggests that the monitoring program used to obtain the data is not complete. Significant facrors relating with ecological trouble in Lower Don river are revealed. For these factors the ETL are calculated and statistical reliability of turning out norms is estimated.

1.Conception of the method

The proposed method for normalization of external actions upon natural ecosystems has as its basis the biotic conception of environment (Karr, 1991; Cairns and Niederlehner, 1993; Levich,

1994).In accordance with this conception, estimation of ecosystem state on the scale "norm-pathology" must be based on biotic but not abiotic characteristics, say, chemical ones. In its turn, the existence of a criterion of ecosystem state allows to pose the question about the levels of factors dividing normal and abnormal states of the ecosystem.

All calculations are based on really measured characteristics of natural objects at different moments of time. These characteristics can be divided into two types: abiotic factors and biotic estimates of ecological state. For example, for a water ecosystem we may consider as factors such qualitatively measured characteristics as concentrations of chemical compounds, characteristics of oxidation, transparency, color as well as climatic and hydrological characteristics, and so on.

The assembly of values of all considered factors for a given location and for a given time will be called an "observation". The set of all observation for a given time interval and for a given set of natural objects constitutes the data table which is the only base for our calculations. Each observation may be also interpreted as a point in the multidimensional space having factors as its coordinate axes. An important feature of data actually accessible in practice which will essentially guide our choice of calculation procedures is absence in real data tables of significant number of values which could be present in a fully filled data table.

It is also supposed that for each observation we have values of some biotic characteristics which permit us to qualify the ecosystem state for this observation as normal (state of well-being) or abnormal (state of disturbance, pathology). Correspondingly, each observation in the factor space can be designated as normal or abnormal (Fig. 1).

The main purpose of the presented method is to delimit in the factor space the region (in form of multidimensional parallelepiped) corresponding to normal functioning of the ecosystem i.e. to determine the boundaries of this region for each factor. This boundaries are called ecologically tolerable levels (ETL) of factors and when an ecosystem escape beyond them this is interpreted as its transfer from a normal state to an abnormal one.

In reality, the boundaries of the ecosystem normal functioning region may be rather fuzzy. In Fig. 2, in coordinate system "factor - state", an idealized diagram of observations distribution for one-dimensional factor space is presented. The region of normal functioning is simple: for all factor values which are less than the ETL the ecosystem state is normal and otherwise it is abnormal. An analogous diagram for a real situation is given in Fig. 3.

There are several causes of departure from the idealized case. The first is due to statistical variations of data and to measurement errors. As a result, "pluses" can appear to the right from ETL (in the quadrant 2 of the diagram) and "minuses" can appear to the left (in the quadrant 3). The second cause is the action of some additional factors upon the ecosystem. These factors can cause abnormality even when the considered factor has a value not exceeding the ETL. As a result, "minuses" can appear to the left from ETL (in the quadrant 3). However this does not add "pluses" to the right from ETL (in the quadrant 2). The last feature - absence (if we do not take into account observational errors) of points in the quadrant 2 - is very important for practical search of ETL. Corresponding algorithms should be based on transforming diagrams to a form when the quadrant 2 were as empty as possible. To estimate the degree of "emptiness" a criterion named "criterion of exactness" is introduced. It is defined as ratio of the number of observations with abnormal value of ecological state estimate and with value of factor exceeding ETL (quadrant 4) to the number of all observations whose factor value exceeds ETL (quadrants 2 and 4 together). For correctly calculated ETL the exactness must be near 100%.

The presence of observations having abnormal ecological estimate but "normal" (i.e. not exceeding ETL) factor value means that (besides erroneous measurements and statistical variation) the factor under consideration does not describe completely the causes of departures from normality. Quantitatively, this situation can be described by the so called "criterion of completeness" which is defined as the ratio of the number of ecologically abnormal observations with factor values exceeding ETL (quadrant 4) to the number of all ecologically abnormal observations (quadrants 3 and 4 together). This criterion reflects also the importance of the factor in the sense that if the completeness for this factor is high then a great number of cases of abnormality can be explained by using it.

The criteria of exactness and completeness for evaluation of the assertion "if the ETL is exceeded then the ecosystem state is abnormal" is borrowed from the conception of so called "determination analysis" (Chesnokov, 1982). Some more common statistical methods, such as discrimination analysis or multiple regression analysis, usually cannot be used in our context because of great number of missing data.


Let us introduce some designations. Let X be a factor of

environment and Y be an estimate of ecological state. We shall

classify an observation as ecologically normal if X is not

greater than some value E (which ultimately is to be determined)

called ecologically tolerable level (ETL) and as ecologically

abnormal if X is greater than E. In doing so we define the

observation as ecologically normal if Y is not greater than a

given value F and as abnormal otherwise. Let as also designate:

a is the number of observations for which Y , F and X , E;

b is the number of observations for which Y , F and X > E;

c is the number of observations for which Y > F and X , E;

d is the number of observations for which Y > F and X > E.

For simplicity, in the examples that we consider we suppose namely such a situation i.e. that we classify as abnormal the observations whose factor values are to the right from ETL. However the situation can be more complicated. For example, in the case of such a factor as concentration of oxygen diluted in water, unfavorable values of the factor are lower ones. In general, the boundary of the region of ecologically tolerable values of a factor is two-sided and is defined by two boundary values E1 and E2 called lower and upper ETL. Respectively, the quantities a,b,c, and d are defined by the conditions:

a is the number of observations for which Y,F and E1,X, E2;

b is the number of observations for which Y,F and X<E1 or


c is the number of observations for which Y>F and E1,X,E2;

d is the number of observations for which Y>F and X<E1 or


Using these designations we can define the exactness of the assertion "if X<E1 or X>E2 then Y>F" by the formula

and the completeness of this assertion by

The determining of E (or E1 and E2 in the case of two-sided

boundaries) is our main purpose in searching for dependence of Y

upon X for a given F. We define as optimal for a given lower

boundary of exactness T that value of E for which the

completeness P is maximum.

Such an analysis which could be called "one-factor analysis" can be carried on for each factor. As an example, Fig. 3 presents the results obtained when determining the influence upon the ecological state of the factor "nitrogen concentration", one of 67 factors for which such an analysis was performed on the base of the monitoring data for Don water basins in 1975-91. The scale of estimate of ecological state is from 1 (completely normal state) to 5 (state of regress) with precision up to 0.5 (Abakumov 1991; Geletin et al. 1991). The value F=2.75 was taken as boundary value between norm and pathology and the value T=75% as lower boundary of the exactness. The maximum value of the completeness in this case

was attained for the concentration of nitrogen equal to E=0.986

mg l-1 which is so the ecologically tolerable level that was sought.

If there are several factors then their one-factor ETL may be pooled. We shall classify an observation as abnormal if at least one of the factors considered X1, ..., Xm exceeds its ETL. Correspondingly, the values of a, b, c, and d are redefined as follows:

a is the number of observations for which Y , F and X , E

for all Xi;

b is the number of observations for which Y , F and X > E

at least for one Xi;

c is the number of observations for which Y > F and X , E

for all Xi;

d is the number of observations for which Y > F and X > E

at least for one Xi.

After such redefining the formulas for T and P remain the same. The "pooled" value of the completeness P cannot be less than its one-factor values but the pooled exactness may become less than its lower boundary.

It is worthwhile to note that the expressions "for all Xi" and "at least for one Xi" mean "for all Xi having values for this observation" and "at least for one Xi having values for this observation". This permits to augment the number of observations used in the analysis that it is very important for data having many missing values.

As an alternative to the procedure of pooling, a purely multifactor procedure may be used. It consists in determining a joint set of ETL maximizing the joint completeness under condition of verifying a given constraint on minimum admissible exactness. One of advantages of this procedure is a possibility of obtaining more optimal ETL.

Unfortunately, the purely multifactor approach demands too much time for calculations even for the number of factors greater than two or three. Because of this we propose a "combined" method which proceeds as follows. Initially, by a purely two-factor method, we find the pair of factors providing the maximum completeness and verifying the constraint for the exactness (to be not less than the given lower boundary). After this, we try to add to this pair each of remaining factors calculating for each of them the optimal three-factor ETL and the corresponding three-factor completeness (these are not, however, purely three-factors parameters because the ETL found for the initial pair are not recalculated). Ultimately, we take as third factor that one which provides the maximum increase of completeness when it is joined to the initial pair of factors.

Analogously, the triplet of factors can be enlarged to a set of four factors and so on. In comparison with the procedure of pooling, this procedure guarantees verifying the constraint on minimal admissible exactness and can provide more optimal ETL (though, possibly, less optimal than for purely multifactor procedure).

3.Application example of the method

The combined procedure was used to assess the dependence of the ecological state of zooplankton in Lower Don basin upon 87 environmental factors. For 12 of them (biological oxidation, ammonium, nitrites, oil products, phenols, detergents, copper, water flow, zinc, chromium, pesticides, temperature) which provide the maximum completeness for one-factor analysis the corresponding one-factor ETL, exactnesses and completenesses are presented in Table 1.

The results of purely two-factor analysis are given in Table 2. Fig. 4 illustrates this analysis for the pair of the most informative factors "ammonium" and "oil products". Adding the second factor increases the completeness from 37 to 48 percents (the factor "oil products" was preferred to "zinc" because there was more complete data for it).

As a result of sequential joining of factors to the initial pair we found the following values for the completeness:

nitrites (3rd factor): P=53%;

biological oxidation (4th factor): P=56%;

detergents (5th factor): P=57%;

water flow (6th factor): P=59%;

phenols (7th factor): P=59%.

The procedure of pooling described earlier could provide for 7 factors the value of P equal to 62% but only for the exactness equal to 70% whereas the combined procedure provides the exactness 75% though only for 59% of completeness.

An disadvantage of the combined procedure is its inability to work with data tables having many missing values because it demands, like the purely multifactor procedure, the presence of the values of all factors for all observations taken into account. As a more practical a procedure which could be called "stepwise" is proposed. It resembles to the combined procedure but differs from it by using only one-factor ETL. At each new step the factor providing the maximum increase of completeness is seeking but its one-factor ETL is not recalculated. At the last step we obtain the pooled completeness and exactness. The constraint on exactness again can be not fulfilled.

If there are many missing values then it is more reasonable to modify the criterion of including factors in the stepwise procedure. In place of the early defined completeness we recommend to use the next characteristic:

(N2 is the number of all observations for which Y>F) that can be

called "absolute completeness". This permits to prevent

including, especially at the first steps, those factors which

gives a great increase of P but only for a limited number of


One of results of stepwise procedure is ranking of factors in the order of their influence upon the state of the ecosystem. Another one is information about interdependences between factors. For example, if a group of factors have high values of completeness at some step but after including of one of them at this step each of remaining factors gives only a small increase of completeness at the next step then this means that their influences upon Y are correlated (this is analogous to the partial correlation analysis for quantitative variables). Information of such type is important for purposes of control: we must take into account a possibility that a factor estimated by our procedure as important may be really only correlated with another acting factor.

We already noted that the stepwise procedure, in contrast to purely multifactor and combined ones, does not guarantee fulfillment of exactness constraint. To compensate this disadvantage we propose to calculate the one-factor ETL with a more high value of the minimum admissible boundary of exactness. For example, we took as one-factor exactness boundary equal to 80% in hope to provide not least than 75% for the pooled exactness.

4.Statistical reliability of the data

In addition to obtaining an estimate for ETL it is useful to assess statistical reliability of this estimate. We used for this a method of statistical simulation called "bootstrap". It proceeds as follows k samples of data of the same dimension m+1 (m of factors plus one estimate) and of the same size N are generated by random drawing observations from the original data. After this, the algorithm which was used for finding E, T, and P is applied to each of simulated data tables. As a result we obtain k sets of values for ETL, exactness, and completeness: E1, ..., Ek; Ti, ...,Tk; Pi, ...,Pk. Then we calculate for these sets their means Em, Tm, Pm, their standard deviations Es, Ts, Ps, and their coefficients of variation

The number of samples k is a compromise between the reliability of the method and the size of computation. We took k=10 and this was rather sufficient taking into account great variability of data. The coefficients of variation of ETL, exactness and completeness can be used as additional characteristics for choosing the factors the most significant from the point of view of their influence upon the ecological state (Tables 3, 4, 5).


The analysis described above allow, for each state estimate and

for each factor, to find the values of ETL, exactness and

completeness. As a result all the factors are divided into two

groups: "undersignificant factors" i.e. those for which the ETL

are between the minimum and maximum observed values of

corresponding factors and "insignificant factors" i.e. those for

which all ecological estimates are normal.

The ETL for insignificant factors, if they exist, are somewhere out of the range of values in the analyzed data. The result of the analysis for these factors are their minimum and maximum values which we call ecologically safe boundaries (ESB). When establishing ecological norms we can use these boundaries as approximation to the complete region of ecological well-being which cannot be found because of incompleteness of data.

The result of the analysis for undersignificant factors are their ETL. Further analysis allows to extract those of them which we call "significant" and which can serve as the main basis for ecological normalization. We recognize a factor as significant by taking into account several criteria. The main criterion is a high value of completeness for this factor. The second one is a sufficient number of observations (at least about 10) for normal as well as for abnormal states. Finally, some expert considerations about influence of a given factor upon the ecological state are taken into account.

For undersignificant factors which was not recognized as significant we can use for ecological normalization their ESB. Their ETL can also can be used but we must take into account that they are not sufficiently justified.

The significant factors can be ranked in accordance with their completenesses P. This criterion may be interpreted as follows: if grace of environment defence measures the value of a factor with completeness P enter to the region confined by its ETL then P% of abnormal states will become normal (on condition that all other causes of abnormality are also eliminated).

The ETL method requires not only data for values of fac- tors but also ones for ecological estimates. In accordance with the biotic conception of environment control (Levich, 1994) the state estimates must be based upon biotic characte- ristics. The estimates used in the examples in this paper are obtained by so called method of ecological modifications and are based on such characteristics as number of species and their abundances, indices of saprobity for phytoplankton, bacterioplankton, periphyton, zoobenthos. State estimates may be also based upon another characteristics recognized adequa- te to the purposes of the study, for example, upon ichthyolo- gical characteristics of water basin (Bulgakov et al.,in press ), upon hygienic characteristics of an environment, upon physiological characteristics of indicator organisms, and so on.

We recommend to apply the principle of maximum severity when choosing between ETL corresponding to different types of state estimates of to different time lags i.e. to choose the minimum value for all upper ETL and the maximum value for all lower ETL. The user can propose some other criterion, for example, he can declare one of state estimate as the most important and then the ETL for this estimate must be preferred.

It is worth to note that when using the conception of limiting admissible concentration different norms for different purposes (medicine, fishery) are also applied.

Not only instant values of abiotic factors may be used for analysis but also their average for a year, for a month or any other average value as well their minimum and maximum values, any value lagged in time and so on. As an example in Fig. 5 a chronogram of month averages of pH for the ecological estimate based on zoobentos characteristics for lower Don is presented.

We noted already that including of a new factor in the stepwise procedure depends not only upon its own importance but also upon its correlations with other factors. We call "essentiality" of a factor with relation to a set of another factors the increase of pooled completeness when adding this factor to this set. When choosing environment defence measures we must take into account the essentiality of a factor in parallel with some others ones, for example, such as its admissibility for regulation and social impacts.

The obtained results can also be used to estimate any separate observation related to a particular factor, location and time point from the point of view of its unfavourability for ecosystem state. It is convenient for this to express the value of the factor under consideration as ratio to its ETL.

In conclusion we must note that, of course, the results obtained depend essentially upon particularities of the data used for the analysis. It may happen that the factors really influencing upon the ecosystem state are not presented among the factors we analyze and then the monitoring program must be completed. Some of factors included in the analysis must not be considered as causes of ecological disturbances being themselves influenced by some other factors. For example, indices of chemical and biological oxidation, pH, diluted oxygen are partially of this type. So our formal analysis should always be accompanied by additional expert analysis.


Abakumov, V.A. (1992). Ecological modifications and biocenosis development. In Ecological modification and criteria for ecological standartization, 15-37. St.Petersburg: Gidrometeoizdat.

Bulgakov, N.G., Levich, A.P., and Teriokhin, A.T. (1996, in press). A method for searching contingencies between hydrobiological characteristics and abiotic environmental factors. Notes of Moscow State University. Biological issue.

Cairns, J., Jr. and Niederlehner B.R. (1993). Ecological function and resilience: neglected criteria for environmental impact assessment and ecological risk analysis. The Environmental Professional 15, 116-124.

Geletin, Yu.V., Zamolodchikov, D.G., Levich, A.P.,Volynov, A.M., Koreneva, I.B., and Yadkova, V.V. (1992). Estimation and prediction ofwater ecosystem state by ecological modification method. In Ecological modification and criteria for ecological standartization, 179-191. St.Petersburg: Gidrometeoizdat.

Karr J.R. (1991). Biological integrity: a long-neglected aspect of water resource management. Ecological Applications 1: 66-84.

Levich, A.P. (1994). Biological conception of environment control. Doklady Biological Science 337, 257-259.

Chesnokov, S.V. (1982). Determination analysis of socio-economic data. Moscow: Nauka (in Russian).