A Knowledge-Based Approach to Predicting Salinty in The South- West of Western Australia

A Knowledge-Based Approach to Predicting Salinty in The South- West of Western Australia

¹P.A. Caccetta, ²H.T. Kiiveri, ²F.H. Evans and ³R. Ferdowsian
¹School of Computing, Curtin University of Technology
²CSIRO Division of Mathematics and Statistics
³Department of Agriculture Western Australia

Abstract
This paper describes the construction of a knowledge-based system and its application to predicting salinity in the Kent River catchment located in the south-west of Western Australia. The system incorporates remotely sensed data, data derived from digital elevation models, maps produced by experts and data combination rules based upon expert knowledge and expert-derived training data. Bayesian networks are used as the modelling environment, allowing reasoning with uncertainty in the process of combining data. Firstly, data were obtained for three time periods spaced roughly a decade apart. A knowledge-based system was then conceived and applied to the data, with data from one decade used to predict land condition in the following decade. The resulting prediction maps were then compared with independent validation data, giving good results. The maps were used to provide estimates of the historical spread of salinity in the catchment and its likely extent in the future.

1 Introduction
In much of the Western Australian agricultural region, the clearing of land has lead to ..rising saline groundwater, resulting in the loss of previously productive land to salinity. Based on farmer surveys conducted in 1979 and 1989, the Australian Bureau of Statistics reported that 443000ha {2.8%) of previously arable land was lost to salinity at a rate of about 18000ha each year .

Aerial photo interpretation has been used to assess the extent of salinity in some areas [4], but this is time consuming and expensive. More recent efforts [6] [7] [8] ( have used Landsat TM data to estimate the extent of salinity. Typically, statistical I approaches such as maximum likelihood classification are used to classify the data into different landcover classes, which are associated with the land being severely affected, slightly affected and not affected.

The above approaches provide a means of monitoring the extent of salinity but provide no insights into the likely effects that remedial actions such as tree planting will have on the problem, or given the current land status, what areas are at risk of salinisation in the longer term. The latter is imperative for the formulation of remedial measures.

To explore ways of achieving these aims, we developed a knowledge-based system for representing relationships relevent to salinisation which can then be applied to mapping and predicting salinity. The system takes various input maps in a GIS and applies rules, which have varying degrees of confidence, to produce output maps of existing and likely future salinity. The system may also be used in an interactive fashion to answer what-if scenarios.

The system was developed in the Upper Kent Catchment (approximately 2000 sq. km), which lies in a high rainfall (500- 750mm) area of the agriculture region, approx- imately 350km south east of Perth.

The methodology of our approach was

consult with experts to gain knowledge of the problem
based on the expert knowledge, obtain data (where possible) that is relevant, possibly with further processing, to appling the knowledge over the catchment '
based on expert knowledge and available data sets, construct the knowledge-based system
use the resulting model to interpret the data, ie produce salinity risk maps more the region
validate the results by comparing the system output to independent validation data, and if acceptable, form a catchment salinity summary

These points are discussed in this paper .

2 Knowledge Elicitation and Relevant Data Sets
A workshop was held to provide a forum to quantify current knowledge on factors relevant to dryland salinisation in the south-west of Western Australia. The workshop " was attended by 21 people, including experts on salinisation from the Department of Agriculture, the CSIRO and the West Australian Water Authority. The factors identified 4 include time since clearing, depth to ground water, rate of ground water rise, distance to existing salinity, climate, degree of waterlogging, geology, salt storage, landform, position <j" in flow path, historical and existing vegetation cover, depth to basement, hydrology (such .as influenced by shear zones, faults and dykes) and land management.

Clearly, relevant data for some factors could not be obtained in a cost effective manner on a catchment scale; for instance, data relating to depth to ground water and rate of ground water rise typically are obtained from monitoring bore holes.

A time period of approximately one decade was chosen as the monitoring inter- val. Features derived from a digital elevation model (dem), satellite Landsat MSS and Landsat TM and historical aerial photographs were the primary sources of data.

Historical estimates of landuse, and in particular clearing, were obtained for the: years 1977, 1988 and 1994 from classification of Landsat MSS (September 1977) and Landsat TM (August 1988 and August 1994) data. A maximum likelihood classifier was used..

Elevation data were in the form of contours derived from stereo photo interpretation. This is the most common source of elevation data presently available in the Western Australia. For the area considered, 5-metre contour data were available. These data were interpolated using cubic spline interpolation to form a raster dem.

A slope map was produced from the dem. Using the dem and the clearing information .derived from the Landsat data, maps of up slope cleared area and percentage up slope cleared area were produced for each of the periods using an algorithm based on flow path predictions [5].

Independent training and validation data were obtained from an interpretation of historical aerial photographs, with the areas classified into the classes not saline, potentially saline and saline. The same training areas were interpreted for the years 1973,1985 and 1994, thus enabelling predictions to be made for the years 1985, 1994 and 2004.Unfortunately some unavoidable date mismatches occured between the available Landsat data and the aerial photgraphs used for the training and validation data, although they were considered sufficient to model the general trends.

Aerial photo interpretation was also used to partition the catchment into landform patterns [2], from which surrogates for ground water salinity were derived.

The data used were:

Up slope cleared area (UpSlpClr): the total area of land up slope of a given location '.that has been cleared ,.
Percentage up slope cleared (PerClr): the percentage of the catchment above a , given location that has been cleared;
Flowslope (FlowSlope): the slope of the land in the direction of surface flow;
Land cover (LandCov): land use and condition as identified using Landsat satellit, imagery; and
Ground water salinity (GndWatSal): map of the expected ground water salinity: as derived from landform patterns.

Based on expert knowledge and available data, a knowledge-based system representing the process of salinisation was constructed. This is the topic of the next section.

3 Knowledge Representation
A Bayesian network, also referred to as a causal network or a ( conditional) probability network, was used for representing knowledge and combining evidence. The network that was constructed may be represented graphically as in Figure 1. The ellipses represent variables (attributes) whose values are derive from maps, human input or inferred by the system. The directed edges represent relationships that exist between variables. For example, Disch94 is ground water discharge in 1994 and LndCov94 is land use/condition in 1994 derived from a classification of Landsat data.

Figure 1: Graphical representation of the Bayesian network
The knowledge embodied in the network is represented by the joint probability distribution of the variables, i.e. beliefs ( rules) are expressed in terms of ( conditional) probabilities. The arrows in Figure 1 depict the rules that need to be specified. The probabilities for a variable are specified as the conditional probability for that variable given all variables that have an edge directed toward the variable of interest. For example, in Figure 1, one set of rules that needed to be specified was P(Disch94IFlowSlope,PerClr94,UpSlpClr94).

Given evidence about the state of one or more variables, the system can be used to derive conclusions about the probable state of the remaining variables. Note that the system works with probabilities in a consistent manner, allowing probabilities to be estimated and interpreted as relative frequencies. This feature is particularly useful in remote sensing and GIS when using the system in an interactive mode, as these values relate directly to the tangible interpretation: given the available information, it is expected that x% of the land has the characteristic in question. For those interested, more technical details can be found in Caccetta et al [1] and Lauritzen et al [3] .

The system parameters (conditional probability tables) were usually estimated using the training data, although in some instances the parameters were defined subjectively. 4 Results
The data for the catchment were interpreted (processed) with the Bayesian network l' to produce probabilistic maps of land condition for each of the three time periods. A classification may be derived from the probabilistic maps, the probabilities giving an indication of the quality of the resulting choice of label. An example of the network output is given in Figure 2, along with the expert classification for the area. The risk map is graded from black to white, with black indicating a low probability of salinisation and white a high probability of salinisation. For the classifications, black represents not affected, grey potentially affected and white already affected.

Figure 2: Risk map (left), Classification (middle), Expert's Class. (right)
The classification maps were compared with the validation data and a summary of the performance of the network is given in Table 1. In the table, Risk has the interpretation is saline or will go saline in the next time period. For example, in 1988, 79% of the pixels predicted to be at risk by the network were also identified to be at risk by the expert.

Table 1: Network Mapping Accuracy (%), Expert vs Network

	No Risk	Risk	No Risk	Risk	No Risk	Risk
	1977	1977	1988	1988	1994	1994
True No Risk	70	56	75	21	71	25
True Risk	30	44	25	79	29	75

The maps were then used to provide a summary of the historical spread and likely extent of future salinity for the catchment. According to the network, 9% of the catch-ment was affected in 1977; this increased to 14% in 1988, to 20% in 1994 and is predicted to increase to 27% by the year 2004 if no remedial action is undertaken.

5 Conclusions
A combination of remotely sensed data, data derived from digital elevation models and expert knowledge may be used to form cost effective estimates of historical and future salinisation in the agricultural regions of Western Australia. The strategies and methods described here are useful for integrating many different sources of data with expert knowledge, particularly when the data have an associated element of uncertainty.

Acknowledgements
The work by P.A. Caccetta was supported by a grant from the DEC External Re- search Programme and further funding from the CSIRO Division of Mathematics and Statistics.

This work was part of the Predicting Salinity project carried out with funding pro- vided by the Land and Water Resources Research and Development Corporation.

References

P.A. Caccetta, N.A. Campbell, G. West, H. Kiiveri, and M. Gahegan. Aspects of reasoning with uncertainty in an agricultural gis environment. To appear in New Review of Applied Expert Systems, 1995.
R. Ferdowsian. Landform patterns of western forest area, on the south coast of W .A.Technical report, Department of Agriculture, Western Australia, 1993.
S. L. Lauritzen and D.J. Spiegelhalter. Local computations with probabilities on graphical structures and their application to expert systems. Journal of the Royal Statistical Society B, 50(2):157-224, 1988.
R. A. Nulsen. Salt-affected land in the shire of Wongan-8allidu, Western Australia. Australian Journal of Soil Research, 19:87-91, 1981.
P. Quinn, K.8even, P. Chevallier, and 0. Planchon. The prediction of hill slope flow paths for distributed hydrological modelling using digital terrain models. Hydrological Processes, 5:59-79, 1991.
J .F. Wallace and G. A. Wheaton. Spectral discrimination and mapping of land degradation in Western Australia's agricultural region. In 5th Australasian Remote Sensing Conference, Perth, Western Australia, pages 1066-1073, 1990.
G .A. Wheaton, J .F .Wallace, D.J. McFarlane, and N .A. Campbell. Mapping saltaffected land in Western Australia. In Proceedings of the 6th Australasian Remote Sensing Conference, volume 2, pages 369-377, Wellington, New Zealand, 1992.,
G.A. Wheaton, J .F .Wallace, D.J. McFarlane, N .A. Campbell, and P. Caccetta. Mapping and monitoring salt-affected land in Western Australia. In Proceedings of Resource Technology '94 Conference, pages 531-543, Melbourne, Australia, 1994.