THE COMPARISON OF SELECTED SAMPLING METHODS FOR DETERMINING THE AGE DISTRIBUTION OF FISH IN THE

Computer simulations were employed to determine the ef­ ficiency of selected sampling methods used to determine the age composition of the catches. Both proportional (PROP) and stratified sampling schemes were analyzed. In the stratified sampling, the efficiency of drawing a constant number of fish from the stratum (S _ CON) as well as drawing a number of fish proportional to the product of the fish numbers and the stan­ dard deviation of age in the stratum (S_PROP*SD) were in­ vestigated. The S _PROP*SD sampling method was proven to be the most effective. The proportional sampling was shown to be the best when taking into consideration the sampling effi­ ciency and the practical possibilities of implementation of a given drawing scheme.


INTRODUCTION
The application of the majority of mathematical models for the assessment of the state of fish stocks requires data concerning the age composition of the catches.This is deter mined using fish samples from which otoliths or scales are collected in order to determine fish age.Sampling for otolith or scale and determining age from them are time-consuming activities.In order to achieve a relatively accurate estimation of the age composition of the catches, based only on age, one would require a rather high sample size.To lower sample numbers, the calculations are carried out using the so-called age-length key, which was first introduced by Fridrikson (1934, after Kimura 1977).This approach, which investigates both ageand length composition, is thought to be less time-consuming and the length distribu tion can be determined even by inexperienced personnel.Once the length distribution of the catches is established, the numbers in each length class can be divided into particular age group numbers using the age-length key derived from the samples used for age determina tion.The contributions of each age to particular length classes are then summarized which yields the contribution of particular age groups to the catches.Two sampling methods are usual ly used in this procedure for the determination of the age composition of the catches.They are the proportional method, in which fish are collected at random, and the stratified method, in which the same number of fish is collected from each stratum being defined by particular length classes.
The sample size necessary for determining the length distribution of the catches and the numbers and methods of sampling for age are the basic questions which must be an swered when planning sampling for the estimation of the age composition of the catches.There are neither easy nor clear solutions to these questions.They may depend on the population structure of the catches or the purpose which the determined age composition will be used for.In the literature there is a number of papers which address such problems including those by, among others, Kutkuhn (1963), Kimura (1977), Quinn et al. (1983) and Gudmundsdottir et al. (1988).The above authors, by using both analytical methods and computer simulations, discussed the efficiency of various sampling methods and tried to determine rational sample size.Most often, the results obtained indicated that the propor tional (random) sampling for fish age investigations is more effective than is the sampling of the same number of fish in particular length classes.It was also indicated that the results may vary with respect to the species sampled and according to the purpose they are to serve.
This work attempts to determine the efficiency of various sampling methods and the rational size of the sample drawn to estimate the age composition of the Baltic cod catches.

MATERIAL AND METHODS
Comparing various fish sampling methods for investigations of the age composition of the catches and determining the best method from among those analyzed were based on computer simulations.First, the hypothetical length-age composition of the cod catches was established and served as the sampled statistical population.This was based on the age and length distribution of Polish cod catches made with trawls in the second quarter of 1997.The second quarter was chosen as it was the most representative of the data available and was characterized by a large number of samples and high catches.
The collection of fish samples to investigate the age and length distributions of the established population was then simulated with computer programs.Fish used for age and length determination were drawn independently.Samples for the length distribution of the catc hes were drawn at random, and three sampling methods were employed for age deter mination.
These methods were as follows: l.Proportional (random) sampling (PROP) Fish are collected at random from the whole population.In this type of sampling, the number of fish collected from particular length classes is approximately proportional to the numbers in these classes.

Stratified sampling, constant (S_CON)
The same numbers of fish are drawn at random from particular length classes.This method is usually applied in the sampling performed by the Sea Fisheries Institute.3. Stratified sampling, proportional to the product of the fish numbers in length class and the standard deviation of age in length class (S _PROP*SD).
A number of fish collected at random from particular length classes is proportional to the product of length class numbers and the standard deviation of age in the length class.Therefore, fish from length classes with a wide age range (large fish) were more numerous in this method, in comparison to their numbers in the catches, than were fish from length classes with a narrow age range (small fish).Before the sample was collected using this method the standard deviation of fish age was determined for particular length classes (Fig. 1 ).Due to the relatively large variability of this value it was smoothed using linear re gression.This sampling method is optimal when the number of elements in the strata is known (Pawlowski 1972).When determining the age composition of the catches the numbers in the strata is un known and is determined from the samples.
Once the length and age of the sampled fish is determined, computations of the age composition of the catches were made using the age length key.The results were compared with the age distri bution of the hypothetical catches.In addition, the age composition of the catches for data collected using the propor tional method was calculated by taking into consideration only the age and neglecting the length distribution of the catches.This computation method will be further referred to as PROPO.Simulations were carried out for several options of sample size for the age deter mination and several options of sample size for the determination of length distribution.A total of 3 8 sample number combinations for both age and length investigations were made (Tab.1).For each combination of sample size and sampling method one hundred sample drawings were simulated.Next, the variance and the coefficient of variation of the contri bution of particular age groups in the catches, were calculated.The simulations were based on programs written by the author in TBASIC 1.0.The sampling was conducted using the RND function.The results of the simu lations were processed using a multi-factor analysis of vari ance in the STATGRAPHICS PLUS package.Two measures of the efficiency of the sam pling method were employed: the total variance of age distri bution and the coefficient of variation of age contribution in the catches.Total variance has been defined as follows: where vari denotes the variance of the contribution of age i in the catches.The coefficient of variation for age has been defined as follows: where averagei denotes the average contribution of age in the catches.Sampling method (Method), sample size for the determination of length distribution (Measured), sample size for age investigation (Aged) and age group (Age) were accepted as factors which poten tially influence the above defined measures of sampling efficiency.

RESULTS
The dependence of the total variance of the age distribution in the catches (V arTot) on the sampling method, the size of sample used for determining fish age and the numbers of specimens used to determine the length distribution of the population are presented in Fig. 2. The results obtained indicate that the lowest total variance was usually obtained when samples of fish for age investigations were collected at random (PROP) or by the stratified method, when the size of sub-sample in the stratum is proportional to the product of the number offish in the stratum and the standard deviation of the age distribution in the stratum (S _PROP*SD).The stratified method, in which a constant number of fish are col lected from the strata (S _ CON), was proven to be decidedly less effective.In several in stances, this method produced even greater variance than the proportional method did when the age-length key was not employed to determine the age composition of the catches (PRO PO).Multi-factor analysis of variance revealed that total variance can be presented as a sum of the following factors: VarTot =Constant+ Method+ Measured+ Aged+ error, All of these factors were proven to be statistically significant at a level lower than 0.0001.The PROP and S_PROP*SD methods yield similar effects which do not vary sta tistically.However, the S _ CON and PROPO methods have much greater influence on the total variance of age (Fig. 3a).The influence of the factor Measure1 on the total variance quickly disappears (Fig. 3b) and the majority of its levels do not differ significantly.There fore, there is no statistical reason to increase the number of fish sampled for length compo sition beyond 3000 when efficiency of the method is measured using the quantity VarTot.In practice, samples about 1 OOO fish used to determine length are sufficient to obtain a rela tively low VarTot value.The factor Aged has the greatest influence on the VarTot varia tion and its influence decreases at levels 5-7, where only differences between levels 5 and 7 are statistically significant (Fig. 3c).Another assumed measure of the efficiency of the sampling method was the coeffi cient of variation of the contribution of particular fish ages in the catches.It can be antici pated that method evaluation resulting from simulations will be different for various age groups.In the stratified sampling method, with the same number of elements collected from each stratum, age groups which are less numerous in the catches will be relatively well rep resented.In other methods they will be represented to a lesser extent.These expectations were confi rmed by the results of simulations (Figs.4a, b ).The S _ CON method was demon strated to be the best for the oldest age groups, 8 and 9, which are not numerous in the catches.
Similarly to the VarTot analysis, the influence of particular factors on the coefficient of variation of the contribution of particular age groups in the catches was investigated us ing multi-factor analysis of variance.An additional factor, Age, was added to those analyzed above.The results of calculations indicate that the coefficient of variation can be presented as a sum of the following factors and their interactions: ln (CV)= Constant+ Method+ Measured+ Aged+ Age+ Method* Age+ error.
The coefficient of variation was presented in logarithmic form.Otherwise, the distri butions of residuals would not have fulfilled the assumption of the multi-factor analysis of variance.It occurred that all the factors and the Method* Age interactions were statistically significant at a level lower than 0. 000 1. The conclusions drawn from these analyses, which refer to value of particular sampling methods for age determination, are slightly different from those obtained while analyzing the quantity VarTot.The stratified sampling method with the sub-sample numbers from the stratum proportional to the product of the number of specimens and the standard deviation of age in the stratum (S_PROP*SD) was proven to be the best.The proportional sampling method (PROP) appeared to be less effective than the S PROP*SD method and its intluence on CV is almost identical with the influence of the stratified sampling method with a constant number of elements in the strata (S _ CON)."' Ol 0) 0 N 8 � :..,.
The practical implementation of the S_PROP*SD method may be difficult, since it re quires determining the standard deviation of age in length classes which can be done, for example, using pilot samples.Prior to sampling for the age determination, the number of fish in the strata should be established.Then, the number of fish from each stratum which should be sampled can be determined.The application of this procedure at sea or even in port is troublesome enough so that the proportional sampling scheme (PROP), being only slightly less effective than the S_PROP*SD method, would be more practical to realize.
It is easier to determine the value of particular sampling methods than to recommend the proper sample size.This difficulty results mainly from the lack of a definition of the al lowable error of age composition from the point of view of the requirements for the assess ment of fish resources.The simulations which were carried out reveal that the aging of even Let us assume that the acceptable average error of the age composition of the catches is 1 %.If the aim is to determine the annual age distribution of the catches, the ap propriate number of specimens for aging would be 1000.However, in work which is cur rently being carried out in co-operation with ICES the age composition of the catches needs to be established quarterly.These compositions do not vary much and greater differences occur in the youngest age groups which gradually enter exploitation throughout the year.Therefore, it would be advisable to determine the age of approximately 1500 fish with the quarterly samples size proportional to the catch magnitude.This should be accompanied by about 3000 fish sampled for the determination of the length distribution.

DISCUSSION
The results obtained in this work are similar in many ways to those obtained by other authors.However, the analytical considerations or simulations which present the efficiency of the S_PROP*SD method have not been found in the literature.In addition, the applica tion of the multi-factor analysis of variance for interpreting the results obtained permitted the quantitative determination of the influence of particular factors on precision of the age distribution, and most importantly, of the influence of the method applied.
Using equations derived by Cohran (1953) (in Kutkuhn 1963), Kutkuhn (1963) showed that the va1iation of the age distribution obtained from the S _ CON method is most often lower than that obtained by means of the simple random sampling method.In further comparing the coefficients of variation of the derived age composition of individuals, he concluded that the increase of accuracy through the application of the S _ CON method is so small that the usefulness of this method is questionable.According to Kutkuhn (1963), de termining age composition by using age-length keys is worthwhile only when the cost of age determination is at least five times higher than the cost oflength measurement.Kimura (1977) used analytical methods to evaluate the efficiency of the application of age-length keys.The considerations which were carried out and the examples which were presented indicated that proportional sampling is more effective than stratified sampling, constant (S _ CON).It may even occur that when age composition is based on the stratified constant sampling, the age-length key and the length distribution is less precise than the age composition based only on data from proportional sampling without employing length measurement in the calculations.Quinn et al. (1983) analyzed sampling methods for age determination of Pacific hali but.They obtained results similar to those of previously mentioned authors, namely that the proportional sampling method appeared to be better than the stratified method with a con stant number of fish collected from the strata.
Simulations of the influence of various sampling methods on the determination of the age distribution of Icelandic cod and capelin were made by Gudmundsdottir et al. (1988) using the bootstrap method.Their results indicated that the proportional sampling method was more effective than the S_CON method, when the quantity VarTot was used to meas ure efficiency.The authors emphasize that the S _ CON method is more effective if the aim of the investigations is to determine the age composition of only the less numerous age groups, usually the youngest and the oldest.The simulations presented in this work reveal that the S_PROP*SD and PROP methods are better than the S_CON method when the de termination of the age contribution of each age group is equally important.
The most important pragmatic conclusion which results from the simulations which were carried out is the greater efficiency of the proportional sampling than the stratified sampling with the constant number of specimens drawn from the stratum, in cases when the aim of sampling is to determine the age composition of the catches.This last method of sampling may be recommended when the aim is to determine the contribution of only the least numerous groups in the catches.The simulations which were carried out were based on the assumption that the ex ploited population is characterized by the same distribution of age and length at every fish ing ground throughout the year.This assumption does not affect the conclusions which concern the efficiency of various sampling methods.The influence of this assumption on the recommended number of specimens in samples is unknown.If fisheries, at various fishing grounds, exploit portions of the same population which differ in age and length distributions then the sample size recommended in this paper may lead to error in the age composition of the catches of more than 1 %.

Fig. 1 .
Fig. 1.Standard deviation of the age distribution in length classes, SD, smoothed standard deviation, SD _ smoothed, and product of the number of specimens � length class and the smoothed standard deviation, Number*SD

Fig. 2 .
Fig. 2. Total variance of the age distribution of the catches, VarTot, with respect to the sam pling method applied, number of fish analysed for age (Number _aged) and number of fish analysed for length.Subsequent parts of the graph refer to the length measurement for 500, 1000, 1500, 2000, 3000, 5000, and 10000 specimens Fig. 3. Influence of the factor: a] Method The .results are illustrated in Figure Sa.The influence of the factors Measured and Aged on the accuracy of the evaluation of age distribution is more distinct than when it is evaluated with the quantity VarTot.All levels of the factors Measured, Aged, and Age have significantly different influences on the error with the exception of levels 2 and 3, in case of factor Measured (Figs.Sb, c, d). -----------0.00-l-t-+-+-+-+....;.

Fig. 5 .
Fig. 4. Coefficient of variation of age group contribution in the catches with respect to sampling method used, number of fish analysed for age (Number aged), and number offish analysed for length.Subsequent parts of the graph refer to the number of fish measured for length equal 500, 1000, 1500, 2000, 3000, 5000, and l 0000 specimens

Table 1 The
combinations of sample size for age investigation (Aged) and length measurement (Measured) used in the computer simulations VarTot= :t:var i ,

Table 2
The standard deviation (%0), of the age distribution estimated for analysed sampling methods and different samples size, collected for age investigation (Aged) and length measurement (Measured)