   Last update: 06-Jul-2004 Arch Hellen Med, 20(5), September-October 2003, 526-531 ORIGINAL PAPER Theory of statistical models T. KATOSTARAS Faculty of Nursing, University of Athens, Athens, Greece

The application of the theory of the statistical models in the health field attempts to give answers to several important problems, one of which is that of the causes leading to the appearance of an illness. For this aim to be fulfilled, several assumptions are made about the properties of population units. In order to express quantitatively the group of these assumptions, a mathematical equation or a group of equations is defined, which is called a mathematical model. This model is used to express the mode by which a feature of the population units changes, when other specific features vary. Mathematical models specify a deterministic relation between their variables, an exact relationship between cause and outcome and consequently refer to natural phenomena. Apart from the natural, there are also incidental phenomena, which cannot be fully explained by the known mathematical models. The effort, then, of mathematics is concentrated on explaining the average variation of a feature (dependent variable), when other known (independent variables) and unknown features change. This is achieved by the use of statistical models. These models constitute a group of assumptions about the properties of the population units, which are considered to be true on average and are expressed through an equation or a group of equations. With statistical models the effects of the unknown and immeasurable variables on the dependent variable are co-examined, with the use of a variable that is called random error. The linear regression model: With the use of this model, the value of a quantitative feature is estimated through the values of the other features (determinants, risk factors). The variation of a quantitative characteristic can also be estimated, when only one or more of the other quantitative characteristics vary. A fundamental assumption of the model is that the variation of the quantitative characteristic is linearly related to the variation of the other characteristics. The logistic regression model: With this model, the likelihood and its variation can be estimated for a characteristic (illness) to show up, when the values of other characteristics change or when other characteristics show up (risk factors). This is a non-linear model. The discriminant analysis model: With this model, the population units are classified into categories, with the classification depending on the value of some of their characteristics or on the existence of certain features. The non-linear regression models: There are countless non-linear statistical models, which originate from respective mathematical models, depending on the form of the random error. With the use of these, the value of a quantitative characteristic is estimated, through the values of other characteristics. The variation of a quantitative characteristic can also be estimated when one or more, mainly quantitative, characteristics change. A fundamental assumption is that the changing characteristics relate to the characteristic concerned with a non-linear relationship.

Key words: Dependence, Discriminant analysis, Logistic regression, Regression, Statistical model.