Mathematical principles of clinical index review – mean reversion phenomenon

　　In daily clinical work, when doctors are skeptical of laboratory test data, they often collect specimens for a second time and send them for review, resulting in two test results, which one is more correct? This implies a statistical mathematical principle – mean reversion phenomenon.
　　I. The problem presented
　　A young patient with acute abdominal pain, suspected to be acute gastritis, came to the emergency room. The physician did a routine blood biochemistry test to rule out pancreatitis and to understand the serum electrolyte profile. When the test results were returned, the physician was surprised to see that the serum potassium was as high as 7.8 mmol/L; normal is 3.5-4.5 mmol/L.
　　According to general clinical guidelines, such a high level of potassium can cause cardiac rhythm disturbances, which can be life-threatening and often require emergency treatment. However, on closer inspection of the patient, the general condition is good and there is no obvious past medical history that could lead to elevated serum potassium (e.g. kidney disease, muscle lysis, etc.), in short, the clinical picture does not suggest the presence of hyperkalemia.
　　What to do? The next step for most doctors is to take another blood sample and send it for retesting. But there are two possibilities for the test results.
　　One is: the serum potassium is still very high, consistent with the first test result, which is better to deal with, indicating that this patient really does have hyperkalemia and should be dealt with immediately to find the potential cause of the hyperkalemia.
　　The second possibility: the result is in the normal range or moderately elevated.
　　What to do? This way this patient has had two lab results in a short period of time. This raises the question of whether the first result is accurate or the second result is accurate. Which lab result do we believe.
　　The answer is that the second retest result is more accurate than the first, and the physician formulates a treatment plan based on the second lab result.
　　In fact, the patient’s second result returned 4.7 mmol/L, which was slightly above normal. The treatment plan was to continue to observe the condition and not to treat it for the time being.
　　II. Mean reversion phenomenon
　　There is a mathematical game here, the “mean reversion phenomenon” in statistics, as to why the second retest result is more accurate than the first. Because all observations may have measurement variability, because the measurement involves the performance of the measuring instrument and the observer’s operational effectiveness. This variation can be reduced by careful manipulation and following specifications, but when measurements are made with human judgment rather than with instruments, the variation can be large and difficult to control. The German mathematician and physicist Gauss (Gause) proposed in the 19th century that when the same object is measured repeatedly using the same instrument, the distribution of the measurements taken each time has a normal curve distribution, and the dispersion between the values only indicates the random variation between the measurements. The normal curve is a symmetrical bell-shaped distribution graph with the true value (the level of the true measurement) as the midline, which means that the further away from the true value (located at the edge part of the bell-shaped distribution curve), the less likely it is to occur.
　　Therefore, the patients we have selected in the clinic are situations that do not fit well with the clinical situation and represent the extremes (extreme cases) of the distribution, i.e. the marginal part of the bell-shaped distribution curve. The results of subsequent repeated measurements are rarely similar extreme cases again, i.e., closer to the true value (closer to the central part of the bell-shaped distribution curve), which means that the results of the second retest are much more likely (probable) than the first results to be close to the true value. That is, the likelihood that the results of the second retest will be more accurate than the first result is very high. Although many tests are now performed using fully automated instruments with a strict quality control system, this clinical dilemma, caused by purely statistical reasons, cannot be completely avoided.
　　III. Probability and normal distribution curve
　　Probability is a quantitative indicator of the likelihood of a random event occurring. In independent random events, if the frequency of a certain event occurs in the total number of events, it is relatively obvious that it is stable around a certain fixed constant in a larger range. It can be considered that the probability of this event occurring is this constant. For any event the probability value must be between 0 and 1.
　　There is a class of random events which has two characteristics.
　　1. there are only a finite number of possible outcomes.
　　2. each outcome has the same probability of occurring. The random phenomenon with these two characteristics is called “classical probability”.
　　In the objective world, there are a large number of random phenomena, and the results of random phenomena constitute random events. If a variable is used to describe each result of a random phenomenon, it is called a random variable.
　　There is a distinction between finite and infinite random variables, which are generally divided into discrete random variables and non-discrete random variables according to the values of the variables. If all possible values can be listed in a certain order, such a random variable is called a discrete random variable; if the possible values fill an interval and cannot be listed in order, such a random variable is called a non-discrete random variable.
　　Among the probability distributions of discrete random variables, the relatively simple and widely used one is the binomial distribution. If the random variables are continuous, all have a distribution curve. Both practice and theory prove that there is a special and commonly used distribution that has a regular distribution curve, which is the normal distribution. The normal distribution curve depends on a number of representations of this random variable, the most important of which are the mean and the degree of variation. The mean is also called the mathematical expectation, and the variance is also called the standard variance. ANOVA, also called analysis of variance, is the use of the concept of variance to analyze judgments that can be made by just a few tests.
　　Since random phenomena exist in a large number of practical human activities, probability statistics has developed with the development of modern industry and agriculture, recent science and technology, thus forming many important branches, such as: stochastic processes, information theory, limit theory, experimental design, multivariate analysis, etc. Understanding these knowledge is extremely important to establish a good clinical thinking process. In conclusion, mathematics is the foundation of any science, and this statement deserves careful consideration by our clinicians.