Classical statistics•Experimental errors – random and systematic;•Normal distribution of experimental results – the backbone of classical statistics;•Statistical tests for normal distribution: F – test, Student’s t – test etc – use and significance;•Analytical metrology based on classical statistics – uncertainty, sensitivity, limit of detection, selectivity, reliability of the analytical signal, calibration
Significance and validity check of the linear regression model
How to test/validate a regression model?
Regression models are powerful tools frequently used to predict a dependent variable from a set of predictors. They are widely used in a number of different contexts. An important problem is whether results of the regression analysis on the sample can be extended to the population the sample has been chosen from. If this happens, then we say that the model has a good fit and we refer to this question as a goodness-of-fit analysis, performance analysis or model validation analysis for the model (Hosmer and Lemeshow, 2000; D’Agostino et al., 1998; Harrell et al., 1996; Stevens, 1996). Application of modelling techniques without subsequent performance analysis of the obtained models can result in poorly fitting results that inaccurately predict outcomes on new subjects. We deal with how to measure the quality of the fit of a given model and how to evaluate its performance in order to avoid poorly fitted models, i.e. models which inadequately describe the above mentioned relationship in the population. First we state an important preliminary assumption and the aim of our work, and we introduce the concept of goodness-of-fit and the principle of optimism. Then we illustrate a brief review of the diverse techniques of model validation. Next, we define a number of properties for a model to be considered “good,” and a number of quantitative performance measures. Lastly, we describe a methodology for the assessment of the performance of a given model.
Goals of the statistical analysis
Statisticians help to design data collection plans, analyze data appropriately and interpret and draw conclusions from those analyses. The central objective of the undergraduate major in Statistics is to equip students with consequently requisite quantitative skills that they can employ and build on in flexible ways.
Experimental Design and Modeling
Design of Experiments (DoE) is a very important process development and validation component in several kinds of industries. DoE for process development and validation involves carrying out a number of tests recurrently and steadily over a period of time. Its responses are then observed.
DoE is important for process development and validation as it offers an understanding of the predictability and reproducibility of an experiment. Fundamentally, Design of Experiments for process development and validation seeks to rule out fluke or chance in the methods needed for bringing about control for a product.
The actual experimental design
Fractional factorial design
A factorial experiment in which only an adequately chosen fraction of the treatment combinations required for the complete factorial experiment is selected to be run.
Even if the number of factors, k, in a design is small, the 2k runs specified for a full factorial can quickly become very large. For example, 26 = 64 runs is for a two-level, full factorial design with six factors. To this design we need to add a good number of centerpoint runs and we can thus quickly run up a very large resource requirement for runs with only a modest number of factors.
Plackett – Burnam experimental design
Good experimental design is important in many studies of analytical and other chemical processes. Complete factorial designs, which study all the factors (experimental variables) affecting the system response, using at least two levels (values) for each factor, can give rise to an unacceptably large number of trial experiments. This is because even apparently simple processes may be affected by a large number of factors. Moreover these factors may affect the system response interactively, i.e. the effect of one factor may depend on the levels of others. Any interactions must also be distinguished from random measurement errors. So it is more common to use partial factorial designs in which some information, especially about interactions, may be sacrificed in the interests of a manageable number of experiments.
Central composite design
A Box-Wilson Central Composite Design, commonly called 'a central composite design,' contains an imbedded factorial or fractional factorial design with center points that is augmented with a group of 'star points' that allow estimation of curvature. If the distance from the center of the design space to a factorial point is ±1 unit for each factor, the distance from the center of the design space to a star point is |
α| > 1. The precise value of αdepends on certain properties desired for the design and on the number of factors involved.
Similarly, the number of centerpoint runs the design is to contain also depends on certain properties required for the design.