ATM 111 - Statistics in Guidance

Review: Using Statistics in the Guidance

Page Last modified: 4 February 2016

Model Output Statistics: MOS
- Takes advantage of past statistical correlations between quantities available at the time of the forecast and future events.
- "Predictors" X_i are known and used to estimate the unknown "Predictand" Y by means of a regression equation.
- Example of a linear regression equation was shown.
  - "regression coefficients" a_i relate the predictors to the predictand.
  - each a_i is uniquely determined from the past correlation between all the X_i's used and Y. By a procedure to minimize the estimation error S.
    - The correlation is calculated using N past occurrences.
    - A matrix is inverted, so any a_i depends on all the predictors used.
    - Defining the matrix: the i^th row is specified by taking the derivative of S w.r.t. a_i and setting that to zero.
    - for a small number of predictors (1 or 2) eliminate variables algebraically.
  - S is the sum of the N values of the squared difference between the estimate of Y generated by your regression equation (= "Y-hat") and the actual observed value of Y at N times in the past.
- Predictors may come from 3 sources:
  1. observations
  2. model output
  3. climatology
- Predictors can be of several types:
  1. continuous (e.g. Temperature)
  2. discrete (e.g. cloud cover: 0=clear, 1=scattered, 2=broken, 3=overcast)
  3. binary (e.g. 1=precip occurred, 0=no precip)
  4. conditional or discontinous (e.g. include T at this model level only if it is < 0 C)
- To choose the predictors to use:
  - collect as many candidate predictors as you can think of
  - choose the first predictor X₁ as that candidate with the highest correlation with Y.
  - then add other predictors one at a time, perhaps as follows:
    1. weed out candidates that are highly correlated with each other as follows. Call such similar candidates a "group"
    2. calculate regression coefficients using just one of the group each time, but do this for each member of the group.
    3. calculate the reduction of variance (RV) each time
    4. keep that member of the group having the largest RV -- RV measures an improvement by using a given predictor.
- MOS products:
  - max T
  - min T
  - PoP
  - mean cloudiness
  - mean wind speed
  - conditional probability of snow
- MOS caveats:
  1. If your model changes, then so must the regression coefficients
  2. To make good estimates of the regression coefficients requires many past occurrences (many samples; i.e. N large)
  3. In practice, not much variance is explained
  4. Since climatology is often a predictor, MOS tends to asymptote back to climatology (underpredicts extreme events)
.
Verification and Predictability
- verification refers to evaluations of skill
- predictability refers to applications of measures of skill to improve our knowledge
- three uses:
  1. quantifying forecast errors (e.g. to compare models, forecasts, etc.)
  2. using forecast errors to identify a problem (e.g. more than just errors in a program, but to identify where to target efforts to improve the model. Also to understand the limits to forecasting)
  3. using forecast error to improve forecast guidance (e.g. estimating ahead of time the likely skill of a given forecast)
- Measures of skill:
  - want something quantifiable and calculatable
    - "subjective" comparison of maps can be tedious and biased
    - prefer a measure that cannot be fooled: e.g. RMS and smoothing
  - see forecast notebook (section V.B.) for definitions
  - for fields or arrays: AC, RMSE (over space), S1 used
  - for individual weather events (like MOS products): RMSE (over time), SE, B, SB, FB, and TS used
- Both model output and MOS products tend to do better in winter than summer -- frontal systems easier to predict than mesoscale phenomena like convection when normalized by the actual variation.
- 500 mb level much better predicted than the surface conditions -- more complex balances occur at the surface
- skill has improved over the years (really!)
Consensus Forecasting
- over time, using an aggregate of forecasts is often superior to a single forecast
- a consensus forecast is an average of more than 1 forecast
  - applied to a numeric forecast (e.g. max T) not a description (e.g. "there will be a warming trend...")
  - can use any available source: MOS, other people, model values
- Ensemble forecasts use output from more than one model run
  1. compare various models, e.g. Eta, NGM, AVN/MRF, MM5, NOGAPS, ECMWF, etc.
  2. compare forecasts made from various initial conditions (ICs)
    - ICs try to sample possible range of error in initial condition
    - greater divergence of the solutions probably implies less confidence in the guidance
  3. compare forecasts made on different times but valid for the same forecast time. (e.g. yesterday's 36hr compared to today's 12hr, both verify at the same time)

Back to ATM 111 homepage

Review: Using Statistics in the Guidance

Model Output Statistics: MOS

Verification and Predictability

Consensus Forecasting