Evaluating atmospheric instability from high spectral

Transcript

Evaluating atmospheric instability from high spectral
Evaluating atmospheric instability from high spectral resolution
IR satellite observations
P. Antonelli ([email protected]), A. Manzato ([email protected]),
S. Puca ([email protected]), F. Zauli ([email protected])
April 11, 2011
Abstract
This document describes the activities performed to derive atmospheric instability in clear sky from high
spectral resolution IR data observed by the IASI instrument on board of METOP-A. High spectral res­
olution has been proved to carry information on the vertical structure of the atmosphere (temperature
and water vapor concentration) at higher vertical resolution than any other observation system operated
on board of satellites. This information was expected to allow for characterization the pre-convective
environment in clear sky conditions. Goal of the study was the development of an automated system for
identification of potentially unstable air masses through the combination of IASI level 3 products (insta­
bility indices derived from atmospheric temperature and water vapor vertical profiles obtained through
a physical retrieval), and linear combination of IASI level 1 radiances. Results obtained were compared
to those obtained using instability indices derived from high vertical resolution rawinsondes instead of
IASI data and products, and demonstrated the feasibility of an automatic system for the near-casting of
convective events in clear sky conditions. Outcomes of this study are expected to lead, in the future, to
the implementation of near-casting operational applications.
Contents
1 Technical Reports 6 and 7: Prediction of convective event: intercomparison of results obtained from IASI data with those obtained from rawinsondes, over Po Valley.
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1.1
1.2
Description of the Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1.2 Event climatology . . . . . . . . . . . . . . . . . . . . .
Nowcasting convective events from rawinsondes . . . . . . . .
1.2.1 Full data set for Rawinsondes . . . . . . . . . . . . . .
1.2.2 Predicting events from rawinsonde derived indices . . .
1.2.2.1 Empirical posterior probability for rawinsonde
1.2.2.2 Forward selection algorithm . . . . . . . . . .
1.2.3
8
9
9
10
10
10
Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2.3.1 Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2.3.2 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13
15
15
Nowcasting convective events from IASI . . . . . . . . . . . . . . .
1.3.1 Full data set for IASI . . . . . . . . . . . . . . . . . . . . . .
1.3.1.1 Predictors derived from IASI retrievals . . . . . . .
1.3.1.2 Predictors derived from IASI principal components
1.3.2 Predicting events from IASI derived indices and PCS . . . .
1.3.2.1 Empirical posterior probability for IASI indices . .
1.3.2.2 Forward selection algorithm . . . . . . . . . . . . .
1.3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3.3.1 Training . . . . . . . . . . . . . . . . . . . . . . . .
1.3.3.2 Testing . . . . . . . . . . . . . . . . . . . . . . . .
1.3.4 Discussion of the results . . . . . . . . . . . . . . . . . . . .
1.3.4.1 Limited size of the IASI Full data set . . . . . . . .
1.3.4.2 Increasing the Full IASI data set . . . . . . . . . .
1.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
6
.
.
.
.
.
.
1.3
. . . . .
. . . . .
. . . . .
. . . . .
indices .
. . . . .
5
5
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
15
15
16
20
20
20
20
24
24
24
25
25
28
29
2 Technical Report 5: validation of Level 3 products derived from vertical rawinsonde and retrieval profiles with occurrence of convection as detected by Lightnings.
32
1
Ref.: PA/IIS/FR/2010/01
2.1
2.2
2.3
2.4
Relationship between Instability Indices and convection occurrence . . . . . . . . . . . . .
2.1.1 Linear correlation between instability binary indices and convection occurrence . .
2.1.2 Cross-entropy error between instability indices and convection occurrence . . . . .
32
33
33
2.1.3 Skill scores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.1 Udine Campoformido: Linear correlation between instability binary indices and con­
33
34
vection occurrence and cross-entropy . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.2 Udine Campoformido: skill scores . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.3 Pratica di Mare: Linear correlation between instability binary indices and convection
occurrence and cross-entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.4 Pratica di Mare: skill scores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.5 Cagliari: Linear correlation between instability binary indices and convection occur-
34
34
34
38
rence and cross-entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.6 Cagliari: skill scores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
38
42
Analysis of Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
42
46
3 Technical Report 4: Comparison of Level 3 products (Instability Indices) derived from satellite observations and rawinsondes
48
3.1 Generation of Instability Indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.2
3.3
3.4
3.1.1 Lifted Parcel Theory assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1.2 Lifted Parcel Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1.3 Selection of Instability Indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
49
49
50
Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2.1 Udine Campoformido . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2.2 Pratica di Mare . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
50
52
52
3.2.3 Cagliari . . . . . . . . . . . . . . . . .
Analysis of Results . . . . . . . . . . . . . . .
3.3.1 Forecast derived indices . . . . . . . .
3.3.2 Time dependence of instability indices
Conclusions . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
52
54
54
54
56
.
.
.
.
.
.
59 59
59
60
60
60
60
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
4 Technical Report 3: Validation of baseline Retrieval with rawinsondes
4.1 Inversion with UWPHYSRET . . . . . . . . . . . . . . . . . . . . . . . . .
4.1.1 IASI observations used in retrieval . . . . . . . . . . . . . . . . . .
4.1.2 A-priori Covariance: in-situ observations used for climatology . . .
4.1.3 Error Covariance . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.1.4 Forward Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.1.5 Minimization Scheme . . . . . . . . . . . . . . . . . . . . . . . . . .
2
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Ref.: PA/IIS/FR/2010/01
4.1.6
Convergence Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
60
4.1.7
Retrieval Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
61
Validation strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.1 Spectral Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
62
62
4.2.2
. . . .
. . . .
. . . .
63
63
63
. . . .
. . . .
64
64
Pratica di Mare . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Cagliari . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
67
69
4.4
Analysis of Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
72
4.5
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
76
4.2
4.3
Environmental Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.2.1 In situ observations used for environmental validation . . . . . . . .
4.2.2.2 Statistical quantities used to characterize environmental validation
Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3.1 Udine Campoformido . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3.2
4.3.3
5 Technical Report 2: Instability Indices (Level 3 Products) derived from IASI retrievals (Level 2 Products)
79
5.1
5.2
5.3
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Retrievals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
79
79
81
5.4
5.5
Instability Indices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
81
82
6 Technical Report 1: Dataset description
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.2
6.3
6.4
IASI data . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.2.1 Pratica di Mare (lat : 41.65N, lon : 12.43E) . . .
6.2.2 Udine Campoformido (lat : 46.03N, lon : 13.18E)
6.2.3 Cagliari (lat : 39.25N, lon : 9.05E) . . . . . . . .
Lightning . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.3.1 Pratica di Mare . . . . . . . . . . . . . . . . . . . .
6.3.2 Udine, Campoformido . . . . . . . . . . . . . . . .
6.3.3 Cagliari . . . . . . . . . . . . . . . . . . . . . . . .
Rawinsondes . . . . . . . . . . . . . . . . . . . . . . . . . .
6.4.1 Pressure interpolated profiles . . . . . . . . . . . .
6.4.2 Pratica di Mare . . . . . . . . . . . . . . . . . . . .
6.4.3 Udine, Campoformido . . . . . . . . . . . . . . . .
6.4.4 Cagliari . . . . . . . . . . . . . . . . . . . . . . . .
7 Conclusions
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
89 89
89
90 90 94 96
97
97
100
100
100
102
102
102
107
3
Introduction
This document is a collection of seven technical reports (chapters) presented in reverse chronological order
to allow the reader to go through the final results without requiring the vision of all the propedeutic parts
(anyhow included). Chapter 1, includes the last two technical reports, 6 and 7, of the project. It focuses
on the implementation of a prototype prediction system for near-casting of convective events defined as
the occurrence of more than 10 lightnings strikes (with at least 1 second separation for nearby strikes) over
the area of interest (Po Valley in Italy), between 11:00 UTC and 17:00 UTC, for the annual time period of
April-October. The described effort aims to investigate and compare the capacity of predicting a convective
events using IASI data and products on one side, and rawinsondes products on the other side. Chapter 2
(technical report 5), propedeutic to chapter 1, is dedicated to the validation of the instability indices derived
from IASI data and those derived from high vertical rawinsondes with the lightning occurrence (convective
event observations) in the areas of interests over Pratica di Mare, Cagliari, and Udine, Italy. Chapter 3,
technical report 4, provides a comparison of the instability indices, derived from satellite observations and
rawinsondes. Chapter 4, technical report 3, going backwards, describes the results obtained validating
baseline retrieval (level 2 products) derived from IASI with available high vertical resolution rawinsondes
launched within 50 km and about 100 minutes from satellite observations. Chapters 5 and 6, which include
respectively technical reports 2 and 1, are dedicated to the description of the software package used to
derive instability indices from vertical profiles of atmospheric temperature and water vapor, and to the
dataset descriptions. Finally chapter 7 reproduces the conclusions of the main chapter (chapter 1) in a
more general form.
4
Chapter 1
Technical Reports 6 and 7: Prediction of
convective event: intercomparison of results
obtained from IASI data with those obtained
from rawinsondes, over Po Valley.
Document: Technical Reports 6 and 7
Written by: Paolo Antonelli, A. Manzato
Date: 09 March 2011
Reference: PA/IIS/TR06/2011/01
1.1
Introduction
The presented effort aims to investigate and compare the capacity of predicting a convective event over the
Po Valley using: 1) rawinsondes products, 2) IASI data and products. In the study a convective event was
defined as the occurrence of more than 10 lightnings strikes (with at least 1 second separation for nearby
strikes) over the area of interest, between 11:00 UTC and 17:00 UTC. In the first approach 11:00 UTC
rawinsondes obtained from the sites of Milano Linate and Udine Campoformido were used to generate sets
of 50 instability indices per site. Among these indices 8 predictors were selected, by a forward selection
algorithm, as best predictors and fed to an ANN which produced two contingency tables for Training and
Test sets associated to Peirce Skill Scores (PSS) of 0.67 and 0.68 respectively. In the second approach 30
instability indices (only the ones not depending on winds) were derived from IASI level 2 products generated
by a physical retrieval approach (UWPHYSRET) over 2 areas centered in Milano and Udine. Twenty linear
combinations of IASI channels (Principal Component Scores) were added to the 30 indices as potential
predictors. Three best predictors were selected and fed to an ANN which produced two contingency tables
for Training and Test sets associated to PSS scores of 0.67 and 0.33 respectively. Poor results obtained
5
Ref.: PA/IIS/FR/2010/01
in the generalization of the prediction of convective event from IASI data and products, were found to
be mostly dependent on the limited size of the IASI database (available retrievals in clear sky conditions)
which is a factor 10 smaller than the rawinsonde database (both for training and testing). However a
general tendency of the retrievals to overestimate low level water vapor, which led to overestimation of the
atmospheric instability, was found and should be further investigated. Finally by focusing on a single area
of interest we were able to increase the size of the IASI database by a factor two, and the prediction PSS
score on the test set reached the value of 0.57, indicating that nowcasting of convection, by IASI data over
individual (smaller) areas, is feasible and promising. Besides the final scores, significance of the presented
material relies on the correlation found between some of the PCS and the occurrence of convection, and
on the validation of the IASI level2 and level 3 products.
This document describes the statistical links found between the instability predictors derived either from
11:00 UTC rawinsondes (instability indices [15]) or from morning overpass IASI data (instability indices
and principal component scores), and the occurrence of convective events, defined by the observation of
10 or more lightnings between 11:00 UTC and 17:00 UTC, over the Po Valley, for the seasonal period that
goes from beginning of April to the end of October. The document is divided in four sections respectively
dedicated to: 1) introduction to the experiment; 2) nowcasting convective events from rawinsondes; 3)
nowcasting convective events from IASI data; 4) conclusions on achieved results.
1.1.1
Description of the Experiment
The goal of the experiment is to investigate and compare the capacity of predicting a convective event over
a designated area using:
• rawinsondes products;
• IASI data and products.
In the study a convective event is defined as the occurrence of more than 10 lightnings strikes (with at
least 1 second separation if the LAT-LON coordinates are very close) over the region indicated by the
yellow area in figure 1.1, between 11:00 UTC and 17:00 UTC. Rawinsonde sites of Milano Linate (45.45°N,
9.27°E) and Udine Campoformido (46.02°N, 13.16°E) are indicated by the green markers, and are located
in ideal position to characterize boundary conditions for the area of interest.
The capacity of predicting was measured by calculating the Peirce skill score (PSS [8, 12, 13]) of a
binary classifier obtained from a set of continuous predictors using thresholds to dichotomize the predictor
values into event occurrence and non-occurrence classes.
The strategy used to achieve the goal is summarized as follows:
• define the occurrence of a convective event, by setting a threshold (of 10 strikes) to convert the
discrete distribution of lightning strikes into a binary output event yes/no (1/0);
• build a Full Dataset with occurrence of convective event (yes/no) and the values of all available
predictors;
6
Ref.: PA/IIS/FR/2010/01
– divide the full dataset into 2 subsets: 1) Total Set, to be used to build the classifier; 2) Test Set,
to be used for the final evaluation of the classifier capacity of prediction;
– divide, in 12 different ways, the Total Set into 2 subsets: 1) Training Set (75%); 2) Validation
Set (25%); both to be used to subselect the optimal predictors using the Repeated Holdout
Technique [18];
• define, for each predictor, an empirical posterior probability, i.e. a mathematical relationship which
associate the probability of having an event to the continuous values of the predictors (figures 1.4,
1.5) and use it as pre-processing;
• implement a forward selection algorithm (based on Artificial Neural Networks, namely a single layer,
feedforward network trained with backpropagation [9, 14]) to choose an optimal subset of predictors.
The ANN chooses at first the one predictor that gives the best classification of the event occurrences
starting from its empirical probability distribution. Then it selects the predictor which gives the best
fit, when used together with the first one. New predictors are added, until the system predictive skill
stop increasing. The number of input predictors was chosen taking into consideration the mean skill
of the 12 ANN built with the different instances of the Training Sets. Prediction skill of the ANN
was measured by the mean cross-entropy error (CEE):
CEE = −
N
1 �
[tn ln (yn ) + (1 − tn ) ln (1 − yn )]
N n=1
(1.1)
where yn is the output of the ANN, and tn is boolean for the the convective event (1|yes, 0|no),
calculated over the 12 instances of the Validation Sets. The cross-entropy can be used as an error
measure when a network’s outputs can be thought of as representing independent hypotheses (e.g.
each node stands for a different concept), and the node activations can be understood as representing
the probability (or confidence) that each hypothesis might be true. In that case, the output vector
represents a probability distribution, and our error measure - cross-entropy - indicates the distance
between what the network believes this distribution should be, and what the teacher says it should be
[16]. CEE = 0 is a pefect score and CEE � 1 is a poor score.
• once the optimal subset of predictors was identified, the final ANN architecture was chosen among
different candidates (different numbers of hidden neurons in the hidden layer) as the one that has
the lowest combined CEE on the Total set (Training + Validation) without overfitting it, that is,
with similar performances also on the independent Test set;
• quantitative evaluation of the learning and generalization of the knowledge during the ANN super­
vised training has been performed using the Receiver Operating Characteristic (ROC) [19, 12]. A
ROC curve summarizes the performance of a two-class classifier across the range of possible thresh­
olds. An ideal classifier hugs the left side and top side of the graph, and the area under the curve is 1.0.
A random classifier should achieve an area of approximately 0.5 and lies along the 45 degree bisector
7
Ref.: PA/IIS/FR/2010/01
Prediction: YES
Prediction: NO
Event (N)
b
d
Event (Y)
a
c
Table 1.1: Example of contingency table
Score
Expression
POD
POFD
FAR
HIT
BIAS
O
a
a+c
b
b+d
b
a+b
a+d
a+b+c+d
a+b
a+c
ad
bc
Score
HSS
PSS
E(PSS)
Expression
2(ad−bc)
(a+c)(c+d)+(a+b)(b+d)
(ad-bc)
(a+c)(b+d)
N 2 −4(a+b)(c+d)KSS 2
4N (a+b)(c+d)
Table 1.2: Scores derived from the contingency table
(a classifier with an area less than 0.5 can be improved simply by flipping the class assignment).
The ROC curve is recommended for comparing classifiers, as it does not merely summarize per­
formance at a single arbitrarily selected decision threshold, but across all possible decision thresholds
[http://www.statsoft.com/textbook/statistics-glossary/r/button/r/]. Once the output of the ANN was
dichotomized using the event prior probability as threshold, the contingency table 1.1 was calculated,
and the different statistical scores (table 1.2) were determined.
1.1.2
Event climatology
The database of convective events, for the time period of 2004-2010 was developed by Osservatorio Me­
teorologico Regionale (OSMER) of Agenzia Regionale per la Protezione dell’Ambiente del Friuli Venezia
Giulia (ARPA-FVG), using lightning data already bought from the CESI/SIRF company. Climatology of
the observed number of lightnings in 6h (without the 0 cases), for the time period under consideration,
and for the area of interest, is showed by the solid line in figure 1.2. The event distribution of the number
lightning strikes was also fit using the Pareto distribution:
N (x) = 50·t·(xmin )t·x−(t+1)
(1.2)
with xmin = 50, and t = 0.33 (dashed line in Figure 1.2), a threshold of 10 strikes was chosen to define
a convective event. Selection of the threshold value was done subjectively, considering 10 strikes enough
to guarantee a significant (with respect to the area size) convective event. 600 cases in about 1500 days (
40.0 %) showed at least 10 lightnings strikes in a 6h period between 11:00 and 17:00 UTC. The lightning
distribution was found to have M edian = 1, M ean = 211.63, Standard Deviation = 647.19. No lightning
activity was recorded on 696 cases. In 898 cases lightning activity was associated to less then 10 strikes
over the whole area. The number of cases which had one strike only was 802 and the maximum activity was
observed on 30 Aug 2007, and 8174 strikes were counted. By setting the threshold of at least 10 lightning
8
Ref.: PA/IIS/FR/2010/01
Figure 1.1: Event Area
strikes to define a convective event, the probability of an event over the whole dataset was estimated to
be around 40%.
1.2
Nowcasting convective events from rawinsondes
Rawinsonde data for this study, collected between 2004 and 2009 were provided by Centro Nazionale per
la Meteorologia e la Climatologia Areonautica (CNMCA), while rawinsonde data for 2010 were retrieved
from the University of Wyoming archive. Locations of the rawinsonde launching sites are showed by the
green markers in figure 1.1. It is worth emphasizing that only rawinsondes launched at 11:00 UTC were
taken into consideration in this study.
1.2.1
Full data set for Rawinsondes
The 1433 coincident rawinsondes available for Udine and Milano were used to generate 50 instability indices
(table 1.3) using Sound_Analys.py, the software package developed at OSMER [15, 11]. The rawinsonde
Full data set was built using the instability indices calculated over the two individual sites (2 subsets), their
differences (Milano - Udine, third subset), plus 4 combinations of some indices for each of the 3 subsets,
for a total of 50 • 3 + 12 = 162 variables. Including also the Julian date (JJJ) as possible predictor, the
Full data set contained therefore 1433 cases with 163 candidate predictors and 1 boolean (event YES|NO)
to be predicted. However candidate predictors with a high number of missing values, such as Level of Free
9
Ref.: PA/IIS/FR/2010/01
Figure 1.2: Probability distribution of convective events
Convection (LFC) and Equilibrium Level (EL), that are defined only for potentially unstable profiles, were
not considered, reducing effectively the number of used candidate predictors to 157.
1.2.2
Predicting events from rawinsonde derived indices
1.2.2.1
Empirical posterior probability for rawinsonde indices
According to the strategy described in sec. 1.1.1, once the Full data set was assembled, the Empirical
Posterior Probability (EPP) functions were determined by fitting ad-hoc curves to the distribution of the
event likelihood probabilities. Figures 1.3, 1.4, and 1.5, show examples of the EPP, respectively for the
Julian Day, and for two of the predictors (CAPE, and DT500). The complete sets of EPP plots for the
rawinsonde derived predictors can be found in ANNEX1.
1.2.2.2
Forward selection algorithm
The EPP of the indices, generated over Udine and Milano, and and those of their differences, were then
used to determine the optimal subset of predictors to forecast the occurrence of a convective event still
following the procedure described in section 1.1.1. Figure 1.6 shows the best 8 predictors found among
10
Ref.: PA/IIS/FR/2010/01
P(JJJ) []
BOY []
Julian day
Boyden index
SWEAT []
MEL [%]
BRI []
Bulk Richardson number
MLWu [ms−1 ]
BS850 [ms−1 ]
Bulk Shear 850 hPa - 100 m
MLWv [ms−1 ]
CAP [o C]
Maximum cap (as Θes
difference)
Convective available
potential energy
Convective inhibition
MRH [%]
Difference of temperature
at 500 hPa
Core difference of
temperature
Downdraft potential
PBL [m]
EHI []
HD [cm]
Energy–helicity index
Hail Diameter (derived
from UpDr)
Rel_Hel [m2 s−2 ]
SWISS []
HLJD [m]
High–levels (6–12 km) jet
depth
U component of high–levels
(6–12 km) wind
V component of high–levels
(6–12 km) wind
High–levels (500–300 hPa)
relative humidity
Helicity
K index
Lifting condensation level
Shear [s−1 ]
Thetae [K]
Trop [m]
UpDr [m/s]
Level of free convection
height
Lifted index
VFlux
[m−2 s−1 kg]
VV [ms−1 ]
Low–levels (lowest 6 km)
jet depth
U component of low-level
wind (0.5 km)
V component of low-level
wind (0.5 km)
Mean relative humidity in
the first 250 hPa
VVstd [ms−1 ]
CAPE [J/kg]
CIN [J/kg]
DT500 [o C]
DTC [o C]
DownPotm [K]
HLWu [ms−1 ]
HLWv [ms−1 ]
HRH [%]
HEL [Jkg −1 ]
KI [o C]
LCL [m]
LFC [m]
LI [C]
LLJD [m]
LLWu [ms−1 ]
LLWv [ms−1 ]
LRH [m]
MaxBuo [K]
Mix [g/kg]
PWC [mm]
PWE [mm]
Shear3 [s−1 ]
ShowI [o C]
Tbase [o C]
WBZ [m]
b_PBL [cm/s2]
h_MUP [m]
Table 1.3: Instability Indices
11
Severe weather threat
Melting level (parcel at
0°C)
U component of midlevel
wind (6 km)
V component of midlevel
wind (6 km)
Mean relative humidity in
the first 500 hPa
Maximum buoyancy
Most unstable parcel
(MUP) mixing ratio
Planetary boundary layer
estimated height
Precipitable water content
of cloud
Precipitable water content
of environment
Relative helicity
Stability and wind shear
index for storms in
Switzerland
Wind shear in the lowest 12
km
Wind shear in the lowest 3
km
Showalter index
Cloud-base (LCL)
temperature
Most Unstable Parcel Θe
Tropopause height
“Core updraft” (parcel at
-15°C)
Mean water vapor
horizontal flux
Radiosonde ascensional
vertical velocity
Std dev of radiosonde
vertical velocity
Environmental wet bulb
zero height
Mean buoyancy acceleration
of the first 250 hPa
MUP height
Ref.: PA/IIS/FR/2010/01
Figure 1.3: Empirical relationship between Julian day and lightning occurrence. Peak of the activity is
found in July.
Figure 1.4: Empirical relationship between CAPE and occurrence of at least 10 lightning. On the left
figure CAPE was derived from Udine Campoformido rawinsondes, while on the right it was derived from
Milano Linate rawinsondes.
12
Ref.: PA/IIS/FR/2010/01
Figure 1.5: Empirical relationship between DT500 and occurrence of at least 10 lightning. On the left
figure DT500 was derived from Udine Campoformido rawinsondes, while on the right it was derived from
Milano Linate rawinsondes.
Variable
DTCu
SWISSm
WBZm
KIu
Mixu
PWCm
ShowIm
KIm
<VE >
0.488
0.433
0.402
0.375
0.363
0.351
0.354
0.347
< TE >
0.484
0.428
0.402
0.371
0.356
0.338
0.330
0.313
< T otal E >
0.485
0.429
0.402
0.372
0.358
0.342
0.336
0.321
Table 1.4: Forward selection algorithm: results for the best rawinsonde derived predictors.
the 157 rawinsonde derived indices according to the mean CEE values of Validation Errors (VE), Training
Errors (TE), and Total Errors (T otal E = .75T E + .25V E) also reported in table 1.4, where the letters u
and m at the end of the variable names stand for Udine and Milano respectively.
1.2.3
Results
During the input selection phase (forward selection algorithm) only the Total set was used, it included 949
cases and was used to train the different ANN candidates. While to select the best architecture (hidden
neurons) for the prediction system (ANN) also the consistency between the results obtained on the Total
and on the Test sets (of 350 cases) was taken into account. The architecture chosen was with 8 inputs, 2
neurons on the hidden layer, and 1 output.
13
Ref.: PA/IIS/FR/2010/01
Figure 1.6: The Training-Validation (TV) diagram of the CEEs computed over the 12 bootstraps (instances
of training/validation sets) for the first 8 “best ANN-inputs” chosen by the classification forward selection
algorithm. The sets of 12 TV points of each variable are represented alternatively by filled circles and
triangles, while the unfilled squares, connected by a dashed line, show the mean errors over the 12-point
bootstraps.
14
Ref.: PA/IIS/FR/2010/01
1.2.3.1
Training
Application of the ANN on the Total set led to a Total CEE of 0.335, while applying the probability
threshold (0.40) on the continuous ANN output led to the following contingency table:
TOTAL
Event (Y)
Event (N)
Prediction: YES
316
95
Prediction: NO
63
475
The analysis of the contingency table:
1.2.3.2
TOTAL
POD
HIT
FAR
POFD
BIAS
TS
HSS
PSS
S(PSS)
ODDS
Score
0.83
0.83
0.23
0.17
1.08
0.67
0.66
0.67
0.02
25.08
Testing
Applying the ANN on the Test set led to a Test CEE of 0.375, while entries found in the contingency table
were:
TEST
Event (Y)
Event (N)
Prediction: YES
114
33
Prediction: NO
23
180
The analysis of the contingency table:
TEST
POD
HIT
FAR
POFD
BIAS
TS
HSS
PSS
S(PSS)
ODDS
Score
0.83
0.84
0.22
0.15
1.07
0.67
0.67
0.68
0.04
27.03
which are even slightly better than the scores obtained applying the ANN on the Total Set, and indicate
a good generalization capacity of the network. The ROC curves for this ANN are shown in figure 1.7.
1.3
Nowcasting convective events from IASI
The same procedure described in sec. 1.4 was repeated using predictors derived from IASI data.
1.3.1
Full data set for IASI
IASI coverage started in 2007 and only 154 coincident retrieval/observations were found for Udine and
Milano. The IASI Full data set was built using predictors calculated off IASI retrievals [15, 11, 6], and
predictors derived form IASI observed radiances. The full set was built on 154 cases defined by 160 (CAPu
and D_CAP were not used because of the large fraction of missing values) predictors and 1 boolean (event
YES|NO) to be predicted.
15
Ref.: PA/IIS/FR/2010/01
Figure 1.7: ROC obtained by selected ANN for Total and Test rawinsonde data
1.3.1.1
Predictors derived from IASI retrievals
Instability indices (the 30 which are not depending of wind direction and/or intensity) were generated from
IASI level 2 products [6]. The retrievals were obtained inverting observations collected between April and
October, from 2007 to 2010, over the blue areas in figure 1.8, by using UWPHYSRET [5, 3] with a local
climatology (which included rawinsonde launched from Milano and from Udine at 05:00 and 11:00 UTC)
for the characterization of the a-priori covariance. Retrievals were considered successful if the spectral
residuals were within noise level, however all successful retrievals were divided into two categories: the
ones whose profiles did not show evidence of water vapor saturation (218 profiles over Udine, and 294
over Milano) and those which showed evidence of potential saturation (556 profiles over Udine, and 753
over Milano). The first category was considered more reliable, however the saturated profiles had to be
included, because of the very limited number of favorable cases. In fact, even with the potentially saturated
profiles, after averaging all the cases observed in a single overpass (obtaining 283 cases over Udine, and
332 over Milano), and finding the intersection (coincident profiles) of the two sets of Milano and Udine,
only 154 cases were left for the whole period 2007-2010.
For each available retrieval the mean temperature and the mean water vapor mixing ratio, within the
first 200 hPa of the atmospheric profile, were compared to those derived from the rawinsondes. Results
showed that the mean retrieval temperature over Milano (figure 1.9) and Udine (figure 1.11) is, as expected,
generally 1 − 2 K colder than the mean rawinsonde temperature, being the retrieval obtained about 90
minutes before the rawinsonde. Reds diamonds, associated to potentially saturated profiles, show more
16
Ref.: PA/IIS/FR/2010/01
Figure 1.8: IASI retrieval areas
outliers and slightly worse statistics with respect to the blue diamonds associated with non saturated
profiles. Results also showed that water vapor mixing ratio retrieved from IASI is generally overestimated
with respect to values provided by the rawinsondes (figures 1.10, and 1.12). This is expected to lead to an
overestimation of the instability because of the overestimation of Θe . Also for water vapor more outliers
were found in the potentially unstable profiles.
An additional sanity check on the data used to build part of the IASI Full data set was performed by
calculating the linear correlation and the bias between instability indices derived from the retrievals and
those derived from the rawinsondes. Results are shown in figures 1.13 and 1.14. Consistently with what
described in [4], the indices which are not dependent on Lifted Parcel Theory showed higher correlation.
Also, in general, correlation were found to be higher over Milano than over Udine, indicating either better
performance of the retrieval system, and/or lower temporal and spatial variability of the atmospheric
conditions over Milano.
17
Ref.: PA/IIS/FR/2010/01
Figure 1.9: Comparison of mean atmospheric Temperature in the lowest 200 hPa between retrievals and
rawinsondes over Milano. Red and blue diamonds represent respectively mean values for potentially satu­
rated and non saturated profiles.
Figure 1.10: Comparison of mean atmospheric water vapor mixing ratio in the lowest 200 hPa between
retrievals and rawinsondes over Milano. Red and blue diamonds represent respectively mean values for
potentially saturated and non saturated profiles.
Figure 1.11: Comparison of mean atmospheric Temperature in the lowest 200 hPa between retrievals
and rawinsondes over Udine. Red and blue diamonds represent respectively mean values for potentially
saturated and non saturated profiles.
18
Ref.: PA/IIS/FR/2010/01
Figure 1.12: Comparison of mean atmospheric water vapor mixing ratio in the lowest 200 hPa between
retrievals and rawinsondes over Udine. Red and blue diamonds represent respectively mean values for
potentially saturated and non saturated profiles.
Figure 1.13: Correlation of Instability Indices derived from the retrievals and the rawinsondes for the
150 coincident cases (Milano-Udine). Black diamonds show the correlation obtained over Milano, while
magenta circles show the correlation found over Udine.
Figure 1.14: Bias (normalized by the standard deviation) of Instability Indices derived from the retrievals
and the rawinsondes for the 150 coincident cases (Milano-Udine). Black diamonds show the bias obtained
over Milano, while magenta circles show the correlation found over Udine.
19
Ref.: PA/IIS/FR/2010/01
1.3.1.2
Predictors derived from IASI principal components
Besides the use of instability indices derived from IASI level 2 products (retrieval), 20 potential predictors
were generated using Principal Component Analysis. The procedure to generate the PCS is fully described
in[2, 1, 7]. The set of 20 PCS predictors was created using 10 PCS for the IASI Long Wave (LW) band,
9 for the Mid Wave (MW) band, and 1 for the Short Wave (SW) band. The PCS were all associated to
the Principal Components number 1, 2, 4, 5, 6, 7, 8, 9, 10, 12 for the LW; 1, 2, 4, 6, 7, 8, 9, 10, 17 for
the MW; and 24 for the SW. Selection was based on the spectral structure of the Principal Components,
trying to find the ones which had higher information on weak water vapor lines (especially in the LW). A
more systematic analysis of individual components is part of the future activity. Some of the PCS were
found to have an empirical posterior probability distribution (sec. 1.3.2.1) which could fit the lightning
observation remarkably well. This was the case of PCS associated to the 10th LW PC (figure 1.17), which
clearly appeared to be a good candidate to be chosen as predictors for nowcasting of convective events. In
�
addition to the 20 PCS, 12 non-linear combination of them (such as P CSx2 + P CSy2 ), were also included
in the list of candidate predictors.
1.3.2
Predicting events from IASI derived indices and PCS
1.3.2.1
Empirical posterior probability for IASI indices
According to the strategy described in sec. 1.1.1 and 1.2.2.1, once the Full data set was assembled, the EPP
functions were determined by fitting ad-hoc curves to the distribution of the events. Figures 1.15, 1.16,
1.17, show examples of the EPP, respectively for the Julian Day, and for two of the predictors (CAPE, and
PCS LW 10m). The complete sets of EPP plots for the IASI derived predictors can be found in ANNEX2.
1.3.2.2
Forward selection algorithm
The forward selection algorithm applied to the indices, generated for the two areas around Udine and
Milano, 20 IASI Principal Component Scores, along with their differences (Milano - Udine), produced the
results in described in table 1.5,. Figure 1.18 shows the best 6 predictors found among the 164 rawinsonde
derived indices according to the lowest mean values of Validation Errors (VE). Also the Training Errors
(TE) and Total Errors are reported in table 1.5, where the letters u and m at the end of the variable names
stand for Udine and Milano respectively, while the Greek letter ∆ indicate the difference between values
generated over Milano and over Udine.
20
Ref.: PA/IIS/FR/2010/01
Figure 1.15: Empirical relationship between Julian day and lightning occurrence for the IASI Full data
set. Peak of the activity is found in July.
Figure 1.16: Empirical relationship between CAPE and occurrence of at least 10 lightning. On the left
figure CAPE was derived from Udine Campoformido rawinsondes, while on the right it was derived from
Milano Linate rawinsondes.
21
Ref.: PA/IIS/FR/2010/01
Figure 1.17: Empirical relationship between PCSL10 and occurrence of at least 10 lightning. On the left
figure PCSL10 was derived from Udine Campoformido rawinsondes, while on the right it was derived from
Milano Linate rawinsondes.
Variable
PCS_L10m
ShowIu
WBZm
PCS_L10_M8m
∆PCS_M7
∆KI
<VE >
0.340
0.298
0.201
0.198
0.194
0.066
< TE >
0.338
0.277
0.175
0.173
0.125
0.046
< T otal E >
0.339
0.282
0.181
0.180
0.141
0.051
Table 1.5: Forward selection algorithm: results for the best IASI derived predictors.
22
Ref.: PA/IIS/FR/2010/01
Figure 1.18: The Training-Validation (TV) diagram of the CEEs computed over the 12 bootstraps (in­
stances of training/validation sets) for the first 6 “best variables” chosen by the classification forward
selection algorithm. The sets of 12 TV points of each variable are represented alternatively by filled cir­
cles and triangles, while the unfilled squares, connected by a dashed line, show the mean errors over the
12-point bootstraps.
23
Ref.: PA/IIS/FR/2010/01
1.3.3
Results
The Full data set (IASI-Lightnings) was divided into a Total set (which included 116 cases and was used
to train the difference ANN candidates) and a Test Set (of 38 cases) to select the best prediction system,
in terms of absolute scores and consistency between the results obtained on the Total and the Test sets.
The architecture chosen was an ANN with 3 inputs, 1 neuron on the hidden layer, and 1 output.
1.3.3.1
Training
Application of the ANN on the Total set led to a Total CEE of 0.27 (on 116 cases), while applying the
probability threshold (0.40) on the continuous ANN output led to the following contingency table:
TOTAL
Forecast: YES
Forecast: NO
Event: YES
25
11
Event: NO
6
74
The analysis of the of the contingency table led to the following results:
1.3.3.2
TOTAL
POD
HIT
FAR
POFD
BIAS
TS
HSS
PSS
S(PSS)
ODDS
Score
0.81
0.85
0.31
0.13
1.16
0.59
0.64
0.68
0.08
28.03
Testing
Applying the ANN on the Test set led to a Test CEE of 0.81 (on 38 cases), while entries found in the
contingency table were:
TEST
Forecast: YES
Forecast: NO
Event: YES
7
7
Event: NO
7
17
The analysis of the of the contingency table led to the following results:
TEST
POD
HIT
FAR
POFD
BIAS
TS
HSS
PSS
S(PSS)
ODDS
Score
0.5
0.63
0.5
0.29
1
0.33
0.21
0.21
0.16
2.43
which are worse than the scores obtained applying the ANN on the Total Set, and indicate a poor gen­
eralization capacity of the network. The ROC curves for this ANN are shown in figure 1.19. It is worth
emphasizing that, in this case, signs of overfitting are evident, and the performance on the Reduced Test
dataset are worse than those obtained on the Full rawinsonde Test dataset.
24
Ref.: PA/IIS/FR/2010/01
Figure 1.19: ROC obtained by selected ANN for Total and Test IASI data
1.3.4
Discussion of the results
Poor results obtained in forcasting the Test convective events from IASI data and products, with respect
to those obtained using rawinsondes, is likely due to:
1. the limited size of the IASI database (available retrievals in clear sky conditions) which is a factor
10 smaller than the rawinsonde database (both for training and testing);
2. a general tendency of the retrievals to overestimate low level water vapor, which led to over-estimation
of the atmospheric instability.
The first hypothesis was investigated and the results are reported in the two following subsections. Ad­
dressing the second hypothesis is part of the future work as it requires a more detailed study on the
inversion system performances.
1.3.4.1
Limited size of the IASI Full data set
Considering that the nowcasting of the convective event using rawinsonde derived predictors was signifi­
cantly better, a simple set of experiments was done to address the relevance of the Full data set size on
the final skill figures. The experiments consisted essentially in replicating what is described in sec. 1.2,
but using a reduced version of the rawinsonde Total and Test data sets, obtained by retaining form the
Full rawinsonde data set only the cases represented in the Full IASI data set.
25
Ref.: PA/IIS/FR/2010/01
1. In the first experiment a new set of optimal predictors was determined using the Reduced Set of
rawinsondes (150 case in total). The best ANN-inputs, showed in figure 1.20, were LIu, MELm,
HLVv, SWEATu, Helm, HDu. The best ANN architecture, using only the first 3 predictors, 1 hidden
neuron, led to a Total CEE of 0.37 and a Test CEE of 0.58 and to the following contingency tables
on the Total and Test:
RED. TOTAL Forecast: YES
Forecast: NO
Event: YES
28
25
Event: NO
1
59
RED. TEST
Forecast: YES
Forecast: NO
Event: YES
13
12
1
11
Event: NO
The analysis of the of the contingency table led to the following results:
RED. TOTAL
POD
HIT
FAR
POFD
BIAS
TS
HSS
PSS
S(PSS)
ODDS
Score
0.97
0.77
0.47
0.30
1.83
0.52
0.53
0.67
0.07
66.08
RED. TEST
POD
HIT
FAR
POFD
BIAS
TS
HSS
PSS
S(PSS)
ODDS
Score
0.93
0.65
0.48
0.52
1.79
0.5
0.35
0.41
0.16
11.92
and to the ROC showed in figure 1.21.
2. In the second experiment the ANN described in the previous point and developed on the Reduced
rawinsonde Total Set was applied to the Full rawinsonde Test Set. This led to a CEE of 0.55 on
415 cases while applying the probability threshold (0.4) on the continuous ANN output, and to the
following contingency table on the Test:
FULL TEST Forecast: YES Forecast: NO
Event: YES
146
93
Event: NO
16
160
The analysis of the of the contingency table led to the following results:
FULL TEST
POD
HIT
FAR
POFD
BIAS
TS
HSS
PSS
S(PSS)
ODDS
Score
0.74
0.74
0.39
0.37
1.47
0.57
0.49
0.53
0.04
15.70
3. In the third experiment the ANN described in sec. 1.2 and developed on the Full rawinsonde set
(8 inputs, 2 hidden neurons, 1 output) was applied on the Reduced rawinsonde Total and Test sets.
This led to a CEE values of 0.29 and 0.27 respectively, while applying the probability threshold (0.4)
on the continuous ANN output, and to the following contingency table on the Test:
RED TEST Forecast: YES Forecast: NO
Event: YES
11
0
Event: NO
3
21
The analysis of the of the contingency table led to the following results:
26
Ref.: PA/IIS/FR/2010/01
Figure 1.20: The TV CEEs computed over the 12 bootstraps (instances of training/validation sets) for
the first 6 “best ANN-inputs” chosen by the classification forward selection algorithm. The sets of 12
TV points of each variable are represented alternatively by filled circles and triangles, while the unfilled
squares, connected by a dashed line, show the mean errors over the 12-point bootstraps.
RED TEST
POD
HIT
FAR
POFD
BIAS
TS
HSS
PSS
S(PSS)
ODDS
Score
0.79 0.91 0.0
0.0
0.79 0.79 0.81 0.79
0.12
N/A
Note that these results are coherent with those obtained by the 8-input 2-neurons ANN on its Full
Test dataset and are independent of the very small sample size of the Reduced Test database used here.
Outcomes of the 3 experiments were expected, but it was important to quantify the degradation of the
performances due to the limited size of the total datasets. The first experiment showed that, also for the
rawinsondes, if the Full data Set is small (and therefore the Total and Test sets are small) the ANN does
a good job fitting the Total set but performs poorly on the Test set. The second experiment showed that
part of the degradation of the performances of the ANN trained on the small Total data set is actually
due to the lack of representativeness of the Reduced Test set, in fact, when a larger Full Test set is used
the PSS improves noticeably. Finally the third experiment confirmed that the representativeness of Test
set is not crucial, in fact the same ANN which was performing well on the Full Test set, when applied to
the Reduced Test Set, produced consisten scores like a P SS = 0.79.
27
Ref.: PA/IIS/FR/2010/01
Figure 1.21: ROC obtained by selected ANN for the reduced Total and Test rawinsonde data
1.3.4.2
Increasing the Full IASI data set
By focusing on the a single area of interest, it was possible to generate an ANN predictor, trained on a
IASI datasets twice as large as the Full IASI Set used in sec. 1.3. This section aims to provide a short
description of the results obtained by nowcasting convective events over the area of Milano only. The goal
is to prove that by avoiding the need for coincident observations, more observations could be used in the
experiment, and better results were achieved using IASI products. A more extensive study of the single
area prediction scheme is left to future work.
1. MILANO: a new set of optimal predictors was determined using the Full IASI Set of retrievals for
Milano (331 case in total). The best ANN-inputs, showed in figure 1.22, were KI, PCS LW 12, PCS
LW 17. The best ANN, a 3 input, 1 hidden neuron, led to a Total CEE of 0.40 (on 242 cases) and a
Test CEE of 0.42 (on 89 cases), and to the following contingency tables on the Total and Test:
MILANO TOTAL Forecast: YES Forecast: NO
Event: YES
36
41
Event: NO
25
140
MILANO TEST
Forecast: YES
Forecast: NO
Event: YES
17
13
6
53
Event: NO
The analysis of the of the contingency tables led to the following results:
28
Ref.: PA/IIS/FR/2010/01
Figure 1.22: The TV CEEs computed over the 12 bootstraps (instances of training/validation sets) for
the first 6 “best ANN-inputs” chosen by the classification forward selection algorithm. The sets of 12
TV points of each variable are represented alternatively by filled circles and triangles, while the unfilled
squares, connected by a dashed line, show the mean errors over the 12-point bootstraps.
MILANO TOTAL
POD
HIT
FAR
POFD
BIAS
TS
HSS
PSS
S(PSS)
ODDS
Score
0.59
0.73
0.53
0.23
1.26
0.35
0.33
0.36
0.06
4.91
MILANO TEST
POD
HIT
FAR
POFD
BIAS
TS
HSS
PSS
S(PSS)
ODDS
Score
0.74 0.79 0.43
0.2
1.30 0.47 0.35 0.49
0.10
11.55
and to the ROC showed in figure 1.23. Results obtained over Milano, using a Full dataset which
is twice as large as the Milano-Udine Full dataset provide much better results than those obtained
for the whole Po Valley, and indicates that nowcasting of convection by IASI data over individual
(smaller) areas is feasible and promising.
1.4
Conclusions
This report describes the results obtained by two forecast systems for thunderstorms (events with more
than 10 lightning strikes within 11:00 and 17:00 UTC for the time period April - October) over the Po
Valley.
1. For the first system, rawinsondes launched in Milano Linate, and Udine Campoformido between
2004-2010 were used to produce sets of 50 instability indices. Among these indices 8 predictors
(DTCu, SWISSm, WBZm, KIu, Mixu, PWCm, ShowIm, KIm, where m stands for calculated over
29
Ref.: PA/IIS/FR/2010/01
Figure 1.23: ROC obtained by selected ANN for the Full Total and Test IASI data over Milano
Milano, and u for calculated over Udine) were fed to an ANN with 2 neurons in the hidden layer,
and 1 continuous output. Application of the network to the Total (or Training) and Test sets led
to CEE di 0.335 and 0.375 respectively. By setting the discretization threshold (event: YES/NO) to
0.40 the ANN produced two contingency tables for Total and Test associated to PSS scores of 0.667
and 0.677 respectively.
2. The second system was designed to replicate the first one but with two substantial differences: wind
dependent instability indices were not used; and 20 PCS were used as potential predictors. The
available dataset for the IASI observations turned out to be much smaller than the rawinsonde
database, because only a small fraction (about 30% per area of interest) of IASI observations were
found to be in clear sky, and because the required coincidence of retrievals over the two areas reduced
the available cases to 10% of the total cases. With the limited IASI dataset, it was possible to
describe the Total set properly, with a 3 input ANN (PCl10m, ShowIu, and WBZm) with 1 neuron
on the hidden layer, but the capacity of the ANN to generalize on the Test Set was found poor: the
CEE=0.267 on the Total Set, becomes 0.813 on the Test Set. With a discrimination threshold of 0.40
the PSS= 0.722 on the Total set became 0.333 on the Test set. It is worth emphasizing that the first
predictors chosen were a combination of PCS and instability indices.
Poor results obtained in the generalization of the prediction of convective event from IASI data and
products, were found to be mostly dependent on the limited size of the IASI database (available retrievals
in clear sky conditions) which is a factor 10 smaller than the rawinsonde database (both for training and
30
Ref.: PA/IIS/FR/2010/01
testing). However a general tendency of the retrievals to overestimate low level water vapor, which led
to overestimation of the atmospheric instability, was found and should be further investigated. Finally
by focusing on a single area of interest we were able to increase the size of the IASI database by a factor
two, and the prediction PSS score on the test set reached the value of 0.57, indicating that nowcasting of
convection by IASI data over individual (smaller) areas, even if not yet as good as the nowcasting from high
vertical resolution rawinsondes, is feasible and promising, especially in the perspective of using more polar
satellite (AQUA, MATOP-B, etc) and even more with future geostationary platfroms (MTG). Besides the
final scores, significance of the presented material relies on the correlation found between some of the PCS
and the occurrence of convection, and on the validation of the IASI level2 and level 3 products.
31
Chapter 2
Technical Report 5: validation of Level 3
products derived from vertical rawinsonde and
retrieval profiles with occurrence of convection
as detected by Lightnings.
Document: Technical Report 5
Written by: Paolo Antonelli, A. Manzato
Date: 26 November 2010
Reference: PA/IIS/TR05/2010/01
2.1 Relationship between Instability Indices and convection oc­
currence
This document describes the statistical links found between the instability indices derived from raw­
insondes and IASI retrievals over the areas of Udine Campoformido, Pratica di Mare, and Cagliari
(sec. 2.2 PA/IIS/TR03/2010/01) with the occurrence of convective activity as detected by lightnings
(PA/IIS/TR01/2010/01). Statistical relationships were evaluated in different way as described in the fol­
lowing sections as a first step in investigating and comparing the skills of individual indices in predicting
the occurrence of convection. Following in part the concept described in [9], the three different approaches
used are based on linear regression (sec. 2.1.1), cross-entropy (sec. 2.1.2), and skill scores (sec. 2.1.3). It
is worth stressing that in the first and the third case the results are strongly dependend on the threshold
used to map the continuous instability variables into a boolean index (Istable = 0; Iunstable = 1). Also the
relationship described in this document refer to individual indices. This work has been done to demonstrate
the need for a statistical tool capable of combining all the indices to take advantage of their individual
skills.
32
Ref.: PA/IIS/FR/2010/01
2.1.1
Linear correlation between instability binary indices and convection oc­
currence
In a first approach the linear correlation between indices and occurrence of convection was done mapping
the continuous values of the indices and of the lightning counts into two boolean variables. For every
rawinsonde, followed by more than 1 lightning count in the 10 hrs time span, the convection occurrence
variable L was set to yes (1). If no lightning activity was observed the convection occurrence variable, L,
was se to no (0). The continuous values relative to the instability indices were also mapped into boolean
variables, by defining, for each index, a threshold value, t, which if exceeded (not exceeded) by the index
would lead to an instability variable It equal to yes (1) or no (0). For each index, derived from the
rawinsonde or from the retrievals, the threshold values, t, was determined by maximazing the correlation
between It and L. Once thresholds were defined individual the linear correlation R was calculated as
follows:
R =
cov (L, It )
σL σIt
(2.1)
2.1.2 Cross-entropy error between instability indices and convection occur­
rence
In the classification problem the error is usually taken to be the cross-entropy error (CEE), defined as:
N
1 �
(Lj ln (x) + (1 − Lj ) ln (1 − x))
CEE = −
N j=1
(2.2)
where N is the total number of cases, Lj is a boolean representing the convection occurrence, and x is the
instabilityindex under consideration.
2.1.3 Skill scores
As last step, contingency tables for each Instability index were derived using the It which maximize R in
equation 2.1. An example of the contingency table is shown in tbl. 2.1 where a and b represent the number
of cases in which there was instability according to the value of the instability index considered (for example
CAP E > It = 500) and lightning activity was observed in the 10 hrs following the rawinsonde/satellite
observation (a), or lightning activity was not observed (b); c and d represent cases in which the indices
indicated stability (for example CAP E < It = 500) and lightning activity was observed (c), or was not
observed (d). Contingency tables were derived for both rawinsondes and retrievals and for each individual
index. For each contingency tables a set of 5 scores (tbl. were calculated and compared.
33
Ref.: PA/IIS/FR/2010/01
INST (Y)
INST (N)
LGT (Y)
a
c
LGT (N)
b
d
Table 2.1: Example of contingency table
Expression
POD
POFD
FAR
HIT
a
a+c
b
b+d
b
a+b
a+d
a+b+c+d
PSS
(ad-bc)/[(a+c)(b+d)
Table 2.2: Scores
2.2
Results
2.2.1 Udine Campoformido: Linear correlation between instability binary in­
dices and convection occurrence and cross-entropy
Using the Instability Indices derived from 1924 rawinsonde and 148 retrievals, threshold values, It , needed to
map the continuous indices into a stability boolean index were derived by maximazing the linear correlation
as described in section 2.1.1. Figures 2.3, 2.4, and 2.5 show the linear correlation R as function of the
threshold values It , for both the rawinsondes (magenta) and the retrievals (black). Out of the 1924
rawinsonde available 628 were found to be associated to lightning activity in the 10 hrs following the
rawinsonde launch, while out of the 148 available retrievals, 19 of them were associated to lightning activity.
Values of the maximum linear correlation found for both rawinsonde and retrievals are shown in tbl. 2.6
(5th and 6th columns). While the values of cross-entropy are shown in the same table in the 7th and 8th
columns. The Group column represent the index family as described in document PA/IIS/TR04/2010/01.
2.2.2 Udine Campoformido: skill scores
Skill scores described in section 2.1.3, shown in tbl. 2.7, were calculated from rawinsonde (columns 2, 4,
6, 8, and 10) and from retrievals (columns 3, 5, 7, 9, and 11). It is important to emphasize that the scores
are strongly dependent on the threshold values, It , used to map the continuous indices into boolean for
stability/instability, which were simply derived by maximizing the linear correlation.
2.2.3 Pratica di Mare: Linear correlation between instability binary indices
and convection occurrence and cross-entropy
Using the Instability Indices derived from 1649 (574 associated to lightning activity) rawinsonde and 291
(64 associated to lightning activity) retrievals, threshold values, It , needed to map the continuous indices
into a stability boolean index were derived by maximazing the linear correlation as described in section
2.1.1. Figures 2.8, 2.9, and 2.10 show the linear correlation R as function of the threshold values It , for
both the rawinsondes (magenta) and the retrievals (black). Values of the maximum linear correlation
found for both rawinsonde and retrievals are shown in tbl. 2.11 (5th and 6th columns). While the values
34
Ref.: PA/IIS/FR/2010/01
Table 2.3: Udine, Campoformido: correlation between instability occurrence (as a function of threshold
values) and lightning occurrence for CAPE, CIN, UpDr, LI, ShowI, DTC, DT500, and LCL
35
Ref.: PA/IIS/FR/2010/01
Table 2.4: Udine, Campoformido: correlation between instability occurrence (as a function of threshold
values) and lightning occurrence for Tbase, MaxBuo, CAP, MRH, PWE, LRH, KI, Θe .
36
Ref.: PA/IIS/FR/2010/01
Table 2.5: Udine, Campoformido: correlation between instability occurrence (as a function of threshold
values) and lightning occurrence for LFC.
Index
Group
Units
CAPE
CIN
UdDr
LI
ShowI
DTC
DTC500
LFC
LCL
Tbase
MaxBuo
KI
CAP
MRH
LRH
PWE
Θe
magenta
magenta
magenta
grey
grey
grey
grey
grey
grey
grey
grey
green
green
green
green
green
green
J/kg
J/kg
m/s
◦
C
◦
C
◦
C
◦
C
m
m
◦
C
◦
C
◦
C
◦
C
%
%
mm
K
Sonde
R
0.51
0.48
0.53
-0.53
-0.54
-0.51
-0.55
0.13
-0.36
0.43
0.51
0.49
-0.28
0.27
0.21
0.44
0.40
IASI
R
0.46
0.32
0.45
-0.41
-0.43
-0.47
-0.44
0.26
-0.23
0.30
0.50
0.39
-0.29
0.28
0.21
0.37
0.35
Sonde
CEE
1.60
1.12
1.84
0.95
0.94
1.02
0.93
0.98
0.87
0.94
0.53
0.87
0.96
0.72
0.84
0.53
0.58
IASI
CEE
0.26
1.16
0.31
1.38
1.38
1.31
1.38
0.61
0.51
1.19
0.58
0.78
0.50
0.64
0.94
0.50
0.69
Table 2.6: Udine Campoformido. List of correlation values between Instability Indices derived from1924
(628 with lightning) rawinsonde and 148 (19 with lightning) ([15]) and number of lightning counts observed
in the 10 hrs time span after rawinsonde launch.
37
Ref.: PA/IIS/FR/2010/01
CAPE
CIN
UdDr
LI
ShowI
DTC
DTC500
LFC
LCL
Tbase
MaxBuo
KI
CAP
MRH
LRH
PWE
Θe
Sonde
POD
0.73
0.92
0.77
0.25
0.20
0.32
0.15
0.27
0.14
0.76
0.79
0.72
0.37
0.84
0.83
0.60
0.70
IASI
POD
0.79
0.89
0.47
0.11
0.42
0.26
0.16
0.79
0.00
0.89
0.53
0.74
0.37
0.74
0.05
0.68
0.53
Sonde IASI Sonde
POFD POFD FAR
0.20
0.19
0.36
0.42
0.41
0.48
0.21
0.05
0.36
0.80
0.70
0.87
0.76
0.90
0.89
0.83
0.85
0.85
0.74
0.77
0.91
0.16
0.40
0.55
0.51
0.31
0.88
0.31
0.44
0.45
0.25
0.05
0.40
0.21
0.21
0.38
0.67
0.76
0.79
0.56
0.33
0.58
0.63
0.00
0.61
0.17
0.20
0.37
0.28
0.12
0.45
IASI
FAR
0.62
0.76
0.44
0.98
0.94
0.96
0.97
0.78
1.00
0.77
0.41
0.66
0.93
0.75
0.00
0.67
0.62
Sonde
HIT
0.78
0.69
0.78
0.22
0.22
0.22
0.23
0.65
0.38
0.72
0.76
0.77
0.34
0.57
0.52
0.76
0.72
IASI
HIT
0.81
0.63
0.89
0.28
0.14
0.16
0.22
0.62
0.60
0.60
0.89
0.78
0.26
0.68
0.88
0.78
0.83
Sonde
PSS
0.53
0.50
0.55
-0.54
-0.56
-0.52
-0.58
0.11
-0.37
0.45
0.54
0.51
-0.30
0.28
0.20
0.43
0.42
IASI
PSS
0.60
0.48
0.42
-0.59
-0.48
-0.59
-0.61
0.39
-0.31
0.45
0.47
0.53
-0.39
0.41
0.05
0.48
0.40
Table 2.7: Udine Campoformido. List of contingency table scores for Instability Indices derived from 1924
(628 with lightning) rawinsonde and 148 (19 with lightning) retrievals ([15]) and number of lightning counts
observed in the 10 hrs time span after rawinsonde launch.
of cross-entropy are shown in the same table in the 7th and 8th columns. The Group column represent the
index family as described in document PA/IIS/TR04/2010/01.
2.2.4 Pratica di Mare: skill scores
Skill scores described in section 2.1.3, shown in tbl. 2.12, were calculated from rawinsonde (columns 2, 4,
6, 8, and 10) and from retrievals (columns 3, 5, 7, 9, and 11). It is important to emphasize that the scores
are strongly dependent on the threshold values, It , used to map the continuous indices into boolean for
stability/instability, which were simply derived by maximizing the linear correlation.
2.2.5 Cagliari: Linear correlation between instability binary indices and con­
vection occurrence and cross-entropy
Using the Instability Indices derived from 1725 (381 with lightning) rawinsonde and 292 (27 with lightning)
retrievals retrievals, threshold values, It , needed to map the continuous indices into a stability boolean index
were derived by maximazing the linear correlation as described in section 2.1.1. Figures 2.13, 2.14, and 2.15
show the linear correlation R as function of the threshold values It , for both the rawinsondes (magenta) and
the retrievals (black). Values of the maximum linear correlation found for both rawinsonde and retrievals
are shown in tbl. 2.16 (5th and 6th columns). While the values of cross-entropy are shown in the same
table in the 7th and 8th columns. The Group column represent the index family as described in document
38
Ref.: PA/IIS/FR/2010/01
Table 2.8: Pratica di Mare: correlation between instability occurrence (as a function of threshold values)
and lightning occurrence for CAPE, CIN, UpDr, LI, ShowI, DTC, DT500, and LCL
39
Ref.: PA/IIS/FR/2010/01
Table 2.9: Pratica di Mare: correlation between instability occurrence (as a function of threshold values)
and lightning occurrence for Tbase, MaxBuo, CAP, MRH, PWE, LRH, KI, Θe .
40
Ref.: PA/IIS/FR/2010/01
Table 2.10: Pratica di Mare: correlation between instability occurrence (as a function of threshold values)
and lightning occurrence for LFC.
Index
Group
Units
CAPE
CIN
UdDr
LI
ShowI
DTC
DTC500
LFC
LCL
Tbase
MaxBuo
KI
CAP
MRH
LRH
PWE
Θe
magenta
magenta
magenta
grey
grey
grey
grey
grey
grey
grey
grey
green
green
green
green
green
green
J/kg
J/kg
m/s
◦
C
◦
C
◦
C
◦
C
m
m
◦
C
◦
C
◦
C
◦
C
%
%
mm
K
Sonde
R
0.33
0.33
0.38
-0.35
-0.39
-0.40
-0.35
-0.13
-0.21
0.13
0.41
0.41
-0.29
0.39
0.30
0.24
0.07
IASI
R
0.30
0.27
0.32
-0.28
-0.25
-0.29
-0.29
-0.10
-0.23
0.19
0.31
0.14
-0.25
0.25
0.24
0.15
-0.06
Sonde
CEE
2.24
1.15
2.52
0.84
0.84
1.01
0.82
1.07
0.95
1.17
0.60
1.00
0.99
0.59
0.68
0.64
0.76
IASI
CEE
0.88
1.04
0.80
0.77
0.85
0.85
0.82
1.04
0.63
1.54
0.73
0.91
0.44
0.59
0.76
0.70
0.96
Table 2.11: Pratica di Mare. List of correlation values between Instability Indices derived from 1649 (574
with lightning) rawinsonde and 291 (64 with lightning) retrievals ([15]) and number of lightning counts
observed in the 10 hrs time span after rawinsonde launch.
41
Ref.: PA/IIS/FR/2010/01
CAPE
CIN
UdDr
LI
ShowI
DTC
DTC500
LFC
LCL
Tbase
MaxBuo
KI
CAP
MRH
LRH
PWE
Θe
Sonde
POD
0.64
0.91
0.62
0.37
0.37
0.14
0.18
0.02
0.15
0.82
0.89
0.58
0.37
0.82
0.57
0.29
0.96
IASI
POD
0.56
0.75
0.75
0.50
0.28
0.52
0.45
0.09
0.23
0.28
0.52
0.98
0.36
0.52
0.70
0.17
0.44
Sonde IASI Sonde
POFD POFD FAR
0.30
0.23
0.47
0.59
0.43
0.55
0.24
0.37
0.42
0.73
0.80
0.79
0.77
0.58
0.79
0.55
0.82
0.88
0.54
0.78
0.85
0.09
0.19
0.89
0.34
0.51
0.81
0.70
0.11
0.62
0.47
0.19
0.50
0.18
0.89
0.36
0.67
0.66
0.77
0.42
0.24
0.49
0.26
0.41
0.46
0.10
0.07
0.39
0.92
0.51
0.64
IASI
FAR
0.59
0.67
0.63
0.85
0.88
0.85
0.86
0.88
0.89
0.59
0.57
0.76
0.87
0.62
0.67
0.58
0.80
Sonde
HIT
0.68
0.58
0.71
0.30
0.28
0.34
0.36
0.60
0.48
0.48
0.65
0.74
0.34
0.67
0.68
0.69
0.38
IASI
HIT
0.73
0.61
0.66
0.27
0.39
0.25
0.27
0.66
0.43
0.75
0.75
0.31
0.35
0.71
0.62
0.77
0.48
Sonde
PSS
0.34
0.31
0.38
-0.36
-0.40
-0.41
-0.36
-0.07
-0.20
0.12
0.42
0.40
-0.31
0.41
0.31
0.19
0.04
IASI
PSS
0.33
0.32
0.38
-0.30
-0.30
-0.30
-0.32
-0.09
-0.28
0.17
0.33
0.10
-0.30
0.28
0.29
0.11
-0.07
Table 2.12: Pratica di Mare. List of contingency table scores for Instability Indices derived from1649 (574)
rawinsonde and 291 (64) retrievals ([15]) and number of lightning counts observed in the 10 hrs time span
after rawinsonde launch.
PA/IIS/TR04/2010/01.
2.2.6
Cagliari: skill scores
Skill scores described in section 2.1.3, shown in tbl. 2.17, were calculated from rawinsonde (columns 2, 4,
6, 8, and 10) and from retrievals (columns 3, 5, 7, 9, and 11). It is important to emphasize that the scores
are strongly dependent on the threshold values, It , used to map the continuous indices into boolean for
stability/instability, which were simply derived by maximizing the linear correlation.
2.3
Analysis of Results
Results showed that different statistical analysis produce different ranking of the forecasting skill of the
indices. In particular, over Udine Campoformido, the approach based on Max(R) (sec. 2.1.1) is more
favorable to UpDr, LI, ShowI and DTC500, which show sharp peaks in figures 2.3, 2.4, and2.5. While
the second method based on and Min(CEE) (sec. 2.1.2) is favorable to MaxBuo, PWE e Θe , where the
last two exhibit a wider maximum rather than a sharp one. Differences in the ranking should be further
investigated.
Over Pratica di Mare, the approach based on Max(R) rewards DTC, MaxBuo and KI, while the second
approach, M in(CEE), M axBuo, PWE e MRH. Over Cagliari M ax(R) and M in(CEE) reward both
MaxBuo e MRH. Worst index according to M in(CEE) is consistently UpDr while for M ax(R) is always
42
Ref.: PA/IIS/FR/2010/01
Table 2.13: Cagliari: correlation between instability occurrence (as a function of threshold values) and
lightning occurrence for CAPE, CIN, UpDr, LI, ShowI, DTC, DT500, and LCL
43
Ref.: PA/IIS/FR/2010/01
Table 2.14: Cagliari: correlation between instability occurrence (as a function of threshold values) and
lightning occurrence for Tbase, MaxBuo, CAP, MRH, PWE, LRH, KI, Θe .
44
Ref.: PA/IIS/FR/2010/01
Table 2.15: Cagliari: correlation between instability occurrence (as a function of threshold values) and
lightning occurrence for LFC.
Index
Group
Units
CAPE
CIN
UdDr
LI
ShowI
DTC
DTC500
LFC
LCL
Tbase
MaxBuo
KI
CAP
MRH
LRH
PWE
Θe
magenta
magenta
magenta
grey
grey
grey
grey
grey
grey
grey
grey
green
green
green
green
green
green
J/kg
J/kg
m/s
◦
C
◦
C
◦
C
◦
C
m
m
◦
C
◦
C
◦
C
◦
C
%
%
mm
K
Sonde
R
0.30
0.35
0.30
-0.30
-0.35
-0.36
-0.28
-0.07
-0.15
0.11
0.40
0.35
-0.28
0.41
0.30
0.14
-0.11
IASI
R
0.16
0.21
0.18
-0.23
-0.14
-0.19
-0.18
-0.12
-0.21
0.12
0.18
0.21
-0.19
0.29
0.25
-0.18
-0.18
Sonde
CEE
1.56
1.28
1.98
0.78
0.90
0.95
0.79
0.83
0.69
1.38
0.58
1.15
0.77
0.55
0.69
0.65
0.73
IASI
CEE
0.30
0.61
0.41
0.89
0.94
1.16
0.88
0.71
0.50
1.42
0.45
0.84
0.26
0.68
0.92
0.79
0.68
Table 2.16: Cagliari. List of correlation values between Instability Indices derived from 1725 (381 with
lightning) rawinsonde and 292 (27 with lightning) retrievals ([15]) and number of lightning counts observed
in the 10 hrs time span after rawinsonde launch.
45
Ref.: PA/IIS/FR/2010/01
CAPE
CIN
UdDr
LI
ShowI
DTC
DTC500
LFC
LCL
Tbase
MaxBuo
KI
CAP
MRH
LRH
PWE
Θe
Sonde
POD
0.87
0.82
0.62
0.18
0.18
0.13
0.24
0.06
0.15
0.99
0.93
0.79
0.68
0.82
0.76
0.11
0.85
IASI
POD
0.85
0.81
0.81
0.56
0.07
0.19
0.96
0.11
0.19
0.67
0.63
0.26
0.15
0.63
0.89
0.96
0.11
Sonde IASI Sonde
POFD POFD FAR
0.51
0.58
0.67
0.39
0.45
0.63
0.27
0.51
0.61
0.54
0.86
0.91
0.60
0.28
0.92
0.57
0.51
0.94
0.58
1.00
0.90
0.11
0.29
0.86
0.31
0.54
0.88
0.94
0.46
0.77
0.46
0.34
0.63
0.37
0.06
0.62
0.91
0.47
0.83
0.33
0.20
0.59
0.40
0.46
0.65
0.03
1.00
0.53
0.93
0.41
0.79
IASI
FAR
0.87
0.84
0.86
0.94
0.97
0.96
0.91
0.96
0.97
0.87
0.84
0.70
0.97
0.76
0.84
0.91
0.97
Sonde
HIT
0.57
0.65
0.70
0.40
0.35
0.37
0.38
0.71
0.57
0.27
0.63
0.67
0.22
0.70
0.63
0.78
0.25
IASI
HIT
0.46
0.58
0.52
0.18
0.66
0.46
0.09
0.65
0.43
0.55
0.66
0.88
0.50
0.78
0.57
0.09
0.54
Sonde
PSS
0.36
0.43
0.35
-0.36
-0.42
-0.43
-0.34
-0.05
-0.16
0.06
0.48
0.42
-0.23
0.48
0.36
0.07
-0.08
IASI
PSS
0.27
0.37
0.30
-0.30
-0.21
-0.32
-0.04
-0.18
-0.35
0.21
0.29
0.20
-0.32
0.43
0.43
-0.04
-0.30
Table 2.17: Cagliari. List of contingency table scores for Instability Indices derived from 1725 (381 with
lightning) rawinsonde and 292 (27 with lightning) retrievals ([15]) and number of lightning counts observed
in the 10 hrs time span after rawinsonde launch.
LFC. Before drawing the conclusions it is important to note that for the rawinsonde the probability of a
628
19
convective event (for example over Udine) is praw = 1924
= 33% , while for the retrievals pret = 148
= 13% .
Reason for the large discrepancy, being praw � pret , is likely due to the higher probability of having clouds
associated to convective events.
2.4
Conclusions
This document described the results obtained by investigating the statistical links between the instability
indices derived from rawinsondes and IASI retrievals over the areas of Udine Campoformido, Pratica
di Mare, and Cagliari, with the occurrence of convective activity as detected by lightnings. Statistical
relationships were evaluated in different ways as a first step in understanding and comparing the skills of
individual indices in predicting the occurrence of convection. Following in part the concept described in [9],
the three different approaches used were based on linear regression (sec. 2.1.1), cross-entropy (sec. 2.1.2),
and skill scores (sec. 2.1.3). It is worth stressing that in the first and the third case the results found were
strongly dependend on the threshold used to map the continuous instability variables into a boolean index
(Istable = 0; Iunstable = 1). As expected single indices have different skills in forecasting convection activity,
therefore the development of a statistical tool capable of combining all the indices to take advantage of
their individual skills is highly recommended.
In terms of comparing the information obtained from the rawinsonde and the retrievals, the results
46
Ref.: PA/IIS/FR/2010/01
showed in this document remain consistent with conclusions drawn in the previous report (PA/IIS/TR04/2010/01)
which stated that overall the results obtained are encouraging in defining a procedure to operationally use
the IASI retrievals in the derivation of instability indices useful for forecasting, and nowcasting of instability
activity.
47
Chapter 3
Technical Report 4: Comparison of Level 3
products (Instability Indices) derived from
satellite observations and rawinsondes
Document: Technical Report 4
Written by: Paolo Antonelli
Date: 4 November 2010
Reference: PA/IIS/TR04/2010/01
3.1
Generation of Instability Indices
A set of eighteen instability indices was derived from both available rawinsondes (hereafter referred to as
ΥR ) and from available IASI retrievals (hereafter referred to as ΥI ) . Instability indices, listed in table 3.1,
were generate using the software Sound_Analys.py developed at the Osservatorio Meteorologico Regionale
(OSMER) of Friuli Venezia Giulia, by Agostino Manzato [15]. Besides providing capabilities to derive a
large set in instability indices, Sound_Analys.py, implements three different schemes for evaluating the
buoyancy and performing the adiabatic lifting. The study described in this document uses only the results
obtained by the Tv scheme which corresponds to the simple T scheme, where the parcel is considered to
consist not only of dry air, neglecting therefore the water vapor, but also of the non-saturated vapor during
the ‘‘dry’’ lifting. In other words the Tv scheme is based on a ‘‘moist’’ adiabatic lifting process. Specific
detail about the three approaches can be found in [11]. The remaining part of this section describes the
basic concepts of the lifted parcel theory, however, before proceeding further it is worth emphasizing that
not all the indices rely on this theory, for example the K-index (KI) depend on the atmospheric state
condition only, regardless of the assumption made on vertical dynamics schemes.
48
Ref.: PA/IIS/FR/2010/01
Index
CAPE
CIN
UdDr
LI
ShowI
DTC
DTC500
LFC
LCL
Tbase
KI
CAP
MRH
LRH
PWE
Θe
MaxBuo
Swiss
Units
J/kg
J/kg
m/s
◦
C
◦
C
◦
C
◦
C
m
m
◦
C
◦
C
◦
C
%
%
mm
K
◦
C
−
Table 3.1: List of Instability Indices derived from rawinsonde and retrievals. Detailed description of the
individual indices can be found in [15].
3.1.1
Lifted Parcel Theory assumptions
The assumptions behind the calculation of the instability indices dependent on the lifted parcel theory can
be schematically synthesized as:
• rising parcel does not mix with the environment;
• parcel pressure always equal to environmental pressure at same height;
• parcel rises along moist adiabat until it becomes saturated, and afterwards it rises along a wet
adiabat;
• condensed water falls out of the parcel (pseudo-adiabatic process), so there is no freezing of the
condensed water.
3.1.2
Lifted Parcel Theory
The implementation of the lifted parcel theory in Sound_Analys.py starts with the selection of the initial
low level parcel that will raise, and that represents the moist and warm air that will create the cloud.
Sound_Analys.py defines a layer of 30 hPa and raises it stepwise though the lowest 250 hPa, computing for
each central layer pressure (pc ) the mean sounding values within the layer ([pc −15 , pc +15] hP a). The layer
with the most unstable features, i.e. the largest pressure averaged equivalent potential temperature Θe0
is selected as initial parcel (Most Unstable Parcel). Initial conditions for the parcel are (p0 , T0 , T d0 ). The
49
Ref.: PA/IIS/FR/2010/01
parcel is then lifted to higher level (p, T, Td ), along a moist adiabat, so that initial potential temperature
Θe0 and q0 are conserved until parcel becomes saturated at Lifting Condensation Level (LCL). Above LCL
the parcel is lifted along a wet adiabat which conserves only the equivalent potential temperature Θe0 while
q = qsat (T, p). If, at a level, the parcel becomes lighter than the environment, that is called Level of Free
Convection (LFC) and the sounding is said to be potentially unstable. Afterwards the parcel rises up to
Equilibrium Level (EL) where the density of the parcel is equal to the environmental density.
3.1.3
Selection of Instability Indices
Manzato [10] shows as different instability indices are differently sensitive to the vertical resolution. Table
3.1 show the correlation, R2 , between the indices calculated from the same rawinsondes but sampled
at different vertical resolution. The indices which are dependent on the lifted parcel theory show lower
correlations than the other indices, therefore it is likely that they will present the largest deviation when
calculated for the high vertical resolution rawinsondes rather than for the IASI retrievals. Therefore
throughout the whole document the indices have been divided in three main families or groups:
1. indices heavily dependent on lifted parcel theory (CAPE, CIN, UpDr) hereafter referred to as magenta
group;
2. indices somehow dependend to lifted parcel theory (LI, ShowI, DTC, DTC500, LFC, Tbase) and
belonging to the grey group;
3. indices independent of lifted parcel theory (KI, CAP, MRH, LRH, Θe, MaxBuo, PWE) and belonging
to the green group.
3.2
Results
The instability indices listed in table 3.1 were generated from sets, ΥR , of rawinsondes, at highest available
vertical resolution, and from sets ΥI of IASI retrievals. In order to minimize the impact of the issues
described in technical report #3 (ref. PA/IIS/TR03/2010/01) the following procedure was used:
1. instability indices were derived for each satellite overpass with one or more IASI spectrally successful
retrievals by applying Sound_Analys.py to the mean profiles obtained by all the available retrievals
for that overpass;
2. the same instability indices were derived from the closest available rawinsonde, regardless of time
order but with a time difference constraint of 200 min;
3. linear correlation between IASI derived, and rawinsonde derived indices was calculated for each site
over the selected time period,
50
Ref.: PA/IIS/FR/2010/01
Figure 3.1: Table of correlation, from [10], between instability indices calculated from the same rawinsonde
sampled at different vertical resolution from soundings at full vertical resolution compared with sounding
reduced to the TEMP format (WMO code) . Values highlighted in green refer to the indices which are
independent of Lifted Parcel Theory. Grey and magenta indices are partially and highly dependent on the
Lifted Parcel Theory.
51
Ref.: PA/IIS/FR/2010/01
Index
CAPE
CIN
UdDr
LI
ShowI
DTC
DTC500
LFC
LCL
Tbase
MaxBuo
KI
CAP
MRH
LRH
PWE
Θe
Swiss
Group
magenta
magenta
magenta
grey
grey
grey
grey
grey
grey
grey
grey
green
green
green
green
green
green
N/A
Units
J/kg
J/kg
m/s
◦
C
◦
C
◦
C
◦
C
m
m
◦
C
◦
C
◦
C
◦
C
%
%
mm
K
−
R
.44
.51
.45
.81
.80
.60
.79
.26
.61
.86
.58
.88
.22
.78
.67
.93
.94
N/A
Table 3.2: List of Instability Indices derived from rawinsonde and retrievals. Detailed description of the
individual indices can be found in [15].
3.2.1
Udine Campoformido
Available data over Udine Campoformido include: 2069 rawinsondes (as described in technical report #1,
ref. PA/IIS/TR01/2010/01) for the time period of July 2007 and December 2009, and 702 spectrally
successful retrievals associated to 217 METOP-A overpasses. Linear correlation coefficients calculated are
listed in tbl. 3.2.
3.2.2
Pratica di Mare
Available data over Pratica di Mare include: 1757 rawinsondes (as described in technical report #1, ref.
PA/IIS/TR01/2010/01), and 1328 spectrally successful retrievals associated to 397 METOP-A overpasses.
Linear correlation coefficients calculated are listed in tbl. 3.3.
3.2.3
Cagliari
Available data over Pratica di Mare include: 1858 rawinsondes (as described in technical report #1, ref.
PA/IIS/TR01/2010/01), and 922 spectrally successful retrievals associated to 348 METOP-A overpasses.
Linear correlation coefficients calculated are listed in tbl. 3.4.
52
Ref.: PA/IIS/FR/2010/01
Index
CAPE
CIN
UdDr
LI
ShowI
DTC
DTC500
LFC
LCL
Tbase
MaxBuo
KI
CAP
MRH
LRH
PWE
Θe
Swiss
Group
magenta
magenta
magenta
grey
grey
grey
grey
grey
grey
grey
grey
green
green
green
green
green
green
N/A
Units
J/kg
J/kg
m/s
◦
C
◦
C
◦
C
◦
C
m
m
◦
C
◦
C
◦
C
◦
C
%
%
mm
K
−
R
.27
.29
.31
.49
.57
.45
.57
.47
.37
.70
.40
.72
.51
.74
.74
.86
.91
N/A
Table 3.3: List of Instability Indices derived from rawinsonde and retrievals. Detailed description of the
individual indices can be found in [15].
Index
CAPE
CIN
UdDr
LI
ShowI
DTC
DTC500
LFC
LCL
Tbase
MaxBuo
KI
CAP
MRH
LRH
PWE
Θe
Swiss
Group
magenta
magenta
magenta
grey
grey
grey
grey
grey
grey
grey
grey
green
green
green
green
green
green
N/A
Units
J/kg
J/kg
m/s
◦
C
◦
C
◦
C
◦
C
m
m
◦
C
◦
C
◦
C
◦
C
%
%
mm
K
−
R
.30
.32
.32
.35
.55
.48
.59
.45
.25
.58
..40
.69
.56
.80
.78
.82
.88
N/A
Table 3.4: List of Instability Indices derived from rawinsonde and retrievals. Detailed description of the
individual indices can be found in [15].
53
Ref.: PA/IIS/FR/2010/01
3.3
Analysis of Results
A first look at the results indicates that the green indices calculated from rawinsonde and averaged re­
trievals show, as expected, higher correlation than grey and magenta indices, with the exception of CAP
(green group), that rely on the very high vertical resolution details of the lowest 400hPa and shows a low
correlation.
In addition, results over Udine Campoformido show higher values of correlation than the results over
Pratica di Mare and Cagliari. This was expected from the discussion of the retrieval validation included in
technical report #3 (ref. PA/IIS/TR03/2010/01), which showed how retrievals over coastal areas were very
likely more contaminated by clouds, and also how the averaging of continental/ocean profiles in regimes
of breezes might not be an optimal approach.
Besides presentation and analysis of the absolute values of the correlations, it was considered important
to refer the results obtained to the correlation existing between indices derived from forecast profiles (at
different time) and those derived from rawinsondes as described in [10]. The fundamental reason for
this comparison is based on the fact that generally the rawinsonde is considered useful in operational
now-casting activities more than for longer range forecasting activities, and for this reason, operational
forecasters are more inclined to use instability indices derived from model forecast than from rawinsonde.
To compare refer the results to what is used operationally provides a practical quality assessment criterium.
3.3.1
Forecast derived indices
The comparison of indices derived from forecast and from rawinsonde was presented by Manzato [10] and
it is shown in figure 3.2, where columns show the change in correlation (in terms of R2 ) between the
indices derived from the rawinsondes and the indices derived from forecast at different times (from 24 hrs
to 132 hrs in advance) for Udine Campoformido. Colored oval shapes show the results obtained for the
correlation between indices derived from rawinsondes and from IASI retrievals over the same site.
However before proceeding any further with the comparison it is worth noting that the correlations
obtained over Udine Campoformido, Pratica di Mare, and Cagliari and described in sections 3.2.1, 3.2.2,
and 3.2.3, are affected by two factors: 1) is the limited vertical resolution of the retrievals compared to the
vertical resolution of the rawinsondes; and 2) the time difference between the satellite overpass and the
rawinsonde launch, which on average is 116 min. While the first factor is really the potentially limiting
factor in the use IASI retrievals for instability forecasting, and it was addressed in section 3.1.3, the second
factor is a false error due only to the time difference and it has to be properly taken out of consideration
for a proper comparison of rawinsonde-retrieval and rawinsonde-forecast indices correlations (see following
section).
3.3.2
Time dependence of instability indices
In order to define an upper bound of time variability of the instability indices, Sound_Analys.py was applied
to a set of consecutive rawinsondes launched within 6 hrs at 06 : 00 and 12 : 00 U T C. Correlations found
54
Ref.: PA/IIS/FR/2010/01
Figure 3.2: Table of correlation (in themrs of R2 between instability indices calculated from forecast
for different time (columns) and rawinsonde (from [10]) . Colored oval shapes show the results of the
correlation between indices derived from IASI retrievals and from rawinsonde over Udine Campoformido.
Values highlighted in green refer to the indices which are independent of Lifted Parcel Theory. Grey and
magenta indices are partially and highly dependent on the Lifted Parcel Theory.
55
Ref.: PA/IIS/FR/2010/01
Index
CAPE
CIN
UdDr
LI
ShowI
DTC
DTC500
LFC
LCL
Tbase
MaxBuo
KI
CAP
MRH
LRH
PWE
Θe
Swiss
Group
magenta
magenta
magenta
grey
grey
grey
grey
grey
grey
grey
grey
green
green
green
green
green
green
N/A
Units
J/kg
J/kg
m/s
◦
C
◦
C
◦
C
◦
C
m
m
◦
C
◦
C
◦
C
◦
C
%
%
mm
K
−
R
.67
.70
.75
.91
.89
.81
.87
.70
.74
.90
.79
.87
.39
.81
.86
.95
.97
N/A
Table 3.5: List of Instability Indices derived from 267 pairs of consecutive rawinsondes launched at 06 : 00
and 12 : 00 U T C over Udine Campoformido. Detailed description of the individual indices can be found
in [15].
are listed in table 3.5, and showed in figure 3.3, in which red rectangles indicating the correlation values
found for consecutive rawinsondes are superimposed to the values displayed in figure 3.2. Considering the
errors due to different vertical resolutions between retrievals and rawinsondes (as shown in figure 3.1) and
the errors due to time differences between satellite overpasses and rawinsonde launches of 116 min on
average (whose upper bounds are shown in figure 3.3 for 360 min delay) it becomes clear that for several
indices a large fraction of the correlation degradation between retrievals and rawinsondes is actually due
to the time differences.
3.4
Conclusions
The study described by this technical report indicates that instability indices independent, and weakly
dependent on Lifted Parcel Theory, tend to correlate well between retrievals and rawinsonde. Correlation
values, once the time difference have been accounted for, are comparable with the correlation found between
indices derived from rawinsondes and forecast profiles. It is important to stress that:
• at the report submission time, the retrievals used were labled as spectrally successful but were
quality control for potential cloud contamination, therefore the correlation over Udine were found
to be higher than those over Pratica di Mare and Cagliari (as described in technical report #3, ref.
PA/IIS/TR03/2010/01);
56
Ref.: PA/IIS/FR/2010/01
Figure 3.3: Table of correlation (in themrs of R2 between instability indices calculated from forecast
for different time (columns) and rawinsonde (from [10]). Colored oval shapes show the results of the
correlation between indices derived from IASI retrievals and from rawinsonde over Udine Campoformido.
Values highlighted in green refer to the indices which are independent of Lifted Parcel Theory. Grey and
magenta indices are partially and highly dependent on the Lifted Parcel Theory. Red rectangles show the
correlation between indices derived for a set of consecutive rawinsondes launched within 6 hrs.
57
Ref.: PA/IIS/FR/2010/01
• results over the three sites could be improved by performing a suitable cloud quality control;
• the availability of extra rawinsonde time and space coincident with IASI observations would greatly
improved the significance of the presented work.
Overall the results obtained are encouraging in defining a procedure to operationally use the IASI retrievals
in the derivation of instability indices useful for forecasting, and nowcasting of instability activity.
58
Chapter 4
Technical Report 3: Validation of baseline
Retrieval with rawinsondes
Document: Technical Report 3
Written by: Paolo Antonelli
Date: 30 October 2010
Reference: PA/IIS/TR03/2010/01
4.1
Inversion with UWPHYSRET
UWPHYSRET is a research tool built on a matlab implementation of a Bayesian retrieval system. It allows
for retrieval of atmospheric parameters from high spectral resolution infrared observations. The package is
based on LBLRTM (version 11.7) and Optimal Spectral Sampling (OSS) for the computation of simulated
radiances and jacobians. The system works with IASI observations, however the modular nature of the
package makes it suitable to extend its use to other current and future high spectral resolution instrument
such as AIRS, and MTG-IRS. It allows for simultaneous retrievals of: vertical profiles of temperature,
water vapor mixing ratio, Carbon Dioxide, Ozone, and surface temperature and emissivity.
4.1.1
IASI observations used in retrieval
IASI L1C data were collected over an area of 1x1 degree around Pratica di Mare, Udine Campoformido,
and Cagliari, Italy. Before inversion data were thinned using the MAIA cloud mask. Only observations
labeled more than 98% clear were retained. Observations over Udine Campoformido and Cagliari, were
compressed and reconstructed according to the procedure described in PA/ScP/2010/04 Annex 1 using
Principal Components derived at EUMETSAT.
59
Ref.: PA/IIS/FR/2010/01
4.1.2
A-priori Covariance: in-situ observations used for climatology
Site dedicated a-priori covariance were obtained using available rawinsondes, observed prior June 2007 in
Pratica di Mare, Italy. Rawinsondes were provided by the Centro Nazionale per la Meteorologia e la Clima­
tologia Areounautica (CNMCA) of the Italian Air Force. Observed vertical profiles were extrapolated up
to 0.1 [hP a] using climatological profiles, and were quality controlled for saturation and/or missing values,
Ozone and Carbon Dioxide were generated by random perturbations of climatological profiles. Surface
Temperature was randomly generated from lowest level temperature, constrained by surface type, time of
day, and latitude. Climatological covariance matrix was regularized using Singular Value Decomposition.
For this experiment first guess emissivity was derived from Dan Zhou emissivity atlas. Retrieval of surface
emissivity was enabled.
4.1.3
Error Covariance
Error covariance matrix for the 4021 channels, between 670 and 2200 cm−1 used in the inversion, was
obtained by increasing the IASI nominal noise as provided by CNES by 30% to account for forward model
errors. The error covariance matrix was considered purely diagonal.
4.1.4
Forward Model
Optimal Spectral Sampling forward model (from Atmospheric & Environmental Research, Inc.) was used
in this experiment to allow for faster computation of the retrievals.
4.1.5
Minimization Scheme
Minimization scheme used to derive retrievals was based on the Levenberg–Marquardt approach:
�−1 � T −1
�
�
Ki S� [y − F (xi )] − Sa−1 [xi − xa ]
xi+1 = xi + (1 + γ) Sa−1 + KiT S�−1 Ki
(4.1)
where xi is the state vector at iteration i, Sa is the a-priori covariance described in sec. 4.1.2, K is the
Jacobian of the forward model F , S� is the observation covariance matrix described in sec. 4.1.3, and with
a variable γ. Starting values was set to γ = .75 and increments were set of factor 1.5 for χ2 ratio > .75
and decrements of a factor 2 for χ2 ratio < .25, where
(4.2)
χ2 = [y − F (y)]T S�−1 [y − F (xi )]
and y and F (y) are respectively observed and simulated radiances.
4.1.6
Convergence Test
Convergence test used to stop iterative retrieval process was done, as suggested by C. Rodgers [17], in the
atmospheric profile space (rate of change in retrieved profile):
d2i = (xi − xi+1 )T S�−1 (xi − xi+1 ) � n
(4.3)
60
Ref.: PA/IIS/FR/2010/01
Figure 4.1: Temperature retrieval error calculated according to Rodgers [17].
where close to convergence:
�−1
�
S� = Sa−1 + KiT S�−1 Ki
(4.4)
and n is the number of elements of the state vector.
4.1.7
Retrieval Errors
Retrieval total error was estimated according to Rodgers [17]:
�
�−1
Ŝ = K T S�−1 K + Sa−1
(4.5)
while its smoothing and measurement components were estimated according to the following equations:
�−1 −1 � T −1
�−1
�
Ss = K T S�−1 K + Sa−1
Sa K S� K + Sa−1
(4.6)
�
�−1 −1 � T −1
�−1
Sm = K T S�−1 K + Sa−1
S� K S� K + Sa−1
(4.7)
and are shown in figures 4.1, and 4.2.
61
Ref.: PA/IIS/FR/2010/01
Figure 4.2: Water Vapor retrieval error calculated according to Rodgers [17].
4.2
Validation strategy
In order to guarantee the accuracy of the results, the retrievals obtained with UWPHYSRET were first
tested using observed radiances (spectral validation). Only retrievals which passed the spectral test were
considered successful and useful for environmental validation.
4.2.1
Spectral Validation
Retrievals on available IASI observations were validated spectrally by comparing the retrieval residuals,
calculated by subtracting radiances simulated off retrieved profiles, using the OSS forward model, to
observed radiances reconstructed after PCA compression. In the comparison the reconstructed radiances
were preferred to original observations because of the noise filtering properties of PCA [7] compression.
The residuals obtained were averaged in 5 spectral regions and were compared (in terms of Brightness
Temperature) to the mean observation error (in BT) used in the retrieval process and described in sec.
4.1.3. The 5 spectral regions are indicated in tbl. 4.1 and were selected to guarantee that observations
were properly fit in the 14 µm carbon dioxide band (accuracy of vertical profile of temperature), in the
9.7 µm ozone band (accuracy of ozone vertical distribution), in the 6.7 µm water vapor band (accuracy
of vertical distribution of water vapor), in the 12 µm window (accuracy of surface emissivity and surface
temperature), and in the 11 µm band (accuracy of surface emissivity and surface temperature, presence of
high load of aerosols and/or presence of thin cirrus clouds). Only retrievals with average residuals ± one
standard deviation, smaller than average observation error, in all 5 bands were considered successful.
62
Ref.: PA/IIS/FR/2010/01
Range in cm
−1
CO2
670 − 775
Win
775 − 990
O3
990 − 1070
Win 2
1090 − 1120
H2 O
1240 − 2100
Total
670 − 2200
Table 4.1: Spectral regions used to to validate retrieval residuals.
4.2.2
Environmental Validation
Spectral validation was needed to verify that retrieved profiles indeed produced synthetic radiances whose
distance from real observations was within the observation error. However the spectral validation tests do
not guarantee that the retrieved profiles are always accurate and/or realistic. For example, in presence
of thin cirrus clouds, the inversion system is often capable to retrieve the atmospheric state variables
associated to spectral residuals smaller then the observation noise, however the effect of the cirrus cloud on
the observed radiances, not modeled in the forward model calculation, is spread over the retrieved variables
causing errors of several degrees K in temperature and of several g/kg in water vapor mixing ratio. For
this reason before using the level 2 products to generate level 3 (instability indices) it is important to
combine the spectral validation with an environmental validation. The remaining part of this document
describes the procedure used to perform the environmental validation and the results achieved.
4.2.2.1
In situ observations used for environmental validation
At each site, retrievals were compared with rawinsonde profiles, observed between July 2007 and December
2009, and obtained from CNMCA. Profiles were extrapolated up to 0.1 [hP a], and were quality controlled
for saturation and/or missing values. Original rawinsonde observations at high vertical resolution (with
single profile measurements made every 2 sec) were pressure averaged per layer according to the following
equation:
�
�
Xi +Xi−i
N
∗ (Pi − Pi−1 )
�
2
(4.8)
Xl =
(Plow − Phigh )
i=1
where Xl is the atmospheric parameter to be averaged (T, WV) in the layer l, i is the i-th of the N
sublevels that divide the layer l, and Plow and Phigh are the pressure extremes of the layer in consideration.
Surface Temperature associated to the profile was randomly generated from lowest level temperature,
constrained by surface type, time of day, and latitude.
Retrievals labeled as successful by the spectral validation, were compared to available pressure averaged
rawinsondes. The comparison was done to ensure that difference were within ranges expected because of
time and space differences and not by other factors.
4.2.2.2
Statistical quantities used to characterize environmental validation
Mean of the differences between the M rawinsondes variables, Y , and the M retrieved variables, X at each
level i:
�M
j=1 Xi,j − Yi,j
(4.9)
δ̄i =
M
63
Ref.: PA/IIS/FR/2010/01
and their standard deviation:
δˆi =
�
�
�� �
� M (X − Y ) − (X − Y ) 2
� j=1
i,j
i,j
i,j
i,j
M
(4.10)
were calculated for each inverted spectrum. X and Y could be Temperature, and Water Vapor mixing
ratio. Statistical quantities in eq. 4.9 and 4.10 provided an indication of how retrieved profiles compare to
observations made by rawinsondes generally launched within 3 hrs and 50 km from satellite overpass.
4.3
Results
This section describes the results obtained comparing the rawinsonde, pressure averaged according to the
procedure outlined in secion 4.2.2.1, with the retrieved profiles which passed the spectral validation test.
4.3.1
Udine Campoformido
Using the MAIA cloud mask, 2519 spectra were labeled as clear sky over an area of 1x1 degree centered
in Udine Campoformido (lat : 46.03N, lon : 13.18E) for the time period of July 2007 - December
2009. After inversion of the PCA noise filtered radiances, 702 retrievals passed the spectral test described
in section 4.2.1. Retrievals obtained from the morning orbit (AM) were 236 corresponding to 33.6% of
the total number of total number of spectrally successful retrievals, while 466 were associated with the
afternoon overpasses (PM, corresponding to 66.4%). Mean and standard deviation of the distance between
the retrieved profiles and the closest (in any case within 200 min) rawinsonde launched after satellite
overpass are shown in figures 4.3, 4.4, 4.5, and 4.6. It is worth noting that:
1. mean temperature distance at the surface for AM overpasses (blue line) was found to have the
expected negative sign, being δ¯ defined in eq. 4.9 the difference between the retrieval and the
rawinsonde, and being the retrieval obtained, on average, 116 min before the sonde launch, and its
magnitude ( ∼ 1 K) is smaller than the PM one (∼ 3 K), figure 4.3;
2. large mean temperature deviation ( ∼ 2 K) were found around the tropopause, figure 4.3;
3. results for AM overpasses were characterized by large mean deviation in water vapor mixing ratio (
∼ 1.5 g/kg), figure 4.4;
4. standard deviation of temperature was found to be large at the tropopause, and at the surface, with
AM overpasses having larger values, figure 4.5;
None of the mentioned points seemed to indicate pathological behavior of the retrieval.
64
Ref.: PA/IIS/FR/2010/01
Figure 4.3: Udine Campoformido: mean temperature distance between retrieved profiles and closeset
rawinsonde. Blue line represents the mean distance for morning (AM) overpasses. Red line represents the
distance for PM overpasses, and the black dashed line the mean over the whole set of retrievals.
Figure 4.4: Udine Campoformido: mean water vapor mixing ratio distance between retrieved profiles
and closeset rawinsonde. Blue line represents the mean distance AM overpasses. Red line represents the
distance for PM overpasses, and the black dashed line the mean over the whole set of retrievals.
65
Ref.: PA/IIS/FR/2010/01
Figure 4.5: Udine Campoformido: standard deviation of temperature distance between retrieved profiles
and closeset rawinsonde. Blue line represents the mean distance for morning (AM) overpasses. Red line
represents the distance for PM overpasses, and the black dashed line the mean over the whole set of
retrievals.
Figure 4.6: Udine Campoformido: standard deviation of water vapor mixing ratio distance between re­
trieved profiles and closeset rawinsonde. Blue line represents the mean distance AM overpasses. Red
line represents the distance for PM overpasses, and the black dashed line the mean over the whole set of
retrievals.
66
Ref.: PA/IIS/FR/2010/01
Figure 4.7: Pratica di Mare: mean temperature distance between retrieved profiles and closeset rawinsonde.
Blue line represents the mean distance for morning (AM) overpasses. Red line represents the distance for
PM overpasses, and the black dashed line the mean over the whole set of retrievals.
4.3.2
Pratica di Mare
Using the MAIA cloud mask, 4489 spectra were labeled as clear sky over an area of 1x1 degree centered
in Pratica di Mare (lat : 41.65N, lon : 12.43E) for the time period of July 2007 - December 2009. After
inversion 1328 retrievals passed the spectral test described in section 4.2.1. Retrievals obtained from the
morning orbit (AM) were 430 corresponding to 32.4% of the total number of total number of spectrally
successful retrievals, while 898 were associated with the afternoon overpasses (PM, corresponding to 67.6%).
Mean and standard deviation of the distance between the retrieved profiles and the closest (in any case
within 200 min) rawinsonde launched after satellite overpass are shown in figures 4.7, 4.8, 4.9, and 4.10.
Results over Pratica di Mare showed some inconsistencies:
1. mean temperature distance at the surface for AM overpasses (blue line) was found to have an unex­
¯ defined in eq. 4.9, represents the difference between
pected positive sign. As in the Udine case, δ,
the retrieval and the rawinsonde, with the retrieval obtained, on average, 116 min before the sonde
launch. Positive values were expected for PM overpasses but not for the AM ones, as showed in
figure 4.7;
2. AM overpasses results were characterized by a sharp peak in the mean deviation in water vapor
mixing ratio ( ∼ 2.5 g/kg) between 850 and 900 hP a, figure 4.8.
Both these results were unexpected and seemed to indicate problems with the retrievals.
67
Ref.: PA/IIS/FR/2010/01
Figure 4.8: Pratica di Mare: mean water vapor mixing ratio distance between retrieved profiles and closeset
rawinsonde. Blue line represents the mean distance AM overpasses. Red line represents the distance for
PM overpasses, and the black dashed line the mean over the whole set of retrievals.
Figure 4.9: Pratica di Mare: standard deviation of temperature distance between retrieved profiles and
closeset rawinsonde. Blue line represents the mean distance for morning (AM) overpasses. Red line
represents the distance for PM overpasses, and the black dashed line the mean over the whole set of
retrievals.
68
Ref.: PA/IIS/FR/2010/01
Figure 4.10: Pratica di Mare: standard deviation of water vapor mixing ratio distance between retrieved
profiles and closeset rawinsonde. Blue line represents the mean distance AM overpasses. Red line represents
the distance for PM overpasses, and the black dashed line the mean over the whole set of retrievals.
4.3.3
Cagliari
Using the MAIA cloud mask, 6314 spectra were labeled as clear sky over an area of 1x1 degree centered
in Cagliari (lat : 39.25N, lon : 9.05E) for the time period of July 2007 - December 2009. After inversion
1262 retrievals passed the spectral test described in section 4.2.1. Retrievals obtained from the morning
orbit (AM) were 340 corresponding to 26.9% of the total number of total number of spectrally successful
retrievals, while 922 were associated with the afternoon overpasses (PM, corresponding to 73.1%). Mean
and standard deviation of the distance between the retrieved profiles and the closest (in any case within
200 min) rawinsonde launched after satellite overpass are shown in figures 4.11, 4.12, 4.13, and 4.14.
Results over Cagliari showed the some inconsistencies that were found for Pratica di Mare:
1. mean temperature distance at the surface for AM overpasses (blue line) has the unexpected positive
sign. As in the Udine case, δ̄, defined in eq. 4.9, represents the difference between the retrieval and
the rawinsonde, with the retrieval obtained, on average, 116 min before the sonde launch. Positive
values were expected for PM overpasses but not for the AM ones, as showed in figure 4.11;
2. AM overpasses are characterized by a sharp peak in the mean deviation in water vapor mixing ratio
( ∼ 2.5 g/kg) between 850 and 900 hP a, figure 4.12.
Both these results were un-expected and seemed to indicate problems with the retrievals.
69
Ref.: PA/IIS/FR/2010/01
Figure 4.11: Cagliari: mean temperature distance between retrieved profiles and closeset rawinsonde. Blue
line represents the mean distance for morning (AM) overpasses. Red line represents the distance for PM
overpasses, and the black dashed line the mean over the whole set of retrievals.
Figure 4.12: Cagliari: mean water vapor mixing ratio distance between retrieved profiles and closeset
rawinsonde. Blue line represents the mean distance AM overpasses. Red line represents the distance for
PM overpasses, and the black dashed line the mean over the whole set of retrievals.
70
Ref.: PA/IIS/FR/2010/01
Figure 4.13: Cagliari: standard deviation of temperature distance between retrieved profiles and closeset
rawinsonde. Blue line represents the mean distance for morning (AM) overpasses. Red line represents the
distance for PM overpasses, and the black dashed line the mean over the whole set of retrievals.
Figure 4.14: Cagliari: standard deviation of water vapor mixing ratio distance between retrieved profiles
and closeset rawinsonde. Blue line represents the mean distance AM overpasses. Red line represents the
distance for PM overpasses, and the black dashed line the mean over the whole set of retrievals.
71
Ref.: PA/IIS/FR/2010/01
Figure 4.15: Pratica di Mare: mean temperature distance between retrieved profiles and closeset rawinsonde
over water. Blue line represents the mean distance for morning (AM) overpasses. Red line represents the
distance for PM overpasses, and the black dashed line the mean over the whole set of retrievals.
4.4
Analysis of Results
To further investigate the inconsistencies in the results obtained over Pratica di Mare and Cagliari (both
coastal regions), the mean of the distances for temperature and water vapor were calculated for Field of
View (FOVs) with more and less than 50% of water surface. Figures 4.15, and 4.16 show δ¯T and δ¯M R over
water for Pratica di Mare, while figures 4.17, and 4.18 show the same quantities over land. Positive values
at the surface for the AM overpasses were found in both cases, but the effect was found to be much larger
over land (with discrepancies of 5 − 6 K, and 4 g/kg), than over water (with discrepancies of 1 K, and
2.5 g/kg).
In order to verify that indeed the positive values at the surface are not due to climatological effects,
all the rawinsonde available over Pratica di Mare, at 06 : 00 and 18 : 00 U T C, were compared to the
rawinsonde launched 6 hrs (360 min) later. The comparison was performed for the whole period between
July 2007 and December 2009. Climatological signal for AM overpasses was found to have large (5 K)
differences in temperature close the surface (figure 4.19) and smaller (2 K) differences for PM overpasses.
Water vapor mixing ratio differences did not show any significant signal between 850 and 900 hP a, and
differences in proximity of the surface were larger for PM overpasses then for AM ones (figure 4.20).
By comparing the climatological rawinsonde signal in figure 4.19 to the values of δ¯T and δ¯M R obtained
72
Ref.: PA/IIS/FR/2010/01
Figure 4.16: Pratica di Mare: mean water vapor mixing ratio distance between retrieved profiles and
closeset rawinsonde over water. Blue line represents the mean distance AM overpasses. Red line represents
the distance for PM overpasses, and the black dashed line the mean over the whole set of retrievals.
Figure 4.17: Pratica di Mare: mean temperature distance between retrieved profiles and closeset rawinsonde
over land. Blue line represents the mean distance for morning (AM) overpasses. Red line represents the
distance for PM overpasses, and the black dashed line the mean over the whole set of retrievals.
73
Ref.: PA/IIS/FR/2010/01
Figure 4.18: Pratica di Mare: mean water vapor mixing ratio distance between retrieved profiles and
closeset rawinsonde over land. Blue line represents the mean distance AM overpasses. Red line represents
the distance for PM overpasses, and the black dashed line the mean over the whole set of retrievals.
Figure 4.19: Pratica di Mare: mean temperature distance between consecutive rawinsondes. Blue and red
lines represent the mean distance for AM and PM overpasses respectively.
74
Ref.: PA/IIS/FR/2010/01
Figure 4.20: Pratica di Mare: mean water vapor mixing ratio distance between consecutive rawinsondes.
Blue and red lines represent the mean distance for AM and PM overpasses respectively.
over Pratica di Mare and showed in figures 4.7, it was that PM overpasses had a good agreement with
climatological rawinsonde signal, while AM overpasses showed large discrepancies. It is worth emphasizing
that:
• the rawinsonde signal is associated to a 6 hrs time difference, 3 times larger than the time difference
between satellite overpass and rawinsonde launch, therefore the magnitude of the climatological signal
is expected to be larger than the retrieval-rawinsonde differences;
• the climatological signal extracted from rawinsonde is associated to land only differences, while the
retrieval-rawinsonde differences were calculated over both land and ocean, with the ocean thermal
capacity being quite different from the land one.
Given that the PM differences were found to be realistic and close to what could be expected from climatol­
ogy, it was unlikely that AM suspicious differences were caused by wrong surface emissivity or pathologies
of the a-priori information (which would affect also the PM cases). One possible explanation was identified
in the higher probability of cloud contamination (especially from thin cirrus clouds) for the AM overpasses
and over land. To validate this hypothesis the retrieved profiles were screened for potential saturation
simply by retaining all the profiles that had relative humidity (calculated over water) smaller than altitude
dependent threshold values (empirically chosen to prove the concept). Figure 4.22 shows how neglecting the
retrieval with potential saturation causes a shift in temperature towards climatological (negative) values
close to the surface. The same concept was found to be evident also in the histogram of the temperature
differences calculated over the whole set, and over the RH screened set, at 986 and 1014 hP a showed in
figure 4.23. While the number of “good” retrievals decreased of about 13 , and while all the AM retrievals
over land were labled as contaminated by clouds, the distribution of the values of temperature differences
near the surface was found to move correctly towards negative values.
75
Ref.: PA/IIS/FR/2010/01
Figure 4.21: Pratica di Mare: mean temperature distance between retrievals and rawinsondes: blue and red
solid lines represent the mean distance for AM and PM overpasses respectively. Mean temperature distance
between consecutive rawinsondes are shown by blue and red dashed lines for AM and PM overpasses
respectively.
However while the RH based correction takes care of the issues near the surface, the problems between
800 and 900 hP a were found to be still persistent both in temperature (figure 4.22) and water vapor
mixing ratio (figure 4.24). It is possible that better threshold have to be determined, however the problem
might also be related to instability of the retrievals due to improper use of the a-priori covariance matrix.
UWPHYSRET, in its current form, retrieves all the variable an the high level of correlation between
different levels could be a source of instability while calculating the inverse of the a-priori matrix (used in
the iterative equation 4.1). For this reason Principal Component Analysis should be applied to the profiles
before inversion, and the equation should be applied to a lower dimensional (compressed) state vector.
RH correction had similar impact on data retrieved over Cagliari, while it had milder impact on the
data retrieved over Udine, indicating that AM cloudiness might be more of a factor over land in coastal
areas, than over more inland (like in the Udine Campoformido) cases.
4.5
Conclusions
Retrievals obtained for the three areas under investigation, were validated spectrally to ensure proper
functioning of the inversion package used (UWPHYSRET). Spectral validation was performed comparing
the retrieval residuals to the observation error used in the inversion process. Only retrievals whose spectral
residuals were found to be consistently smaller than the observation error were considered successful.
76
Ref.: PA/IIS/FR/2010/01
Figure 4.22: Pratica di Mare: mean temperature distance between retrievals and rawinsondes: blue dia­
monds solid lines represent the mean distance for AM overpasses for retrievals before (left) and after (right)
RH correction. Mean temperature distance between consecutive rawinsondes is shown in cian circles.
Figure 4.23: Pratica di Mare: histogram of the mean temperature distance between retrievals and rawin­
sondes at 986 (blue) and 1014 hP a (red) before (left) and after (right) RH correction.
Figure 4.24: Pratica di Mare: mean temperature distance between retrievals and rawinsondes: blue dia­
monds solid lines represent the mean distance for AM overpasses for retrievals before (left) and after (right)
RH correction. Mean temperature distance between consecutive rawinsondes is shown in cian circles.
77
Ref.: PA/IIS/FR/2010/01
Spectral validation was followed by an environmental validation, where the retrieved temperature and
water vapor profiles were compared to available rawinsondes launched in the areas of interest, within
200 minutes from satellite overpass. While the differences between retrievals and rawinsondes obtained
over Udine seemed to be within expected values, anomalies were found for Pratica di Mare and Cagliari,
especially with morning overpasses. A detailed analysis of the issue, indicated that a significant part of the
AM retrievals might have been contaminated by clouds, especially over land. Removing the potentially
cloud contaminated retrievals improved the quality of the results near the surface, however did not have
an impact on anomalies found in both temperature and water vapor mean differences between 800 and
900 hP a. This issue should be further investigated. Final outcome of the validation study is that retrieved
profiles could be used to generate instability indices (level 3 products) however it is strongly recommended
that all the profiles available over the 1x1 degree areas for a given overpass, should be averaged before
generating level 3 data, to minimize the possible side effects introduced by clouds.
78
Chapter 5
Technical Report 2: Instability Indices (Level 3
Products) derived from IASI retrievals (Level 2
Products)
Document: Technical Report 2
Written by: Paolo Antonelli and Silvia Puca
Date: 25 August 2010
Reference: PA/IIS/TR02/2010/02
5.1
Introduction
This document is the second report of activities for the project on atmospheric instability derived from
IASI observations. The document follows up the first report (Reference: PA/IIS/2010/01) and describes
preliminary results obtained after the optimization of UWPHYSRET. For this study about 1500 IASI
observations have been collected in a 1 by 1 degree box centered in Pratica di Mare, Italy, for the time
period July – September 2007. The clear sky observations where inverted and instability indices were
derived from the retrievals and were compared to those derived from available rawinsondes launched in
Pratica di Mare at 12:00 and 00:00 UTC. Good agreement was found between instability derived from
satellite and from rawinsondes. Also good agreement was found between electric activity (lightning) and
instability derived from the rawinsonde. Future investigation will make use of a convection detection
system based on satellite (SEVIRI) data.
5.2
Observations
For this study 1565 IASI L1C observations were collected in an area of 1x1 degree around Pratica di Mare,
Italy (lat : 41.65N, lon : 12.43E) for the time period July – September 2007. Before inversion data
79
Ref.: PA/IIS/FR/2010/01
Figure 5.1: Spatial distribution of retrievals (red diamonds) used for assessment of PCA impact on level 2
product accuracy over Pratica di Mare. Blue circle indicates location of rawinsonde launches.
were thinned using the MAIA cloud mask, 757 observations were labeled as clear sky (corresponding to
FOVs more than 98% clear) and 469 lead to convergence in the retrieval process. However part of these
observations were found to be contaminated by clouds not detected by MAIA, and after the spectral and
environmental validation only 250 retrieved profiles (showed in figure 5.1, and hereafter indicated with
M ) were considered usable to derive Instability Indices. Along with the IASI observations also a total of
154 rawinsondes launched at 12 : 00 (11 : 00 UTC actual launch) and 00 : 00 UTC (23 : 00 UTC actual
launch) were collected. Location of rawinsonde site is showed in blue in figure 5.1. Rawinsonde data were
made available CNMCA of the Italian Air Force. CNMCA also provided for this study an estimate of the
lightning activity during the time period of the study. Lightning data were used, in this stage, without
adequate quality control, and some problems were reported for July and September 2007. For this reason
further investigation will be required on Lightning observations.
80
Ref.: PA/IIS/FR/2010/01
Figure 5.2: Mean Temperature distance (black line) and Standard Deviation of Temperature distance
between rawinsondes and retrievals.
5.3
Retrievals
Retrievals were performed with UWPHYSRET, a physical retrieval package developed at the Space Science
Engineering Center of the University of Wisconsin – Madison. Out of the 757 clear sky observations, 250
lead to successful retrievals (location showed in red in figure 5.1). Retrievals were considered successful
when spectral residual were within the estimated observation error, and no saturation was found in the
retrieved profiles. Statistics of the Temperature and Water Vapor Mixing Ratio distances between retrievals
and time co-located rawinsondes (Retrieval – Rawinsonde) are shown in figures 5.2 and 5.3. Considering
that the mean time difference between the IASI retrieval and the effective launch time of the rawinsonde is
about 160 minutes, and the space distance between IASI observations and rawinsonde launch site ranges
from 0 to 50 km, the quality of the retrievals was considered satisfactory. Temperature deviation of 1.5 K
in the boundary layer was expected because 65% of the successful retrievals were associated to the evening
overpasses (around 19:30 UTC) and therefore were compared to colder (in the boundary layer) rawinsonde
profiles. Large discrepancies between 100 and 200 hPa have to be further investigated but to do not impact
the study on the lower tropospheric instability.
5.4
Instability Indices
Instability Indices were derived from rawinsondes and retrieved profiles with Sound_Analys, a python
based software package developed by A. Manzato at the Osservatorio Meteorologico Regionale (OSMER)
of the Agenzia Regionale per la Protezione dell’Ambiente del Friuli Venezia Giulia (ARPA-FVG). Time
series of CAPE and Lifted Index for the months of July (figures 5.4, and 5.7), August (figures 5.5, and 5.8),
and September (figures 5.6, and 5.9), 2007. Red circles indicates the values of CAPE (LI) derived form
00:00 and 12:00 UTC rawinsondes; blue diamond shows the mean values of CAPE for all the retrievals
co-located with a given rawinsonde; blue error bars are associate to the variability of the IASI derived
81
Ref.: PA/IIS/FR/2010/01
Figure 5.3: Mean Water Vapor Mixing Ratio distance (black line) and Standard Deviation of Water Vapor
Mixing Ratio distance between rawinsondes and retrievals.
CAPE between all the retrievals co-located with a given rawinsonde; black squares show the number of
lightning observed within 10 hours from rawinsonde launch in a 1x1 degree box centered over Pratica di
Mare. In figures 5.7, 5.85.9 the black squares indicate the log of the number of lightning. The agreement
between the values of CAPE and LI, derived from rawinsonde and retrievals is encouraging at this stage.
More conclusive results will be available only after processing data for the summers of 2008, 2009 and
2010. Also general agreement between rawinsonde derived CAPE and LI and detection of electric activity
is found. More conclusive results will be achieved with the use of NEFODINA, a convection detection
scheme developed at CNMCA.
5.5
Conclusions
Results presented in this document represent the outcome of a preliminary study on instability derived from
satellite (IASI) data. About 1500 IASI observations were collected in a 1 by 1 degree box centered in Pratica
di Mare, Italy, for the time period July – September 2007. After using the MAIA cloud detection scheme,
the clear sky observations where inverted with UWPHYSRET, a physical inversion scheme developed
at the University of Wisconsin-Madison, and instability indices (CAPE and LI) were derived from the
retrievals, with Sound_Analys a software package developed at OSMER. Instability indices derived from
IASI retrievals were compared to those derived from available rawinsondes launched in Pratica di Mare
at 12:00 and 00:00 UTC. Good agreement was found between instability derived from satellite and from
rawinsondes. Also good agreement was found between electric activity (lightning) and instability derived
from the rawinsonde. Future investigation will make use of a convection detection system based on satellite
(SEVIRI) data.
82
Ref.: PA/IIS/FR/2010/01
Figure 5.4: Time Series of CAPE (J/Kg) for the month of July 2007. Red circles indicates the values of
CAPE derived form 00:00 and 12:00 UTC rawinsondes; blue diamond shows the mean values of CAPE for
all the retrievals co-located with a given rawinsonde; blue error bars are associate to the variability of the
IASI derived CAPE between all the retrievals co-located with a given rawinsonde; black squares show the
number of lightning observed within 10 hours from rawinsonde launch in a 1x1 degree box centered over
Pratica di Mare.
83
Ref.: PA/IIS/FR/2010/01
Figure 5.5: Time Series of CAPE (J/Kg) for the month of August 2007. Red circles indicates the values
of CAPE derived form 00:00 and 12:00 UTC rawinsondes; blue diamond shows the mean values of CAPE
for all the retrievals co-located with a given rawinsonde; blue error bars are associate to the variability of
the IASI derived CAPE between all the retrievals co-located with a given rawinsonde; black squares show
the number of lightning observed within 10 hours from rawinsonde launch in a 1x1 degree box centered
over Pratica di Mare.
84
Ref.: PA/IIS/FR/2010/01
Figure 5.6: Time Series of CAPE (J/Kg) for the month of September 2007. Red circles indicates the values
of CAPE derived form 00:00 and 12:00 UTC rawinsondes; blue diamond shows the mean values of CAPE
for all the retrievals co-located with a given rawinsonde; blue error bars are associate to the variability of
the IASI derived CAPE between all the retrievals co-located with a given rawinsonde; black squares show
the number of lightning observed within 10 hours from rawinsonde launch in a 1x1 degree box centered
over Pratica di Mare.
85
Ref.: PA/IIS/FR/2010/01
Figure 5.7: Time Series of Lifted Index for the month of July 2007. Red circles indicates the values of
CAPE derived form 00:00 and 12:00 UTC rawinsondes; blue diamond shows the mean values of LI for all
the retrievals co-located with a given rawinsonde; blue error bars are associate to the variability of the
IASI derived LI between all the retrievals co-located with a given rawinsonde; black squares show the log
of the number of lightning observed within 10 hours from rawinsonde launch in a 1x1 degree box centered
over Pratica di Mare.
86
Ref.: PA/IIS/FR/2010/01
Figure 5.8: Time Series of Lifted Index for the month of August 2007. Red circles indicates the values
of CAPE derived form 00:00 and 12:00 UTC rawinsondes; blue diamond shows the mean values of LI for
all the retrievals co-located with a given rawinsonde; blue error bars are associate to the variability of the
IASI derived LI between all the retrievals co-located with a given rawinsonde; black squares show the log
of the number of lightning observed within 10 hours from rawinsonde launch in a 1x1 degree box centered
over Pratica di Mare.
87
Ref.: PA/IIS/FR/2010/01
Figure 5.9: Time Series of Lifted Index for the month of September 2007. Red circles indicates the values
of CAPE derived form 00:00 and 12:00 UTC rawinsondes; blue diamond shows the mean values of LI for
all the retrievals co-located with a given rawinsonde; blue error bars are associate to the variability of the
IASI derived LI between all the retrievals co-located with a given rawinsonde; black squares show the log
of the number of lightning observed within 10 hours from rawinsonde launch in a 1x1 degree box centered
over Pratica di Mare.
88
Chapter 6
Technical Report 1: Dataset description
Document: Technical Report 1
Written by: Paolo Antonelli
Date: 16 October 2010
Reference: PA/IIS/TR01/2010/01
6.1
Introduction
This document describes the data collected for the project Evaluating atmospheric instability from high
spectral resolution IR satellite observations supported under contract EUM/CO/10/4600000746/SAT. The
dataset used and delivered to EUMETSAT consist of observations over three areas of 1x1 degree centered
in Pratica di Mare, Udine, and Cagliari as shown in figure 6.1. Observations include: 1) IASI data ; 2)
lighting data; 3) Rawinsonde data for 00:00 and 12:00 UTC, launched by CNMCA in the airforce bases
located in the three areas. Data collected and delivered cover the time period from 01 July 2007 to 30
September 2009. At time of delivery only a subset of NEFODINA is available, and it is not included in
delivery.
6.2
IASI data
For the study 13322 IASI L1C observations were collected in the areas of 1x1 degree around Pratica di Mare,
Udine Campoformido, and Cagliari Italy for the time period July 2007 – September 2009. Before inversion,
IASI data were thinned using the MAIA cloud mask. After the spectral and environmental validation
only a fraction of the retrieved profiles (showed in figures 6.2, 6.5, and 6.8), were considered usable to
derive instability indices. Number of available retrievals for each satellite overpass range from 0 to about
10. IASI spectra were PCA noise filtered for spectral validation purposes. Both original and PCA noise
filtered observations are contained in the netcdf files XXX_IASI_ORIGINAL_RAD_SUCC_RET.nc and
89
Ref.: PA/IIS/FR/2010/01
Figure 6.1: Areas of Interest
XXX_IASI_FILTERED_RAD_SUCC_RET.nc (where XXX can be PDM, UDI, CAG). An example of
the file structure as generated by ncdump can be found in Annex 1.
6.2.1
Pratica di Mare (lat : 41.65N, lon : 12.43E)
A total of 1328 spectra passed the spectral validation (described in PA/IIS/TR03/2010/01) over Pratica
di Mare, 430 (corresponding to 32.4%) were associated to morning overpasses, while 898 (corresponding
to 67.6%) were collected in the evening. Figure 6.3 shows the mean (blue curve) and the mean±std (red
curves) of the observed radiances. Figure 6.4 shows the distribution, for the whole dataset, of: 1) the Field
Of View (FOV) angle; 2) the estimated FOV water percentage; 3) the estimated FOV surface elevation;
4) the FOV IGBP Land Cover Classification; 5) the array detector detector index.
6.2.2
Udine Campoformido (lat : 46.03N, lon : 13.18E)
After the spectral validation only 702 retrieved profiles (showed in figure 6.5) were considered usable to
derive instability indices. Out of the 702 spectra, 236 (corresponding to 33.6%) were collected in the
morning overpass, while 466 (corresponding to 66.4%) were observed in the evening. Figure 6.3 shows
the mean (blue curve) and the mean±std (red curves) of the observed radiances. Figure 6.4 shows the
distribution of: 1) the Field Of View (FOV) angle; 2) the estimated FOV water percentage; 3) the estimated
FOV surface elevation; 4) the FOV IGBP class; 5) the array detector detector index. Both original and
PCA noise filtered observations are contained in the netcdf files UDI_IASI_ORIG_RAD_SUCC_RET.nc
and UDI_IASI_FILT_RAD_SUCC_RET.nc.
90
Ref.: PA/IIS/FR/2010/01
Figure 6.2: Spatial distribution of IASI observations (red dots) collected over Pratica di Mare.
Figure 6.3: Mean and Standard Deviation of IASI observations collected over Pratica di Mare.
91
Ref.: PA/IIS/FR/2010/01
Figure 6.4: Statistics of IASI observations collected over Pratica di Mare.
92
Ref.: PA/IIS/FR/2010/01
Figure 6.5: Spatial distribution of IASI observations (red dots) collected over Udine Campoformido.
Figure 6.6: Mean and Standard Deviation of IASI observations collected over Udine Campoformido.
93
Ref.: PA/IIS/FR/2010/01
Figure 6.7: Statistics of IASI observations collected over Udine Campoformido.
6.2.3
Cagliari (lat : 39.25N, lon : 9.05E)
After the spectral validation only 1262 retrieved profiles (showed in figure 6.8) were considered usable to de­
rive instability indices. IASI spectra were PCA noise filtered for spectral validation purposes. Both original
and PCA noise filtered observations are contained in the netcdf files CAG_IASI_ORIG_RAD_SUCC_RET.nc
and CAG_IASI_FILT_RAD_SUCC_RET.nc.
Out of the 1626 spectra, 340 (corresponding to 26.9%) were collected in the morning overpass, while
922 (corresponding to 73.0%) were observed in the evening. Figure 6.9 shows the mean (blue curve) and
the mean±std (red curves) of the observed radiances. Figure 6.10 show the distribution of: 1) the Field
Of View (FOV) angle; 2) the estimated FOV water percentage; 3) the estimated FOV surface elevation;
4) the FOV IGBP class; 5) the array detector detector index.
94
Ref.: PA/IIS/FR/2010/01
Figure 6.8: Spatial distribution of IASI observations (red dots) collected over Cagliari.
Figure 6.9: Mean and Standard Deviation of IASI observations collected over Cagliari.
95
Ref.: PA/IIS/FR/2010/01
Figure 6.10: Statistics of IASI observations collected over Cagliari.
6.3
Lightning
Lightning data were provided by CNMCA in ASCII format, for areas of 1x1 degree centered in Pratica
di Mare, Udine, and Cagliari. Observations were made by LAMPINET, the lightning network of Servizio
Meteorologico Aeronautica Militare. Observed variables are the Magnetic Direction Findings (MDF) and
the Time Of Arrival (TAO). Geolocation was performed through both TOA and MDF. Network coverage
is shown in figure 6.11. Issues related to some of the sensor not working properly were reported by
CNMCA personnel for the summer of 2007, in particular for the months of July and August 2007, therefore
the number of events for this time period is considered underestimated. Netcdf files with lightning data
contains: 1) the number of occurrences over the 1x1 degrees areas within the 10 hr following the rawinsonde
launch; 2) time of rawinsonde launch; 3) mean latitude and longitude of lightning events observed in
the10 hr following the rawinsonde launch; 4) standard deviation of latitude and longitude of lightning
96
Ref.: PA/IIS/FR/2010/01
Figure 6.11: LAMPINET network coverage with isolines of estimated geolocation error. Courtesy of
CNMCA.
events observed in the 10 hr following the rawinsonde launch. Delivered files are XXXX_LGT_DATA.nc
(where XXX can be PDM, UDI, CAG). File structure as generated by ncdump of PDM_LGT_DATA.nc
can be found in Annex 2.
6.3.1
Pratica di Mare
Figure 6.12 shows the time series of the lightning occurrences within the10 hr following the rawinsonde
launch over Pratica di Mare.
6.3.2
Udine, Campoformido
Figure 6.13 shows the time series of the lightning occurrences within the10 hr following the rawinsonde
launch over Udine, Campoformido.
97
Ref.: PA/IIS/FR/2010/01
Figure 6.12: Pratica di Mare: Time series of lightning occurrences in the 10 hr following the rawinsonde
launch.
98
Ref.: PA/IIS/FR/2010/01
Figure 6.13: Udine: Time series of lightning occurrences in the 10 hr following the rawinsonde launch.
99
Ref.: PA/IIS/FR/2010/01
Figure 6.14: Cagliari: Time series of lightning occurrences in the 10 hr following the rawinsonde launch.
6.3.3
Cagliari
Figure 6.14 shows the time series of the lightning occurrences within the10 hr following the rawinsonde
launch over Cagliari.
6.4
Rawinsondes
Original data provided by CNMCA were obtained by VAISALA RS-92 sonde. Since July 2005 the Italian
Meteorological Service has discontinued the use of the VAISALA RS-90 rawinsonde, to introduce the RS-92
sonde. Technical specifications of the rawinsonde are showed in the original VAISALA data sheet in figure
6.15. Observations collected at Pratica di Mare, Udine Campoformido, and Cagliari.
6.4.1
Pressure interpolated profiles
For this study the rawinsonde profiles were extrapolated up to 0.1 [hP a], and were quality controlled
for saturation and/or missing values. Original rawinsonde observations at high vertical resolution (with
100
Ref.: PA/IIS/FR/2010/01
Figure 6.15: VAISALA RS-92 specs from http://www.vaisala.com/
101
Ref.: PA/IIS/FR/2010/01
single profile measurements made every 2 sec) were pressure averaged per layer according to the following
equation:
�
�
Xi +Xi−i
N
�
2∗(Pi −Pi−1 )
Xl =
(6.1)
(Plow − Phigh )
i=1
where Xl is the atmospheric parameter to be averaged (T, WV) in the layer l, i is the i-th of the N
sublevels that divide the layer l, and Plow and Phigh are the pressure extremes of the layer in consideration.
Surface Temperature associated to the profile was randomly generated from lowest level temperature,
constrained by surface type, time of day, and latitude.
Observations collected at Pratica di Mare, Udine Campoformido, and Cagliari are stored in netcdf files
(XXX_RAWINSONDE_101L_PROF.nc where XXX can be PDM, UDI, CAG). File structure as gener­
ated by ncdump can be found in Annex 4.
6.4.2
Pratica di Mare
The 1757 rawinsondes collected at Pratica di Mare airport are stored in the PDM_RAWINSONDE_101L_PROF.n
Examples of the statistical properties of the profiles for Pratica di Mare are shown in figures 6.16 (Mean
Temperature) 6.17 (Standard Deviation of Temperature), 6.18 (Mean Water Vapor Mixing Ratio), and
6.19 (Standard Deviation of Water Vapor Mixing Ratio). Figures show profiles for different launch time
(06 : 00 U T C in blue, 12 : 00 U T C in red, 18 : 00 U T C in magenta, 00 : 00 U T C in green) and for the
whole day (black dashed line).
6.4.3
Udine, Campoformido
The 2069 rawinsondes collected at Udine Campoformido are stored in the UDI_RAWINSONDE_101L_PROF.nc
6.4.4
Cagliari
The 1858 rawinsondes collected at Cagliari Elmas airport are stored in the CAG_RAWINSONDE_101L_PROF.nc
102
Ref.: PA/IIS/FR/2010/01
Figure 6.16: Pratica di Mare: Mean of Temperature profiles interpolated on 101 levels.
103
Ref.: PA/IIS/FR/2010/01
Figure 6.17: Pratica di Mare: Standard deviation of Temperature profiles interpolated on 101 levels.
104
Ref.: PA/IIS/FR/2010/01
Figure 6.18: Pratica di Mare: Mean and Standard deviation of Water Vapor Mixing Ratio profiles inter­
polated on 101 levels.
105
Ref.: PA/IIS/FR/2010/01
Figure 6.19: Pratica di Mare: Standard deviation of Water Vapor Mixing Ratio profiles interpolated on
101 levels.
106
Chapter 7
Conclusions
This document describes the results obtained by two forecast systems for thunderstorms (events with more
than 10 lightning strikes within 11:00 and 17:00 UTC for the time period April - October) over the Po
Valley, and all the work performed to achieve the implementation of the two systems. The first system,
based on an artificial neural network trained using instability indices derived from rawinsondes launched
in Milano Linate, and Udine Campoformido between 2004-2010, led to excellent results with a prediction
PSS scores of 0.68. The second system designed to replicate the first one but with predictors derived only
from IASI level 3 (instability indices) and level 1 (radiances). The capacity of the IASI trained ANN to
predict convection was found poor and led to a final PSS of 0.21.
Poor results obtained in the generalization of the prediction of convective event from IASI data and
products, were found to be mostly dependent on the limited size of the IASI database (available retrievals
in clear sky conditions) which is a factor 10 smaller than the rawinsonde database (both for training and
testing). However a general tendency of the retrievals to overestimate low level water vapor, which led to
overestimation of the atmospheric instability, was found and should be further investigated.
Finally by focusing on a single area of interest, over Milano, Italy we were able to increase the size
of the IASI database by a factor two, and the prediction PSS score on the test set reached the value of
0.49, indicating that nowcasting of convection by IASI data over individual (smaller) areas, and therefore
with larger datasets, improves considerably. In other words the experiment over Milano demonstrated the
feasiblity and potentiality of the satellite based nowcasting system.
Besides the final scores, significance of the presented material relies on the correlation found between
some of the IASI radiances and the occurrence of convection, and on the validation of the IASI level2 and
level 3 products.
107
Authors
Paolo Antonelli is a researcher who works at 60% for SSEC of the University of Wisconsin - Madison.
His expertise is in the area of high spectral resolution data inversion and data compression.
Agostino Manzato is a scientist of OSMER ARPA FVG. His expertise ranges from meteorology to
statistical learning.
Silvia Puca is a scientist of Italian Department of Civil Protection. Her expertise is in the areas of
convection and statistical learning.
Lt. Col. Francesco Zauli is the head of the satellite group at CNMCA. His expertise is in the area of
satellite meteorology.
108
Acknowledgments
The authors wish to thank Dr. R. Stuhlmann and Dr. S. Tjemkes of EUMETSAT for their construc­
tive comments and their continuous support; Cap. A. Vocino of CNMCA for reviewing the documents
and managing the program, Mr. R. Garcia of SSEC, for is invaluable help in gathering IASI data; Cap.
Davide Melfi, Daniele Biron, of CNMCA for their kind help in gathering lighting and auxiliary data;
many thanks to Dr. A. Van Delden for reviewing the core part of this document and for taking part
to the final presentation. Activities described in this report were supported by EUMETSAT, through
grant EUM/CO/10/4600000746/SAT, and SSEC, through grants NOAA NA06NES4400002 and NASA
NNX07AK89G, for the use of IASI data in predicting instability and for main development of UWPHYS­
RET respectively. Finally many thanks go to OSMER ARPA FVG for making available precious human
resources and expertise to the project.
109
Bibliography
[1] Paolo Antonelli. Experiment on pca compression impact on iasi level 2 product for atmospheric tem­
perature and water vapor mixing ratio, and for surface temperature. Technical report, EUMETSAT,
2010.
[2] Paolo Antonelli. Statistical properties of iasi noise reconstructed after pca compression. Technical
report, EUMETSAT, 2010.
[3] Paolo Antonelli. Validation of baseline retrieval with rawinsondes. Technical report, EUMETSAT,
2010.
[4] Paolo Antonelli. Validation of level 3 products derived from vertical rawinsonde and retrieval profiles
with occurrence of convection as detected by lightnings. Technical report, EUMETSAT, 2010.
[5] Paolo Antonelli, R. Knuteson, R. Garcia, S. Bedka, D. Tobin, J.Taylor, W. Smith, and H. Revercomb.
Uwphysret an ssec inversion package for high resolution infrared data based on lblrtm. 4th Workshop
on Sounding from High Spectral Resolution Infrared Observations, Madison, WI, 15-18 September
2008, September 2008.
[6] Paolo Antonelli and Silvia Puca. Instability indices (level 3 products) derived from iasi retrievals (level
2 products). Technical report, EUMETSAT, 2010.
[7] Paolo Antonelli, H. E. Revercomb, W. L. Smith, R.O. Knuteson, L. Sromovsky, D.C. Tobin, R. K.
garcia, H. B. Howell, H.-L. Huang, and F.A. Best. A principal component noise filter for high spectral
resolution infrared measurements. Journal of Geophysical Research, 109, 2004.
[8] Ian T. Jolliffe and David B. Stephenson. Forecast verification: A practitioner’s guide in atmospheric
science. International Journal of Forecasting, 22(2):403–405, 2006.
[9] A. Manzato. The use of sounding-derived indices for a neural network short-term thunderstorm.
WEATHER AND FORECASTING, 20:896–916, 2004.
[10] A. Manzato. A verification of numerical model forecasts for sounding-derived indices above udine,
northeast italy. Weather Forecasting, pages 477–495, 2007.
[11] Agostino Manzato. A climatology of instability indices derived from friuli venezia giulia soundings,
using three different methods. Atmospheric Research, 67(68):417–454, 2003.
110
Ref.: PA/IIS/FR/2010/01
[12] Agostino Manzato. An odds ratio parameterization for roc diagram and skill score indices. WEATHER
AND FORECASTING, 20:918–930, 2005.
[13] Agostino Manzato. A note on the maximum peirce skill score. WEATHER AND FORECASTING,
22(5):1148–1154, 2007.
[14] Agostino Manzato. Sounding-derived indices for neural network based short-term thunderstorm and
rainfall forecasts. Atmospheric Research, 83:349–365, 2007.
[15] Agostino Manzato and Griffith Morgan Jr. Evaluating the sounding instability with the lifted parcel
theory. Atmospheric Research, 67(68):455–473, 2003.
[16] Plunkett and Elman. Exercises in Rethinking Innateness. MIT Press, 1997.
[17] Clive Rodgers. Inverse Methods for Atmospheric Soundings: Theory and Practice, volume 2. World
Scientific Publishing Co. Pte. Ltd., 2000.
[18] Ian H. Witten and Eibe Frank. Data mining: practical machine learning tools and techniques. Morgan
Kaufmann, 2005.
[19] MH Zweig and G. Campbell. Receiver-operating characteristic (roc) plots: a fundamental evaluation
tool in clinical medicine. Clin Chem, 39:561–577, 1993.
111

Documenti analoghi

A benthic quality index for European alpine lakes

A benthic quality index for European alpine lakes stressed by acidification and from deep lakes subjected to eutrophication; for other lake types (the ones included in the Mediterranean areas for example) and for other pressures (hydro-morphologic...

Dettagli