Experiences of variance estimation for relative poverty

Transcript

Experiences of variance estimation for relative poverty
Experiences of variance estimation for relative
poverty measures and inequality indicators in
official sample surveys
Claudia Rinaldelli
ISTAT, Servizio Condizioni economiche delle famiglie
via Adolfo Ravà 150, 00142 Roma
[email protected]
Summary. This paper reports ISTAT experience in evaluating the sampling variance of the relative poverty measures and inequality indicators
estimated by means of the Household Budget survey and the Statistics on
Income and Living Conditions survey (EU-SILC survey).
Key words: sampling variance, linearization, resampling, relative poverty,
inequality
1 Introduction
ISTAT calculates sampling variance of common estimates (frequencies,
means, totals) using the standard methodology1 [SSW92] [Wol85]; two
software procedures were developed in SAS to implement this methodology in complex sampling designs [FR98] [IST05]. ISTAT has paid special
attention to the dissemination of complex statistics in the last three years;
we mean statistics that are complex because they are estimated by complex
sample surveys and above all they are expressed by non-linear functions.
The standard methodology and therefore the current software procedures
of variance estimation can not be directly applied to these statistics. This
1
Standard methodology means a well known and widely implemented methodology; literature on sampling theory for finite population provides formulas to calculate sampling variance for the most used sampling designs and estimators.
2
C. Rinaldelli
paper reports ISTAT experience in evaluating the sampling variance for a
set of complex cross-sectional measures like the relative poverty measures
and inequality indicators estimated from the Household Budget survey
[CPR05] and the Statistics on Income and Living Conditions survey (EUSILC survey) [Rin05]. Sections 2-3 report the experience both of the
Household Budget survey and of the EU-SILC survey. Finally section 4
contains the conclusions.
2 The experience of the Household Budget survey
Official poverty estimates in Italy are calculated yearly by ISTAT using
the Household Budget survey data. A household whose monthly consumption expenditure is equal or below a threshold called the (relative) poverty
line is defined as poor. The poverty line is the monthly average per capita
expenditure for consumption of a two-member household. An equivalence
scale is used to correct the poverty line when households have different
sizes [IST02]. The described procedure means that poverty is not directly
observed on households, but it is a function of the observed sample; it actually depends on the poverty line and on the distribution of consumption
expenditures of the sampled households. The Incidence of relative poverty
for households (defined as the percentage of households with a monthly
consumption expenditure equal or below the poverty line) is a target poverty measure:
∑ I jw j
j∈S
* 100
Î pov =
Households
(1)
‘Households’ are the total resident households (a known demographic
amount), j is the index of household, S is the collected sample, wj is the
weight of household j, Ij is a binary variable defined as:
1


I j =
0


if y j≤ poverty line
(2)
otherwise
yj is the monthly consumption expenditure of household j.
The complexity of this measure affects the use of standard methodology.
This methodology can be indeed applied only assuming poverty observed
directly on households that means assuming the poverty line as a fixed
Experiences of variance estimation for relative poverty measures….
3
value. Linearization and resampling approaches were studied and tested to
take into account the whole sampling variance [DDF03] [PR03]; sampling
errors of the incidence of relative poverty were calculated making use of:
1. standard methodology assuming that the poverty line is not affected
by sampling variability;
2. linearization;
3. resampling approach by Balanced Repeated Replications technique
(BRR).
Table 1 shows the results of these approaches [DDF03] [PR03].
Table 1. Incidence of relative poverty for households and relative sampling errors
(%) by standard methodology, BRR and linearization - year 2002
inc. relative
poverty %
Piemonte
Valle d’Aosta
Lombardia
Trentino AA
Veneto
Friuli VG
Liguria
Emilia-R
Toscana
Umbria
Marche
Lazio
Abruzzo
Molise
Campania
Puglia
Basilicata
Calabria
Sicilia
Sardegna
ITALY
7.0
7.1
3.7
9.9
3.9
9.8
4.8
4.5
5.9
6.4
4.9
7.8
18.0
26.2
23.5
21.4
26.9
29.8
21.3
17.1
11.0
relative sampling errors (%)
standard methodology BRR
linearization
12.0
18.4
10.5
9.9
12.6
11.4
14.4
14.0
12.2
17.1
12.5
9.3
15.0
6.4
6.1
8.6
11.6
6.6
5.8
8.8
2.4
12.0
18.6
11.7
10.0
13.4
11.5
14.9
14.4
12.4
17.4
13.1
9.4
14.2
6.4
6.0
8.4
11.5
6.5
5.6
8.8
2.4
12.2
14.8
12.1
11.7
15.8
12.3
18.1
14.9
13.8
21.0
15.0
9.7
15.5
6.9
6.8
9.6
14.0
7.4
5.6
8.7
2.4
Applications highlighted that relative sampling errors by standard methodology are not so far from those obtained by BRR technique and linearization. Standard methodology can underestimate the sampling variance of
4
C. Rinaldelli
poverty measures because it doesn’t take into account the sampling variance of the poverty line. However, this underestimation is slight because of
the small sampling variability of the poverty line. Standard methodology
(under simplification) doesn’t produce severe differences compared to
more suitable (but more complex) techniques for variance estimation. That
is why, sampling errors of the incidence of relative poverty have been calculated by standard methodology since 2003.
3 The experience of the EU-SILC survey
EU-SILC survey2 is aimed at producing estimates on income, living
conditions, poverty. Among these, there are the EU(ropean) relative
poverty measures and inequality indicators estimated with their sampling
errors [Reg03]; most of them are complex in a double way as mentioned.
Let’s brief the EU complex statistics we deal with; introducing k as the
unit index (person), yk as the value of variable Y on unit k, wk as the
weight of unit k, Ŷß as the estimated ßth quantile of variable Y (0 ≤ ß ≤ 1):
• at Risk-of-Poverty Threshold (RPT) is the 60% of the median national
income:
RPT = 60%Ŷ0.5
(3)
• at Risk-of-Poverty Rate (RPR) is the percentage of persons (over the total population) with an income below RPT:
∑ Ik w k
RPR=
k∈S
∑ wk
(4)
*100
k ∈S
in equation (4) S represents the collected sample, Ik is a binary variable
defined as:
1
Ik =
0
if yk <RPT
otherwise
• inequality of income distribution, Gini index:
2
This survey is under European Regulation; the first year of survey was 2004.
(5)
Experiences of variance estimation for relative poverty measures….
5
last unit
last unit 
unit k

 (6)

2
yk *wk * ∑

−
w
y
*
w
∑
∑
2*

k
k

k
first unit
 k =first unit 

 k =first unit
− 1
G=100* 
 last unit
 last unit



w k  *
yk *w
∑
∑

k


=
=
k
first
unit
k
first
unit


• Gender Pay Gap (GPG) is the difference between men’s and women’s
average gross hourly earnings as a percentage of men’s average gross
hourly earnings (the population consists of all paid employees aged 1664 at work 15+ hours per week):
∑ yk w k
k∈M
∑ wk
GPG= k∈M
−
∑ yk w k
(7)
k∈F
∑ wk
k∈F
∑ yk w k
*100
k∈M
∑ wk
k∈M
in equation (7) M are the male sampled paid employees aged 16-64 at
work 15+ hours per week, F are the female sampled paid employees aged
16-64 at work 15+ hours per week;
• Relative Median at Risk-of-Poverty Gap (RPG) is the difference between RPT and the median income of poor units, expressed as a percentage of RPT:
(8)
poor
RPG =
RPT − Ŷ0.5
RPT
*100
poor
in equation (8) Ŷ0.5 is the estimated median income of poor units (units
with Ik=1, see equation (5));
• Income Quintile Share Ratio (QSR) is the ratio of total income received
by 20% of the country’s population with the highest income (top quintile) to that received by 20% of the country’s population with the lowest
income (lowest quintile):
6
C. Rinaldelli



 ∑ y k w k  /  ∑ w k 
  k∈T 
QSR =  k∈T



 ∑ y k w k  /  ∑ w k 
 k∈L
  k∈L 
(9)
in equation (9) T is the set of sampled units with yk> Ŷ0.80 and L is the set
of sampled units with yk ≤ Ŷ0.20 .
Linearization and resampling approaches were tested to evaluate the sampling errors of these complex measures [MR05] [MPR06]. First of all, four
of these measures (at risk of poverty threshold, at risk of poverty rate, Gini
index, gender pay gap) were linearized by estimating equations and Taylor–Woodruff methods [KB97] [Dev99]; then, a resampling approach by
BRR technique was as well considered with the aim of experimentally3 obtain sampling errors both by linearized variables and BRR [Mcc69] [PR03]
[SSW92]. The results of these applications (reported in Table 2) show that
these different approaches lead to similar values. Linearization of ‘Relative
median at risk of poverty gap’ and ‘Income quintile share ratio’ were provided by EUROSTAT. It recommended the use of this approach for all the
EU measures. Sampling errors of the EU measures have been calculated
using linearization since 2005 taking into account the results in Table 2,
EUROSTAT recommendation and the computational workload of resampling.
4 Conclusions
Two different solutions were adopted for estimating the sampling variance
of the described relative poverty measures and inequality indicators; the
use of standard methodology under simplification was preferred in the
Household Budget survey and a linearization approach was implemented
in the EU-SILC survey. In addition to the results reported in sections 2-3,
the following reasons were taken into account to pick out a satisfactory
variance estimation solution:
1. Household Budget survey has been carried out since 1997 therefore
the use of standard methodology (under simplification) enabled to not
severely change the current data process; otherwise, the planning of
3
Data from ECHP (European Community Household Panel survey) were used because those from EU-SILC were not available.
Experiences of variance estimation for relative poverty measures….
7
the new survey (EU-SILC) enabled to implement a more complex
approach in the data process;
2. more complex measures are estimated in the EU-SILC survey versus
the one disseminated in the Household Budget survey; simplification
can be reasonable for one measure but not for a whole set of complex
statistics;
3. EU-SILC survey is carried out under European Regulation therefore
EUROSTAT recommendation has to be considered.
Table 2. Relative sampling errors (%) of EU measures by linearization and BRR
At risk of poverty threshold
At risk of poverty rate
Gini index
Gender pay gap
Relative median at risk of poverty gap
Income quintile share ratio
relative sampling errors (%)
linearization
BRR
1.72
1.73
3.55
3.53
1.71
1.60
30.51
32.61
--6.61
--2.88
References
[CPR05] Coccia, G., Pannuzi, N., Rinaldelli, C.: Poor and non poor households:
the estimation from sample surveys. In: Book of Short papers,
Cladag2005. MUP editore, 69-72 (2005)
[DDF03] De Vitiis, C., Di Consiglio, L., Falorsi, S., Pauselli, C., Rinaldelli, C.: La
valutazione dell’errore di campionamento delle stime di povertà relativa.
Final report, Conference ‘Povertà Regionale ed Esclusione Sociale’,
Roma 17-12-03 (2003)
[Dev99] Deville, J.C.: Variance Estimation for complex statistics and estimators:
linearization and residual techniques. In: Survey Methodology, 25, 2,
193-203 (1999)
[FR98] Falorsi, S., Rinaldelli, C.: Un software generalizzato per il calcolo delle
stime e degli errori di campionamento. In: Statistica Applicata, 10, 2,
217-234 (1998)
[IST02] ISTAT: La stima ufficiale della povertà in Italia 1997-2000. Argomenti
24 (2002)
[IST05] ISTAT: GENESEES 3.0 Manuale Utente e Aspetti Metodologici (2005)
[KB97] Kovacevic, M.S., Binder, D.A.: Variance estimation for measures of income inequality and polarization – The estimating equations approach.
In: Journal Official Statistics, 13, 1, 41-58 (1997)
8
C. Rinaldelli
[Mcc69] McCarthy, P.J.: Pseudoreplication: Further evaluation and application of
the balanced Half-Sample Technique. In: Vital and Health Statistics, Series 2, 31, National Center for Health Statistics, Public Health Service,
Washington, D.C. (1969)
[MR05] Moretti, D., Rinaldelli, C.: Variance estimation for relative poverty
measures and inequality indicators from complex sample surveys. In:
Atti del Quarto Convegno S.Co.2005, Cluep editrice Padova, 67-72
(2005)
[MPR06] Moretti, D., Pauselli, C., Rinaldelli, C.: La stima della varianza cam
pionaria di indicatori complessi di povertà e disuguaglianza. In: Statistica Applicata, to be printed (2006)
[PR03] Pauselli, C., Rinaldelli, C.: La valutazione dell’errore di campionamento
delle stime di povertà relativa secondo la tecnica Replicazioni Bilanciate
Ripetute. In: Rivista di Statistica Ufficiale, 2/2003, 7-22 (2003)
[Reg03] Regulation (EC) No 1177/2003 of the EUROPEAN PARLIAMENT and
of the COUNCIL of 16 June 2003 concerning Community statistics on
income and living conditions (EU-SILC), 3.7.2003 L 165/1 Official
Journal of the European Union
[Rin05] Rinaldelli, C.: Statistiche complesse e software. In: Statistica & Società,
3, 2, 01.2005, 27-29 (2005)
[SSW92] Särndal, C.E., Swensson, B., Wretman, J.: Model assisted survey sampling. New York Springer-Verlag (1992)
[Wol85] Wolter, K.M.: Introduction to variance estimation. New York SpringerVerlag (1985)