Public Service Spending: Efficiency and Distributional Impact — Lessons from Asia

Efficiency and equity are cornerstone concepts in rational service delivery in the public sector. This paper benchmarks efficiency and equity in public spending on health, education and social protection in a broad group of Asian Development Bank (ADB) member economies with varying levels of development. We describe public expenditure trends in health, education and social protection in the region. Following Herrera and Pang (2005), we conduct a formal efficiency benchmarking exercise using Data Envelopment Analysis and available input and output data from WDI, GFS, and ADB databases to deconstruct each member economy’s efficiency changes in health and education spending. We next turn to review service provision inequality within ADB economies using utilization rates and benefit incidence, and note the deficiency of pro-poor spending in some sectors.

Change in Government Education Spending (% of GDP) 4 5 Change in Government Education Spending (% of total expenditure) 5 6 Change in Government Health Spending (% of GDP) 5 7 Change in Government Health Spending (% of total expenditure) 6 8 Change in Government Social Security and Welfare Spending (% of GDP) 6 9 Change in Government Social Security and Welfare Spending (% of total expenditure) 7 10 Constant Returns to Scale and Variable Returns to Scale Data Envelopment Analysis 14 11 Illustration of Data Envelopment Analysis Methodology 15 12 Illustration of Data Envelopment Analysis, Variable Returns to Scale 17 13 Illustration of Input and Output Efficiency 18 14 Health Expenditure Efficiency by Region, 1995 The two major goals of public service spending are: (i) achieving targets (i.e., MDG goals) at the lowest cost through an optimal mix of inputs, that is, efficiency; and (ii) ensuring that public services reach those who need them most, such as the poorest segment of the population, or equity. Decisions of policymakers on where and how to spend have important implications on both efficiency and equity.
Efficiency consists of two components-allocative efficiency and technical efficiency. Allocative efficiency looks at finding the cost minimizing mix of inputs to achieve a certain level of output. For instance, many studies in the 2006 Disease Control Priorities Network publication indicated that public health interventions are more cost-effective than curative-inpatient and outpatient visits. Technical efficiency looks at minimizing the total cost of inputs to achieve a given level of output. Most of the literature on the efficiency of public expenditure focuses on analyzing this concept, particularly on the cost differences of achieving a certain level of output which may be different across countries or across subcategories in a given sector.
In many countries, the public sector is heavily involved in providing education and health care. Hence, to maximize return, it is crucial that countries provide these services at a certain level and quality, and at the minimum cost.
There is also concern regarding equity in public spending. On the one hand, this reflects the belief that the government's role in society involves some redistribution toward groups and areas that require development and aid. On the other, there is an increasing realization that allocative efficiency can be improved by allocating public money toward the poor or neglected groups 1 . This paper is organized as follows. Section II presents trends on public expenditure in Asia. Section III shows a framework for analyzing expenditure efficiency using data envelopment analysis. Section IV looks at the distribution of public spending, and Section V concludes.

II. TRENDS IN PUBLIC SPENDING IN ASIA
On average, the share in the total budget of government expenditure devoted to goods and services declines in 2000 and 2010 ( Figure 1). Notable decreases are observed in the Kyrgyz Republic, Pakistan, and Maldives. Plotted against the share of income per capita, the share decreases as income rises. Countries that are spending below the average for their income group are Nepal, India, Indonesia, and Japan. Outlier countries, where budget shares are twice or thrice those of other countries with similar incomes are Afghanistan and Maldives. 1 Reallocating expenditures and resources across rich and poor districts to lower average cost of provision is a theme in many Public Expenditure Reviews. Note: x-asis log scale. See Appendix Table 3 for the definition of the codes. Source: World Bank, World Development Indicators (accessed 1 June 2013).
At least half of Asian countries spend less than 5% of their gross domestic product (GDP) on health ( Figure 2). Studies (James, C. D., D. Bayarsaikhan, and H. Bekedam 2010;WHO 2010) have shown that on average, countries that spend 5% and above of their GDP on health achieve better financial risk protection and exhibit good population health outcomes. Countries spending above 5% are predominantly island countries like the Marshall Islands, Tuvalu, Kiribati, and Palau as well as Organisation for Economic Co-Operation and Development (OECD) countries like the Republic of Korea, Australia, and Japan. Countries with the lowest total health expenditure relative to their GDP per capita are Bangladesh, Pakistan, and Singapore. Spending increases with income for both total and public spending on health.  Table 3 for the definition of the codes. Sources: World Bank, World Development Indicators; World Health Organization, Global Health Expenditure database (both accessed 1 June 2013).
Unlike health, education expenditure as a share of GDP remains relatively constant across incomes ( Figure 3). Asian countries who are spending above the average are Solomon Islands, the Kyrgyz Republic, and Mongolia; while Pakistan, Armenia, and Azerbaijan spent the lowest, compared to other economies at the same income levels. Figures 4 through 9 illustrate the changes in public expenditures in education, health, and social protection in various economies. Spending on public education show some convergenceeconomies starting with a larger share before 2005 tend to see their spending (as a share of GDP or as a share of total government outlay) shrink and those with a smaller share tend to rise. Timor-Leste, Armenia, and Samoa's education share (GDP and total budget) grew over the decade, while Mongolia, Brunei Darussalam, and Azerbaijan's fell.  Most economies in the World Development Indicators database experienced an increase in health spending as a percent of GDP, even those starting with a larger share of spending of either GDP ( Figure 6). The countries that showed the largest increases are Thailand, Armenia, Samoa, and Kiribati; while Brunei and Mongolia showed the largest contractions in health spending as a percent of GDP. As shares of government expenditure, Thailand, Cambodia, the Republic of Korea, and Kiribati showed the highest growth in shares, while health seemed to be a diminishing priority of governments in Mongolia and Maldives (Figure 7).  Government social protection spending 2 showed very modest increases for most economies (Figures 8,9). Except for Mongolia, most economies experienced only a slight increase in social protection spending during the last decade. Sri Lanka and Azerbaijan exhibited contractions in the share of social protection in their budget and as a percent of GDP.

III. EFFICIENCY OF PUBLIC EXPENDITURE
There are a number of empirical benchmarking studies that focus on the efficiency of public expenditure on health and education. Most studies were concerned with expenditure efficiency (Herrera and Pang 2005) or technical efficiency using physical inputs (Afonso and St. Aubyn 2004). Hollingsworth (2003) wrote an exhaustive review about efficiency studies on health care. Thus, this section will focus only on recent empirical studies on both expenditure efficiency and technical efficiency. Most of the efficiency studies focused on the relative efficiency at the cross-country level, with very few conducting singlecountry analysis. An overview of selected studies is presented in Table 1.
Expenditure Efficiency Gupta and Verhoeven (2001) examined the efficiency of government expenditure on health in 37 African countries from 1984 to 1995. Using free disposal hull (FDH), they calculated efficiencies of African countries relative to each other, and relative to other countries in Asia and the Western Hemisphere. Per capita education and health spending by the government in purchasing power parity (PPP) terms was taken as the input measure. Health output measures included life expectancy; infant mortality; and diphtheria, pertussis, and tetanus (DPT) and measles immunization rates. Education output measures were primary school enrollment, secondary school enrollment, and adult illiteracy. They found that there was a wide variation in the way government spending impacted on health and education outcome indicators. While government expenditure was associated with relatively high educational attainment for Zambia, Guinea, Ethiopia, and Lesotho, the same was not true for Botswana, Cameroon, Cote d' Ivoire, and Kenya. They also concluded that on the average, African countries were less efficient in providing health and education services compared to countries in Asia and the Western Hemisphere. De Sijpe and Rayp (2004) estimated government efficiency for 52 developing countries using Data Envelopment Analysis (DEA). Their input measure was central government expenditure per capita (in PPP). Outputs were infant mortality, immunization against measles, youth illiteracy rates, secondary enrollment, and government effectiveness. To allow for a lagged effect of public spending, they averaged expenditures over the period 1990-1994 and evaluated the outputs in the second half of the 1990s. Input efficiency score for the countries in the sample was on the average 0.50 implying that output indicators could be increased by 50%, keeping inputs fixed. The People's Republic of China (PRC) and the Russian Federation were identified to be countries in the frontier, followed closely by Sri Lanka and Thailand. To explain efficiency scores, they also estimated a semi-log model, where efficiency score was the independent variable. Explanatory variables included GDP per capita, percent of population aged 0-14, private health expenditure per capita, urban population, perception of corruption, rule of law, political stability, voice and accountability measures, ethnic fractionalization, political rights, civil liberties, political constraints, dummy for armed conflict, Official Development Assistance (ODA) per capita, dummy for the International Monetary Fund (IMF) program, money and quasi money growth, liquid liabilities as percent of GDP, export of goods and services as percent of GDP, and foreign direct investment (FDI) inflows as percent of GDP. They concluded that efficiency was affected primarily by governance and political variables such as rule of law and political instability. Also, countries with high youth population, high adult illiteracy, and low private health spending found it difficult to register good health and education outcomes. Finally, economic variables, such as trade openness, FDI inflows, and ODA, did not seem to affect the efficiency of countries in providing services. Afonso and St. Aubyn (2004) examined the efficiency of expenditures in the health and education sectors for a sample of OECD countries using FDH and DEA. They estimated efficiency frontiers using two kinds of inputs: (i) expenditure and (ii) quantifiable input measures such as instruction time in hours per year for 12-14 years old, number of teacher per student in public and private institutions for secondary education, inpatient beds, medical technology indicators, and health employment. Output for education was measured by the performance of 15-year-old students in reading, math, and science literacy scales in 2000, while infant mortality and life expectancy were used as health outputs. They found that in general, DEA and FDH results were not very different, with efficient countries in DEA being a subset of those identified as efficient under FDH. Another finding was that, efficiency attainments were different when the measurement of input was in terms of financial resources or physical inputs. For instance, among OECD countries, Sweden was efficient when inputs were measured in physical terms, but became inefficient when measured in expenditure terms due to relatively higher prices in the country. On the other hand, the Czech Republic and Poland were shown to be spending efficiently, but were not technically efficient. The reason cited was cheaper cost of labor in the two countries; thus, they became frontier countries when inputs were measured in financial terms. Afonso, Schuknecht and Tanzi (2005) computed public sector performance and public sector efficiency indicators for 23 OECD countries using the FDH. Included in their indicators were secondary school enrollment and educational achievement for the education sector, and infant mortality and life expectancy for the health sector. The United States, Japan, and Luxembourg were identified as the most efficient countries in utilizing public expenditures in producing social services outcomes.
Using DEA and FDH in the first stage, Herrera and Pang (2005) examined the efficiency of public spending in providing social services among developing countries. The input was public expenditure on health and education. Output indicators for education were primary school enrollment, secondary school enrollment, first and second level completion rates, and learning scores. Health output indicators were life expectancy at birth, DPT and measles immunization rates, and disability-adjusted life expectancy (DALE). They used Tobit analysis in the second stage to explain variations in efficiency. Among the variables used in explaining efficiency scores were wages and salaries as percent of total public expenditure, total government expenditure as percent of GDP, share of publicly financed expenditure in health and education, constant GDP per capita, urban population, Gini coefficient, ODA as percent of fiscal revenue, and prevalence of AIDS. Their main conclusion was that countries found to be inefficient usually had higher expenditure levels and wage bills, higher ratios of public to private financing of services provision, and inequality levels as well as high aid dependency ratios.
Among the more popular application of parametric methodologies was the worldwide assessment of the effectiveness of health care delivery carried out by the World Health Organization (WHO) and presented in its World Health Report in 2000. Based on the study of Evans et al. (2000), the report presented a ranking of productive efficiencies of health care systems in 191 countries. Evans et al. (2000) used a fixed effects stochastic frontier methodology for a 5-year panel covering the period 1993-1997. Per capita public and private health expenditures and average years of education of the population were used as inputs, and two measures of health care attainment DALE and a composite measure of health care delivery were used as outputs. 3 They found that Oman was the best performer in terms of DALE, while France performed best in health care delivery composite. Among their conclusions was that contrary to the popular belief that the PRC and Sri Lanka were efficient in providing health, they were in fact performing poorly compared to other developing countries. Poorest performers were those that had civil unrest during the study period and those with high AIDS prevalence.
The WHO report (2000), and subsequently the study of Evans et al. (2000), were met with many criticisms. One of the major criticisms was that the fixed effects model used did not capture the heterogeneity of the countries in the data. The wide variation in cultural and economic characteristics of the sample of countries produced a large amount of unmeasured heterogeneity in the data (Greene, 2003a). Hollingsworth and Wildman (2003) reestimated the rankings with DEA and Stochastic Frontier Analysis (SFA) using the same dataset. They noted that the WHO estimation procedure was too narrow so that contextual information was hidden by the use of only one method. The sample was also stratified by the OECD and non-OECD membership to determine the impact of more developed countries in the sample. They concluded that non-OECD countries showed more variation than OECD countries; therefore, it was important that the whole sample be divided into countries with similar characteristics.
Greene (2003a) also reestimated the study using the same dataset with recently developed alternatives to the SFA, which allowed for the incorporation of heterogeneity. He found that the results substantially differed with the WHO estimates when heterogeneity was taken into account. With DALE as an output, Japan was identified as the best performer rather than Oman, while Greece was identified as the best performer in health delivery composites instead of France. Such conflicting findings illustrate the difficulty of analyzing cross-country data.
One of the few studies that focused on an individual country was the study of Sampaio de Sousa and Stosic (2005) on Brazilian municipalities. Using DEA and FDH, they evaluated how public resources were utilized by local governments in a decentralized environment. Their input indicators were current spending, number of teachers, infant mortality rate, and hospital and health services. The output indicators were literate population, enrollment per school, student attendance per school, students who got promoted to the next grade per school, students in the right grade per school, and households with access to safe water, sewerage system, and garbage collection. Their main conclusion was that smaller municipalities tended to be less efficient than the larger ones.

A. Data Envelopment Analysis
The most used nonparametric approach for benchmarking is the data envelopment analysis (DEA). Two decades after Farrell's (1957) proposal of a piecewise linear convex hull approach to frontier estimation, a study by Charnes, Cooper, and Rhodes (1978) found a method of estimation. DEA involves the use of linear programming techniques to determine which firms form an envelopment surface or efficient frontier. Firms are considered efficient if there are no other firms, or linear combination of firms, which produce more of at least one output (given the inputs) or use less of at least one input (given the outputs). The firms that lie on the surface are considered efficient, whereas the firms below the surface are termed inefficient, and their distance to the frontier provides a measure of their relative efficiency or inefficiency.
The Charnes, Cooper and Rhodes (1978) original specification of the ratio form of DEA was: (1) The relative performance of a unit (referred to as a decision making unit or DMU in DEA literature) was evaluated based on observed performance of other units j=1,2,….n.
Observed amounts of output and input were represented by r and j, respectively. In the specification, were constants representing outputs and inputs of the jth firm which utilize these i=1,2,….m inputs to produce r=1,2,…..s outputs. The u's and v's are variables of the problem and were constrained to be greater than or equal to some small positive quantity  in order to avoid any input or output being ignored in computing the efficiency. The solution to the above model gave a value h 0 , the efficiency of the unit being evaluated. If h 0 = 1, then the unit was efficient relative to the others. But if it was less than l then some other units were more efficient than this unit, which determines the most favorable set of weights. This flexibility was viewed as a weakness because even the judicious choice of weights by a unit, which is unrelated to the value of any input or output, may allow a unit to appear efficient. Another problem was, it has an infinite number of solutions, such that if (u*,v*) was a solution, ( ) can also be a solution (Coelli 1996).
Public Service Spending: Efficiency and Distributional Impact-Lessons from Asia | 13 To avoid both problems, the 1981 study of Charnes et al. imposed the constraint which led to the following specification: Using the duality in linear programming, an equivalent envelopment form of this multiplier form was derived: The dual variable 's of the envelopment form were the shadow prices related to the constraints limiting the efficiency of each unit to be less than or equal to 1. The value solved will be the efficiency score for the jth firm, with a value of 1, indicating a point on the frontier and therefore, a technically efficient firm, following Farrell's definition. The linear programming problem was solved n times, once for each firm in the sample, and an efficiency value was obtained for each one. This envelopment form was specified as an input orientation which assumed constant returns to scale (CRS). 4 The CRS assumption is appropriate only when all firms are operating at an optimal scale. Many factors, such as imperfect competition and financial constraints, however, may lead a firm not to operate on the optimal scale (Coelli 1996). In their 1984 paper, Banker, Charnes, and Cooper suggested an extension of the CRS-DEA model to account for situations with variable returns to scale (VRS). The specification was: 4 The envelopment form of the model is generally the preferred form to solve DEA.
The additional constraint imposed ensured that a firm was compared against other firms with similar size. The use of CRS-DEA even when firms were not operating on an efficient scale resulted in technical efficiency (TE) measures that included scale efficiencies (SE). The use of VRS-DEA allowed the separation of TE and SE.
Figure 10 below illustrates CRS and VRS concepts. The CRS frontier is the straight line OC with firm F being on the efficient frontier. The line VV is the VRS frontier which allows the optimal level of outputs and inputs to vary with the size of the firm in the sample.
Public Service Spending: Efficiency and Distributional Impact-Lessons from Asia | 15 The DEA approach will be illustrated using hypothetical data for provinces in Table 2. The provinces in the sample have a medical staff ranging from 100 to 300 and barangay health stations (BHS) from 100 to 600. The number of treated patients in 1 month ranges from 50 to 150. To compare the five provinces, the inputs were translated into the number of treated mothers and children per input (represented by columns 4 and 5 in the table). Given variations in inputs and outputs, it is difficult to facilitate comparison by numbers alone. The figure below ( Figure 11) plots the data for medical staff and health stations per treated mothers and children. Provinces 1, 4, and 5, which are closest to the origin, are identified as the most efficient, meaning they are able to treat the most number of patients given the relatively smaller inputs. Provinces 2 and 3 are considered less efficient because, when compared to other provinces, they can still reduce their input use given their current output. The TE score of province 2 is shown in the figure as line segment 2'2. Its numerical value is 0.67, meaning the province could reduce its input usage by 33% and still treat 200 patients. If it were to operate at the hypothetical point 2', it needs to reduce its medical staff to 200 and its health stations to 400. Point 2' is derived from the combination of provinces 1 and 5, the provinces with the most similar production structure to province 2. Identification of these "peers" is one of the advantages of DEA for benchmarking purposes because it allows comparison of similar units.
Province 3 is another inefficient unit that needs to reduce its input usage. Province 4 has the most similar production characteristics to province 3, among samples in the data. Province 3 has a TE score of 0.75 meaning its input use has to be reduced by one-fourth to reach the hypothetical point 3'. Medical staff at point 3' is 187.5 and the number of health stations is 75. However, point 3' is not yet fully efficient because the number of medical staff can be reduced further similar to that of province 4, while keeping the number of health stations constant. Thus, to fully maximize its efficiency, Province 3 has to reduce one of its inputs by more than one-fourth-medical staff has to be reduced further by 37.5 making it 150. This reduction is called input "slack" in DEA literature.
It is easy to illustrate and compute the methodology when the data structure is simple, in this case, two inputs and one output. In the presence of economies of scale and multiple outputs and inputs, the analysis becomes more complicated that it becomes necessary to use automated computer processes.
The illustration presented above depicts efficiency scores assuming CRS, implying that the size of the province is not relevant to its efficiency. This assumption might not be relevant to the health sector because of the presence of overheads. The VRS assumption of DEA allows the frontier to vary with the size of the units in the sample. This concept will be illustrated using a hypothetical data on rural health centers (RHCs) presented in Table 3.  Figure 12 below illustrates both CRS and VRS frontiers using one input and one output. The CRS frontier is represented by the line 0C which depicts the highest attainable output when the size of the health center is not considered. Line V'V is the VRS frontier. It passes through health centers 1, 3, and 5, the units with the highest prenatal care to medical staff ratios, given adjustments in the size of health centers. The distance between the VRS and CRS frontiers represents the scale efficiency of each unit.
It should be noted that the CRS-TE can be obtained by multiplying the VRS-TE and the SE. It can be inferred that the CRS-TE is decomposed into pure TE (the VRS-TE) and SE. From Table 4, it can be seen that only health center 3 is efficient in either assumption, implying that it has the optimal scale among the samples. Health centers 1 and 5 are technically efficient, but are scale inefficient, with RHC 1 operating under increasing returns to scale and RHC 5 operating under decreasing returns to scale. RHC numbers 2 and 4 are both technical and scale inefficient.  Increasing returns to scale implies the possibility for increasing size because when its inputs are doubled, the resulting output is more than doubled. Decreasing returns to scale, on the other hand, implies that the unit is operating above the optimal level-an increase of one input will lead to a less than one increase in output. This suggests that these units are potential candidates for downsizing.
DEA is also able to calculate both input efficiency scores and output efficiency scores. Input efficiency implies finding the least amount of input that can produce a given output level. Thus, when the major concern is to save cost, estimating input efficiency scores is a good choice. On the other hand, output efficiency means finding the highest possible output that can be attained without having to increase any of the inputs. Input and output efficiencies are illustrated in Figure 13 below using RHC 2 as an example. Assuming VRS, input efficiency of RHC 2 is represented by the ratio of distances . Output efficiency is given by the ratio . Units identified as efficient will remain as efficient regardless of the orientation chosen. For inefficient units, however, the TE values will be different. For instance, the input efficiency score of RHC 2 is 0.625, while its output efficiency score is 0.545. The input efficiency score implies that RHC can reduce its medical staff by 37% and still give prenatal care to its current level of 20 mothers. Output efficiency score means the number of mothers given prenatal care can still increase by 45% given its current medical staff of 4.
The peers for RHC 2 are also different. Under input orientation, the peers of RHC 2 are health centers 1 and 3, and under output orientation, they are 3 and 5. Health center 3, being a peer of both orientations implies that health centers 2 and 3 have very similar production compositions. Fried, Lovell, and Schmidt (1993); Coelli, Rao, and Battese (1998);and Bhat, Verma, and Reuben (2001) provide a list of questions which DEA can help answer:  How can appropriate models which will serve as benchmarks be selected?  Which units are the most efficient?  If all the units were to perform according to the best practice, how much more output could be produced and how much inputs could be reduced?  What are the characteristics of the efficient units?  What is the optimum scale of operations and how much can be saved if the units are operating at the optimal point?  How can the differences in scale of operations be accounted for in performance benchmarking? By calculating the relative efficiencies of each unit, DEA is useful in identifying efficient units that will serve as models, which other units can follow. By providing input and output projections, policymakers can be guided as to what the appropriate targets are to improve performance. Bowlin (2000) outlined the advantages and disadvantages of DEA. The advantage of DEA is that unlike traditional regression approaches, it does not require explicit specification of the functional forms relating inputs to outputs. More than one cost or production function is admitted, and the solution can be interpreted as providing a local approximation to whatever function is applicable based on outputs and inputs of the firms being evaluated. DEA is therefore more flexible in recognizing differences in production functions between firms. Second, DEA is oriented toward individual firms in which it conducts n optimizations, one for each firm, in place of the single optimization that is usually associated with the regressions used in traditional efficiency analyses. Thus, the solution obtained from DEA is unique for each firm under evaluation. Third, a deficiency of all of the regression approaches is the inability to identify sources of inefficiency and to estimate the amounts of inefficiency associated with these sources. There is, therefore, no inference as to how corrective action will be provided even when the results show that inefficiencies are indeed present. DEA provides both the sources (input and output) and amounts of any inefficiency. Finally, DEA can also examine the effect of environmental variables which can further enhance the analysis when comparing heterogeneous firms (Yaisawarng and Klein 1994).
Among the drawbacks of DEA is that there is no consideration of random error or an " " term in the model as there is in regression. Thus, DEA may tend to confuse random fluctuations with inefficiencies represented in the data, and the estimations lack statistical properties making hypothesis testing impossible. Also, since a subset of the available data defines the efficient frontier, while the rest of the observations have no impact on the shape of the envelopment surface, the results are very sensitive to measurement errors in the frontier firms. Further, the number of efficient firms on the frontier is sensitive to the number of inputs and outputs. As the ratio number of variables/sample size grows, the ability of DEA to discriminate among firms is sharply reduced, because it becomes more likely that a certain firm will find some set of weights to apply to its outputs and inputs, which will make it appear as efficient (Yunos and Hawdon 1997). For instance, a number of firms might be labeled 100% efficient not because they are really more efficient than other firms, but just because there are no other firms or combinations of firms against which they can be compared with when there are many dimensions of comparison.
DEA was chosen as the main methodology in this paper because based on previous studies, it is more suitable to the health sector compared to Stochastic Frontier Analysis (SFA). Many studies compared DEA and SFA using simulated data. Gong and Sickles (1992) found that SFA performs better than DEA when the technology and inefficiency distribution closely follow what was used in the data generating process. However, when the underlying technologies and efficiency distributions are unknown, DEA performs better than SFA. Even in the presence of heteroscedasticity, DEA-based estimators are bound to give better results (Banker, Chang, and Cooper 2004). According to Resti (2001), DEA also has a better performance when the dataset is composed of small samples.

B.
Malmquist Data Envelopment Analysis 5 With panel data, it is possible to run a variant of DEA which calculates Malmquist Total Factor Productivity (TFP) index to measure productivity change in order to provide information about how 5 This section is based on the discussion of Malmquist index in Coelli, Rao, and Battese (1998).
 the best practice frontier has moved over time. The Malmquist TFP index measures the TFP change between two data points by calculating the ratio of the distances of each data point relative to a common technology. The Malmquist (input oriented) TFP change index between period s (the base period) and period t is given by (5) where the notation (y t , x t ) represents the distance from the period t observation to the period s technology. A value of M i greater than one will indicate positive TFP growth from period s to period t while a value less than one indicates a TFP decline. Equation (6) is the geometric mean of two TFP indices, with the first evaluated with respect to period s technology, and the second with respect to period t technology. This index can also be specified as: where the ratio outside the square brackets measures the change in the input-oriented measure of TE between periods s and t. Thus, efficiency change is the ratio of the TE in period t to the TE in period s. The ratios inside the brackets measure technical change or frontier shift, which captures the shifts in technology frontier. It is the geometric mean of the shift in technology between the two periods, evaluated at x t and also at x s . To specify the decomposition,

THE EFFICIENCY OF SOCIAL SERVICES EXPENDITURE IN ASIAN COUNTRIES
Until a country reaches a level of social services provision that is accessible to everyone, the correlation between expenditures and outcomes is high. If there are major inequalities in income within the country, the poorer segment of the population can only rely on government facilities. Thus, when the government cuts its expenditure on health and education, the ones in great need are gravely affected.
This section attempts to measure the efficiency of Asian countries in utilizing public resources for health and education. A major problem in many countries is the allocation of scarce government resources to provide social services in the most efficient manner. The importance of examining public sector expenditure efficiency is particularly pronounced when countries are experiencing fiscal deficits. When services are publicly provided, performance measurement becomes an inevitable management tool. The government needs to identify poorly performing units since market mechanism cannot cut them out. When inefficiency continues, the constituents of that inefficient unit suffer. The

A. Data
The input utilized is health and education expenditure per county. The outputs considered are health and education outcomes such as life expectancy, and DPT and measles immunization rates. Data for health is available for the period 1995-2010. For education, the outputs are the completion rates for primary, and secondary education (Table 5). Due to data availability issues in education completion data, we averaged years from 2006 to 2012 to perform the DEA.

B. Issues in the Estimation
Comparison of countries suffers from factor heterogeneity problems. A main concern in this paper is that financially richer countries will have better outcomes. Tobit analysis on the efficiency scores will be conducted to see the effects of these variables.
When there is higher private expenditure on health and education in some countries, and the private sector provides better service, the efficiency score might be higher. As proxy for private expenditure, the percentages of private spending on health and education are included in the Tobit analysis.
Finally, as with most studies on health and education, the indicators used for outcomes or outputs might not necessarily reflect how healthy the people are or how much actual learning takes place in the country. It should be emphasized that it is not simple to identify the effects of public sector spending on outcomes accurately. It is difficult to assess to what extent higher life expectancy becomes a benchmark due to government intervention as opposed to other factors such as dietary habits and healthy lifestyle (Afonso, Schuknecht, and Tanzi 2005).

C. Health and Education Efficiencies
In 2010, Asian countries could use an average of 93% of their budget to attain the current level of health outcomes. In terms of output efficiency, an average score of 96% implies that with its current level of health per capita, the three outcome indicators can be increased further by 4% (Table 6). Singapore, Fiji, Vanuatu, and Thailand dominate the DEA frontier for health expenditure estimation. On average, countries could have spent only 78%-98% of their budget to achieve current life expectancy, measles and DPT immunization rates. Average output efficiency score of 0.96 implies that with the same level of expenditure, health outcomes can be increased by 4% if all countries are expenditure efficient.
Countries like Brunei Darussalam, the Republic of Korea, Palau, and Sri Lanka are countries that were tagged as efficient only because they don't have peers in the group. The methodology is unable to pick up whether they are indeed efficient or just have different input and output structures compared to the rest. Appendix Table 1 contains the year-by-year DEA results in Health from 1995 to 2010, and Appendix Table 2 contains the list of peer economies.
For education, the DEA was performed using the average over 2006 to 2012 due to sparseness of data for education completion rates (Table 7). For the countries included in the analysis, inputoriented DEA says this set of countries overspend by 27% to achieve its average level of output over this time period. In terms of output, it can raise its output (here, a combination of primary and lower secondary school completion rates) by 6%. The DEA method tagged Bangladesh, Nepal and Cambodia as efficient among its peers using the Input-oriented DEA but not in the output orientation. The output oriented method tagged Maldives and Samoa as efficient among its peers but not in the input method.  Table 8 below shows the most efficient and least efficient countries by outcome and by methodology. What drives these efficiency scores will be explored further in the Tobit analysis.

D.
Efficiency Change over Time Health input efficiency scores are aggregated by region for the period 1995-2010 ( Figure 14). The region with the highest input efficiency score is East Asia, followed by the Pacific, and Southeast Asia. South Asia and Central Asia are regions where the most input-inefficient countries are located. The Malmquist index makes it possible to distinguish whether the shift in frontier in different periods are due to changes in efficiency or changes in technology. The Malmquist TFP calculations are based on DEA-like linear programs, as described in the methodology section. The Malmquist TFP result for health expenditure is presented below (Table 9). The overall TE change (shown in column 2) represents changes in TE (position relative to the frontier), and this is made up of pure TE change (column 4) and SE change (column 5). The technical change index number (column 3) indicates how far the frontier against which TE is assessed has moved (frontier shift). Overall TFP growth (column 6) is a combination of TE change (column 2) and frontier shift or technical change (column 3).
The interpretation of the Malmquist index numbers presented in the Table 9 is explained using Afghanistan as an example. Afghanistan had a TFP growth of 1.2% from 1995 to 2010 (represented by the index number in column 6 of 1.012). This is made up of an overall TE change of 1.003, and technical change of 1.01. The overall TE change can be further decomposed into a pure TE change of 0.988 and a SE change of 1.  On average, TFP increased by 0.2% from 1995 to 2000. There was an improvement in TE of 0.5%, implying that countries that were inefficient in 1995 had slightly better efficiency scores in 2010. The decrease in TFP is due to the low technology change score of 0.996. This implies that the direction of frontier shift was inwards-countries in the frontier increased health expenditure levels but health outcome increases were slow. This also indicates that the efficiency improvements at the frontier were much slower than the rate of improvement in less efficient countries.
For education, the Malmquist index was calculated using two time periods averaging educational completion rates from 1990 to 2005, and 2006 to 2012. This averaging was necessary to maximize the number of countries included in the benchmarking exercise. In Table 10, total factor productivity fell by almost an average of 50% over these two time periods. There are differences across countries -some countries exhibited efficiency increases over time and others didn't. The reason why TFP fell for all countries is that Technology change is negative (shift inward of the frontier) and overwhelmed even those countries who posted efficiency gains. This is reflected in the raw data. Public education budgets grew significantly during the period, while improvements in primary completion and lower secondary levels are small or nonexistent.
Performance is varied among countries. Among the countries in the frontier, Singapore and Vanuatu had an increase in TFP of 1.1% from 1995 to 2010. Afghanistan was identified as one of the most input inefficient in 2010. Table 11 shows, however, that from 1995 to 2010 Afghanistan increased its life expectancy by 4 years, its DPT immunization rates by 46 percentage points, and its measles immunization rate by 21 percentage points, despite a relatively modest government health expenditure per capita increase. This is in contrast with other inefficient countries in 1995 such as India, Indonesia, Nepal, and Azerbaijan. Their health budgets increased but this relative increase in outcomes in the past 15 years has been slow.   Looking at education's productivity change in Table 12, we find TFP decreases in spending by both frontier and inefficient countries alike. The smallest decrease is Bhutan, which was found to be input and output inefficient. Compared to other countries, including the frontier countries, Bhutan increased its primary and lower secondary completion rates by a significant margin (48% to 85% and 20% to 50%, respectively) compared to other countries in the table. Indeed, Vanuatu's primary completion rate fell, while its spending on education almost doubled ($68 to $103 per capita). This illustrates the usefulness of the Malmquist index in efficiency benchmarking. By comparing productivity changes in two or more periods, the real efficient and inefficient countries can be identified.

E. Explaining Efficiency Scores
This section aims to seek factors that will explain differences in efficiency scores among countries. By using regression techniques, this section identifies statistical association between efficiency scores and environmental variables. Environmental variables are factors that are not included in the efficiency estimation but might influence efficiency scores. Among the environmental variables that are used as regressors are GDP per capita, percent of total health expenditure coming from external funds, and percent of paved roads.
A major concern in this kind of analysis is the correlation between the input variables used in the DEA estimation and the environmental variables used in the regression analysis. When the variables are highly correlated, the coefficients will be inconsistent and biased (Herrera and Pang 2005, Ravallion 2003, Grosskopf 1996. Table 13 shows, with the exception of GDP per capita, there is no high correlation between input variables and environmental variables, making them suitable for regression analysis. Efforts will be made in future drafts to look for proxy variables for GDP and to find other explanatory variables for the analysis.

F. Method
The Tobit model was used in the regression because of the censored nature of efficiency scores. According to Maddala (1987) and Greene (2003b), the fixed effects estimator for a panel data with short periods will have inconsistent coefficients and biased variance. Thus, random effects Tobit estimation was chosen. The Tobit estimation is as follows: where health input efficiency scores from VRS-DEA gdp it = GDP per capita ext it = % of total health expenditure from external donors roads it = % of roads paved

G. Estimation Results
The result of the Tobit regression for factors explaining efficiency scores are presented below (Table 14). In general, GDP and higher percentage of paved roads increase efficiency, while the effect of external aid is unclear.
As with any study, the results need to be interpreted with caution. There are various supply and demand factors, such as price, access, and quality, among others, that affect health outcomes which are not included in the paper. 6 Since the goal of this paper is to measure the expenditure efficiency, such factors were not included. Further examination including those factors might be needed to have a more holistic picture of differences among countries.
The same analysis we have done or a panel of ADB countries can also be done for a panel or cross-section of provinces within Countries. Box Tables 1, 2, and 3 show the DEA results for Indonesian and Indian provinces for health and education efficiencies. Box

VI. DISTRIBUTION OF PUBLIC SPENDING
A.

Inequalities in Utilization
Inequalities still exists in Asia, even for the most basic health services such as immunization and skilled birth attendance. Table 15 shows utilization rates of full immunization broken down by quintiles. While utilization rates differ widely from 25.4 in Tajikistan to almost full coverage of the population in Thailand, what is common in these countries is that poorest quintiles have lower utilization compared to the richest quintiles. The same trend is true for the attendance of skilled professional during childbirth. In Bangladesh, only 18.2% of women benefit from skilled birth attendance, but among the richest quintiles, 50% are able to access skilled professionals, while only 5% of the poorest are able to do so. On the other end of the spectrum, Albania, Armenia, and the Philippines have near universal coverage in this indicator, but some pockets of the population, specifically those in the poorest two quintiles, are lagging behind (Table 16). The same unequal trends are observed in outpatient and inpatient utilization. It is worth noting though that Sri Lanka, Malaysia, and the Lao People's Democratic Republic have more poor quintiles utilizing inpatient services, compared to richer quintiles (Tables 17 and 18).

B. Inequalities in Government Subsidies
Given the trends in utilization, it is not surprising that most government subsidies go to the richest quintiles (Table 19). In the 1990s, the largest subsidies go to hospital care for the richest quintile. This seems to prevail in Bangladesh in 2005, but this does not seem to be the case anymore for Mongolia and Viet Nam in 2006. In Mongolia, there appears to be a differentiation in suom and aimag facilitieswith aimag facilities being more pro-poor. In Viet Nam, the poor benefit most from subsidies in commune health centers, while richer quintiles benefit on services that are hospital based.  Benefit analysis in various economies show that in education, the poorest quintiles benefit more on subsidies in primary education spending, while richer quintiles benefit more on secondary and tertiary expenditures. With the exception of Armenia, this trend is consistent in all economies in Table  20. This trend was also observed in the region for the 1990s.

VII. CONCLUSIONS AND POLICY RECOMMENDATIONS
The paper analyzed the efficiency of health and education expenditure in Asian countries using DEA.
The results indicate that countries could achieve higher health and education outcomes given their expenditures. The health output efficiency score of 0.96 implies that the three health outcome indicators can be increased further by 4%. On average, in terms of input efficiency, Asian countries can use 93% of their budget to attain current levels of health indicators. Since these figures are just indicative, it is important to identify the factors that cause the variations in efficiency scores. The Tobit regression identified variables that show statistical association with efficiency scores. Results show countries where other factors affecting access, such as roads, are efficient.
To improve allocative efficiency of public spending, at the same time make expenditures propoor, there is a need to reallocate expenditures from hospitals to public health, and tertiary and secondary to primary education. It has become evident in this study that many countries still favor curative health spending and tertiary education spending despite evidence that the benefits of these expenditures do not accrue to the poor.
Adoption of performance based budgeting will help align incentives to performance as well as improve efficiency of input mix. It was found that most countries normally practice historical budgeting, which does not necessarily reflect the current needs of institutions and does not encourage improvements in performance.
In health, TE will improve if referral systems are put in place. Most countries, even relatively good performers like Sri Lanka, appear to be struggling in ensuring that patients do not bypass primary care. Countries like the United Kingdom have successfully done this assigned gate-keeping functions to primary care providers, but for low-and middle-income countries, this would also entail improvements in the perception of primary care providers. Improvements in budget execution need citizen participation. Experiences of Bangladesh and the Philippines show that accountability of ministries and politicians improve when civil organizations are involved in expenditure tracking. Leakages will also be minimized when the public are given access to information, such as Public Expenditure Tracking Surveys. Also, as nongovernment organizations and other sectors get involved in service provision, governments should ensure that regulatory frameworks are being drawn up to standardize delivery of quality services.
Lastly, better provision of public services entails sustained tax reform process. Encouraging participation of civil society groups in tax-watch efforts will help ensure that the taxes are indeed being spent in necessary public services that the poor need most.