Aid and Growth. New Evidence Using an Excludable Instrument

We use an excludable instrument to test the effect of bilateral foreign aid on economic growth in a sample of 96 recipient countries over the 1974-2009 period. We interact donor government fractionalization with a recipient country’s probability of receiving aid. The results show that fractionalization increases donors’ aid budgets, representing the over-time variation of our instrument, while the probability of receiving aid introduces variation across recipient countries. Controlling for country- and period-specific effects that capture the levels of the interacted variables, the interaction provides a powerful and excludable instrument. Making use of the instrument, our results show no significant effect of aid on growth in the overall sample. We also investigate the effect of aid on consumption, savings, and investments, and split the sample according to the quality of economic policy, democracy, and the Cold War period. With the exception of the post-Cold War period (where abundant aid reduces growth), we find no significant effect of aid on growth in any of these sub-samples. None of the other outcomes are affected by aid.


Introduction
In a previous paper we began with an apology for adding yet another paper investigating the effect of foreign aid on economic growth to what is already a long list of articles . We frankly admitted that we were unable to provide an unbiased estimate of aid's effect on growth -as is true for most of the preceding literature. Since then, a number of innovative contributions have added to our understanding of whether and to what extent aid causally affects growth and institutions. Jackson (2014) suggests using natural disasters in countries receiving aid from the same donor as an instrument. Galiani et al. (2017) instrument aid flows with the International Development Association's (IDA) threshold for receiving concessional aid. While interesting and innovative, we remain unconvinced of these identification strategies. Jackson's suggestion of increased short-term aid for countries unaffected by disaster as a consequence of disasters in other aid recipient countries from the same donor, while empirically powerful, lacks a theoretical foundation, and is thus potentially spurious. 1 Galiani et al.'s instrument could be correlated with growth for reasons other than aid, as countries' rates of growth might be influenced by factors other than aid at the time they exceed the IDA's income threshold. 2 The lack of a plausibly excludable instrument for aid in a large sample of donor and recipient countries continues to plague the aid effectiveness literature at large. The question of whether aid affects recipient countries' economic growth thus remains wide open. 3 In this paper, we aim to fill this gap. We are inspired by the identification strategies of Werker et al. (2009), Nunn and Qian (2014) and Ahmed (2016). These studies rely on plausibly excludable variables that do not vary at the recipient country level and interact it with a proxy for the probability of receiving aid. We borrow from Ahmed (2016) who exploits variation in the composition of the United States' House of Representatives to instrument US aid in explaining recipient country democracy. To the extent that fractionalization leads to larger government budgets and larger overall budgets lead to an increase in the 1 On the significance of "false positives," see Chaudoin et al. (2014). 2 This would hold even if the decision to pass the IDA's income threshold could not be manipulated by aid-receiving governments. Consider a reform-oriented government that achieves substantially higher growth rates for some years that eventually lead to passing the exogenous threshold. Growth dynamics will be different in these years compared to the years in which the country does not grow, even with an exogenous income threshold. What is more, governments can manipulate GDP data, which makes reaching the threshold potentially endogenous (see Kerner et al. 2017, who show this for aid-dependent countries). Galiani et al. test for these possibilities. Using a smoothed income trajectory to rule out the effect of shocks they find results that are similar to their main analysis. They find no evidence of data manipulation. However, their sample only covers 35 countries. Dreher and Lohmann (2015) focus on regional growth within countries. Their instrument for aid is an interaction of the IDA income threshold with a region's probability to receive aid, in a sample of 21 countries. 3 Among prominent recent attempts to investigate the effect of aid, Clemens et al. (2012) do not use instruments and Brückner (2013) relies on rainfall and commodity price shocks, which can easily violate the exclusion restriction. See Werker (2012) and Doucouliagos (2016) for recent surveys. aid budget, fractionalization can serve as a powerful instrument. In line with Nunn and Qian (2014) and Ahmed (2016) we introduce variation at the recipient country level by interacting fractionalization with the share of years a country receives aid from its donors. 4 To the extent that variables correlated with donor fractionalization do not affect recipients' rates of growth differently in regular and irregular recipients of aid, controlled for country-and period fixed effects and a battery of control variables, the resulting instrument is excludable. Contrary to Nunn and Qian (2014) and Ahmed (2016), we focus on growth rather than democracy or conflict, and aid from a group of major donors rather than (food) aid from the United States exclusively. Other than Werker et al. (2009), we focus on a broad set of donor countries. As we outline in more detail in Section 2, we investigate the link between government fractionalization and the effectiveness of aid as a chain of cause-and-effect relationships. Starting with the effect of fractionalization on government budgets, we further illustrate the relation between overall budgets and aid budgets.
In addition to investigating the effect of aid on growth, this paper's contribution is the introduction of an instrument for aid from a large number of donors and years that can be used to address a substantial number of questions in the aid effectiveness literature. Though still new, our instrument has already been used in Bluhm et al. (2016) to investigate the effect of aid on conflict, and  in the context of democracy. 5 We suggest a number of additional research questions where we think our instrument helps overcoming the endogeneity of aid in the conclusion.
We describe our data and method in more detail in Section 3. To foreshadow our results (shown in Section 4), we find that the interaction of government fractionalization and a country's probability of receiving aid is a powerful instrument for aid. Using this instrument, we find no positive effect of aid on growth in the overall sample. Section 5 splits the sample in a number of important dimensions -the quality of economic policy, democracy, and the Cold War -and tests whether the impact of aid differs across these groups. With the exception of the post-Cold War period (where abundant aid reduces growth), we find no significant effect of aid on growth in any of these sub-samples. We also investigate the effect of aid on components of GDP rather than growth (in section 6). Savings, investment, and consumption are all unaffected by aid. The final section summarizes and concludes the paper. 5

The argument
Most of the previous literature pursues one of three strategies to identify the effect of aid on growth. One group of papers relies on instruments that relate to the size of the recipient country's population (as a proxy for the ease to exercise power, e.g., Rajan and Subramanian 2008). A second group of papers focuses on bilateral political relations, for example employing voting coincidence in the United Nations General Assembly to instrument for aid (Bjørnskov 2013). The third uses internal instruments and estimates difference or system GMM regressions (Minoiu and Reddy 2010). Each of these strategies is misguided.
Population size can affect growth through many channels that researchers cannot control for and is thus not excludable (Bazzi and Clemens 2013). Lagged levels and differences of aid are also hardly excludable to growth, invalidating them as (internal) instruments. Political-relations based variables might be excludable, but to the extent that the motive for granting aid affects the outcome, the resulting Local Average Treatment Effect (LATE) reflects the effects of politically motivated aid rather than those of all aid .
A couple of recent papers suggest alternative identification strategies, based on interactions between an excludable instrument and a potentially endogenous variable (Werker et al. 2009, Nunn and Qian 2014, Ahmed 2016. Of these, only Werker et al. (2009) investigate the question that we address in this paper -the effect of foreign aid on economic growth. Werker et al. make use of oil price fluctuations that substantially increase the aid budgets of oil-producing Arab donors, in particular to Muslim countries.
Specifically, their instrument for Arab aid is the interaction of the oil price with a binary indicator for Muslim recipient countries, which receive the bulk of Arab donors' aid. They find recipient country growth to be unrelated to aid. While we are convinced of Werker et al.'s identification strategy, their results can hardly be generalized to represent the effects of aid more broadly. As they point out, their results show the LATE for oil-price-induced increases in aid to Muslim countries, which might be unrepresentative of aid from a broader set of donors to a broader set of recipients. In particular, the modalities of aid delivery as well as the political motivations of this aid might reduce its effectiveness, as might the specific set of policies and institutions in the largely authoritarian recipient countries of aid from Arab donors (Werker et al. 2009. We rely on Werker et al.'s identification strategy, closely following Nunn and Qian (2014) and, in particular, Ahmed (2016), but focusing on aid's effect on growth for a large set of aid donors and recipients, over a long period of time.
We rely on two additional strands of previous literature to motivate our instrument for aid. The first investigates the effect of government fractionalization on governments' budgets. Roubini and Sachs (1989) propose that coalition governments will be more reluctant to reduce expenditures compared with 6 single-party governments, as each party of the coalition will resist pressure to cut expenditure in its own area, even if they are in favor of overall spending cuts. Volkerink and de Haan (2001) and Scartascini and Crain (2002) show that legislature fragmentation increases governments' expenditures. We make use of the relationship between fractionalization and government budgets, hypothesizing that the larger budgets arising due to fractionalization increase aid budgets, which in turn affect aid disbursements at the recipient country level. Most importantly, controlling for period fixed effects, recipient fixed effects, and other control variables, government fractionalization in donor countries is arguably excludable in growth regressions at the recipient country level.
The second well-established strand of literature we draw from addresses the relationship between overall government budgets and their aid budgets. Brech and Potrafke (2014) and Round and Odedokun (2004) show that overall expenditures as a share of GDP significantly determine aid budgets.
Interestingly, in line with our hypothesis in this paper, Round and Odedokun's (2004) regressions excluding government expenditures show that government fractionalization increases aid budgets, "apparently to satisfy the various interests of the coalition" (p. 308). 6 Obviously, larger overall aid budgets increase aid disbursements to recipient countries, on average (e.g., Dreher and Fuchs 2011).
We use fractionalization interacted with the probability of receiving aid as our instrument for bilateral aid, and argue that it is excludable to recipient country growth. As Nunn and Qian (2014: 1632, 1638 explain, this holds even though the probability of receiving aid itself is endogenous. As they point out, the resulting regressions resemble a difference-in-difference approach, where we compare the effect of aid on growth in regular and irregular recipients of aid as donor fractionalization changes. We explain our identification strategy in more detail in the next section, where we introduce our data and method of estimation. One might consider two alternative instruments resulting from our hypothesized transmission channels: government expenditures and aid budgets. These instruments are however not necessarily excludable, given that growth shocks in recipient countries could directly affect donors' aid budgets (and thus their overall budgets), while growth shocks in non-recipient countries might not. For example, Rodella-Boitreaud and Wagner (2011) show that donors' total aid budgets increase with natural disasters in developing countries, indicating that donors adjust their total aid budget in response to shocks rather 6 Overall government budgets and government fractionalization do not turn out to be robust determinants of aid budgets in the large-scale robustness analysis in Fuchs et al. (2014). Their regressions however include various measures of fractionalization and fiscal policy at the same time, setting a high bar on the identification of the individual effects. 8 ℎ , = β 1 , −1 + β 2 , −1 2 + , β 3 + β 4 + β 5 + , , where Growthi,t is recipient country i's average yearly real GDP per capita growth over a four-year period t. 10 Aidi,t-1 denotes the amount of net aid (as a percentage of GDP) disbursed by the 28 bilateral donors of the OECD's Development Assistance Committee (DAC) in the previous period. Some specifications also include aid squared to test for decreasing returns to aid, following Clemens et al. (2012). represent recipient country fixed effects, period fixed effects, and εi,t the error term. Standard errors are bootstrapped based on pairwise recipient country clusters. 11 All regressions include the set of contemporaneous control variables used in Burnside and Dollar (2000), which we denote as Xi,t: Initial GDP/capita, Ethnic Fractionalization, Assassinations, Ethnic Fractionalization*Assassinations, dummies for Sub-Saharan Africa and East Asia, Institutional Quality, M2/GDP (lagged), and Policy. 12 Some words of caution are in order. The instrumental variables approach that we explain in more detail below does not rely on these control variables -our instrument does not violate the exclusion restriction in their absence. We thus face a trade-off between increasing the efficiency of the estimator and introducing bias via the potential endogeneity of the control variables and their correlation with predicted aid. While we include the control variables in the main analysis, note that our results are qualitatively unchanged when we exclude them. 13 A skeptical reader might also be concerned about the Nickell bias arising from the inclusion of initial GDP per capita. When we exclude initial GDP per capita, our results remain robust. When we correct 10 We include recipient countries that have been on at least one "DAC List of ODA Recipients" between 1997 and 2013. Appendix E shows these countries. The results are unchanged when we instead estimate the aid-growth relationship in a dyadic setting. 11 However, even though we are using a constructed instrument, IV standard errors are consistently estimated as long as the second-stage error term is not correlated with our donor-recipient-specific instrument ( , * , ) from the zero-stage regression (Wooldridge 2010). In line with Atkinson and Cornwell (2011) we also employ wild bootstrap at the second-stage to test robustness (using cgmwildboot, Cameron et al. 2008). Standard errors are based on the bootstrapped p-values as these rather than standard errors are pivotal. Our results do not change when using alternative bootstrap approaches or when not bootstrapping standard errors. 12 To reduce clutter, we do not show them in the main tables. Note that the time-invariant variables are removed here (as in Clemens et al. 2012) through the inclusion of country fixed effects. Also note that we do not control for Burnside and Dollar's measure of good policy, given that improvements in policy might be an important transmission channel by which aid affects growth. We lose about 200 observations when we include the good policy indicator. Our results however do not depend on its exclusion. While the first-stage F-statistics are somewhat lower in the reduced sample, the coefficients of interest are within the respective Anderson-Rubin 90%-confidence intervals. We also estimate regressions including an imputed good policy indicator to avoid losing observations. Our results are again unchanged. Appendix A reports the sources and definitions of all variables, while we show descriptive statistics in Appendix B. Appendix D reports the full specifications for the main regressions. 13 See Table C1 in the Appendix. 10 unlikely, to ensure that our results do not depend on this modelling choice we add the levels of the interaction term and donor-recipient fixed effects to equation (2) in a robustness test. 15 We also compared the different modelling choices of the zero-stage in case of one endogenous variable in a simulation analysis. What is more, we compared the findings to the approach when predicting aid relying on all coefficients of the zero-stage regression including the levels of the interacted instrumental variable, country-pair and time fixed effects. In balanced samples, we find these different methods to lead to the same second-stage results. Note that after aggregating over all donors the donor-recipientspecific probability is then captured by recipient-country fixed effects (when proceeding as in equation 2). When instead controlling for time fixed effects in the zero-stage, the probability is captured by the time fixed effects at this level. The donor-specific time-varying measure of government fractionalization is the same across recipients and is consequently captured in the time fixed effects after we have aggregated the data over all donors. The only variation that remains at the first-and second-stage level is the exogenous variation introduced by the interaction term. This holds irrespective of the three different modelling choices: a) including only the interacted instrument as in equation 2; b) predicting aid relying on γ 1 from a zero-stage regression which also includes the levels of the interacted instruments and fixed effects; and c) the same regression as in b) but predicting aid from all coefficients. 16 One might also be concerned about the fact that we do not control for the second-stage covariates in the dyadic equation (2). The dyadic zero-stage equation constructs an instrument from exogenous variation, which we then use in the usual 2SLS procedure at the recipient-level. After aggregating over all donors, we use the constructed instrument and control for the second-stage covariates in the first-stage regression. Thus, the remaining variation is the exogenous variation introduced by our constructed instrument conditional on all second-stage covariates.
The intuition of our approach is that of a difference-in-difference approach, where we investigate a differential effect of donor fractionalization on the amount of aid to countries with a high compared to a low probability of receiving aid. The identifying assumption is that growth in countries with differing probabilities of receiving aid will not be affected differently by changes in fractionalization, other than via the impact of aid, controlling for recipient country and period fixed affects and the other variables in the model. In other words, as in any difference-in-difference setting, we rely on an exogenous treatment and the absence of different pre-trends across group. Controlled for period fixed effects, donor-government fractionalization cannot be correlated with the error term and is thus clearly exogenous to aid. In order for different pre-trends to exist, these trends across countries with a high compared to a low probability to receive aid would have to vary in tandem with period-to-period changes in donor fractionalization.
Given that donor fractionalization follows no obvious trend in our data, we consider this implausible. 17 In order to ensure that our result is not driven by omitted variables that affect regular and irregular recipients of aid differently, we also control for recipient country characteristics such as economic freedom and trade (as a percentage of GDP), both as a level and interacted with the probability of receiving aid, respectively. The effect of aid on growth is unaffected and F-statistics remain around the threshold of 10. Moreover, the dyadic instrument remains strong at the zero-stage regression when controlling for a number of donor and recipient country characteristics as economic freedom, ideology, overall trade, bilateral imports and exports and donor GDP per capita growth. 18 We aggregate equation (2) across donors for each recipient and period, resulting in the fitted value of aid as a share of GDP at the recipient-period level (in analogy with Rajan and Subramanian 2008, for example): , � = ∑ �γ � 1 , * , + , , �.
(3) 17 Following Christian and Barrett (2017) we plot the variation in government fractionalization in tandem with the variation in aid and growth for two different groups that are defined according to the mean of the probability to receive aid. Figure 4 in Appendix F plots these graphs. They give no reason to believe that the parallel trend assumption is violated in our case. More precisely, the probability-specific trends in aid and growth, respectively, seem rather parallel across the regular recipients (those with a probability to receive aid that is above the mean) and the irregular recipients (with the probability to receive aid being below the mean). There is also no obvious non-linear trend in regular compared to irregular recipients that is similar for aid and growth. What is more, these trends do not overlap with the trend in government fractionalization. In analogy to Christian and Barrett (2017), our identification strategy would be at risk in the presence of a non-linear trend in government fractionalization that is similar to the trends in aid and growth for the group of regular recipients. A common trend in all three variables, that is not different for regular and irregular recipients would, to the contrary, be captured by our time fixed effects. 18 The detailed results are available on request.
We then instrument , −1 in equation (1) with our constructed instrument , −1 � from equation (3) at the recipient-period level. 19 We instrument , −1 2 with the square of predicted aid to GDP from the first-stage, following Wooldridge (2010: 268). Our results are robust when we instead use the square of fitted aid to GDP (from equation 3) as an instrument for aid squared (from equation 1).
A priori, it is unclear whether legislature or government fractionalization is more suitable as an instrument. As Ahmed (2016) points out for the United States, the "funding and allocation of bilateral economic aid involves both the executive branch and Congress" and the same is true for the other donor countries in our sample. As it is the government that drafts the budget plan and not the legislature, we measure donor fractionalization as the probability that two randomly-chosen deputies from among the parties forming the government represent different parties (Beck et al. 2001). This would come at the disadvantage that there is no variation in government fractionalization for the United States and Canada across our period of observation. We therefore replace government fractionalization with legislature fractionalization for these countries. 20 Our results are unchanged when we (i) do not replace these values, (ii) omit the two countries, and (iii) use legislature instead of government fractionalization for all countries.
We proxy a country's probability of receiving aid with the percentage of years the country received aid from a particular donor over the sample period, following Ahmed (2016) and Nunn and Qian (2014). Specifically, the probability of receiving aid from a particular donor j is , ����� = , with , , indicating whether recipient i received positive amounts of aid from donor j in year y. To test robustness we alternatively included the probability to receive aid over each four-year period (and its interaction with fractionalization) rather than those over the whole sample period. 21 19 This follows Rajan and Subramanian (2008) and -in the context of trade rather than aid - Frankel and Romer (1999). Our results are unchanged when we include donor-recipient pair and period fixed effects in the zero stage regression (with first-stage F-statistics becoming stronger). They are also unchanged when we instead replace 13 Figure 1: Probability to receive aid and average aid, 1974-2009 period We argue that the extent to which changes in aid budgets affect aid receipts depends on a country's probability of receiving aid. Both Nunn and Qian (2014) and Ahmed (2016) show that the probability of receiving aid is indeed significantly correlated with the amount of US (food) aid a country receives. The same holds for our sample, for a broad set of donors, as can be seen in Figure 1. The Figure plots the average probability of receiving aid (i.e., recipient i's probability of receiving aid from any donor over the whole sample period) on the horizontal axis and the average aid received from all donors as a percentage of GDP on the vertical axis. The correlation between the two is 0.31, significant at the one-percent level.
For example, the figure shows that Afghanistan received aid in 63 percent of the years in the  period, amounting to about 37 percent of its GDP. On the lower end of the scale, Kuwait received 0.0085 percent of its GDP as aid, and received aid in 12 percent of the years in the sample.

Main results
Before discussing the IV results presented in Table 2, it is important to note that the interaction of donor fractionalization and the probability of receiving aid is statistically significant at the 1%-level in the zero-stage regression (equation (2), Table C2 in Appendix C). The corresponding F-statistic of the interaction term is 101. Obviously, when taking the alternative approach to equation (2) by including donorrecipient pair fixed effects the respective F-statistic drops to a lower value, 14.7, which is still clearly above the threshold of 10. The coefficient of the dyadic instrument in equation (2) amounts to 0.363 with a standard deviation of 0.035. An increase in fractionalization from zero to one thus increases bilateral aid to recipient countries that receive aid in all years by 0.363 percentage points of GDP. The dyadic instrument provides the exogenous variation that we use to calculate the exogenous part of bilateral aid (as a percentage of GDP). After aggregating over all donors, we use the sum of fitted bilateral aid (fitted aid to GDP, over all 28 DAC donors) in order to measure its causal effect on growth at the recipient-period level. Table 2 shows the results at the recipient-period level using fitted aid to GDP as an instrument for actual aid. The control variables from Table 1 are included in all first-and second-stage regressions, but we exclude them from the table to reduce clutter. 27 Column 1 focuses on contemporaneous aid, instrumented with , � , in analogy to equation (3). The table also shows the corresponding first-stage results. 24 Note that to facilitate comparison we restrict the sample to those observations that are also included in the 2SLS regressions below. 25 Specifically, their estimated coefficient is 0.096 (in column 4 of their Table 7), which is however not significant at conventional levels. 26 The coefficient for the linear aid term is 0.361 and for aid squared -0.008 in the comparable regression in Clemens et al. (2012), both significant at the five-percent level (in column 7 of their Table 7). 27 Appendix D shows the full results.
As can be seen in the table, the Cragg-Donald and Kleibergen-Paap first-stage F-statistics are above Staiger and Stock's (1997) rule-of-thumb threshold of ten. 28 The underidentification test (Kleibergen-Paap LM statistic) clearly rejects the Null hypothesis that the equation is underidentified.
Column 2 includes aid squared, which we instrument with the square of predicted aid to GDP of the first-stage. The test statistics given in column 2 of Table 2 refer to this instrument; statistics for aid itself are equivalent to those shown in column 1. The results show strong first-stage F-statistics; underidentification is again easily rejected.
Columns 3 and 4 show results for our preferred specifications, replacing contemporaneous values of aid with their lagged values (equation 1). The statistics indicate that for the linear and squared term the instrument for aid is strong. The results show no significant effect of aid or aid squared on growth.
There is no evidence that aid causally affects growth. 29 The significant correlations shown in Table 1 and in Clemens et al. (2012) are thus likely to be spurious. Potentially, donors anticipate growth-promoting policies -due to more reform-oriented politicians assuming power, for example -and increase their aid to such countries.
We conclude that there is no evidence that aid increases growth and offer a number of explanations. First, aid or growth might not be measured precisely enough to capture the effects of aid in a rather small sample of less than 800 observations. Second, even if aid would be measured precisely, the small number of observations implies that our tests are underpowered. In order for our tests to show an effect of aid if it was actually there with an 80 percent probability we would require more than 6000 observations rather than the sample of roughly 800 that we have. 30 This is an unfortunate feature that we share with the aid effectiveness literature at large (Ioannidis et al. 2016). 31 Third, the effects of aid might be spread over different horizons, and our four-year averages might be inadequate to capture these effects. 32 28 Stock and Yogo (2005) propose more specific sets of critical values for weak identification tests based on the number of endogenous regressors, the number of instruments and the acceptable maximum bias of the 2SLS relative to OLS regression or the maximum Wald test size distortion. For example, a 20-percent 2SLS size distortion of a fivepercent Wald test is associated with a critical value of 6.66 and a lower value of 4.42 for a 20-percent LIML (limited information maximum likelihood) size distortion. 29 We also used logged aid/GDP rather than the level of aid along with its square, which allows for a decreasing marginal effect of aid even though it does not allow its effect to change sign. Our results are unchanged. 30 This high number of required observations is driven by our fixed effects setting, as both country and time fixed effects in tandem with the set of covariates capture most of the variation in the dependent variable so that the variation caused by aid conditional on these variables is rather small. 31 According to Ioannidis et al. (2016), only about one percent of the 1779 estimates in the aid and growth literature surveyed have adequate power (see also Doucouliagos 2016). 32 A detailed analysis of longer lags is beyond the scope of this paper. When we include further lags of our aid variables, the second lag stays insignificant (8 years), but there is some evidence that growth might increase with even Fourth, aid might be effective in some groups of countries but not in others, and our pooled sample could hide such effects. We turn to this in the next section. Finally, of course, aid might simply not increase growth.

Heterogeneous effects of aid
Our instrumental variables regressions estimate the effect of variation in bilateral aid flows that go disproportionately to regular and irregular recipients of aid as a result of differences in government fractionalization. We have no reason to believe that the LATE cannot be generalized to be representative of bilateral aid more broadly. However, the previous literature suggests that the effects of aid vary across a recipient country's policies and institutions. Most importantly, it has been suggested that aid is effective in countries with good economic policies (Burnside and Dollar 2000), in democracies (Svensson 1999), or after the end of the Cold War (Headey 2008), but not otherwise. All of these interactions have been shown to be fragile (e.g., Doucouliagos and Paldam 2009), but none of these earlier studies investigates causal relationships. Rather than introducing interaction effects, we split the sample according to the median of Burnside and Dollar's (2000) good policy index (based on inflation, the budget balance, and openness to trade), Cheibub et al.'s (2010) binary indicator of democracy, and the years before 1991 and after 1990, respectively. Table 3 shows the results. As can be seen, aid has no significant linear effect on growth in any of the samples. With one exception, the results also show that there is no significant non-linear effect of aid on growth. The exception is the regression in column 6 where we split along the Cold War dimension. Aid squared is significant (at the five-percent level) after the end of the Cold War. However, the coefficient is negative with a level effect that is also negative, indicating that if aid had any effect at all it would reduce growth.
Overall, our results show no positive effects of aid on growth in any of the sub-samples and a negative effect of abundant aid on growth after the Cold War period.
longer lags (from 12 years on). The number of observations in these regressions is however comparably low, and we did not investigate the robustness of these results.

Where does the aid go?
In the final substantive section of the paper we investigate the effects of aid on components of GDP, with the aim of testing where aid is spent. The insignificant effect of aid on GDP per capita growth could be the result of aid being spent on consumption rather than investment. Alternatively, aid could increase investment, but investments might be ineffective in increasing economic growth. The policy implications of these results would be substantially different. 33 We investigate the effect of aid on investment, overall consumption, private sector consumption, and government consumption. We also investigate the effect of aid on domestic savings, testing whether aid inflows are substituted by equivalent decreases in domestic savings. Specifically, we focus on gross capital formation (in percent of GDP), household final consumption expenditure (in percent of GDP) and government final consumption expenditure (in percent of GDP), with overall consumption being the sum of the two, and gross domestic savings (in percent of GDP). We use the same covariates and timing as in our aid-growth regressions above. Table 4 shows the results. As can be seen, aid has no significant effect on any of the variables in any period. Specifically, there is no effect of aid on consumption, savings or investment in the overall samples, countries with good or bad policies, democratic or undemocratic countries, or during or after the Cold War period. Overall, our results therefore contrast with those of the previous literature. Boone (1996), for example, reports that aid increases consumption, but not savings and investment. Werker et al. (2009) find that household and government consumption both increase with aid, that savings decrease with aid, and investment is unaffected (all focusing on Arab donors and the recipients of their aid exclusively). Temple and Van de Sijpe (2014) confirm the positive impact of aid on total consumption, which seems to be driven mainly by household consumption. This shows the importance of the choice of identification strategy, as well as the sample of donors and recipients, for testing the effect of aid on the outcomes of interest.

Conclusion
This paper has proposed an excludable instrument to identify whether and to what extent foreign aid affects economic growth. Cross-sectional variation arises due to changes in aid disbursements following 20 differences in donor countries' government fractionalization. Temporal variation is introduced by interacting fractionalization with the probability of a certain country receiving aid. The approach resembles a difference-in-difference approach, the difference being that our treatment variable (fractionalization) is a continuous rather than a binary indicator.
Using aid disbursement data for all bilateral donors of the OECD's DAC to a maximum of 96 recipient countries over the 1974-2009 period, we find our instrument to be powerful. For the average recipient country this represents roughly quadrupling the amount of current (bilateral) aid. In contrast, countries that receive aid only half of the time can expect an increase in aid inflows of 0.183 percentage points.
Applying the instrument to our growth models, we find bilateral aid to be ineffective in increasing economic growth in the overall sample and various sub-samples, split along the quality of economic policies, democracy, and the Cold War period. In the years after the end of the Cold War, we find growth to decrease with abundant aid. We also investigate the effect of aid on savings, consumption, and investment, and do not find any effect of aid in the overall sample or our sub-samples.
Our results show that bilateral aid has no robust effect on short-term growth. We would like to stress that this finding does not imply that aid is necessarily ineffective. One might argue that aid is measured imprecisely, and standard errors are too large. Statistical power might be too low for the estimators to find a significant effect, even if it would be there (Ioannidis et al. 2016). We agree that these are two possible explanations for our insignificant results. We still believe that it is important to show, and publish, these results, as the published literature on the effectiveness of aid tends to be over-optimistic, due to institutional biases of the authors in the aid effectiveness literature and the well-known bias of journal to publish (only) significant results (Doucouliagos andPaldam 2009, Doucouliagis 2016). As the lack of power pertains independent of the significance of the results, there is arguably no reason to dismiss ours on the grounds of large standard errors, compared to a number of recent papers finding significant (and positive) results. We therefore urge readers to evaluate this paper on its methodological improvements over the previous literature, rather than its results.
At least one other important reason can explain the insignificant results: Donors pursue a multitude of objectives when granting aid, with economic growth being just one of them. To the extent that donors prioritize geo-strategic goals over developmental ones the effects of "true" developmental aid will be higher than those of all aid . Aid would then need to be evaluated based on progress towards its "true" goals. While we did not investigate such outcomes here, the effects of aid on a number of alternative outcomes have been documented, including on terror (Azam and Thelen 2008), voting behavior in international organizations (Vreeland and Dreher 2014), and conflict (Nunn and Qian 2014). Table 1: Aid andGrowth, 1974-2009, OLS Notes: Data are averaged over four years at the recipient-period level. Recipient-and period-fixed effects are included. Standard errors are in parentheses (clustered at the recipient country level; significance levels: * 0.10, ** 0.05, *** 0.01). Models are based on Burnside and Dollar (2000).
(1)  Table 2: Aid andGrowth, 1974-2009, IV Notes: Data are averaged over four years at the recipient-period level. Recipient-and period-fixed effects are included. First-and second-stage include as control variables: Log initial GDP/capita, Assassinations, Ethnic*Assassinations, and M2/GDP (lagged). Pairs cluster bootstrap standard errors with 500 replications are in parentheses in the second-stage regressions (clustered at the recipient country level). Standard errors are in parentheses in the first-stage regressions (clustered at the recipient country level). Models are based on Burnside and Dollar (2000). The first-stage statistics reported in columns 2 and 4 refer to the squared aid term. The statistics for the linear term in columns 2 and 4 are identical to columns 1 and 3, respectively. Standard errors are in parentheses in the zero-stage regression (clustered at the donor-recipient level). Significance levels: * 0.10, ** 0.05, *** 0.01.
(1)  Table 3: Aid andGrowth, 1974-2009, IV, Different Samples Notes: Data are averaged over four years at the recipient-period level. Recipient-and period-fixed effects are included. The first-and second-stages include as control variables: Log initial GDP/capita, Assassinations, Ethnic*Assassinations, and M2/GDP (lagged). The bad/good policy sample includes countries below/above the median according to the Burnside-Dollar good policy index. Democracy is measured with the binary indicator of Cheibub et al. (2010). Pairs cluster bootstrap standard errors with 500 replications are used (clustered at the recipient country level; significance levels: * 0.10, ** 0.05, *** 0.01). Models are based on Burnside and Dollar (2000).
(1)  Notes: The dependent variables are -all as a percentage of GDP -Overall Consumption, government final consumption expenditure (Gov. Consumption), household final consumption expenditure (Private Consumption), gross capital formation (Investment), and gross domestic savings (Savings). The coefficients shown refer to contemporaneous and lagged Aid as a percentage of GDP. The bad/good policy sample includes countries below/above the median according to the Burnside-Dollar good policy index. Democracy is measured with the binary indicator of Cheibub et al. (2010). Data are averaged over four years at the recipient-period level. Recipient-and period-fixed effects are included. The first-and second-stages include as control variables: Log Initial GDP/capita, Assassinations, Ethnic*Assassinations, and M2/GDP (lagged). Pairs cluster bootstrap standard errors with 500 replications are used (clustered at the recipient country level; significance levels: * 0.10, ** 0.05, *** 0.01). Models are based on Burnside and Dollar (2000).
(1)  Table C1: Aid andGrowth, 1974-2009, IV, no covariates Notes: Data are averaged over four years at the recipient-period level. Recipient-and period-fixed effects are included. Pairs cluster bootstrap standard errors with 500 replications are in parentheses in the second-stage regressions (clustered at the recipient country level). Standard errors are in parentheses in the first-stage regressions (clustered at the recipient country level). The first-stage statistics reported in columns 2 and 4 refer to the squared aid term. The statistics for the linear term in columns 2 and 4 are identical to columns 1 and 3, respectively.
(1)  Notes: Data are averaged over four years at the donor-recipient-period level in the zero-stage regression and at the recipient-period level in the second-stage regression. Standard errors are in parentheses in the zero-stage regression (clustered at the donor-recipient level). Pairs cluster bootstrap standard errors with 500 replications are in parentheses in the second-stage regressions (clustered at the recipient country level). Significance levels: * 0.10, ** 0.05, *** 0.01.