Quantile Regression with Generated Regressors

This paper studies estimation and inference for linear quantile regression models with generated regressors. We suggest a practical two-step estimation procedure, where the generated regressors are computed in the first step. The asymptotic properties of the two-step estimator, namely, consistency and asymptotic normality are established. We show that the asymptotic variance-covariance matrix needs to be adjusted to account for the first-step estimation error. We propose a general estimator for the asymptotic variance-covariance, establish its consistency, and develop testing procedures for linear hypotheses in these models. Monte Carlo simulations to evaluate the finite-sample performance of the estimation and inference procedures are provided. Finally, we apply the proposed methods to study Engel curves for various commodities using data from the UK Family Expenditure Survey. We document strong heterogeneity in the estimated Engel curves along the conditional distribution of the budget share of each commodity. The empirical application also emphasizes that correctly estimating confidence intervals for the estimated Engel curves by the proposed estimator is of importance for inference.


Introduction
Since the seminal work of Koenker and Bassett (1978), quantile regression (QR) models have provided a valuable tool in economics, finance, and statistics as a way of capturing heterogeneous effects of covariates on the outcome of interest, exposing a wide variety of forms of conditional heterogeneity under weak distributional assumptions. Also importantly, QR provides a framework for robust inference.
Applied researchers are commonly confronted with the absence of observable regressors in practice. In some cases, proxies for the unobservable variables can be found in data, while in other cases, these regressors need to be estimated. Thus, a very common strategy to deal with unobservable variables is to replace them with estimated values, that is, generated regressors (GR). 1 These GR have important implications for the reliability of general standard estimation and inference procedures. Pagan (1984) and Murphy and Topel (2002) point out that even though consistent estimates of parameters of interest are produced when the unobserved regressors are replaced with their estimated values, the conventional ways to estimate standard errors are incorrect. Recently, Mammen, Rothe, and Schienle (2012) provide a general theory for the impact of GR on the final estimator's asymptotic properties in nonparametric mean regression. Hahn and Rider (2013) derive the asymptotic distribution of three-step estimators of a finite-dimensional parameter in a semiparametric mean regression models where GR is estimated parametrically or nonparametrically in the first-step.
A number of important statistical applications requires estimation of a conditional quantile function when some of the covariates are not directly observed, but are estimated in a preliminary step. Examples include, among others, stochastic volatility time-series models, triangular simultaneous equation models, censoring, sample selection models, and treatment effects models. 2 This paper contributes to the literature by systematically studying and formalizing estimation and inference for linear QR models with a general semiparametric GR. The main contributions are as following. First, we suggest a practical two-step estimation procedure to estimate the parameters of interest in the QR-GR model. The first-step applies a general (semi)-parametric estimator to compute the GR. The second step uses the GR as regressor variables in a QR estimation. We establish the asymptotic properties of the two-step QR-GR estimator, namely, consistency and asymptotic normality. We show that the asymptotic variance-covariance matrix needs to be adjusted to account for the first-step estimation error. The basic idea is as follows. Since the first-step estimation of the unobservable regressors produces consistent estimates of the corresponding true parameters, the GR are consistently estimated. However, the GR are included in the QR model of interest with sampling error, which introduces additional noise into the asymptotic variance-covariance matrix of the coefficients of interest. In other words, the sampling error from the first stage contaminates the second stage estimation. Therefore, the usual way of calculating the QR variance-covariance matrix fails to account for the additional source of error. Under some general conditions, the estimated limiting distribution of the first-step is used to consistently estimate the variance-covariance matrix of the parameters of interest.
The second contribution of the paper is to develop inference procedures for the QR model with GR. Though estimation and inference for conditional average models with GR have been widely studied and used in practice, the literature on QR with GR is more sparse.
We develop testing procedures for general linear hypotheses in these models based on Waldtype tests, and derive their associated limiting distributions. To implement the tests, we propose an estimator for the asymptotic variance-covariance of the QR-GR coefficients, and formally establish its consistency. An important advantage of the proposed tests is that they are simple to compute and implement in practice.
Compared to the existing procedures, the QR-GR methods proposed in this paper present several distinctive advantages for applied researchers. First, instead of imposing a linear structure in the first step, as in Xiao and Koenker (2009), or imposing triangular structure between two steps, as in Ma and Koenker (2006) and Lee (2007), we work with the general case of QR-GR where no structural restrictions between the two steps or any specific functional forms and estimation strategies in the first step are imposed. That is useful in applied as the score of QR. work since practitioners have a large range of alternatives to construct the unobservable variables by using different estimation strategies or even different data sets. Second, we establish the asymptotic properties of the QR-GR estimator for non-iid data under weak conditions. This is an important generalization for practitioners since it allows for inference in a more general class of models. Third, we develop practical inference procedures. Finally, the weak conditions we imposed for QR-GR allow for simple computational implementation. 3 Linear QR models have been the workhorse of the applied research and the methods lead to a simple algorithm that can be conveniently implemented in empirical applications. Researchers can simply use existing software packages for the first-step estimation and to construct the regressors needed, for example, MLE, OLS, QR or GMM, and then apply the QR procedure with our described variance-covariance matrix adjustment.
Monte Carlo simulations assess the finite-sample properties of the proposed methods. We evaluate the QR-GR estimator in terms of empirical bias and root mean squared error, and compare its performance with methods that are not designed for dealing with GR issues.
In addition, we compute the corresponding standard errors of the QR-GR estimator and evaluate its bias. The experiments suggest that the proposed approach performs very well in finite samples and effectively removes the bias of the standard errors induced by the GR.
Thus, the proposed variance estimator is approximately unbiased and approximates well the true variance.
Finally, to motivate and illustrate the applicability of the methods, we consider an empirical application to study demand models, using data from the UK Family Expenditure Survey. Demand models (also known as Engel curves) represent the relationship between total expenditure and the share of various commodities to total expenditure. The QR approach is a useful tool in this example because it allows us to capture the heterogeneity in the expenditures of the different commodities along the conditional distribution of each commodity share. The first step in this exercise is to estimate the unobserved motives for budget shares of different commodities using the factor analysis proposed by Barigozzi and Moneta (2016). The unobserved factors consist of the motives of consumption as necessities, luxuries, and unitary elasticity goods. In the second step, we apply the proposed QR-GR estimator to obtain Engel curves for the commodities such as food, housing and leisure, by regressing each commodity share on the estimated motives from the first step. We found that the motives of consumption play different roles for various commodities and contributions of motives to each budget share vary over the total expenditure. Furthermore, the empirical results document important heterogeneity in the Engel curves. The estimated curves present strong heterogenous effect of the consumption motives on the budget share along the conditional distribution of the budget share in most commodities. Importantly, the empirical study underscores the importance of obtaining correct confidence intervals for the estimated Engel curves by taking into account the GR issue.
The remainder of the article is organized as follows. Section 2 introduces the QR model with GR. Section 3 establishes consistency and asymptotic normality of the two-step estimator. Section 4 proposes a consistent estimator of the variance-covariance matrix. Section 5 presents Monte Carlo simulations. Section 6 presents the empirical application to Engel curves. Section 7 concludes the article. Technical proofs are included in the Appendix.

Quantile Regression with Generated Regressors
This section describes the quantile regression (QR) model with generated regressors (GR) we consider in this paper and the two-step estimation procedure.

Model
For each fixed τ P p0, 1q, we consider the following model where y i is a response variable, x i " px i1 , . . . , x ik q is a k-dimensional vector of explanatory variables, β 0 pτ q is a kˆ1 vector of parameters, and the innovation term u i has conditional When all the regressors x i in model (2.1) are observable, the model can be written as the following standard QR model where Q τ py i |x i q is the conditional τ -quantile of y i given x i . In general, β 0 can depend on τ . The model is semiparametric in the sense that the functional form of the conditional distribution of y i given x i is left unspecified.
The parameter of interest for the researcher is β 0 pτ q in model (2.2). However, in many applications one or more elements of the vector of regressors x i may not be directly observable, but instead estimated from a model with other given variables, that is, the GR. In this paper, we assume that some of the regressors in the vector x i are not observable to the researcher, i.e., px i1 , . . . , x iq q are not directly observable, but px iq`1 , . . . , x ik q are observed, where q ď k. In particular, the GR are assumed to have the following form where the function g j p¨,¨q is differentiable and known up to the unknown p jˆ1 parameter vector θ for j " 1,¨¨¨, q, and the variable w i " pw i1 , ..., w iq q is a q-dimensional vector of observables, and where v ij is an error term. Thus, one can still estimate the QR model in (2.2) by replacing x ij with the GR p x ij for j " 1,¨¨¨, q. The GR p x ij is obtained from the following first-step estimation where p θ j satisfies very general weak conditions as: , and rp¨q is a generic function satisfying Err i pθ j qs " 0. Equation (2.3) defines the GR and could be satisfied by different models, we provide a few examples below. Notice that most estimators used in empirical applications satisfy this first-order representation. We will discuss these conditions more formally below. To complete the model, together with (2.2) and (2.3), we assume that F´1 τ pu i |w i1 ,¨¨¨, w iq , x iq`1 ,¨¨¨, x ik q " 0.
Remark 2.1. One or more elements of the regressors x i may be estimated in the first step.
Each of the GR could be related to different observable variables in different functional forms.
For simplicity, each of the GR equations can be estimated separately, and any dependence between the GR's would be captured by the variance covariance matrix.
Remark 2.2. We note that the estimation of quantile regression models with mismeasured regressors leads to inconsistent estimates and a few methodologies have been proposed to overcome this drawback (e.g., Wei andCarroll (2009), Wang, Stefanski, andZhu (2012), and Firpo, Galvao, and Song (2017)). Thus, the problem is different from the quantile regression with generated regressors where the standard quantile estimation is still consistent.
To motivate the existence of GR for QR models, and illustrate how a QR-GR framework could appear in practice, we include the following motivating examples: Example 2.1 (Two-stage regression with proxy variables). A very important example of GR occurs when the variables x i are not directly observable. For instance, assume that the variables px i1 , . . . , x iq q are related to additional observable variables, w ij , as following where pr x i1 , . . . , r x iq q are proxy variables, or endogenous observables, w ij is a vector of exogenous observable variables, θ j is a p jˆ1 vector, the functions g j p¨q are unknown up to the vector θ j , and v ji are mutually independent innovation terms, for j " 1,¨¨¨, q.
In this case, to complete the definition of the GR one needs to impose more structure on the innovation term v ji to estimate the parameters θ j , for j " 1,¨¨¨, q. As special cases of a more general procedure which generates the regressors, consider a simple but commonly used linear model to generate one regressor, x i1 , as a function of several variables w i , that is, r Example 2.1(b) (Conditional quantile). Another simple model for the GR is a linear conditional quantile. In this case, for a given quantile τ 1 specified by the researcher, the GR model is defined as Assume that w i1 are valid instruments. In the control function approach, one could model the endogenous explanatory variable as following: where w i1 contains exogenous regressors z i . The source of the endogeneity in the model (2.4) comes from the correlation between u i and v 1i . Thus, the endogeneity can be solved by controlling for v 1i in the model. Since we do not observe v 1i , we can replace v 1i with where p v 1i could be obtained by either mean or quantile regression.
This gives where η i " i`γ rg 1 pw i1 , θ 1 q´g 1 pw i1 , p θ 1 qs which depends on the sampling error in p θ 1 .
Notice that p v 1i is a generated regressor. Moreover, it is worthwhile noting that although the control function approach for the endogeneity issue is one of examples for the GR as in e.g., Lee (2007), we are not particularly attempting to solve the endogeneity issue in this paper. Instead, we provide a general framework and establish the asymptotic properties for the quantile models with GR.

Estimation
The estimation of the parameters of interest in model (2.2) involves a two-step estimation.
In the first step one estimates and computes the GR from model (2.3). In the second step one uses the GR and other regressors, and computes the QR of interest from (2.2).
Estimation of the unobservables usually comes from the same sample of data -may come from a different dataset or even from the parameter estimates by another researcher.
Models used to estimate the unknown parameter θ j may generally include linear or nonlinear models. Also, the parameters can be estimated by various strategies. The QR-GR two-step estimation procedure is as following: Step 1 Estimate θ j from (2.3) and compute the fitted values p x ij " g j pw ji , p θ j q for j " 1,¨¨¨, q, and then obtain the generated regressors p x i " pp x i1 ,¨¨¨, p x iq , x iq`1 ,¨¨¨, x ik q J for i " 1,¨¨¨, n.
Thus, one uses the estimates of θ j , denoted by p θ j , in the first step, to obtain the GR p x i . In the examples for the average and quantile models discussed previously we have the following for the first step.
Example 2.1(a) (Average continued). In this example, one employs the standard OLS estimator and obtains p θ " pw J wq´1w J r x 1 and computes p x i1 " w J p θ, and also p x i " pp x i1 , x i2 , ..., x ik q J . Then, the τ -th QR estimator p βpτ q can be obtained by (2.6).
Example 2.1(b) (Quantile continued). In this case, for a given quantile τ 1 one applies the usual QR procedure to estimate θpτ 1 q and obtain p x i1 " w J i p θpτ 1 q. Thus, the first-step estimation is given by the following QR Then, in the second step, for a given τ -quantile, the QR estimator p βpτ q can be obtained by (2.6).
These two practically-common cases illustrate the simple implementation of QR model with GR. We have made R routines for the QR-GR estimator and inference in the QR-GR framework available for the practitioners.

Asymptotic Properties
We now establish consistency and asymptotic normality of the QR-GR two-step estimator, p βpτ q, defined in the previous section. Proofs are collected in the Appendix. We consider the following regularity conditions: A1. tpy i , x i , w i qu n i"1 is independent across i. The conditional distribution functions of the error term tF i pu|x i qu have continuous densities f i pu|x i q with a unique conditional τ -th quantile equal to 0, and f i p0|x i q are uniformly bounded away from 0 and 8.
A2. There exist positive definite matrices D 0 and D 1 such that where r i p¨q is a continuous function which satisfies that Err i pθqs " 0 and V arrr i pθqs " V .
A5. There exists a positive definite matrix Conditions A1 and A2 are the usual conditions in the QR literature. Assumptions A1 allows for non-iid sampling, and A2 requires limiting matrices to be well defined. Assumptions A3-A5 refer to the GR estimation in the first step and need to be verified for each empirical application. These conditions are very mild. Assumptions A3 and A4 impose consistency and asymptotic normality, respectively, for the first-step estimator of the GR.
These conditions hold for most estimators employed in empirical work in the family of Mand Z-estimators. We note that no restrictions are imposed on the functional form of g j p¨q except condition A5 which is a weak smoothness condition needed only for nonlinear models.
The following result states the consistency of the QR-GR estimator. Remark 3.1. A consistent estimate of the unknown parameter θ j for j " 1,¨¨¨, q in the first step suffices for the consistency of QR with GR. In other words, replacing x by p x in a quantile regression still gives us a consistent QR estimator.
The next result establishes the asymptotic normality of the QR-GR two-step estimator.
Theorem 3.2 (Asymptotic normality). Consider the model in (2.2) and (2.3). Under conditions A1-A5, as n Ñ 8, we have that, The result in Theorem 3.2 shows that the limiting distribution of ? np p βpτ q´β 0 pτ qq generally depends on the statistical properties of ? np p θ´θq since the sampling error of generating regressor in model (2.3) contaminates the variance-covariance estimation in the second stage. Hence, the asymptotic variance-covariance of the two-step estimator needs to be adjusted when the regressors are generated. The adjusted variance-covariance matrix is larger, since the sampling error V from the first-step estimation enters variance-covariance matrix. Nevertheless, the sampling variation of p θ j can be ignored, at least asymptotically, if the coefficients β j of the GR are zero, i.e. D j 12 " 0 and the corresponding part in M becomes zero. In that case, replacing the true regressors with GR will not impact the asymptotic distribution of the estimator.
Remark 3.2. In general, the consistency of QR estimator with GR holds when the true regressor is replaced by GR. However, the asymptotic variance-covariance matrix of the estimator p βpτ q needs to be adjusted because of the sampling variation introduced by estimation of p θ j in the first step.
Remark 3.3. Comparing the QR-GR estimator p βpτ q with regular QR estimator with the true unobserved regressors, we see that the additional second term appears in the variancecovariance matrix coming from the first-step estimation. The first-step estimation contaminates the variance-covariance matrix of p βpτ q in two ways: one is the sampling error V j of coefficient estimates θ j in the first stage, the other is the gradient of model specification with respect to the parameter in the first stage. For the simple linear model in the first stage, the gradient is simply the regressors in the first stage. But for nonlinear model in the first stage, the coefficient estimates show up in D j 12 which makes the variance-covariance matrix of p βpτ q larger than when the true regressors are used for estimation.
The two-step estimation procedure in this paper is easy to implement in practice. First, since weak conditions are needed for QR-GR procedure, different estimation strategies may be used in the first step to construct the GR. Most common estimators in practice satisfy the weak conditions: for example, the simple OLS, QR, MLE methods, etc. Second, both linear and nonlinear model specifications in the first step are allowed. Finally, it is important to notice that weak conditions in the QR step include non-iid models which allow practitioners to proceed inference in a general class of models.
To further illustrate the estimation of QR models with GR, we further discuss the two common models, OLS and QR with GR. We discuss the case where only one regressor is generated in the first step. For both models, after estimating p θ which satisfies the condition ? np where Err i pθqs " 0 and V arrr i pθqs " V in the first step, one obtains the GR: p x The following examples derive the asymptotic variance-covariance matrix for both OLS and QR with GR: Example 3.1 (OLS with GR). In this example, p β is estimated from the model Since the OLS estimator has a closed-form solution, one obtains p β "`ř n i"1 p Finally, the variance-covariance matrix is given by the following result which is proved in the Appendix, for completeness. Note that according to Proposition 2 in the Appendix, under conditions C1-C5, When there is no GR or the coefficients of the GR are zero, the asymptotic variance displayed above simplifies since G " 0. In the case of V arpu|Xq " σ 2 u I n , where σ 2 u ą 0 and I n is an nˆn identity matrix, we obtain that ?
Example 3.2 (QR with GR). In this example, p βpτ q is estimated from the model Unlike the simple closed-form solution in the above OLS estimate, one applies the usual QR procedure to estimate where D 0 , D 1 , and D 12 are defined in conditions A2 and A5, and M is simplied to M " In addition, one can notice that the above asymptotic variance-covariance matrix can be simplified to To conclude this section we have two remarks. First, from the second example, one can notice that the variance for the QR with GR has the additional terms D´1 1 D 12 V D J 12 D´1 1 and´2D´1 1 M D´1 1 relative to the standard QR model without GR, which has the variance τ p1´τ qD´1 1 D 0 D´1 1 . Second, intuitively, as shown in Theorem 3.1 and Proposition 1, in both OLS and QR frameworks, estimation with GR still produces consistent estimates. However, as shown in Theorem 3.2 and Proposition 2, the corresponding variance-covariance matrices need to account for two sources of error -the usual estimation error in the OLS or QR method, and the sampling error in generating the regressors.

Inference
In this section, we turn our attention to inference in the QR-GR model. First, we suggest an estimator for the asymptotic variance-covariance matrix of the QR-GR estimator. Second, we propose a Wald-type test for general linear hypotheses.

Variance-Covaraince Matrix Estimation
In applications, the variance-covariance matrices are unknown and need to be estimated. Now we suggest an estimator for the corresponding variance of the QR-GR estimator. The estimator is closely related to those suggested by Hendricks andKoenker (1991) andPowell (1991) and given as the following form To establish the consistency of p Ωpτ q we impose the following assumptions: B1. There exists a positive sequence of bandwidths tc n u such that c n Ñ 0 and ? nc n Ñ 8.
B2. Ep}x i } 4 q ď H 1 ă 8 for all i and for some constant H 1 .
B4. There exists a bounded function A f px i q such that f i pλ|x i q ď A f px i q for all i and for some λ near zero a.s. 4 B5. f i pλ|x i q satisfies the Lipschitz condition that |f i pλ 1 |x i q´f i pλ 2 |x i q| ď L i |λ 1´λ2 | for some constant L i ă 8 and for all i.
Assumption B1 is a restriction on the bandwidth c n , which is commonly used to estimate unknown conditional densities. B2 imposes some moment condition, which is typical in the QR literature. Assumption B3 imposes smoothness and dominance conditions that are typical for nonlinear models (see, e.g., Powell (1991)). B4 imposes some local restriction on the conditional density function near zero and satisfies some moment bounds, which can be weakened at the cost of moment bounds for various cross products of the bounding functions.
(see, e.g., Assumption ER of Buchinsky and Hahn (1998)). B5 imposes smoothness condition on the conditional densities and is standard in the QR literature. 4 We note that x i " px i1 ,¨¨¨, x iq , x iq`1 ,¨¨¨, x ik q where x ij " g j pw ji , θ j q for j " 1,¨¨¨, q. And we also note that for any θ j P Θ j and for all i, The next result states the consistency of the variance-covariance QR-GR estimator. In other words, p D 0 defined in conditions A2 and A5, and M is defined in Theorem 3.2.
As a special case of more general procedure which generates the regressors, consider a linear model r is reduced to w, in the following examples for the average and quantile models: Example 2.1(a) (Average continued). In this example, one employs the standard OLS estimator in the first step to obtain Finally, with the second step estimation p β, the estimator of the corresponding variance-covariance matrix is simply p with p v being the residual from the first-step estimation.
Example 2.1(b) (Quantile continued). In this example, for a given quantile τ 1 one applies the usual QR procedure to estimate θpτ 1 q and obtains p x 1 . Thus, the first-step estimation is given by the following QR So one obtains the standard error p V and computes p The final formula for the estimator of the variance covariance matrix is analogous to the previous example.

Testing
In the independent and identically distributed setting, the conditional quantile functions of the response variable, given the covariates, are all parallel, implying that covariate effects shift the location of the response distribution but do not change the scale or shape. However, slope estimates often vary across quantiles, implying that it is important to test for equality of slopes across quantiles. Wald tests designed for this purpose were suggested by Koenker and Bassett (1982a), Koenker and Bassett (1982b), and Koenker and Machado (1999). It is possible to formulate a wide variety of tests using variants of the proposed Wald test, from simple tests on a single quantile regression coefficient to joint tests involving many covariates and distinct quantiles at the same time.
General hypotheses on the vector βpτ q can be accommodated by Wald-type tests. The Wald process and associated limiting theory provide a natural foundation for the hypothesis where R is a full-rank matrix imposing s number of restrictions on the parameters and r is a column vector of s elements. We consider a Wald-type test where we test the coefficients for selected quantiles of interest. For simplicity, we use the model stated in equation (2.2) with a single variable in the x matrix. The following example is a hypotheses that may be considered in the former framework.
In general, for given τ , the regression Wald process can be constructed as In order to implement the test it is necessary to estimate Ωpτ q consistently. It is possible to obtain such an estimator as suggested in Theorem 4.1 in the previous section, and the main components of p Ωpτ q can be obtained as in equations (4.2)-(4.4).
Given the results on consistency of p Ωpτ q, if we are interested in testing Rβpτ q " r at a particular quantile τ " τ 0 , a Chi-square test can be conducted based on the statistic W n pτ 0 q.
Under H 0 , the statistic W n is asymptotically χ 2 s with s-degrees of freedom, where s is the rank of the matrix R. The limiting distribution of the test is summarized in the following theorem.

Monte Carlo Simulations
In this section, we evaluate the performance of two-step QR-GR estimator and compare its performance with the usual QR estimator which does not account for the first-step estimation. Also, the performance of the proposed variance-covariance estimator is evaluated. The computational results are obtained in the R language.

Monte Carlo Design
We consider the following model as a data generating process: where " N p0, 1q, and β 0 and β 1 are the parameters of interest. We set pβ 0 , β 1 q " p4, 3q.
The parameter γ captures the heterogeneity, hence we let γ " t0, 1u. When γ " 0 we have a location-shift model and for γ " 1 the location-scale-shift. Thus, for the later model, we have that β 1 pτ q " β 1`γ F´1 pτ q.
For comparison, we consider two estimators of β 1 : (i) the standard (infeasible) QR using the unobserved regressors x˚, which we label QR; (ii) the QR estimates with the GR as described above, which is defined as QR-GR. For the two-step QR-GR estimator, the estimation process is as following. In step 1, using the OLS estimation, we obtain the generated regressor (GR) from the model: p x " p θ 0`p θ 1 w`p θ 2 z, where px, w, zq are observables. In step 2, for each τ " p0.1, 0.3, 0.5, 0.7, 0.9q, we estimate β 1 using the QR-GR estimator of y on p x.
We also present results for the corresponding standard errors (SE) of the estimators. For the QR-GR estimator we use the estimator in equation c n " k¨pΦ´1pτ`h n q´Φ´1pτ´h n qq where k " mintSEpp uq, pquantilepp u, 0.75q´quantilepp u, 0.25qq{1.34u, p u is the residual of the QR estimation, and h n is the default bandwidth in the R package. k is a robust estimate of the scale and the bandwidth c n is also commonly chosen in the R package.

Location Shift Model
The results for the location-shift model are provided in Tables 1-2. The bias, SE, and root mean squared error (RMSE) of both QR-GR and QR estimators are presented in Table   1. Different sample sizes n " 100 and n " 1000 in the experiments are reported in the table. According to the theorems in Section 3, both estimators QR-GR and QR should be consistent. As shown in Table 1, both estimators have empirical bias very close to zero, even for small samples.
Table 1 also shows that the standard error decreases as the sample size increases. However, the QR-GR have substantially larger standard error and RMSE than the regular QR with the true regressor. This result is expected from the theoretical results. These observations reflect the fact that having a GR estimation in the first step does not affect the bias performance but induces a substantially larger variance. This confirms that the sampling error of obtaining by the GR contaminates the standard error in the second-stage QR estimation.
In Table 2 we evaluate the performance of proposed variance-covariance estimator discussed in Section 4 for n " 1000. We report three statistics in Table 2. First, in column 1, we report the sample standard deviation of the QR-GR estimates based on the Monte Carlo repetitions, which approximates the true standard error of the parameter β 1 of interest. Sec-ond, the average of the proposed standard error of the QR-GR estimator is reported in the column 2. Finally, for comparison, the standard error of the usual (infeasible) QR is given in the column 3.
By comparing columns 1 and 2 of Table 2, we can see that the estimated standard errors (SE) of the proposed QR-GR is very closely to the true value given in column 1. However, the average of the estimated standard error calculated in the usual way is severely biased downwards. This result confirms the theoretical predictions and reflects that the sampling error from the first-step estimation induces a larger variance-covariance matrix in the second step. Thus, the estimated standard errors from the conventional formula without considering the GR problem underestimates the population counterpart, and in turn severely affects the inference procedure.

Location-scale Shift Model
The results for the bias and RMSE for sample sizes n " 100 and n " 1000 are reported in Table 3. For the location-scale shift model, both QR and QR-GR estimators have small bias so that they are close to the population value of 3`F´1 pτ q. However, as expected, the QR-GR have larger standard error and root mean square error (RMSE) than the regular QR with the true regressor for both sample sizes. Thus, the results corroborates the theoretical findings that the GR variable has no asymptotic effects in the bias performance but induces a larger variance.
As in the previous case, in Table 4 we assess the performance of the proposed variancecovariance estimator discussed in Section 4 for the location-scale model. The results are analogous. We see that the proposed QR-GR standard errors in column 2 closely approximates the true standard error in column 1. However, the estimated standard error calculated in the usual way in column 3 is severely smaller and does not approximate well the true standard error.    (1857) and Working (1943)). They are regression functions where the dependent variable is the level or the budget share of total expenses used to purchase a commodity of goods or services, and the explanatory variable, total expenditure, is usually used as a proxy for income. A very robust empirical result referred to as 'Engel's law' states that the poorer a family is, the larger the budget share it spends on food. Other categories of expenditure present a less robust pattern.
Many researchers explored different functional forms for Engel curves which better fit the data. For example, Lewbel (1997) proposed a functional form for Engel curves that contains a linear function of logarithm of total expenditure and some nonlinear function of total expenditure. Nonparametric estimations also have been incorporated in the estimation of Engel curves like Blundell, Chen, and Kristensen (2007). Methods for comparing different regression functions are discussed in Lewbel (2008). These various shapes of Engel curves suggest a deeper understanding of underlying motives which drive household expenditure decisions. However, it is problematic to assume that only one motive drives the consumption of one particular category of goods or services (Chai and Moneta (2010)). For example, luxury, as a relative concept, is possible in all sorts of consumption. Barigozzi and Moneta (2016) studied a system of budget share and extracted multiple factors that span the same space of basic Engel curves. To understand how the patterns of consumption may be driven by a mixture of different motives, we use the quantile regression with generated regressor (QR-GR) framework laid out in Section 2, in order to explore the heterogeneous effects of motives over different commodity.
To be specific, we estimate unobserved factors which represent underlying motives of expenditure in the first step using the factor model in Barigozzi and Moneta (2016). We then apply quantile regression (QR) methods to the Engel curve model in the second step where budget share is regressed on the estimated factors. The proposed QR-GR model is used to study how each type of household expenditure is driven by different underlying motives. The QR-GR model has two advantages. First, an important difficulty with the model is that the estimator of the variance-covariance matrix needs to take into account the fact that unobserved factors are estimated. And the proposed method is able to provide statistically reliable inference via correct estimation of the variance-covariance matrix. Second, we accommodate possible heterogeneity in the effects of total expenditure over the conditional distribution of budget shares. We find that the estimated Engel curves of budget shares are driven by a mixture of three underlying forces which are motives for a household to consume necessities, luxuries, and goods or services on which is spent the same percentage of the total budget. Also, we find heterogeneity in each motive along the conditional distribution of budget shares.

Data Description
The data we use in this paper is the household data from the UK Family Expenditure In this paper, we study a sample of about 4000 households which have 2 to 4 family members, and the budget shares of these categories are pooled over ten years. Table 5 reports some descriptive statistics for total expenditure and 13 categories of expenditures.
As shown in the

Empirical Analysis
Barigozzi and Moneta (2016) study a system of budget shares that are driven by latent factors. Using the factor analysis in Bai and Ng (2002), they determine the number of basic Engel curves (i.e., the rank of the system) and found that budget shares of each expenditure can be approximated by a three-factor model: (i) a decreasing function (necessities), (ii) an increasing function (luxuries), (iii) a constant function over the total expenditure (unitary elasticity goods).
We estimate the following conditional quantile function: Q τ py i |x 1i , x 2i q " β 0 pτ q`β 1 pτ qx 1i`β2 pτ qx 2i , (6.1) where the quantity y i denotes the household budget share of a commodity, and x 1i and x 2i are the first and second motives of total expenditure, respectively. Since the third motive is constant over the total expenditure, a constant is also included.
In order to estimate the quantile model in (6.1), we first obtain the motives x 1 and x 2 using a regression of factors on total expenditure. We compute these generated regressors by following the functional specifications of each motives established in Barigozzi and Moneta (2016). Denote factors by f 1 , f 2 , and f 3 " 1, and denote the total expenditure by z. Let w 1 " p1, logpzqq J and w 2 " p1, z logpzqq J . We use the following estimation steps: Step 1 Obtain the fitted values (generated regressors) of the first two motives by regressing the corresponding factors on functions of the total expenditure as following: Step 2 Run a quantile regression of budget shares y on the three motives where the third motive is associated with a constant: with β " pβ 0 , β 1 , β 2 q.
Remark 6.1. We note that, in step 1, the motives p x 1 and p x 2 are estimated by regressing the factors on the total expenditures where the factors are obtained by the principal component analysis. For the sake of notational simplicity, we assume the factors and total expenditures are observed without errors.
Remark 6.2. In step 2, each motive represents specific reactions to consumption changes.
The first one x 1 is interpreted as a motive to consume necessities since it is decreasing in the total expenditure, while the second one x 2 represents the motive to consume luxuries, which is increasing in the total expenditure. The third one x 3 " 1 is related to unitary elasticity goods which represents goods or services allocated the same percentage of total budget by both rich and poor households.
We estimate the coefficients in model (6.1) across different quantiles. For conciseness, we only report the results for food, housing and leisure. The main results in Figure 1  The 95% confidence intervals in dashed lines are based on our proposed estimator of the asymptotic variance-covariance matrix, while the 95% confidence intervals in dotted lines are calculated by conventional QR variance-covariance estimation. Clearly, the adjusted confidence bands for QR-GR are much larger than the conventional bands which do not adjust for the GR issue. This can be observed more clearly in Table 6 which summarizes the estimated standard errors for a subset of quantiles τ P t0.25, 0.5, 0.75u. The table reports that the estimated standard errors from the proposed estimator (QR-GR SE) is always larger than those from the conventional estimator (QR SE). These results underscore the importance of estimating standard errors correctly by taking into account the GR issue in Engel curves.  Table 6: Coefficient estimates of motives 1 and 2 and their standard errors at quantiles τ " 0.25, 0.5, 0.75 for (a) food, (b) housing and (c) leisure. Columns 1 and 4: coefficient estimates; Columns 2 and 5: standard error calculated in adjusted QR estimation; column 3 and 6: standard error calculated in conventional QR estimation.
Since both x 1 and x 2 are normalized measures of different motives, comparing the relative magnitudes of their coefficient estimates helps us to tell which motive plays a leading role in a particular commodity. For example, Figure 1(a) shows that the motive to consume food as necessity clearly plays a leading role relative to luxuries at all quantiles. In Figure 1(c), the motive to consume leisure as luxuries is dominating a role as necessity. However, consuming housing in Figure 1 To help further interpret how the patterns of budget shares are related to these unobservable motives, we show how the budget share of a particular commodity varies over the total expenditure in Figure 2 through Figure 4. We estimate the contribution of each motive to the budget share at different quantiles τ P t0.25, 0.5, 0.75u. It is interesting to see how the motive to consume a particular commodity as necessity or luxury contributes to its budget share at different levels of total expenditure. For instance, Figure 2 shows that, as total expenditure increases, the contribution of the first motive, x 1 (necessity), to food budget share decreases at all τ , while the second motive, x 2 (luxuries), increases, as well documented in the literature. This implies that as total expenditure increases, the motive to consume food as a necessary good decreases, while the motive to consume food as a luxury good increases. The pattern in Figure 3, which is very different from food, reflects that individuals consume housing more as necessity but less as luxury, as their total expenditure increases. The motives to consume food and housing as necessity or luxuries play opposite roles in the budget share as total expenditure changes.
However, as shown in Figure 4, both motives to consume leisure increase as total expenditure increases as well. Another interesting point is that the magnitudes of those estimated factors or motives are different across different quantiles of the conditional distribution of budget shares in all three commodities.  Figure 5(a) shows that the budget share of food decreases as total expenditure increases, which is classically referred to as Engel's law. In Figure   5(b) we see a concave shape curve, that is, the budget share of housing increases for low expenditures, but decreases as total expenditure increases. In addition, individuals always raise the budget share of leisure as they consume more, as shown in Figure 5(c). Finally, and importantly, in all three commodities, the shapes of the curves vary over different quantiles τ . In all, the results show strong evidence that individual heterogeneity in the shape of Engel curves is uncovered by the QR.

Conclusion
We study estimation and inference for linear quantile regression (QR) models with generated regressors (GR). We propose a QR-GR two-step estimator for the parameters of interest and an estimator for the corresponding asymptotic variance-covariance matrix. We establish the asymptotic properties of the estimators. Monte Carlo simulation and estimation of the Engel curves using data from the UK Family Expenditure Survey confirm that taking into account the GR problem in the QR framework is essential for correct inference. Furthermore, the empirical application shows strong heterogeneity of the Engel curves over different quantiles of the conditional distribution of budget shares in most commodities.

Proof of the Main Results
Proof of Theorem 3.1 (Consistency). Let p βpτ q be the two-step quantile regression (QR) with generated regressors (GR) estimator, and q βpτ q be the usual (infeasible) QR estimator with true unobservable variables. Consider p βpτ q´β 0 pτ q " p p βpτ q´q βpτ qq`p q βpτ q´β 0 pτ qq.
Letx q " px 1 , ..., x q q J and g " pg 1 , ..., g q q J . By noticing that p β " p βpp x 1 ,¨¨¨, p x q , x q`1 ,¨¨¨, x k q and q β " q βpx 1 ,¨¨¨, x q , x q`1 ,¨¨¨, x k q, and by expanding the first term in equation (8.1) p βpτ q´q βpτ q, we have where the third equality follows from an Taylor expansion for g j p¨q for j " 1,¨¨¨, q and where B q βpτ q Bx J q is qˆk matrix, ∇ θ gpw, θq is ř q j"1 p jˆq matrix, and p p θ´θq is ř q j"1 p jˆ1 vector.
As seen in (8.3), q βpτ q can be rewritten as By taking the derivative of q βpτ q with respect to x j , one can notice that the contribution of the ψ τ py i´x J i β 0 pτ qq term is o p p1q.
Now we apply Lemma 2 about G-inverse from Ma and Koenker (2006), and obtain that It follows that Note that plim nÑ8 p θ " θ by assumption A3 and D´1 1 D 12 is bounded by assumptions A2 and A5, we have that´D´1 1 D 12 p p θ´θq " o p p1q. Therefore, as n Ñ 8, we have  (8.6) and satisfies a Central Limit Theorem such that where ψ τ puq :" τ´Ipu ď 0q.
By noticing that p β " p βpp x 1 ,¨¨¨, p x q , x q`1 ,¨¨¨, x k q and q β " q βpx 1 ,¨¨¨, x q , x q`1 ,¨¨¨, x k q, and by expanding the first term in equation (8.5) ? np p βpτ q´q βpτ qq, we have As seen in (8.6), the Bahadur representation for ? n q βpτ q can be rewritten as Taking the derivative of ? n q βpτ q with respect to x j and noticing that the contribution of the np p βpτ q´q βpτ qq "´D´1 1 1 ? n Substituting A4 where p θ´θ " 1 n ř n i"1 rpθq in the above equation, we obtain that where M " plim nÑ8 1 n ř n i"1 tψ τ py i´x J i β 0 pτ qqf i p0|x i qx i rpθq J ∇ θ gpw i , θq JxJ i βpτ q J u. Finally, combining the above covariance term with (8.7) and (8.10) with (8.5), we obtain Proof of Theorem 4.1 (Consistency of variance-covariance matrix). (i) Claim that p D 0 in other words, By definition: We show that n´1 ř n i"1 p x i p x J i and n´1 ř n i"1 x i x J i are close to each other element by element: (a) For j " 1,¨¨¨, q, j 1 " 1,¨¨¨, q , we have Assumption B3 implies the following uniform convergence by applying the uniform weak law of large number (UWLLN): (Ep¨q means expectation in terms of joint distribution) and assumption A3 that p θ j Ñ θ j for j " 1,¨¨¨, q implies that It follows that (b) For j " 1,¨¨¨, q, s " q`1,¨¨¨, k, similarly by applying UWLLN, we have n´1 n ÿ i"1 x ij x is " n´1 n ÿ i"1 g j pw ji , θ j qx is p Ñ Erg j pw j , θ j qx is s, and the consistency of θ j implies that After claiming (a) and (b), we have To obtain a consistent estimator of D 1 which contains unknown conditional density function, the estimator p D 1 replaces the conditional density functions by uniform kernel weights.
We can find a bound for } p D 1´r D 1 } such that it can be arbitrarily small in probability if δ is chosen sufficiently small. Since the bound for T 2 is easy to derive, we only show that To show this, applying assumption B3, (8.12) and (8.13), we have Inequality (8.11) can be rewritten as |Ip|p i | ă c n q´Ip| i | ă c n q| ď Ip| i´cn | ă δc n r}p x i }`max j D j pw j qsq (8.14) Ip| i`cn | ă δc n r}p x i }`max j D j pw j qsq.

Then we have
EpT 1 q ď E # p2nc n q´1 n ÿ i"1 " Ip| i´cn | ă δc n r}p x i }`max j D j pw j qsq where A is some bound for Erp}p x i } 3`m ax j D j pw j q}p x i } 2 qA f px i qs by assumption B4.
(b) We now show r D 1´D1 " o p p1q. Note that r D 1´D1 " p2nc n q´1 For the first term, it has zero expectation and if a and b are k-dimensional vectors with }a} " }b} " 1 and consider the variance of the first term, Er}x i }}x i }s 2 " p4nc 2 n q´1H 1 " op1q, where the second equality holds because all the cross-product terms are zero by the law of iterated expectations, and H 1 is some bound for Ep}x i } 4 q by assumption B2. Therefore, the first term converges in quadratic mean to zero, which implies that it converges to zero in probability.
For the second term, |p2c n q´1ErIp| i | ă c n q|x i s´f i p0|x i q| "ˇˇˇˇp2c n q´1 ż cń cn f i pλ|x i qdλ´f i p0|x i qˇˇˇď |p2c n q´12c n f i pλ˚|x i q´f i p0|x i q| ď L i |c n | " o p p1q, where f i pλ˚|x i q " max λPr´cn,cns f i pλ|x i q and the last inequality uses assumption B5.
(iii) Claim that p D 12´D12 " o p p1q: p D 12 " p2nc n q´1 n ÿ i"1 Ip|p i | ă c n q p βpτ q p x i ∇ θ gpw i , p θq J , Define r D 12 " p2nc n q´1 n ÿ i"1 Ip| i | ă c n qβpτ qx i ∇ θ gpw i , θq J , where p i " y i´p x J i p βpτ q and i " y i´x J i β 0 pτ q. Similar to Part (ii), we need to show that } p D 12´r D 12 } " o p p1q and } r D 12´D12 } " o p p1q. } r D 12´D12 } " o p p1q is easy to derive by following the idea in Part (ii)(b), we only show that } p D 12´r D 12 } " o p p1q.
Combining the above two inequalities, assumption B3 and (8.14), we have } p D 12´r D 12 } ď p2nc n q´1 n ÿ i"1 # " Ip| i´cn | ă δc n r} p x i }`max j 1 D j 1 pw j 1 qsq Ip| i`cn | ă δc n r} p x i }`max j 1 D j 1 pw j 1 qsq  }βpτ q}¨} p x i }D j pw j q Ip| i | ă c n qδc n } p x i }D j pw j q Ip| i | ă c n qδc n max j 1 D j 1 pw j 1 qD j pw j q Ip| i | ă c n qδc n J} p x i } * " N 1`N2`N3`N4 .
Similar to Part (ii)(a), we can find a bound for } p D 12´r D 12 }, such that it can be arbitrarily small in probability if δ is chosen sufficiently small. Since the bounds for N 2 , N 3 and N 4 are easy to derive, we only show that EpN 1 q " Opδq as follows: Ip| i´cn | ă δc n r} p x i }`max j 1 D j 1 pw j 1 qsq Ip| i`cn | ă δc n r} p x i }`max j 1 D j 1 pw j 1 qsq  }βpτ q}¨} p x i }D j pw j q + ď E # }βpτ q}¨} p x i }D j pw j qp2nc n q´1 n ÿ i"1 2 ż cn`δcnr} p x i }`max j 1 D j 1 pw j 1 qs cn´δcnr} p x i }`max j 1 D j 1 pw j 1 qs f i pλ|x i qdλ + ď E # }βpτ q}¨} p x i }D j pw j qp2nc n q´14δc n n ÿ i"1 r} p x i }`max j 1 D j 1 pw j 1 qsA f px i q + ď n´1 n ÿ i"1 2}βpτ q}¨δ¨B j " Opδq, for some B j satisfying ErD j pw j qt}x i } 2`m ax j 1 D j 1 pw j 1 q}x i }uA f px i qs ă B j ă 8, where the last inequality holds by assumption B4.
(iv) Claim that x

OLS with Generated Regressors
In order to compare our quantile regression with generated regressor framework with the OLS with generated regressor (OLS-GR), we include the detailed assumptions and propositions for the OLS case in this section.
Let p δ be a ? n-consistent estimator of δ, and we obtain the generated regressors as p x i " hpw i , p δq. Let p β be the OLS-GR estimator from the equation where p x i " hpw i , p δq. We impose the following regularity conditions.
We first state the consistency of the OLS-GR estimator.
Proof of Proposition 1. Recall that Thus,