您好,欢迎来到智榕旅游。
搜索
您的当前位置:首页3.Semiparametric Censored Regression Models

3.Semiparametric Censored Regression Models

来源:智榕旅游
JournalofEconomicPerspectives—Volume15,Number4—Fall2001—Pages29–42

SemiparametricCensoredRegressionModels

KennethY.ChayandJamesL.Powell

A

regressionmodeliscensoredwhentherecordeddataonthedependentvariablecutsoffoutsideacertainrangewithmultipleobservationsattheendpointsofthatrange.Whenthedataarecensored,variationinthe

observeddependentvariablewillunderstatetheeffectoftheregressorsonthe“true”dependentvariable.Asaresult,standardordinaryleastsquaresregressionusingcensoreddatawilltypicallyresultincoefficientestimatesthatarebiasedtowardzero.

Traditionalstatisticalanalysisusesmaximumlikelihoodorrelatedprocedurestodealwiththeproblemofcensoreddata.However,thevalidityofsuchmethodsrequirescorrectspecificationoftheerrordistribution,whichcanbeproblematicinpractice.Inthepasttwodecades,anumberofsemiparametricalternativesfordealingwithcensoreddatahavebeenproposed.Inasemiparametricapproach,partofthefunctionalformofthemodel—usuallytheregressionfunction—isparametricallyspecifiedbytheresearcherbaseduponplausibleassumptions,whiletherestofthemodelisnotparameterized.1Whilethetheoreticalliteraturehasproducedseveralsemiparametricestimatorsforthecensoreddatamodel,pub-lishedapplicationsoftheseestimatorstoempiricalproblemsineconomicshavelaggedfarbehind.

Thispaperreviewstheintuitionandcomputationofahandfulofsemipara-metricestimatorsproposedforthecensoredregressionmodel.Thevariousesti-matorsareusedtoexaminechangesinblack-whiteearningsinequalityduringthe

1Inthisissue,thepaperbyDiNardoandTobiasoffersfurtherdiscussionofnonparametricandsemiparametricanalysis.

yKennethY.ChayisAssistantProfessorofEconomicsandJamesL.PowellisProfessorof

Economics,UniversityofCalifornia,Berkeley,California.

30JournalofEconomicPerspectives

1960s,aroundthetimeofthepassageoftheCivilRightsActof1964,basedonlongitudinalSocialSecurityAdministration(SSA)earningsrecords.Theseearningsrecordsarecensoredatthetaxablemaximum;thatis,anyoneearningmorethanthemaximumthatwastaxableunderSocialSecurityisrecordedashavingearnedatthemaximum.Thus,abovethemaximum,thedataonearningsdonotaccu-ratelyreflectactualearnings.Ordinaryleastsquaresanalysisofthesedataimplieslittleconvergenceintheearningsofblackandwhiteworkersduringthe1960s.Ontheotherhand,theestimatesfromthesemiparametricmodelsthataccountforcensoringsuggestthatsignificantblack-whiteearningsconvergencedidoccurafter1964.Comparisonsoftheresultsfromparametricandsemiparametricprocedureshelppinpointsourcesofmisspecificationintheparametricapproach.

CensoredRegressionModelsandEstimators

TheSocialSecurityAdministrationdatasetthatweanalyzesuffersfromthesimplestformofdatacensoring,intervalcensoring,forwhichthevaluesofthe“true”dependentvariable,y*,areobservedonlyiftheyfallwithinsomeknown,oftenone-sided,interval[a,b].Otherwise,theclosestendpointoftheintervalisob-servedinsteadofy*.Tobin(1958)usedthismodeltoanalyzeconsumerexpendi-turesonautomobiles,withaϭ0andbϭϱ,andeconomistsgenerallyrefertoregressionmodelswithnonnegativityconstraintsasTobitmodels.Othertypicalapplicationsofthesecensoredregressionmodelsaretoright-censoreddata,whereaϭϪϱandbrepresentsamaximumrecordablevalueforthedependentvariable.Suchmodelsarisefortop-codeddata,wheresufficientlylargevaluesofthetruevariabley*arerecordedas“atleastequaltob.”Inourprimaryempiricalapplica-tion,thedependentvariable,thelogarithmofannualearnings,is“top-coded,”orcensoredfromabove,withbequaltothelogarithmofthemaximumannualearningssubjecttoSocialSecuritytaxesinagivenyear.

Algebraically,themodelfortheobserveddependentvariableyunderintervalcensoringis

ͭaifxЈ␤ϩ␧Ͻa,bifxЈ␤ϩ␧Ͼb,xЈ␤ϩ␧otherwise,

whereyistheobservedvalueofthedependentvariable,xisavectorofobservedexplanatoryvariables,␤isavectorofunknownregressioncoefficientstobeestimated,␧isanunobservederrorterm,andaandbarethecensoringintervalendpoints.Whilethetruedependentvariabley*satisfiesastandardlinearregres-sionmodel,theobservedvariableyclearlydoesnotwheny*liesoutside[a,b].Becauseydoesnotvarywiththeregressorsxwhenitiscensored(unlikethetrue

KennethY.ChayandJamesL.Powell31

variabley*),standardleastsquaresregressionwillunderestimatethemagnitudeoftheregressionslopecoefficients.

Ifthedistributionoftheerrorterms␧giventheregressorshasaknownparametricform—forexample,normallydistributedandhomoskedasticerrors—itisstraightforwardtoderiveandmaximizethelikelihoodfunction.Thisprovidesaconsistentandapproximatelynormalestimatoroftheregressioncoefficients␤(see,forexample,Amemiya,1985,chapter10).However,inmanyempiricalproblems,thedistributionoftheerrorsisnotknownorissubjecttoheteroskedas-ticityofunknownform.Insuchcases,themaximumlikelihoodestimatorwillnotprovideaconsistentestimate(Goldberger,1983;ArabmazarandSchmidt,1981,1982).Also,forcensoredpaneldatawithfixedeffects—thatis,censoreddatawithrepeatedobservationsonindividualsovertimeandintercepttermsthatareallowedtovaryfreelyacrossindividuals—maximumlikelihoodestimationmethodswillgenerallybeinconsistentevenwhentheparametricformoftheconditionalerrordistributioniscorrectlyspecified(Honore´,1992).

Thus,itisimportanttodevelopestimationmethodsthatprovideconsistentestimatesforcensoreddataevenwhentheerrordistributionisnonnormalorheteroskedastic.Here,wefocusondescribingthreeparticularsemiparametricestimatorsforthecensoredregressionmodel,withacronymsCLAD,SCLSandICLAD.Allthreeestimatorscanbecomputedbyalternatingbetweena“recensor-ing”step,inwhichthedataare“trimmed”(usingthecurrentparameterestimates)tocompensateforthecensoringproblem,anda“regression”stepusingthetrimmeddatatoobtaincoefficientestimates.MorecompletealgebraicderivationsanddiscussionsofthevariousalternativesareavailableinPowell(1994,section5.3).Furtherdetailsonlarge-samplepropertiesandstandarderrorformulaecanbefoundinthecitedreferences.

Thecensoredleastabsolutedeviations(CLAD)estimationmethodwasproposedbyPowell(1984).Forthelinearmodel,themethodofleastabsolutedeviationsobtainsregressioncoefficientestimatesbyminimizingthesumofabsoluteresidu-als.Itisageneralizationofthesamplemediantotheregressioncontextjustasleastsquaresisageneralizationofthesamplemeantothelinearmodel.Ifthetruedependentvariabley*wereobserved,thenitsmedianwouldbetheregressionfunctionxЈ␤undertheconditionthattheerrorshaveazeromedian.Leastabsolutedeviationscouldthenbeusedtoestimatetheunknowncoefficients.

Whenthedependentvariableyiscensored,itsmedianisunaffectedbythecensoringiftheregressionfunctionxЈ␤isintheuncensoredregion(thatis,ifxЈ␤isintheinterval[a,b]).However,iftheregressionfunctionxЈ␤isbelowthelowerthresholda(orabovetheupperthresholdb),thenmorethan50percentofthedistributionwill“pileup”ata(orb).Inthiscase,themedianofyisthatintervalendpoint,whichdoesnotdependonxЈ␤.Thus,computationoftheCLADesti-matoralternatesbetweendeletingobservationswithestimatesoftheregressionfunctionxЈ␤thatareoutsidetheuncensoredregion[a,b](the“recensoring”step)andestimatingtheregressioncoefficientsbyapplyingleastabsolutedeviationsto

32JournalofEconomicPerspectives

theremainingobservations(the“regression”step),asdescribedbyBuchinsky(1994).2Thesymmetricallycensoredleastsquares(SCLS)estimationmethod,proposedbyPowell(1986b),isbasedona“symmetrictrimming”idea.Forsimplicity,supposethataϭϪϱ,sothatthedataare“top-coded”atb(asinourempiricalapplication),andassumethatthetruedependentvariabley*issymmetricallydistributedaroundtheregressionfunctionxЈ␤.Duetothecensoring,theobserveddependentvari-ableyhasanasymmetricdistribution,sinceitsuppertailis“piledup”atthecensoringpointb.ThissituationisillustratedinFigure1.However,symmetrycanberestoredby“symmetricallycensoring”thedependentvariableyfrombelowatthepoint2xЈ␤Ϫb.Nowtheregressionfunctionisequidistantfrombothcensor-ingpoints.Sincethisnew“recensored”dependentvariableissymmetricallydistrib-utedaroundtheregressionfunction,theregressioncoefficientscanbeestimatedbyleastsquares.Iteratingbetweenthis“symmetriccensoring”ofthedependentvariableusingthecurrentestimates(whichdropsobservationswithvaluesoftheregressionfunctionaboveb)andleastsquaresestimationoftheregressioncoeffi-cientsusingthe“symmetricallytrimmed”datayieldstheSCLSestimator.

Finally,theidenticallycensoredleastabsolutedeviations(ICLAD)andidenticallycensoredleastsquares(ICLS)estimationmethodswereproposedbyHonore´andPowell(1994).Themotivationfortheseestimatorsissimilartothe“symmetrictrimming”ideausedtoderivetheSCLSestimator,butinvolvesrecensoringthedependentvariableforpairsofobservationssothattheirdensityfunctionshavethesameshape.Supposethatthedependentvariablesfortwoobservations,y1andy2,arecensoredfromaboveatb,asdepictedinFigure2.Whiletheshapeofthedensitiesforthesetwoobservationswouldbethesameintheabsenceofcensoring,thecensoreddensitieshavedifferentshapes.Also,thedistancesfromtheregres-sionfunctions,xЈ1␤andxЈ2␤,tothecensoringpointbaredifferent.

However,thesecondobservationy2,whichhasthesmallerregressionfunctionxЈ2␤,canbeartificially“recensored”atthepointxЈ2␤ϪxЈ1␤ϩbϵ⌬xЈ␤ϩb.Theresulting“identicallycensored”densityfory2willhavethesameshapeasthedensityfory1.Further,thedifferencebetweenthetwoidenticallycensoredvari-ableswillbesymmetricallydistributedaroundthedifferenceintheirregressionfunctions,⌬xЈ␤.Asaresult,theregressioncoefficientscanbeestimatedbyfindingthevalueof␤thatminimizesthesumofabsolute(ICLAD)orsquared(ICLS)differencesofthe“identicallycensored”residualsacrossalldistinctpairsofobser-vations.Aswiththeestimatorsdiscussedabove,theICLSandICLADestimatorscan

2Quantileregression,asdiscussedbyKoenkerandHallockinthisissue,isbasedonaweightedversionoftheleastabsolutedeviationsapproach(KoenkerandBassett,1978).Indeed,itisstraightforwardtoextendtheCLADapproachtoanalyzethecaseofcensoredquantileregression(CRQ)estimation,asproposedbyPowell(1986a).

SemiparametricCensoredRegressionModels33

Figure1

Densityofyand“SymmetricallyCensored”Density

Figure2

Densitiesofy1andy2and“IdenticallyCensored”Densities

becalculatedbyrepeatedapplicationoflinearleastsquaresorleastabsolutedeviationsregressionprograms.3Honore´(1992)originallyproposedtheconceptbehindtheICLADandICLSestimatorsforcensoredpaneldatawithindividual-specificintercepts(alsoknownas“fixedeffects”).Insteadofidenticallycensoringobservationsforpairsofindi-viduals,theapproachcanbeappliedtopairsofobservationsacrosstimeperiodsfor

3Intheempiricalapplication,wefocusontheleastabsolutedeviationsversionoftheestimator—ICLAD—ratherthantheleastsquaresversion(ICLS),sincesimulationevidencesuggeststhatitperformsbetterinsmallsamples(Honore´andPowell,1994).

34JournalofEconomicPerspectives

eachindividual.Differencingtheidenticallycensoredobservationsforafixedindividualeliminatesthefixedeffect,justastimedifferenceseliminatefixedeffectsinthestandardlinearpaneldatamodel.Infact,thisapproachyieldsconsistentestimatesevenwhencorrectlyspecifiedmaximumlikelihoodwouldnot.

Eachoftheseestimationproceduresimposesaparticularassumptionontheunderlyingerrordistribution.TheSCLSestimatorisbasedontheassumptionthattheerrortermsaresymmetricallydistributedaroundzero,whichimpliesthattheirmedian(andmean)iszero.Whilecompatiblewiththetraditionalassumptionofnormallydistributedandhomoskedasticerrors,thesymmetryassumptionislessrestrictiveandprovidesconsistentestimateswhentheseparametricconditionsfailtohold.However,itisstrongerthanthe“zeromedian”restrictionexploitedbytheCLADestimator,whichpermitsnonnormal,heteroskedasticandasymmetricer-rors.TheICLADandICLSestimatorsassumethattheerrortermsareidentically(butnotnecessarilysymmetrically)distributed,rulingoutheteroskedasticity,butpermittingasymmetryoftheerrordistribution.4Aswiththechoiceofregressors,itisultimatelyuptotheempiricalresearchertodeterminewhichassumptionismostplausiblefortheparticularapplication.Inpractice,though,computationofseveralparametricandsemiparametricestimatorsprovidesausefulguidetothesensitivityoftheresultstotheidentifyingassump-tions,eitherthroughcasualcomparisonofthecoefficientestimatesandstandarderrorsormoreformalspecificationtestsofthekinddescribedbyNewey(1987).Inotherwords,thedifferentestimationapproachesgivetheresearcheradditionalwaysto“cutthedata”toseewhichresultsarerobusttoalternativespecifications.

Theintervalcensoredregressionmodelisaspecialcaseofthemoregeneralcensoredselectionmodel,inwhichthedependentvariableyisgeneratedas

yϭdϫ͑xЈ␤ϩ␧͒,

wheredisanobservablebinary(“dummy”)variableindicatingwhetherthetruedependentvariabley*ϭxЈ␤ϩ␧isobserved(dϭ1)or“censored”(dϭ0).Forthespecialcaseofintervalcensoringconsideredabove,disanindicatorforwhetherthetruevariabley*isintheuncensoredregion[a,b].Moregenerally,though,dwilldependonregressorsanderrortermsthatarerelatedto,butdistinctfrom,thoseintheequationfory*.Forexample,marketwagesmayonlybeobservedforindividualswithpositivelaborsupply,whichisadifferentformofcensoringthantopcodinginwages.

Insuchselectionmodels,parametricestimationmethodsforthecoefficients␤aretypicallybasedonmaximumlikelihood(Gronau,1973)orthe“two-step”strategyproposedbyHeckman(1976,1979).Whentheerrordistributionisnotparametricallyspecified,however,semiparametricestimationoftheregression

4Muchofthetheoreticalliteratureonsemiparametricestimationofcensoredregressionmodelshasfocusedonthisassumption.However,somesimulationevidencesuggeststhatheteroskedasticitycausesgreaterbiasinstandardmaximumlikelihoodestimationthannonnormality(Powell,1986b).

KennethY.ChayandJamesL.Powell35

coefficientsgenerallyinvolvesexplicitnonparametricestimationofdensityorregressionfunctions,unlikethesimplermethodsforintervalcensoreddatade-scribedabove.5Also,semiparametricidentificationofthecensoredselectionmodelgenerallyrequiresan“exclusionrestriction”—thatis,aregressorthatisincludedinthesetofregressorsforthebinaryvariabledmustbeexcludedfromthelistofregressorsxintheequationofinterest.Thisexclusionrestriction(orinstrumentalvariable),whichisnotrequiredfortheinterval-censoringmodel,maynotbeplausibleinmanyempiricalapplications.Duetothesedifficulties,empiricalappli-cationsofsemiparametricselectionmodelsareevenlesscommonthanapplicationsofthesemiparametriccensoredregressionmodelsdescribedabove.6AnEmpiricalApplication:RelativeEarningsofBlackMenintheSouthDuringthe1960s

TitleVIIoftheCivilRightsActof1964,whichwentintoeffectonJuly2,1965,outlaweddiscriminationagainstblackandfemaleworkersandestablishedtheEqualEmploymentOpportunityCommissiontomonitorcompliancewithTitleVIIandtoenforceitsstatutes.ExecutiveOrder11246,signedbyPresidentJohnsononSeptember24,1965,prohibiteddiscriminationbyfederalcontractorsandcreateditsenforcementarm,theOfficeofFederalContractCompliance,tomonitorcontractors.Manystateshadalsoadoptedtheirownfairemploymentpracticelawsbefore1964forbiddingdiscriminationamongemployerslocatedwithinthestate.TheselawsweresimilartoTitleVIIandestablishedstate-levelcommissionstohearindividualdiscriminationclaims.However,noneofthe21stateswithenforceablestatelawsbeforethepassageofthe1964CivilRightsActwereintheSouth.

Weuselongitudinaldataonearningstoestimatetheimpactofthesecivilrightspolicies.InajointprojectoftheCensusBureauandtheSocialSecurityAdministration(SSA),respondentstothe1973and1978MarchCurrentPopula-tionSurveyswerematchedbytheirSocialSecuritynumberstotheirSocialSecurityearningshistories.Theresultingfilescontainsurveyresponsesonrace,gender,education,ageandregionofresidenceasofthesurveyyearforpersonsintheMarchsurveyslinkedtoanyearningsforwhichtheypaidSocialSecuritytaxes.

Weexaminethepooleddatacontainingearningsinformationfrom1958to1974,withaparticularfocusontheyears1963,1964,1970and1971.Weuseasampleofblackandwhitemenlivinginsouthernstateswhowerebornintheperiod1910–1939(theyoungestmaninthesamplewas24in1963,whiletheoldestwas61in1971).WefocusonmenintheSouthbecausenoneofthestatesintheSouthhadfairemploymentpracticelawsbefore1965,andTitleVIIenforcement

56AnexceptionisgivenbyHonore´,KyriazidouandUdry(1997).

OneexampleofanearlyimplementationisprovidedbyNewey,PowellandWalker(1990),whichappliedsemiparametricestimationmethodstotheMroz(1987)dataonthelaborsupplyofmarriedwomen.

36JournalofEconomicPerspectives

activitywasprimarilydirectedatracialdiscriminationintheSouth.Awithin-cohortanalysisisusedtocontrolforthechangingcompositionofworkersovertime.Thefinalsampleconsistsof10,105men,andouranalysisusesallmenwithnonzeroearningsinagivenyear.7AsignificantshortcomingoftheearningsdataisthatmanyrecordsarecensoredattheSocialSecuritymaximumtaxableearningslevel.Inaddition,therealvalueofthetaxceilingchangedsubstantiallyduringthetimeperiodofinterest,risingfromalittleover$15,000(in1982–1984dollars)in1963–1964toabout$20,000in1970–1971.Atleast32percentofthesampleisclassifiedasearningthetop-codedamountfrom1958to1974,andthissharefluctuatesconsiderablyduringthekeyperiods,reachingapeakof54percentin1965.Consequently,anyestimatesoftheimpactofTitleVIIontheblack-whiteearningsgapthatdonotexplicitlyaccountforcensoringatthetaxceilingandchangesinitcouldbeseverelybiased.

Weuseseveralapproachestoestimatetheintervalcensoringmodel.Thedependentvariableineachcaseisthenaturallogarithmofannualtaxableearn-ings,andtheexplanatoryvariablesarerace,levelofeducation,ageandage-squared.Table1presentstheestimationresultsfortheraceandeducationcoef-ficientsbasedonthevariousestimators.Thefirstcolumn,headedOLS1,containstheordinaryleastsquaresestimatesbasedonallofthedata.Thesecondcolumn,headedOLS2,presentstheleastsquaresresultsusingonlytheobservationsthatarenotcensored.ThethirdcolumncontainstheTobitmaximumlikelihoodestimatesundertheassumptionthattheerrorsarenormallydistributedandhomoskedastic.Theremainingcolumnspresenttheresultsforthethreesemiparametricestima-tors:CLAD,SCLSandICLAD.TheTobit,CLADandSCLSestimatorswereimple-mentedusingtheStatasoftwarepackage,whiletheICLADestimatorwascalculatedusingtheGausspackage.Foreachestimator,wehavecreatedStata“ado”filesthatareavailableat͗http://elsa.berkeley.edu/ϳkenchay͘.8ItisclearfromTable1thattheleastsquaresandmaximumlikelihoodestimatesoftheblack-whitelog-earningsgapandthereturnstoeducationareextremelybiasedwhencomparedtothesemiparametricestimators.WethinkoftheCLADestimatorasthenaturalbenchmark,sinceitisconsistentunderthenormalityoferrorsassumptionjustifyingthemaximumlikelihoodestimator,undertheindependenceoferrorsassumptionjustifyingtheICLADestimator,andundertheconditionalsymmetryoferrorsassumptionjustifyingtheSCLSestimator.WhencomparedtotheCLADbenchmark,theleastsquaresestimatorbasedonallofthedata(OLS1)actuallydoesbetterthanthemaximumlikelihoodestimator.Forthis

7Over84percentofthemeninthesamplehavepositiveearningsin1963–1964.Thisfigureis83percentin1970–1971.8ThestandarderrorsforOLS,MLEandICLADwerecalculatedusingstandardapproximations.ThestandarderrorsforCLADandSCLSwerecalculatedusingthebootstraptechniquesdiscussedbyBrownstoneandVallettainthisissue.ItismuchmoreefficienttocalculatetheICLADestimatorusingGaussinsteadofStata.

SemiparametricCensoredRegressionModels37

Table1

EstimatedEffectsofRaceandEducationonLog-Earnings(estimatedstandarderrorsinparentheses)

OLS1

Black-WhiteGap1963196419701971

ReturnstoEducation1963196419701971

OLS2

MLE

CLAD

SCLS

ICLAD

Ϫ0.355(0.033)Ϫ0.349(0.032)Ϫ0.262(0.032)Ϫ0.242(0.031)0.041(0.003)0.040(0.003)0.037(0.003)0.035(0.002)

Ϫ0.183(0.038)Ϫ0.154(0.038)Ϫ0.115(0.037)Ϫ0.111(0.038)0.012(0.004)0.013(0.005)0.003(0.005)0.002(0.004)

Ϫ0.629(0.044)Ϫ0.674(0.044)Ϫ0.508(0.044)Ϫ0.486(0.044)0.102(0.004)0.103(0.004)0.101(0.004)0.100(0.004)

Ϫ0.416(0.027)Ϫ0.428(0.033)Ϫ0.278(0.020)Ϫ0.244(0.022)0.051(0.004)0.064(0.006)0.055(0.003)0.054(0.003)

Ϫ0.444(0.031)Ϫ0.444(0.036)Ϫ0.302(0.031)Ϫ0.287(0.032)0.068(0.007)0.079(0.007)0.066(0.006)0.065(0.005)

Ϫ0.474(0.032)Ϫ0.473(0.031)Ϫ0.338(0.029)Ϫ0.312(0.031)0.073(0.003)0.075(0.003)0.071(0.003)0.070(0.003)

Notes:Thedependentvariableisthenaturallogarithmofannualtaxableearnings.Regressionsalsoincludeaconstantandageandage-squaredasexplanatoryvariables.Observationswithnonpositiveearningsaredroppedfromtheanalysis.Thesamplesizesfor1963,1964,1970and1971are8525,8529,8391and8275,respectively.TheOLS2specificationalsodropstop-codedobservations,leadingtosamplesizesof4632,4267,4485and4163.MLEisTobitmaximumlikelihood;CLADiscensoredleastabsolutedeviations;SCLSissymmetricallycensoredleastsquares;ICLADisidenticallycensoredleastabsolutedeviations.

application,itappearsthatmisspecifyingtheerrorsasbeingnormallydistributedandusingmaximumlikelihoodestimationresultsinmorebiasedestimatesthanignoringthecensoringproblementirelyandusingleastsquaresestimation.Amoreformaltestofthenormalityassumptionalsosuggeststhatitisviolatedforthelog-earningsmodel.9Therearesizeabledifferencesintheestimatedeffectsofeducationonearn-ingsacrossthethreesemiparametricestimators.WhiletheICLADandSCLSestimatorsoftheeducationpremiumaresimilar,theyarealwaysgreaterthantheCLADestimator.Thesedifferencesaresignificantgiventheprecisionoftheestimatesandrangefrom17percentto43percentfortheICLADestimatorand

ChayandHonore´(1998)calculatetheteststatisticsfornonnormalityandforheteroskedasticityincensoredregressionmodelsdiscussedbyChesherandIrish(1987).Theteststatisticfordetectingnonnormalityrangesfrom900.47to1200.68.Underthenull,thisstatistichasan(asymptotic)␹2(2)distributionwitha1percentcriticalvalueof9.21.Therefore,weeasilyrejectthehypothesisthattheerrorsarenormallydistributed.Thetestforheteroskedasticityyieldsstatisticsbetween84.71and90.85.Underthenull,thesehave(asymptotic)␹2(12)distributionswitha1percentcriticalvalueof26.22.Wethereforealsorejectthenullofnoheteroskedasticity.

938JournalofEconomicPerspectives

20percentto33percentfortheSCLSestimator.Thedifferencesintheestimatesoftheblack-whiteearningsgaparesmaller,withtheCLADestimatorabout11percentto28percentand4percentto18percentsmallerinmagnitudethantheICLADandSCLSestimators,respectively.Strikingly,thesemiparametricap-proachesallresultinmorepreciseestimatesoftheracecoefficientthanmaximumlikelihoodestimation.Forexample,thestandarderrorsoftheCLADestimatorare25percentto55percentsmallerthanthestandarderrorsoftheTobitestimator.10Thedifferencesinthecoefficientestimatesacrossthevariousestimatorscanbeusedasasortofspecificationcheck,similarinspirittotheNewey(1987)specificationanalysismentionedearlier.Fortheeducationcoefficient,thelargedifferencesbetweenthemaximumlikelihoodandsemiparametricestimatessug-gestthatnonnormalerrorsareanimportantsourceofbiasintheTobitestimator.Further,thesignificantdifferencesamongthesemiparametricestimatesimplythatheteroskedasticityandasymmetryoftheerrorsarealsosourcesofmisspecificationinthemaximumlikelihoodestimatoroftheeducationpremium.Conversely,fortheblack-whiteearningsgap,thesmallerdifferencesamongthesemiparametricestimatessuggestthatnonnormalityisthebiggestsourceofbiasintheTobitestimator,withheteroskedasticityandasymmetryplayingsmallerroles.

Toexaminethequestionofspecificationinmoredetail,weestimatedthedistributionoftheerrortermsderivedfromtheCLADestimates,usingtheKaplanandMeier(1958)estimator.Theresultingestimatederrordistributionforlog-earningshasfattertailsthandoesanormaldistribution.Themaximumlikelihoodestimatorissensitivetovaluesinthetails,whiletheleastabsolutedeviationsestimator,whichfocusesonthemedianvalue,isunaffectedbyextremeobserva-tions.Sinceblackmenaremorelikelytobeintheleft-handtailofthedistribution,fattailscanexplaintheconsistentlylarger(inmagnitude)maximumlikelihoodestimatesoftheracecoefficient.TheycanalsoexplainthelargersamplingerrorsoftheTobitestimatorrelativetothesemiparametricestimators.Thus,abnormallylongtailsinthelog-earningsdistributionmaybethemajorsourceofmisspecifica-tioninthemaximumlikelihoodestimatesoftheblack-whiteearningsgap(Chay,1995;ChayandHonore´,1998).

Basedontheseriesofcross-sectionalestimatorsforthefouryearsshowninTable1,themaximumlikelihoodandsemiparametricapproachesyieldverysimilarestimatesofchangesinblack-whiterelativeearningsduringthelate1960s.Themaximumlikelihoodandsemiparametricestimatesallimplythattheblack-whiteearningsgapnarrowedabout0.15logpointsfrom1963–1964to1970–1971.WeconcludefromthisthatwhilethereisbiasintheTobitestimatoroftheracecoefficient,thisbiasisfixedovertime.Thus,itis“differencedout”whenoneexamineschangesintheestimatedracecoefficient.However,thetwoordinaryleast

10ThestandarderrorsfortheCLADandSCLSestimatorswerecalculatedusing500bootstrapreplica-tions.ApplyingthebootstraptotheTobitmaximumlikelihoodestimatorresultsinstandarderrorsthatarenearlyidenticaltothosepresentedinTable1,whicharebasedontheasymptoticapproximation.Thus,thebootstraptechniqueisnotthesourceofthedifferencesintheestimatedstandarderrors.

KennethY.ChayandJamesL.Powell39

Figure3

Top-CodeRateandEstimatesofBlack-WhiteLog-EarningsGap,1958–1974

squaresestimatorsimplythatrelativeearningsonlyconvergedbetween0.06(OLS2)and0.10(OLS1)logpointsduringtheperiodofinterest.NotaccountingfortheseverecensoringintheearningsdataresultsindownwardlybiasedestimatesoftheimpactofTitleVII.Also,althoughtheCLADestimatorimposestheweakeststochasticrestrictionsontheerrorterms,itresultsinthemostpreciseestimatesofthepolicyeffects.

Figures3Aand3Bprovideamoredetailedpictureofthevariousestimators.Thetoppanelshowsthepercentageofworkersinthesamplerecordedasearningatthetaxablemaximum(thetop-coderate)from1958to1974,separatelybyrace.Thebottompanelplotstheestimatedblack-whitelog-earningsgapsfromtheOLS1,

40JournalofEconomicPerspectives

Table2

Fixed-EffectsICLADEstimatesofEffectsofRaceandEducation(estimatedstandarderrorsinparentheses)

1963–64

Changein

Black-WhiteGapReturnstoEducation

1963–70

1963–71

1964–70

1964–71

1970–71

0.011(0.007)0.002(0.001)0.102(0.017)0.001(0.002)0.136(0.021)0.000(0.003)0.095(0.020)0.000(0.003)0.108(0.019)Ϫ0.003(0.002)0.015(0.007)0.000(0.001)

Notes:SeenotestoTable1.Thesampleisthe7,435menwithpositiveearningsinallfouryears.Foreachpairofyears,theabsoluteerrorlossfunctionwasusedtoestimatetheidenticallycensoredpaneldatamodelwithfixedeffects.Theestimatesrepresentthechangeinthecoefficientsbetweenthetwoyears.

OLS2,MLEandCLADestimatorsfortheseriesofcross-sectionsfrom1958–1974.Thereisastrikingcorrespondencebetweenchangesinthetop-coderateinthetoppanelandchangesintheordinaryleastsquaresestimatesinthebottompanel.Indeed,itseemsthatmostofthechangesintheordinaryleastsquaresestimatesovertimearebeingdrivenbychangesintheamountofcensoringvaryingbyrace.Thisresultsinasevereunderstatementoftheracialearningsconvergenceinthelate1960s.ThetimeseriesoftheCLADandmaximumlikelihoodestimates,ontheotherhand,havenoassociationwiththetop-coderates,although,asnotedearlier,themaximumlikelihoodestimatessystematicallyoverstatethesizeoftheblack-whiteearningsgapwhencomparedtotheCLADestimates.Themaximumlikeli-hoodandCLADestimatesimplysubstantialblackeconomicprogressintheSouthafter1964;aresultthatismaskedbytheordinaryleastsquaresestimates.

Finally,Table2presentsthefixed-effectsestimationresultsbasedontheICLADestimatorforeachofthesixpossiblepairsoftimeperiodsinourpaneldata.Thetableentriesgivetheestimatedchangeintheraceandeducationcoefficientsbetweenthetwospecifiedperiods.Theanalysisincludesageandage-squaredasexplanatoryvariables.Tocomparetheparameterestimatesacrossthesixcolumns,thesampleisrestrictedtothe7,435menwithpositiveearningsinallfouryears.Thisreducesthesamplesizebyabout10percentto13percentrelativetothecross-sectionalsamples.

Theblack-whiteearningsgapnarrowedsubstantiallyduringtheperiodofinterest,evenafteraccountingforindividual-specificfixedeffects.Therelativeearningsofblackmenincreasedabout0.12–0.14log-pointsfrom1963to1971.Theseestimatesaresimilartothoseimpliedbytheseriesofmaximumlikelihoodandsemiparametricestimatesofthecross-sectionalcensoredregressionmodelinTable1.Akeyassumptionunderlyingthefixed-effectsICLADestimatoristhatthedistributionoftheunobservablesisthesameinalltimeperiodsforagivenindividual.Aformalspecificationtestdidnotrejecttherestrictionsimpliedbythisassumptionatconventionallevelsofsignificance(ChayandHonore´,1998).

SemiparametricCensoredRegressionModels41

Conclusion

Whendataarecensored,ordinaryleastsquaresregressioncanprovidemis-leadingestimates.TheresultsfromthesemiparametricmodelsshowthattherewassignificantearningsconvergenceamongblackandwhitemenintheAmericanSouthafterthepassageofthe1964CivilRightsAct,aresultthatwasmaskedbyleastsquaresanalysis.Thesemiparametricmethodscanalsoprovideinformationonthesourcesofmisspecificationinparametricestimationapproaches.Inthelog-earningsmodel,itappearsthatabnormallylongtailsarethemajorsourceofbiasintheTobitmaximumlikelihoodestimates.

´,DavidLee,andespeciallyAlanKrueger,TimothyTaylorandyWethankBoHonore

MichaelWaldmanfortheirhelpfulcomments.MarceloMoreiraprovidedoutstandingresearch

assistance.SupportfromtheAlfredP.SloanFoundationandtheCenterforAdvancedStudyintheBehavioralSciencesisgratefullyacknowledged.

References

Amemiya,Takeshi.1985.AdvancedEconomet-rics.Cambridge:HarvardUniversityPress.

Arabmazar,AbbasandPeterSchmidt.1981.“FurtherEvidenceontheRobustnessoftheTo-bitEstimatortoHeteroskedasticity.”JournalofEconometrics.November,17:2,pp.253–58.

Arabmazar,AbbasandPeterSchmidt.1982.“AnInvestigationoftheRobustnessoftheTobitEstimatortoNon-Normality.”Econometrica.July,50:4,pp.1055–63.

Buchinsky,Moshe.1994.“ChangesintheU.S.WageStructure1963–1987:ApplicationofQuantileRegression.”Econometrica.March,62:2,pp.405–58.

Chay,KennethY.1995.“EvaluatingtheImpactofthe1964CivilRightsActontheEco-nomicStatusofBlackMenusingCensoredLon-gitudinalEarningsData.”Unpublishedman-uscript,DepartmentofEconomics,PrincetonUniversity.

Chay,KennethY.andBoE.Honore´.1998.“EstimationofSemiparametricCensoredRe-gressionModels:AnApplicationtoChangesinBlack-WhiteEarningsInequalityDuringthe1960s.”JournalofHumanResources.Winter,33:1,pp.4–38.

Chesher,AndrewandMargaretIrish.1987.

“ResidualAnalysisintheGroupedandCensoredNormalLinearModel.”JournalofEconometrics.January/February,34:1-2,pp.33–61.

Goldberger,ArthurS.1983.“AbnormalSelec-tionBias,”inStudiesinEconometrics,TimeSeries,andMultivariateStatistics.S.Karlinetal.,eds.NewYork:AcademicPress,pp.67–84.

Gronau,Reuben.1973.“TheEffectofChil-drenontheHousewife’sValueofTime.”JournalofPoliticalEconomy.March/April,81:2,pp.S168–S199.

Heckman,JamesJ.1976.“TheCommonStructureofStatisticalModelsofTruncation,SampleSelectionandLimitedDependentVari-ablesandaSimpleEstimatorforSuchModels.”AnnalsofEconomicsandSocialMeasurement.Fall,5:4,pp.475–92.

Heckman,JamesJ.1979.“SampleSelectionBiasasaSpecificationError.”Econometrica.Jan-uary,47:1,pp.153–61.Honore´,BoE.1992.“TrimmedLADandLeastSquaresEstimationofTruncatedandCen-soredRegressionModelswithFixedEffects.”Econometrica.May,60:3,pp.533–65.Honore´,BoE.andJamesL.Powell.1994.“PairwiseDifferenceEstimatorsforCensoredandTruncatedRegressionModels.”Journalof

42JournalofEconomicPerspectives

Econometrics.September/October,64:1-2,pp.241–78.Honore´,BoE.,EkateriniKyriazidouandChristopherUdry.1997.“EstimationofType3TobitModelsUsingSymmetricTrimmingandPairwiseComparisons.”JournalofEconometrics.January/February,76:1-2,pp.107–28.

Kaplan,E.L.andP.Meier.1958.“Nonpara-metricEstimationfromIncompleteObserva-tions.”JournaloftheAmericanStatisticalAssocia-tion.53,pp.457–81.

Koenker,RogerandGilbertS.Bassett,Jr.1978.“RegressionQuantiles.”Econometrica.Jan-uary,46:1,pp.33–50.

Mroz,ThomasA.1987.“TheSensitivityofanEmpiricalModelofMarriedWomen’sHoursofWorktoEconomicandStatisticalAssumptions.”Econometrica.July,55:4,pp.765–99.

Newey,WhitneyK.1987.“SpecificationTestsforDistributionalAssumptionsintheTobitModel.”JournalofEconometrics.January/Febru-ary,34:1-2,pp.125–45.

Newey,WhitneyK.,JamesL.PowellandJamesM.Walker.1990.“SemiparametricEsti-mationofSelectionModels:SomeEmpiricalRe-sults.”AmericanEconomicReview.May,80:2,pp.324–28.

Powell,JamesL.1984.“LeastAbsoluteDevia-tionsEstimationfortheCensoredRegressionModel.”JournalofEconometrics.July,25:3,pp.303–25.

Powell,JamesL.1986a.“CensoredRegressionQuantiles.”JournalofEconometrics.June,32:1,pp.143–55.

Powell,JamesL.1986b.“SymmetricallyTrimmedLeastSquaresEstimationforTobitModels.”Econometrica.November,54:6,pp.1435–60.

Powell,JamesL.1994.“EstimationofSemipa-rametricModels,”inHandbookofEconometrics,VolumeIV.RobertF.EngleandDanielL.McFad-den,eds.Amsterdam:NorthHolland,pp.2443–521.Tobin,James.1958.“EstimationofRelation-shipsforLimitedDependentVariables.”Econo-metrica.January,26,pp.24–36.

This article has been cited by:

1.Kamran A. Khan, Stavros Petrou, Oliver Rivero-Arias, Stephen J. Walters, Spencer E. Boyle. 2014.Mapping EQ-5D Utility Scores from the PedsQL™ Generic Core Scales. PharmacoEconomics 32:7,693-706. [CrossRef]

2.Weili Ding, Yuan Zhang. 2014. When a Son is Born: The Impact of Fertility Patterns on FamilyFinance in Rural China. China Economic Review . [CrossRef]

3.Bob Edward Vásquez, Gregory M. Zimmerman. 2014. An investigation into the empirical relationshipbetween time with peers, friendship, and delinquency. Journal of Criminal Justice 42:3, 244-256.[CrossRef]

4.Juan F. Delgado, Juan Oliva, Miguel Llano, Domingo Pascual-Figal, José J. Grillo, Josep Comín-Colet, Beatriz Díaz, León Martínez de La Concha, Belén Martí, Luz M. Peña. 2014. Costes sanitariosy no sanitarios de personas que padecen insuficiencia cardiaca crónica sintomática en España. RevistaEspañola de Cardiología . [CrossRef]

5.Juan F. Delgado, Juan Oliva, Miguel Llano, Domingo Pascual-Figal, José J. Grillo, Josep Comín-Colet, Beatriz Díaz, León Martínez de La Concha, Belén Martí, Luz M. Peña. 2014. Health Careand Nonhealth Care Costs in the Treatment of Patients With Symptomatic Chronic Heart Failure inSpain. Revista Española de Cardiología (English Edition) . [CrossRef]

6.Stefano Mainardi. 2014. Disparities in Public Service Provision in Niger: Cross-District Evidence onAccess to Primary Schools and Healthcare. Regional Studies 1-20. [CrossRef]

7.Sandro C. Andrade, Gennaro Bernile, Frederick M. Hood. 2014. SOX, corporate transparency, andthe cost of debt. Journal of Banking & Finance 38, 145-165. [CrossRef]

8.Jaya Prakash Pradhan, Keshab Das. 2013. Exporting by Indian small and medium enterprises: roleof regional technological knowledge, agglomeration and foreign direct investment. Innovation andDevelopment 3:2, 239-257. [CrossRef]

9.G. M. Artz, K. L. Kimle, P. F. Orazem. 2013. Does the Jack of All Trades Hold the Winning Hand?Comparing the Role of Specialized versus General Skills in the Returns to an Agricultural Degree.American Journal of Agricultural Economics . [CrossRef]

10.Antonio F. Galvao, Carlos Lamarche, Luiz Renato Lima. 2013. Estimation of Censored QuantileRegression for Panel Data With Fixed Effects. Journal of the American Statistical Association 108:503,1075-1089. [CrossRef]

11.Ralph Crott, Matthijs Versteegh, Carin Uyl-de-Groot. 2013. An assessment of the external validityof mapping QLQ-C30 to EQ-5D preferences. Quality of Life Research 22:5, 1045-1054. [CrossRef]12.Billingsley Kaambwa, Lucinda Billingham, Stirling Bryan. 2013. Mapping utility scores from theBarthel index. The European Journal of Health Economics 14:2, 231-241. [CrossRef]

13.Paul A. Raschky, Reimund Schwarze, Manijeh Schwindt, Ferdinand Zahn. 2013. Uncertainty ofGovernmental Relief and the Crowding out of Flood Insurance. Environmental and Resource Economics54:2, 179-200. [CrossRef]

14.Sreejata Banerjee. 2012. Basel l and Basel ll compliance issues for banks in India. Macroeconomics andFinance in Emerging Market Economies 5:2, 228-245. [CrossRef]

15.João Ricardo Faria, Le Wang, Zhongmin Wu. 2012. Debts on debts. The North American Journal ofEconomics and Finance 23:2, 203-219. [CrossRef]

16.Ziyodullo Parpiev, Kakhramon Yusupov, Nurmukhammad Yusupov. 2012. Outlay equivalence analysisof child gender bias in household expenditure data. Economics of Transition 20:3, 549-567. [CrossRef]

17.Ismael Arciniegas Rueda. 2012. EMPIRICAL ANALYSIS OF SPECULATIVE ATTACKSWITH CONTRACTIONARY REAL EFFECTS. Intelligent Systems in Accounting, Finance andManagement n/a-n/a. [CrossRef]

18.Jaya Prakash Pradhan. 2011. Regional heterogeneity and firms' R&D in India. Innovation andDevelopment 1:2, 259-282. [CrossRef]19.Che-Yuan Liang. 2011. Nonparametric structural estimation of labor supply in the presence ofcensoring. Journal of Public Economics . [CrossRef]

20.Danutė Krapavickaitė. 2011. Some models for estimation of total of a study variable having many zerovalues. Lithuanian Mathematical Journal . [CrossRef]

21.Sarah E. Anderson. 2011. Complex constituencies: intense environmentalists and representation.Environmental Politics 20:4, 547-565. [CrossRef]

22.Alessandro Acquisti, Sarah Spiekermann. 2011. Do Interruptions Pay off? Effects of Interruptive Adson Consumers' Willingness to Pay. Journal of Interactive Marketing . [CrossRef]

23.Guixian Lin, Xuming He, Stephen Portnoy. 2011. Quantile regression with doubly censored data.Computational Statistics & Data Analysis . [CrossRef]

24.Paul R. Hunter, Marianna Anderle de Sylor, Helen L. Risebro, Gordon L. Nichols, DavidKay, Philippe Hartemann. 2011. Quantitative Microbial Risk Assessment of Cryptosporidiosis andGiardiasis from Very Small Private Water Supplies. Risk Analysis 31:2, 228-236. [CrossRef]

25.Terry Mashtare, Alan Hutson. 2011. Utilizing the Flexibility of the Epsilon-Skew-NormalDistribution for Tobit Regression Problems. Communications in Statistics - Theory and Methods 40:3,408-423. [CrossRef]

26.Carolyn Kousky. 2011. Understanding the Demand for Flood Insurance. Natural Hazards Review12:2, 96. [CrossRef]

27.Aurora Galego, João Pereira. 2010. EVIDENCE ON GENDER WAGE DISCRIMINATION INPORTUGAL: PARAMETRIC AND SEMI-PARAMETRIC APPROACHES. Review of Incomeand Wealth 56:4, 651-666. [CrossRef]

28.Ralph Crott, Andrew Briggs. 2010. Mapping the QLQ-C30 quality of life cancer questionnaire toEQ-5D patient preferences. The European Journal of Health Economics 11:4, 427-434. [CrossRef]29.Christopher Sullivan, Tara Livelsberger. 2010. Censored Regression in Response to the DistributionalRealities of Crime and Justice Measures. Journal of Criminal Justice Education 21:2, 197-208.[CrossRef]

30.Anil Kumar. 2010. Nonparametric estimation of the impact of taxes on female labor supply. Journalof Applied Econometrics n/a-n/a. [CrossRef]

31.H. Saito, M. Gopinath. 2009. Plants' self-selection, agglomeration economies and regionalproductivity in Chile. Journal of Economic Geography 9:4, 539-558. [CrossRef]

32.Miriam Manchin, Anna Maria Pinna. 2009. Border effects in the enlarged EU area: evidence fromimports to accession countries. Applied Economics 41:14, 1835-1854. [CrossRef]

33.Christopher J. Sullivan, Jean Marie McGloin, Alex R. Piquero. 2008. Modeling the Deviant Y inCriminology: An Examination of the Assumptions of Censored Normal Regression and PotentialAlternatives. Journal of Quantitative Criminology 24:4, 399-421. [CrossRef]

34.Conor Keelan, Carol Newman, Maeve Henchion. 2008. Quick-service expenditure in Ireland:parametric vs. semiparametric analysis. Applied Economics 40:20, 2659-2669. [CrossRef]

35.M. Maria Glymour, Jennifer Weuve, Jarvis T. Chen. 2008. Methodological Challenges in CausalResearch on Racial and Ethnic Patterns of Cognitive Trajectories: Measurement, Selection, and Bias.Neuropsychology Review 18:3, 194-213. [CrossRef]

36.Mark Ottoni Wilhelm. 2008. Practical Considerations for Choosing Between Tobit and SCLS orCLAD Estimators for Censored Regression Models with an Application to Charitable Giving*. OxfordBulletin of Economics and Statistics 70:4, 559-582. [CrossRef]

37.Wojciech Kopczuk. 2007. Bequest and Tax Planning: Evidence from Estate Tax Returns *. QuarterlyJournal of Economics 122:4, 1801-1854. [CrossRef]38.MATTHEW T. BILLETT, HUI XUE. 2007. The Takeover Deterrent Effect of Open Market ShareRepurchases. The Journal of Finance 62:4, 1827-1850. [CrossRef]

39.Shawn Bushway, Brian D. Johnson, Lee Ann Slocum. 2007. Is the Magic Still There? The Useof the Heckman Two-Step Correction for Selection Bias in Criminology. Journal of QuantitativeCriminology 23:2, 151-178. [CrossRef]

40.Maria Karlsson. 2006. Estimators of Regression Parameters for Truncated and Censored Data. Metrika63:3, 329-341. [CrossRef]

41.Jean-Claude Berthelemy. 2006. Bilateral Donors' Interest vs. Recipients' Development Motives inAid Allocation: Do All Donors Behave the Same?. Review of Development Economics 10:2, 179-194.[CrossRef]

42.Jay H. Lubin, Joanne S. Colt, David Camann, Scott Davis, James R. Cerhan, Richard K. Severson,Leslie Bernstein, Patricia Hartge. 2004. Epidemiologic Evaluation of Measurement Data in thePresence of Detection Limits. Environmental Health Perspectives 112:17, 1691-1696. [CrossRef]43.Judith A Chevalier, Austan GoolsbeeVALUING INTERNET RETAILERS: AMAZON ANDBARNES AND NOBLE 12, 73-84. [CrossRef]

因篇幅问题不能全部显示,请点此查看更多更全内容

Copyright © 2019- zrrp.cn 版权所有

违法及侵权请联系:TEL:199 1889 7713 E-MAIL:2724546146@qq.com

本站由北京市万商天勤律师事务所王兴未律师提供法律服务