您的当前位置：首页 The Trading Agent Competition Supply Chain Management scenario (TACSCM)

The Trading Agent Competition Supply Chain Management scenario (TACSCM)

来源：智榕旅游

Value-DrivenProcurementintheTACSupplyChainGame

CHRISTOPHERKIEKINTVELD,MICHAELP.WELLMAN,SATINDERSINGH,andVISHALSONIUniversityofMichigan

TheTACsupply-chaingamepresentsautomatedtradingagentswithchallengingdecisionprob-lems,includingprocurementofsuppliesacrossmultipleperiodsusingmultiattributenegotiations.Theprocurementprocessinvolvessubstantialuncertaintyandcompetitionamongmultipleagents.Ouragent,DeepMaize,generatesrequestsforcomponentsbasedondeviationsfromareferenceinventorytrajectorydeﬁnedbyestimatedmarketconditions.Itthenselectsamongsupplieroﬀersbyoptimizingavaluefunctionoverpotentialinventoryproﬁles.ThisapproachoﬀeredstrategicﬂexibilityandachievedcompetitiveperformanceintheTAC-03tournament.

CategoriesandSubjectDescriptors:I.2.11[ArtiﬁcialIntelligence]:DistributedArtiﬁcialIntel-ligence—intelligentagentsandmultiagentsystems;J.4[ComputerApplications]:SocialandBehavioralSciences—economicsGeneralTerms:Algorithms,Economics

AdditionalKeyWordsandPhrases:TradingAgents,E-Commerce,SupplyChains

1.INTRODUCTION

TheTradingAgentCompetitionSupplyChainManagementscenario(TAC/SCM)wasdesignedtoposeacomplexmulti-tiered,multi-periodproblemforautomatedtradingagentsinaplausiblesupplychaingame[Sadehetal.2003].TheTAC/SCMenvironmentischallengingformanyreasons.Oneisthatagentsarefacedwithsub-stantialuncertainty:aboutthelocalstateofotheragentsinthegame,aswellastheunderlyingdemandandsupplyprocesses.Theenvironmentisalsostrategic,comprisingsixproﬁt-maximizingproduceragents.Agentsmustnegotiatemultiat-tributedealswithsuppliersandcustomers,sotheymustbeabletoreasonabouttherelativevaluesofthoseattributes.Finally,theSCMgameforcesagentstomakedecisionsovermultiplestages,andondiﬀerenttimescales.Agentsmustmakede-cisions(e.g.,componentprocurement)beforeallrelevantuncertaintyisresolved(e.g.,customerdemand,futurecomponentprices).

WedesignedtheUniversityofMichigan’sagent,DeepMaize,toparticipateinthe

Author’saddress:ComputerScienceandEngineeringDivision,UniversityofMichigan,AnnAr-bor,MI48109.

Email:{ckiekint,wellman,baveja,soniv}@umich.edu

Permissiontomakedigital/hardcopyofallorpartofthismaterialwithoutfeeforpersonalorclassroomuseprovidedthatthecopiesarenotmadeordistributedforproﬁtorcommercialadvantage,theACMcopyright/servernotice,thetitleofthepublication,anditsdateappear,andnoticeisgiventhatcopyingisbypermissionoftheACM,Inc.Tocopyotherwise,torepublish,topostonservers,ortoredistributetolistsrequirespriorspeciﬁcpermissionand/orafee.c2004ACM0000-0000/2004/0000-0009$5.00󰀄

ACMSIGecomExchanges,Vol.4,No.3,February2004,Pages9–18.

Kiekintveldetal.

2003TAC/SCMtournament.DeepMaizeemploysdistributedfeedbackcontroltocoordinateitsvariousmodulesandoperaterobustlydespitedynamicuncertainty.Itsoverallfeedback-controlapproachandthespeciﬁcmethodsbywhichDeepMaizesetsitsreferenceinventorytrajectoryaredeﬁnedelsewhere[Kiekintveldetal.2004].Herewefocusonhowtheagentmanagesitsprocurementactionsgiventhereferencetrajectory,byoptimizingavaluefunctionoverpotentialinventoryproﬁles.

Valuefunctionrepresentationshavebeenusedtomakedecisionsinotherman-ufacturingcontexts.Forexample,Schneideretal.[1998]formulateaproductionschedulingproblemasaMarkovDecisionProcess(MDP)andusereinforcementlearningmethodstoapproximatethevaluefunctionofthisMDP.ThedecisionproblemweaddressdoesnotﬁttheassumptionsoftheMDPmodelandmustbemadeinthecontextofalargeragentmakingmanydistributedbutinterrelateddecisions.Ourapproachusesaheuristicvaluefunctionrepresentationthatincor-poratesvaluesfromseveraldiﬀerentsourcesinsupportofasingledecision.2.

PROCUREMENTDECISIONSINTAC/SCM

EachdaySCMagentsmustmakeseveraldecisions,twoofwhichcomprisetheprocurementpolicy:(1)WhatRFQstoissuetocomponentsuppliersand(2)Oftheoﬀersreceivedfromsuppliers,whichtoaccept.Thereareeightsuppliers,eachproducingtwocomponenttypesfromfourcategoriesofcomponents:CPU,moth-erboard,memory,andharddisk.ThefourCPUtypesareeachsoldbyasinglesupplier;thetwotypesofcomponentsinallothercategoriesareeachsoldbytwosuppliers.Eachsupplierhasanominalcapacitytoproduce500perdayofeachcomponenttypeitsupplies.Actualproductionvariesaboutthiscapacityinaran-domwalk.Toacquirecomponents,anagentsendsRFQstoasupplier(uptotenpersupplierperday,inpriorityorder),eachspecifyingadesiredquantityandduedate.Thesupplierrespondsthenextdaywithoﬀersspecifyingquantity,duedate,andprice,andreservessuﬃcientcapacitytomeetthesecommitments.Ifprojectedcapacityisinsuﬃcienttomeettherequestedquantityanddate,thesupplierinsteadoﬀersapartialquantityattherequesteddateand/orthefullquantityatalaterdate.Suppliersassumenominalcapacitywhenprojectingfutureavailability.Togenerateresponses,suppliersexecutethefollowinguntilallRFQsareexhausted:(1)randomlychooseanagent,(2)takethehighest-priorityRFQremainingonitslist,(3)generateanoﬀer,ifpossible.Agentsmustacceptordeclineeachoﬀerthedaytheyreceiveit.

Supplierssetpricesbasedonananalysisofavailablecapacity.TheTAC/SCMcomponentcatalog[Arunachalametal.2003]associateseverycomponentcwithabaseprice,bc.Thecorrespondencebetweenpriceandquantityforcomponentsuppliesisdeﬁnedbythesuppliers’pricingformula.Thepriceoﬀeredbyasupplieratdaydforanordertobedeliveredondayd+iis

κc(d+i)

,(1)500i

whereκc(j)denotesthecumulativecapacityforcthesupplierprojectstohaveavailablefromthecurrentdaythroughdayj.Thedenominator,500i,representsthenominalcapacitycontrolledbythesupplieroveridays,notaccountingforanycapacitycommittedtoexistingorders.

pc(d+i)=bc−0.5bc

ACMSIGecomExchanges,Vol.4,No.3,February2004.

DeepMaizeProcurementinTAC/SCM

3.DEEPMAIZEREFERENCEINVENTORYTRAJECTORY

Thereferenceinventorytrajectoryisthesumofthreesourcesofcomponentrequire-ments:(1)outstandingcustomerorders,(2)expectedfuturecomponentutilization,and(3)baselinebuﬀerlevels.First,outstandingcustomerordersentailaknownrequirementforspeciﬁccomponentsintimetoproducetheorders.

Second,wederivethetimeseriesofexpectedfuturecomponentutilization,basedonprojectionsoffuturecustomerdemandforPCsandmarketequilibriumcalcula-tions.TheprojectionofcustomerdemandusesadynamicBayesiannetworkmodeltoestimatetheunderlyingdemandstateandprojectthesevaluesforwardusingthespeciﬁedsystemdynamics.Marketequilibriumisderivedforeachdaybycalcu-latingtheprices(basedonEq.(1))atwhichthesupplywouldequaltheprojectedcustomerdemand.DeepMaizeassumesthatitwillgarner,onaverage,neworderscovering1/6oftheequilibriumquantityQdfordayd,evenlydistributedacrossthepossiblePCtypes.Toaccountforunpredictabilityinthedemandtrends,wesetasomewhatmoreconservativereference,basedonthedemandquantityQ󰀄satis-fyingPr(Qd≥Q󰀄)=0.63(0.63waschosensomewhatarbitrarily).Overtherangewhereexistingandprospectivecustomerordersoverlap,wephaseintheexpectedutilizationproportionately.

Theﬁnalcontributortoourinventoryreferenceisabaselinebuﬀerlevel,main-tainedtomitigateshort-termnoiseinprocurementandsalesactivityandallowtheagenttoactmoreopportunistically.Forthetournament,DeepMaizesetthebaselinelevelat6.0timesthecurrentexpecteddailyconsumption(alsosomewhatarbitrary).Thislevelisscaledgraduallytozeroattheendofthegame,atwhichpointinventoriesbecomeworthless.Thesumoftheserequirementsrepresentsthetrajectoryofgrossinventoryrequirements.Todeterminethenetreferencetrajec-tory,wesubtractthecurrentinventoryofcomponents(includingthosecontainedinﬁnishedPCs)plusanticipateddeliveriesofcomponentsalreadypurchasedfromthegrossrequirements.

4.DEEPMAIZEPROCUREMENTPOLICY

EachdayagentsissueRFQstosuppliersandacceptorrejectsupplieroﬀersre-ceivedinresponsetothepreviousday’sRFQs.DeepMaizeappliesthesameRFQ-generationandoﬀer-acceptancepoliciestoeverydayofthegame,withonesig-niﬁcantexception:theverybeginningofthegame(day0).Thesupplierpricingformulaprovidesastrongincentivetoprocurelargequantitiesofcomponentsonthisday,aspricesareattheirlowestandavailabilityisashighasitwilleverbe.1Asaresult,inTAC-03agentsemployedincreasinglyaggressiveday-0procure-mentpolicies,leadingtoamutuallydestructiveovercapacityofcomponentsfortheaggregatesystem.Weanticipatedthiseﬀectandintroducedourownpreemptiveday-0strategythatneutralizedthisbehaviorsomewhatand,ineﬀect,reestablishedasettingwhereinagentshadtoprocuresuppliesthroughoutthegame.Thedetailsofourday-0strategyanditseﬀectsaredescribedinaseparateaccount[Estelleetal.2003].Forourpresentpurpose,itsuﬃcestonotethatweemployedspecialRFQgenerationandoﬀeracceptancestrategiesattheverybeginningofthegame.

1Due

toitsdistortingeﬀects,thiswillbemodiﬁedforthe2004tournament.

ACMSIGecomExchanges,Vol.4,No.3,February2004.

Kiekintveldetal.

4.1RFQGeneration

DeepMaizegeneratesRFQstoreducethediﬀerencebetweenthecurrentandref-erenceinventorytrajectories.Threeelementsofthereferencetrajectoryarecon-sideredinturn:outstandingcustomerorders,baselinebuﬀerlevel,andexpectedfuturecomponentutilization.Thisprioritizationtakesintoaccounttheimmediacyofcurrentordersandsubsequentopportunitiestoprocurecomponentsforfutureconsumption.TAC/SCMlimitsagentstotenRFQsperdaypersupplier,andDeepMaizeusesalltheseslots.ItgeneratestenRFQs(splitacrosstwosuppliers)foreachnon-CPUcomponenttype,andﬁveforeachCPU.

4.1.1RFQsforoutstandingcustomerorders.Figure1depictstheprocessofgeneratingorder-relatedRFQs.Wecomputethecurrentinventorytrajectoryin-cludingcomponentsinassembledPCsaswellasknownfuturecomponentarrivals.Foreachcustomerorderthatcannotbeﬁlledusingcurrentinventory,wegenerateanRFQforthecorrespondingdeﬁcitquantityandduedate.Ifmorethan8RFQsaregenerated,thosewithnearbyduedatesaremergedtostaywithinthisquota.

QuantityInventory ProjectionQuantityPrevious Inventory Projection

New Inventory Projection

Quantity Required(for outstanding customerorders)Due DateDeficit Quantities

(a)

(b)

Due Date

Fig.1.Generatingorder-relatedRFQsforaparticularcomponent.(a)Quantitiesrequiredand(b)Finalcomponentinventoryprojection.RFQsarecreatedforthedeﬁcitquantitiesin(b).

4.1.2RFQsforbaselinebuﬀerlevel.Thecurrentinventorytrajectoryismodi-ﬁedbyremovingPCsandcomponentsalreadycommittedtooutstandingcustomerorders.AnRFQisgeneratedfortheﬁrstdaythistrajectorydropsbelowthebase-linebuﬀerleveltomakeupthediﬀerence.Ifthisquantityislarge,theRFQissplitintwo,withhalfthequantitysenttoeachsupplierforthecomponenttype.4.1.3RFQsforexpectedcomponentutilization.AnyremainingRFQslotsareusedtorequestcomponentsaddressingthelong-termexpectedcomponentutiliza-tion.ThesamecurrentinventorytrajectoryusedforgeneratingbaselineRFQsisusedagain,andtheexpectedcomponentutilizationcurveissubtractedfromthis

ACMSIGecomExchanges,Vol.4,No.3,February2004.

DeepMaizeProcurementinTAC/SCM

trajectory.ApotentialRFQisgeneratedforeachdaywherethisquantityisneg-ative.AsubsetoftheseRFQsareselectedtoﬁlltheavailableRFQslots.Theselectionisprobabilisticallybiasedtowardsdayswhencomponentsareexpectedtobecheaper.ToestimateavailablepricesDeepMaizemaintainsanassessmentofeachsupplier’savailablecapacityproﬁle,basedontheoﬀersseenandthesupplierpricingfunction.Eachoﬀeryieldsinformationaboutthesupplier’scurrentcapac-ity.Forexample,apartialcompletionoﬀermeansthatasupplierhasexactlytheoﬀeredquantityavailablebytherequestedduedate,andnomorethantheoﬀeredquantityavailableonanypreviousday.Calculatingtheimplicationsofeveryoﬀeryieldsanupperboundoncapacity(andthuslowerboundonprice)foreachday.4.1.4ProbeRFQs.IfRFQslotsareavailableafterallreferenceinventorytrajec-toryneedshavebeenmet,DeepMaizeissuesprobeRFQs—forasinglecomponentonarandomdate—togarneradditionalinformationaboutsupplierstate.4.2

OﬀerAcceptance

Theoﬀeracceptancemechanismselectswhichsupplieroﬀerstoacceptbasedonthereferenceinventorytrajectory.DeepMaizemakesitsacceptancedecisionsseparatelyforeachcomponenttype.Foreachoﬀerreceived,itmayhavethreechoices:reject(R),acceptcomplete(AC),oracceptpartial(AP)—thethirdoptionisapplicableonlyiftheoﬀerincludesthisoptionduetothesupplier’sinabilitytoprovidethefullquantitybytherequesteddate.

Givenasetofnoﬀers,leto=󰀇o1···on󰀈,oi∈{R,AC,AP},denoteadecisionvector.Theagent’soptimizationproblemistoidentify

󰀁

argmaxnV(s,o)−Ci(oi),(2)

o∈{R,AC,AP}

whereV(s,o)isthevalueoftheinventorytrajectorystartingfromcurrentstates

plustheordersacceptedaccordingtoo.Thestatecomprisesallinformationrelevanttothereferenceinventorytrajectory,includingcurrentinventoryofcomponentsandPCs,anticipatedcomponentdeliveries,andoutstandingcustomerorders.Ci(AC)(respectively,Ci(AP))denotesthecostofacceptingthecomplete(resp.partial)orderi.Foralli,Ci(R)=0.

Thethreesourcesofcomponentrequirementscontributediﬀerentiallytothevaluefunction.Componentsrequiredtoﬁllexistingcustomerordersarevaluedattheentirepriceoftheorderplusthepenaltychargesavoidedbymeetingtheorder.ThepenaltyamountspeciﬁedinthecustomerRFQispaideachdayanorderislateuntilitexpires,sothepenaltychargessavedvarydependingonwhentheordercanbemet.Thesevaluesarequitehighcomparedtothecostofanindividualcomponent,implyingthat(almost)anyoﬀernecessarytomeetanexistingcustomerorderwillbeaccepted,regardlessofprice.Includingpenaltychargesallowstheagenttoreasonaboutcaseswhereanordercanbemetearlierbyacceptingoneoﬀeroveranother,reducingpenaltypayments.Componentsthataddressfutureexpectedconsumptionarevaluedattheexpectedequilibriumpricefortheprojecteddayofconsumption.Thisreﬂectsanassumptionthatthemarketwillbeinequilibrium,andthusanycomponentsobtainedforlessthantheequilibriumpricewillleadtoproﬁtablefutureproduction.

ACMSIGecomExchanges,Vol.4,No.3,February2004.

Kiekintveldetal.

Componentsthatﬁllbaselineinventoryarevaluedattheequilibriumpriceforthecurrentdayplusabaselinepremium.Thebaselinepremiumisdeﬁnedonaslidingscale,withhigherpremiumvaluesfortheﬁrstcomponents.2Thepremiumvaluesareintendedtoheuristicallyaccountfortwofactors:(1)thefactthatcompo-nentsactuallyhavevalueonlywhencombinedwithothercomponentsand(2)theopportunitycostofhavingproductionconstrainedbyavailablecomponents.Thebaselinevaluesaredecayedbyaconstantmultiplicativefactoreachdaytorepresentabiastowardsachievingthebaselineinventoryassoonaspossible.

Tovalueaninventorytrajectory,westartbycreatingasortedlistofpossiblecomponentvaluesforeachfutureday.Unitvaluescalculatedfromthecurrentreferencetrajectoryaccordingtotherulesaboveareinsertedintothelistforthelastdaytheneedcanbemet.Eachcomponentisthenvaluedinorderofarrival.Thealgorithmlooksforwardfromthedayofarrivaltoﬁndthemaximumpossiblevaluethatcanbeassignedtothecomponentandremovesthisvaluefromthecorrespondinglist.Ifthevalueisfromalaterdaythanthecomponentarrives,thevalueisreplacedwiththehighestvaluefromanearlierday.Thisisnecessarytoensurethatthemaximalsumisalwaysassigned.Toseewhythisisso,considerahighlysimpliﬁedexamplewithsinglecomponentsarrivingondays0and2andthefollowingunitvaluesforthenextthreedays:OriginalValuesDay0:20,5

Day1:15,10,10Day2:30,15,5

WithoutReplacementDay0:20,5

Day1:15,10,10Day2:15,5

WithReplacementDay0:5

Day1:15,10,10Day2:20,15,5

Thatis,havingoneunitonday0isworth20,andthesecondisworthanadditional5.Thecenterandrightcolumnsreﬂectthevaluesafterassigning(andremoving)onevalue,withandwithoutreplacement.Withoutreplacement,thevalue30isassignedtotheﬁrstcomponentand15tothesecond.Thesevaluesarenotmaximalsincetheﬁrstcomponentcouldbeassignedthevalue20andthesecondassignedthevalue30withbothcomponentsstillarrivingintime.Byreplacingthevalue30withthevalue20whenitisremovedwecanrepresentthepossibilitythatalaterarrivingcomponentcouldﬁllthisneed,freeingtheﬁrstcomponenttoﬁllanearlierhigh-valuedneed.

Whenallcomponentsarrivingonagivendayhavebeenvalued,anyremain-ingorder-basedorbaselinevaluesarepropagatedtothenextday,accountingforpenaltiespaid,orderexpirations,anddecayofbaselinevalues.Expectedutilizationvaluesareneverpropagated.Oncetheentireinventorytrajectoryisprocessed,itstotalvalueissimplythesumofthevaluesassignedtoitscomponents.Thenetvalueoftheorder-acceptancedecisionisthisvalueminusthecostofacceptedorders(Eq.2).

Usingthisproceduretoevaluatecandidatechoices,wesetupasearchproblemforeachcomponenttype.Thesearchspaceisdeﬁnedbythepossibleacceptancedecisions.Sincethereareatmostthreepossibledecisionsandn≤10originalRFQs,thereareupto310inventorytrajectoriestoevaluate.Thisistoomanygiventhelimitedtimeavailable(15seconds)foreachday’sdecisions,soweperformalocal

2Values

from25–100%ofthecomponentbasepricewereusedduringthetournament

ACMSIGecomExchanges,Vol.4,No.3,February2004.

DeepMaizeProcurementinTAC/SCM

TableI.DeepMaizetournamentprocurementbyRFQtype.Pricesarenormalizedtoapercentageofthebaseprice.

StrategicOrdersBaselineUtilizationProbe

PercentageofRFQs0.8%1.7%6.6%27.7%63.3%AcceptanceRate32.3%10.3%24.7%53.0%18.4%PercentageofQuantity51.2%0.7%14.0%33.1%9.4%AveragePrice57.684.868.565.361.4optimizationusinghill-climbingsearch.Totheextentthatanoﬀerisworthwhile

ornotindependentofwhichothersareaccepted,hill-climbingshouldquicklyﬁndanear-optimalsolution.

Thesearchstartsfromanoderepresentingthestatemaximallyacceptingoﬀers.Ateachnode,theneighborhoodofstateswithonedecisionchangedisexamined.Ifahigher-valuedstateisfound,thatstatebecomesthenewcurrentsearchnode,anditsneighborhoodisexpanded.Whennohigher-valuedstateisfoundinthisneighborhood,thesearchisextendedtoaneighborhoodincludingstateswithtwodecisionschanged.Searchterminateswhentheextendedsearchfailstoﬁndahigher-valuedstate,orwhentimerunsoutformakingadecision.Toallowequalopportunitytoﬁndreasonableacceptancesetsforallcomponenttypeswerunsearchesforall10typesinparallel,evaluatingonestateeachturn.Onceallsearchesterminate,ordersaresentforthehighest-valuedacceptancesets.5.

DISCUSSION

InthissectionwepresentsomeresultsfromtheTAC-03tournamenttoshowhowDeepMaize’sprocurementstrategyworksinpractice.TableIshowsabreakdownofhowthediﬀerenttypesofRFQsDeepMaizesendscontributedtoitsprocure-mentduringthesemi-ﬁnalandﬁnalrounds.3Procurementquantitiesweresplitalmostevenlybetweenstrategicandsteady-statebehaviors.Order-basedneedswerethemostexpensivetoﬁll,followedbybaselineandfutureutilizationneeds.Thisisconsistentwiththerelativevaluesassignedtothoseneedsduringtheoﬀeracceptanceprocess,andindicatesthatthepremiumvalueschangetheacceptancedecisionsintheintendedway.DeepMaizepurchasedthebulkofitscomponentsusingthestrategicandfutureutilizationmechanisms,avoidingtheneedtopayhighpremiumsforcomponentsinallbutrareinstances.

WealsocomparedtheaveragepricespaidandquantitiesprocuredbyDeepMaizetotheotherﬁveagentsinthetournamentﬁnals.Theresultsarebrokendownintothreetimeperiods:day0,days1–2,andtherestofthegame.Day0iswhenmostofthestrategicinteractionsplayedout,butRedAgentandDeepMaizebothhadfallbackstrategiesthatprocuredsubstantialadditionalquantitiesimmediatelyafterday0.Days3–219representthedayswhenourprocurementmodulewasusingitssteady-statepolicy.InTableIIweseethatallagentspurchasedasubstantialfractionoftheircomponentsonday0.Whitebearpurchasedcomponentsonlyonday0,butendeduppurchasingfarfewertotalcomponentsthanmostotheragents.All

allgamesplayedbyDeepMaizeexcept1241,1269,and1429whenDeepMaizeexpe-riencednetworkproblems(atotalof28games).WealsonotethatDeepMaizeusedsomewhatdiﬀerentday-0procurementstrategiesinthethreerounds.Thepatternofresultspresentedhereholdswhentheroundsareconsideredindividuallyinsteadofinaggregate.

ACMSIGecomExchanges,Vol.4,No.3,February2004.

3Includes

Kiekintveldetal.

TableII.ComponentquantitiespurchasedduringdiﬀerentgameperiodsintheTAC-03ﬁnalround.CheckmarksindicateastatisticallysigniﬁcantdiﬀerencewithDeepMaize(p≤0.05),whilexindicatesalesssigniﬁcantdiﬀerence(p≤0.10).Agentsarelistedintheordertheyplacedinduringthetournamentﬁnals.

Day0Days1–2Days3–219Ave.TotalPurchased

RedAgent40.9%x19.5%󰀁39.7%254101󰀁deepmaize31.1%29.4%39.5%226423TacTex33.0%0.8%󰀁66.3%󰀁80396󰀁Botticelli41.2%x0.0%󰀁58.8%󰀁213340PackaTAC20.4%󰀁3.8%󰀁75.8%󰀁117545󰀁whitebear100.0%󰀁0.0%󰀁0.0%󰀁53571󰀁TableIII.Averagenormalizedpricespaidforcompo-nentsduringtheTAC-03ﬁnalround.Checkmarksindi-cateastatisticallysigniﬁcantdiﬀerencewithDeepMaize

(p≤0.05).

Days0–219Days1-2Days3–219RedAgent0.6330.822󰀁0.676󰀁deepmaize0.6320.7670.630TacTex0.6410.656󰀁0.722󰀁Botticelli0.575󰀁-.-0.630PackaTAC0.734󰀁0.854󰀁0.796󰀁whitebear0.500󰀁-.--.-

otheragentspurchasedthroughoutthegame,withsigniﬁcantprocurementactivity

fromdays3–219.

Someinformationabouttheprocurementstrategiesoftheotheragentsafterday0canbefoundintheirpublishedaccounts.RedAgent[KellerandDuguay2004]andPackaTAC[Dahlgren2003]bothusedvariationsofastrategybasedonmaintainingathresholdinventorylevelwithareorderpolicyifinventorylevelsbelowthethresholdweredetected.Bothusedfairlysimpleoﬀeracceptancestrategiesthatdidnotrejectoﬀersonthebasisofprice.TacTex[PardoeandStone2004]hadaprocurementstrategysomewhatsimilartoDeepMaize’s.Itprojectedinventoryneedsforthenext50daysandissuedRFQstoﬁllthoseneedsfordayswithlowexpectedprices.Oﬀerswereacceptedifmarginalvaluesexceededthecostoftheorder(independentlyfromotheroﬀers).TacTexprojectsneedsandcalculatesmarginalvaluesbasedonhistoricaldata,whileDeepMaizeusesprojectionsoffuturecustomerdemandandmarketanalysis.Botticelli[Benischetal.2004]didnotconsidertheprocurementdecisionaspartoftheiroveralloptimizationapproachbecauseoftheday-0eﬀects,andWhitebeardidnotpurchaseanycomponentsafterday0.

LookingatTableIIIweseethatDeepMaizeachievedsigniﬁcantlybetterpricesthaneveryagentexceptBotticellifromdays3–219.BotticelliandDeepMaizehadalmostidenticalperformance.Intheﬁrstcolumnweseethatthetopthreeplac-ingagentspaidverysimilaraveragepricesforcomponentsovertheentiregame.PackaTACpaidsigniﬁcantlyhigherprices,whileBotticelliandWhitebearpaidlowerprices(inWhitebear’scasebecausetheyonlypurchasedonday0).Interestingly,lookingonlyattheseprocurementnumbersBotticelliappearstohavethemosteﬃcientoverallprocurementstrategy.Itpurchasedasimilartotalquantityofcom-ponentsasthetoptwoagents,andpurchasedthemforlowerprices.Thatthisagentplacedfourthoverallistestamenttothefactthataveragepricesdonottell

ACMSIGecomExchanges,Vol.4,No.3,February2004.

DeepMaizeProcurementinTAC/SCM

thefullstory;whencomponentsarriveandwhattheagentdoeswiththemalsomatter.

ThisanalysisshowsthatDeepMaizehadacompetitiveoverallprocurementpol-icy.Thesteady-statepartofthispolicywasbasedonvalue-drivenprocurementguidedbyareferenceinventorytrajectory.Itwasﬂexibleenoughtoexpressmanytypesofprocurementrequirements,androbustenoughtobeusedinconjunctionwithseveralinitialprocurementstrategies.Thetournamentbehaviorofthepro-curementmodulecorrespondedwelltotheintendedbehaviorrepresentedbythevaluefunction.ThisresultedinDeepMaizeachievingaveragepriceperformanceequaltoorbetterthanallotherﬁnalsagentswhenthesteady-statepolicywasactive.Onecaveattoconsideristhattheimportanceofday0procurementinTAC-03mayhavecausedmanyoftheotheragentdeveloperstoputlesseﬀortintodevelopingpoliciesforprocurementduringtherestofthegame.SincechangeswillbemadeforTAC-04toreducetheimpactofday0procurement,welookforwardtothe2004competitionasabettertestforDeepMaize’ssteady-stateprocurementpolicy.

TherearemanypossibilitiesforimprovementstoDeepMaize’sprocurementap-proach.Forinstance,someoftheparametersdeﬁningthereferenceinventorytra-jectory(e.g.baselinelevelandpremiums)weresetsomewhatarbitrarilyandcouldundoubtedlybeimprovedbyadditionaltuning,analysis,orlearningmethods.Ourestimatesofsuppliercapacitycouldalsobeusedmoreeﬀectivelytotargetsupplierswithavailablecapacityatlowprices,potentiallyimprovingoﬀeracceptanceratesandcomponentprices.Thesesameestimatescouldalsobeusedtoimproveourprojectionsoffuturecomponentconsumptionbyidentifyingtimeswhenproductionwillbeimpossiblebecausekeycomponentsarenotavailableorareveryexpensive.Webelievethattheseandotherimprovementswillbeinterestingavenuesforfurtherresearchonvalue-drivenprocurementinfutureTAC/SCMcompetitions.

ACKNOWLEDGMENTS

WegratefullyacknowledgethehelpofmanypeoplewhomadetheTAC/SCMtour-namentpossibleincludingR.Arunachalam,J.Eriksson,N.Finne,S.Janson,andN.Sadeh.AttheUniversityofMichigan,DeepMaizewasdesignedandimplementedwiththeadditionalhelpofJoshuaEstelle,YevgeniyVorobeychik,MatthewRudary,KevinO’Malley,ThedeLoder,andShih-FenCheng.ThisworkwassupportedinpartbyNSFgrantIIS-0205435.

REFERENCES

Arunachalam,R.,Eriksson,J.,Finne,N.,Janson,S.,andSadeh,N.2003.TheTACsupplychainmanagementgame.Tech.rep.,SwedishInstituteofComputerScience.DraftVersion0.62.

Benisch,M.,Greenwald,A.,Naroditskiy,V.,andTschantz,M.2004.Astochasticprogram-mingapproachtoschedulinginTACSCM.InFifthACMConferenceonElectronicCommerce.NewYork.

Dahlgren,E.2003.PackaTAC:Aconservativetradingagent.M.S.thesis,LundUniversity.Estelle,J.,Vorobeychik,Y.,Wellman,M.P.,Singh,S.,Kiekintveld,C.,andSoni,V.2003.Strategicinteractionsinasupplychaingame.Tech.rep.,UniversityofMichigan.

Keller,P.W.andDuguay,F.-O.2004.RedAgent-winnerofTACSCM2003.SIGecomExchanges4,3.

ACMSIGecomExchanges,Vol.4,No.3,February2004.

Kiekintveldetal.

Kiekintveld,C.,Wellman,M.P.,Singh,S.,Estelle,J.,Vorobeychik,Y.,Soni,V.,andRudary,M.2004.Distributedfeedbackcontrolfordecisionmakingonsupplychains.InFourteenthInternationalConferenceonAutomatedPlanningandScheduling.Whistler,BC.Pardoe,D.andStone,P.2004.TacTex-03:Asupplychainmanagementagent.SIGecomExchanges4,3.

Sadeh,N.,Arunachalam,R.,Eriksson,J.,Finne,N.,andJanson,S.2003.TAC-03:Asupply-chaintradingcompetition.AIMagazine24,1,92–94.

Schneider,J.G.,Boyan,J.A.,andMoore,A.W.1998.Valuefunctionbasedproductionscheduling.InFifteenthInternationalConferenceonMachineLearning.Madison,WI,522–530.ReceivedJanuary2004;RevisedFebruary2004;AcceptedFebruary2004;

ACMSIGecomExchanges,Vol.4,No.3,February2004.

因篇幅问题不能全部显示，请点此查看更多更全内容

查看全文