典型文献
Accelerating DAG-Style Job Execution via Optimizing Resource Pipeline Scheduling
文献摘要:
The volume of information that needs to be processed in big data clusters increases rapidly nowadays.It is critical to execute the data analysis in a time-efficient manner.However,simply adding more computation resources may not speed up the data analysis significantly.The data analysis jobs usually consist of multiple stages which are organized as a directed acyclic graph(DAG).The precedence relationships between stages cause scheduling challenges.General DAG scheduling is a well-known NP-hard problem.Moreover,we observe that in some parallel computing frameworks such as Spark,the execution of a stage in DAG contains multiple phases that use different resources.We notice that carefully arranging the execution of those resources in pipeline can reduce their idle time and improve the average resource utilization.Therefore,we propose a resource pipeline scheme with the objective of minimizing the job makespan.For perfectly parallel stages,we propose a contention-free scheduler with detailed theoretical analysis.Moreover,we extend the contention-free scheduler for three-phase stages,considering the computation phase of some stages can be partitioned.Additionally,we are aware that job stages in real-world applications are usually not perfectly parallel.We need to frequently adjust the parallelism levels during the DAG execution.Considering reinforcement learning(RL)techniques can adjust the scheduling policy on the fly,we investigate a scheduler based on RL for online arrival jobs.The RL-based scheduler can adjust the resource contention adaptively.We evaluate both contention-free and RL-based schedulers on a Spark cluster.In the evaluation,a real-world cluster trace dataset is used to simulate different DAG styles.Evaluation results show that our pipelined scheme can significantly improve CPU and network utilization.
文献关键词:
中图分类号:
作者姓名:
Yubin Duan;Ning Wang;Jie Wu
作者机构:
Department of Computer and Information Sciences,Temple University,Philadelphia 19122,U.S.A.;Department of Computer Science,Rowan University,Glassboro 08028,U.S.A.
文献出处:
引用格式:
[1]Yubin Duan;Ning Wang;Jie Wu-.Accelerating DAG-Style Job Execution via Optimizing Resource Pipeline Scheduling)[J].计算机科学技术学报(英文版),2022(04):852-868
A类:
precedence,schedulers
B类:
Accelerating,DAG,Style,Job,Execution,via,Optimizing,Resource,Pipeline,Scheduling,volume,information,that,needs,processed,big,clusters,increases,rapidly,nowadays,It,critical,execute,analysis,efficient,manner,However,simply,adding,more,computation,resources,may,speed,up,significantly,jobs,usually,consist,multiple,stages,which,organized,directed,acyclic,graph,relationships,between,cause,scheduling,challenges,General,well,known,NP,hard,problem,Moreover,observe,some,computing,frameworks,such,Spark,execution,contains,phases,different,We,notice,carefully,arranging,those,reduce,their,idle,improve,average,utilization,Therefore,propose,scheme,objective,minimizing,makespan,For,perfectly,contention,free,detailed,theoretical,extend,three,considering,partitioned,Additionally,aware,real,world,applications,frequently,adjust,parallelism,levels,during,Considering,reinforcement,learning,RL,techniques,policy,fly,investigate,online,arrival,adaptively,evaluate,both,In,evaluation,trace,dataset,used,simulate,styles,Evaluation,results,show,pipelined,CPU,network
AB值:
0.518072
相似文献
机标中图分类号,由域田数据科技根据网络公开资料自动分析生成,仅供学习研究参考。