Dynamic Core Allocation for Malleable Jobs with Unknown Speed-up Parameters
具有未知加速参数的可变作业的动态核心分配
S. ~A. Bodas, J. ~L. Dorsman, M. Mandjes, L. Ravner
AI总结 针对多核系统中具有未知加速参数的可变作业,提出一种迭代学习-控制框架,通过最大似然估计未知参数并求解马尔可夫决策过程更新分配策略,以最小化长期平均作业数。
详情
我们研究了具有固定数量处理核心和可变形作业流的多核计算系统中的动态资源分配问题。每个作业可以在执行期间调整其并行度,从而允许在并发活动作业之间自适应地重新分配资源。作业属于两个可观测类别之一,每个类别由具有未知参数的独特加速函数表征。目标是学习一种核心分配策略,以最小化系统中长期平均作业数,即稳态下的平均响应时间。为了解决这种不确定性,我们开发了一个迭代学习与控制框架。系统在根据观察到的作业完成情况估计未知加速参数和求解相关马尔可夫决策过程以更新分配策略之间交替。在每个作业类别内,核心在活动作业之间平均共享;分配给每个类别的容量比例来自文献[17]的MDP公式,并在当前参数估计下进行评估。我们基于状态相关的离开时间构建了最大似然估计器,并证明了在固定分配策略下其强一致性。我们进一步提出了两种学习算法,将该估计步骤与基于动态规划的策略更新相结合,并通过数值实验说明了它们的性能。
We study dynamic resource allocation in a multicore computing system with a fixed number of processing cores and a stream of {\it malleable} jobs. Each job may adjust its level of parallelism during execution, allowing adaptive redistribution of resources across concurrently active jobs. Jobs belong to one of two observable classes, each characterized by a distinct speed-up function with unknown parameters. The objective is to learn a core-allocation policy that minimizes the long-run mean number of jobs in the system, equivalently the mean response time in steady state. \noindent To address this uncertainty, we develop an iterative learning-and-control framework. The system alternates between estimating the unknown speed-up parameters from observed job completions and solving the associated Markov decision process (MDP) to update the allocation policy. Within each job class, cores are shared equally among active jobs; the fraction of capacity assigned to each class is obtained from the MDP formulation of \cite{berg2017}, evaluated at the current parameter estimates. We construct a maximum likelihood estimator based on state-dependent inter-departure times and prove its strong consistency under a fixed allocation policy. We further propose two learning algorithms that combine this estimation step with dynamic programming-based policy updates, and illustrate their through numerical experiments.