b2科目四模拟试题多少题驾考考爆了怎么补救
b2科目四模拟试题多少题 驾考考爆了怎么补救

mapreduce的工作原理_mapreduce计算原理_mapreduce的作用(12)

电脑杂谈  发布时间:2017-03-13 14:37:59  来源:网络整理

当某个TaskTracker上出现空闲slot时,调度器依次选择一个queue、(选中的queue中的)job、(选中的job中的)task,并将该slot分配给该task。下面是选择queue、job和task所采用的策略:

①选择queue:当所有queue按照资源使用率(numSlotsOccupied/capacity)由小到大排序,依次进行处理,直到找到一个合适的job。

②选择job:在当前queue中,所有作业按照作业提交时间和作业优先级进行排序(假设开启支持优先级调度功能,默认不支持,需要在配置文件中开启),调度依次考虑每个作业,选择符合两个条件的job:【1】作业所在的用户未达到资源使用上限【2】该TaskTracker所在的节点剩余的内存足够该job的task使用。

③选择task,同大部分调度器一样,考虑task的locality和资源使用情况(即:调用jobInProgress中的obtainNewMapTask()/obtainNewReduceTask()方法)

综上所述,能力调度器的伪代码为:

// CapacityTaskScheduler:trackTracker出现空闲slot,为slot寻找合适的task
List<Task> assignTasks(TaskTrackerStatus taskTracker) {
  sortQueuesByResourcesUsesage(queues);
  for queue:queues {
    sortJobsByTimeAndPriority(queue);
    for job:queue.getJobs() { 
      if(matchesMemoryRequirements(job,taskTracker)) {
        task = job. obtainNewTask();
        if(task != null) return task
      } 
    } 
  } 
}

4、capacity Scheduler配置实例

①. 复制$HADOOP_HOME/contrib/capacity-scheduler/hadoop-capacity-scheduler.jar 到$HADOOP_HOME/lib目录中

②. 修改namenode节点中的conf/mapred-site.xml文件

<property>
    <name>mapred.jobtracker.taskScheduler</name>
    <value>org.apache.hadoop.mapred.CapacityTaskScheduler</value>
  </property>
  <property>
    <name>mapred.queue.names</name>
    <value>default,hadoop,hive</value>
  </property>

③. 修改conf/capacity-scheduler.xml 配置文件

<?xml version="1.0"?>
 
<!-- This is the configuration file for the resource manager in Hadoop. -->
<!-- You can configure various scheng parameters related to queues. -->
<!-- The properties for a queue follow a naming convention,such as, -->
<!-- mapred.capacity-scheduler.queue.<queue-name>.property-name. -->
 
<configuration>
  <!-- Capacity scheduler Job Initialization configuration parameters -->
  <property>
    <name>mapred.capacity-scheduler.init-poll-interval</name>
    <value>5000</value>
    <description>The amount of time in miliseconds which is used to poll the job queues for jobs to initialize.
    </description>
  </property>
  <property>
    <name>mapred.capacity-scheduler.init-worker-threads</name>
    <value>5</value>
    <description>Number of worker threads which would be used by
    Initialization poller to initialize jobs in a set of queue.
    If number mentioned in property is equal to number of job queues
    then a single thread would initialize jobs in a queue. If lesser
    then a thread would get a set of queues assigned. If the number
    is greater then number of threads would be equal to number of
    job queues.
    </description>
  </property>
 
  <property>
     <name>mapred.capacity-scheduler.maximum-system-jobs</name>
     <value>30</value>
     <description>Maximum number of jobs in the system which can be initialized,
concurrently, by the Capacity Scheduler.
     </description>
  </property>
 
<!--hadoop queue-->
  <property>
    <name>mapred.capacity-scheduler.queue.hadoop.capacity</name>
    <value>30</value>
    <description>Percentage of the number of slots in the cluster that are to be available for jobs in this queue.
    </description>   
  </property>
 
  <property>
    <name>mapred.capacity-scheduler.queue.hadoop.maximum-capacity</name>
    <value>-1</value>
    <description>
    </description>   
  </property>
 
  <property>
    <name>mapred.capacity-scheduler.queue.hadoop.supports-priority</name>
    <value>true</value>
    <description></description>
  </property>
 
    <property>
    <name>mapred.capacity-scheduler.queue.hadoop.minimum-user-limit-percent</name>
    <value>100</value>
    <description> </description>
  </property>
 
  <property>
    <name>mapred.capacity-scheduler.queue.hadoop.user-limit-factor</name>
    <value>3</value>
    <description></description>
  </property>
 
  <property>
    <name>mapred.capacity-scheduler.queue.hadoop.maximum-initialized-active-tasks</name>
    <value>200000</value>
    <description></description>
  </property>
 
  <property>
    <name>mapred.capacity-scheduler.queue.hadoop.maximum-initialized-active-tasks-per-user</name>
    <value>100000</value>
    <description></description>
  </property>
 
  <property>
    <name>mapred.capacity-scheduler.queue.hadoop.init-accept-jobs-factor</name>
    <value>10</value>
    <description></description>
  </property>
 
  <property>
    <name>mapred.capacity-scheduler.default-maximum-initialized-jobs-per-user</name>
    <value>5</value>
    <description>The maximum number of jobs to be pre-initialized for a user
    of the job queue.
    </description>
  </property>
 
<!-- hive -->
<property>
    <name>mapred.capacity-scheduler.queue.hive.capacity</name>
    <value>30</value>
    <description></description>   
  </property>
 
  <property>
    <name>mapred.capacity-scheduler.queue.hive.maximum-capacity</name>
    <value>-1</value>
    <description></description>   
  </property>
 
  <property>
    <name>mapred.capacity-scheduler.queue.hive.supports-priority</name>
    <value>true</value>
    <description>If true, priorities of jobs will be taken into account in scheng decisions.
    </description>
  </property>
 
    <property>
    <name>mapred.capacity-scheduler.queue.hive.minimum-user-limit-percent</name>
    <value>100</value>
    <description></description>
  </property>
 
  <property>
    <name>mapred.capacity-scheduler.queue.hive.user-limit-factor</name>
    <value>4</value>
    <description>The multiple of the queue capacity which can be configured to allow a single user to acquire more slots.
    </description>
  </property>
 
  <property>
    <name>mapred.capacity-scheduler.queue.hive.maximum-initialized-active-tasks</name>
    <value>200000</value>
    <description></description>
  </property>
 
  <property>
    <name>mapred.capacity-scheduler.queue.hive.maximum-initialized-active-tasks-per-user</name>
    <value>100000</value>
    <description></description>
  </property>
 
  <property>
    <name>mapred.capacity-scheduler.queue.hive.init-accept-jobs-factor</name>
    <value>10</value>
    <description></description>
  </property>
 
<!-- default -->
  <property>
    <name>mapred.capacity-scheduler.queue.default.capacity</name>
    <value>40</value>
    <description></description>   
  </property>
 
  <property>
    <name>mapred.capacity-scheduler.queue.default.maximum-capacity</name>
    <value>-1</value>
    <description></description>   
  </property>
 
  <property>
    <name>mapred.capacity-scheduler.queue.default.supports-priority</name>
    <value>true</value>
    <description></description>
  </property>
 
  <property>
    <name>mapred.capacity-scheduler.queue.default.minimum-user-limit-percent</name>
    <value>100</value>
    <description></description>
  </property>
 
  <property>
    <name>mapred.capacity-scheduler.queue.default.user-limit-factor</name>
    <value>4</value>
    <description></description>
  </property>
 
  <property>
    <name>mapred.capacity-scheduler.queue.default.maximum-initialized-active-tasks</name>
    <value>200000</value>
    <description></description>
  </property>
 
  <property>
    <name>mapred.capacity-scheduler.queue.default.maximum-initialized-active-tasks-per-user</name>
    <value>100000</value>
    <description></description>
  </property>
 
  <property>
    <name>mapred.capacity-scheduler.queue.default.init-accept-jobs-factor</name>
    <value>10</value>
    <description></description>
  </property>
 
</configuration>


本文来自电脑杂谈,转载请注明本文网址:
http://www.pc-fly.com/a/jisuanjixue/article-37299-12.html

相关阅读
    发表评论  请自觉遵守互联网相关的政策法规,严禁发布、暴力、反动的言论

    热点图片
    拼命载入中...