yarn关于app max attempt深度解析,针对长服务appmaster平滑重启
在YARN上開發(fā)長服務(wù),需要注意fault-tolerance,本篇文章對(duì)appmaster的平滑重啟的一個(gè)參數(shù)做了解析,如何設(shè)置可以有助于達(dá)到appmaster平滑重啟。
在yarn-site.xml有個(gè)參數(shù)
/** * The maximum number of application attempts. * It's a global setting for all application masters. */ yarn.resourcemanager.am.max-attempts
一個(gè)全局的appmaster重試次數(shù)的限制,yarn提交應(yīng)用時(shí),還可以為單獨(dú)一個(gè)應(yīng)用設(shè)置最大重試次數(shù)
/** * Set the number of max attempts of the application to be submitted. WARNING: * it should be no larger than the global number of max attempts in the Yarn * configuration. * @param maxAppAttempts the number of max attempts of the application * to be submitted. */ @Public @Stable public abstract void setMaxAppAttempts(int maxAppAttempts);
當(dāng)attempt失敗時(shí),如果設(shè)置keepContainersAcrossAppAttempts了,resource manager會(huì)決定上個(gè)attempt的container是否仍然保留著。
boolean keepContainersAcrossAppAttempts = false;
switch (finalAttemptState) {
case FINISHED:
{
appEvent = new RMAppFinishedAttemptEvent(applicationId,
appAttempt.getDiagnostics());
}
break;
case KILLED:
{
// don't leave the tracking URL pointing to a non-existent AM
appAttempt.setTrackingUrlToRMAppPage();
appAttempt.invalidateAMHostAndPort();
appEvent =
new RMAppFailedAttemptEvent(applicationId,
RMAppEventType.ATTEMPT_KILLED,
"Application killed by user.", false);
}
break;
case FAILED:
{
// don't leave the tracking URL pointing to a non-existent AM
appAttempt.setTrackingUrlToRMAppPage();
appAttempt.invalidateAMHostAndPort();
if (appAttempt.submissionContext
.getKeepContainersAcrossApplicationAttempts()
&& !appAttempt.submissionContext.getUnmanagedAM()) {
// See if we should retain containers for non-unmanaged applications
if (!appAttempt.shouldCountTowardsMaxAttemptRetry()) {
// Premption, hardware failures, NM resync doesn't count towards
// app-failures and so we should retain containers.
keepContainersAcrossAppAttempts = true;
} else if (!appAttempt.maybeLastAttempt) {
// Not preemption, hardware failures or NM resync.
// Not last-attempt too - keep containers.
keepContainersAcrossAppAttempts = true;
}
}
appEvent =
new RMAppFailedAttemptEvent(applicationId,
RMAppEventType.ATTEMPT_FAILED, appAttempt.getDiagnostics(),
keepContainersAcrossAppAttempts);
}
}
關(guān)注appAttempt.maybeLastAttempt這個(gè)變量,rs如何判斷是否這次attempt是最后一次呢?
private void createNewAttempt() {
ApplicationAttemptId appAttemptId =
ApplicationAttemptId.newInstance(applicationId, attempts.size() + 1);
RMAppAttempt attempt =
new RMAppAttemptImpl(appAttemptId, rmContext, scheduler, masterService,
submissionContext, conf,
// The newly created attempt maybe last attempt if (number of
// previously failed attempts(which should not include Preempted,
// hardware error and NM resync) + 1) equal to the max-attempt
// limit.
maxAppAttempts == (getNumFailedAppAttempts() + 1), amReq);
attempts.put(appAttemptId, attempt);
currentAttempt = attempt;
}
在每次構(gòu)造新的attempt時(shí)候,maxAppAttempts == (getNumFailedAppAttempts() + 1)會(huì)決定,已經(jīng)失敗的次數(shù)+1,是否已經(jīng)達(dá)到了maxAppAttempts的限制了。
而maxAppAttempts這個(gè)參數(shù)是由global和individual兩個(gè)配置取min,決定的。
int globalMaxAppAttempts = conf.getInt(YarnConfiguration.RM_AM_MAX_ATTEMPTS,
YarnConfiguration.DEFAULT_RM_AM_MAX_ATTEMPTS);
int individualMaxAppAttempts = submissionContext.getMaxAppAttempts();
if (individualMaxAppAttempts <= 0 ||
individualMaxAppAttempts > globalMaxAppAttempts) {
this.maxAppAttempts = globalMaxAppAttempts;
LOG.warn("The specific max attempts: " + individualMaxAppAttempts
+ " for application: " + applicationId.getId()
+ " is invalid, because it is out of the range [1, "
+ globalMaxAppAttempts + "]. Use the global max attempts instead.");
} else {
this.maxAppAttempts = individualMaxAppAttempts;
}
總結(jié):
如果希望appmaster可以達(dá)到不斷重啟,而且可以接管之前的container,需要把yarn.resourcemanager.am.max-attempts這個(gè)參數(shù)盡量調(diào)大,比如設(shè)置為10000,并且提交app時(shí)候設(shè)置submit context的最大次數(shù),以及刷新窗口,這樣基本就可以滿足長服務(wù)應(yīng)用在yarn上面的運(yùn)行需求了。
總結(jié)
以上是生活随笔為你收集整理的yarn关于app max attempt深度解析,针对长服务appmaster平滑重启的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: C++ new和delete
- 下一篇: 消毒液可以带上飞机吗 坐飞机能带酒精棉球