问答机器人接口python_设计用于机器学习工程的Python接口
問(wèn)答機(jī)器人接口python
In order to do machine learning engineering, a model must first be deployed, in most cases as a prediction API. In order to make this API work in production, model serving infrastructure must first be built. This includes load balancing, scaling, monitoring, updating, and much more.
為了進(jìn)行機(jī)器學(xué)習(xí)工程,必須首先部署模型,在大多數(shù)情況下,模型應(yīng)作為預(yù)測(cè)API。 為了使該API在生產(chǎn)中起作用,必須首先構(gòu)建模型服務(wù)基礎(chǔ)結(jié)構(gòu)。 這包括負(fù)載平衡,擴(kuò)展,監(jiān)視,更新等等。
At first glance, all of this work seems familiar. Web developers and DevOps engineers have been automating microservice infrastructure for years now. Surely, we can repurpose their tools?
乍一看,所有這些工作似乎都很熟悉。 Web開(kāi)發(fā)人員和DevOps工程師多年來(lái)一直在實(shí)現(xiàn)微服務(wù)基礎(chǔ)架構(gòu)的自動(dòng)化。 當(dāng)然,我們可以重新利用他們的工具嗎?
Unfortunately, we can’t—at least, not precisely.
不幸的是,我們不能-至少不是精確地。
While ML infrastructure is similar to traditional DevOps, it is just ML-specific enough to make standard DevOps tools suboptimal. This is why we built Cortex, our open source platform for machine learning engineering.
雖然ML基礎(chǔ)結(jié)構(gòu)類似于傳統(tǒng)的DevOps,但它僅是特定于ML的,足以使標(biāo)準(zhǔn)DevOps工具欠佳。 這就是為什么我們構(gòu)建了Cortex(我們的機(jī)器學(xué)習(xí)工程開(kāi)源平臺(tái))的原因 。
At a very high level, Cortex is designed to make it easy to deploy a model locally or to the cloud, automating all the underlying infrastructure. A central component of the platform is the Predictor interface—a programmable Python interface through which developers can write prediction APIs.
在很高的層次上,Cortex旨在簡(jiǎn)化在本地或云中部署模型的過(guò)程,從而使所有基礎(chǔ)架構(gòu)自動(dòng)化。 平臺(tái)的中心組件是Predictor接口,這是一個(gè)可編程的Python接口,開(kāi)發(fā)人員可以通過(guò)該接口編寫預(yù)測(cè)API。
Designing a Python interface specifically for serving predictions as web requests was a challenge we spent months on (and are still improving). Here, I want to share some of the design principles we’ve developed so far:
設(shè)計(jì)一個(gè)專門用于將預(yù)測(cè)作為Web請(qǐng)求提供服務(wù)的Python接口是我們花了幾個(gè)月(并且還在不斷改進(jìn))的挑戰(zhàn)。 在這里,我想分享我們到目前為止開(kāi)發(fā)的一些設(shè)計(jì)原則:
1. Predictor只是一個(gè)Python類 (1. A Predictor is just a Python class)
At the core of Cortex’s architecture is our concept of a Predictor, which is essentially a prediction API, including all of the request handling code and dependencies. The Predictor interface enforces some simple requirements for those prediction APIs.
Cortex體系結(jié)構(gòu)的核心是Predictor的概念, Predictor本質(zhì)上是一種預(yù)測(cè)API,包括所有請(qǐng)求處理代碼和依賴項(xiàng)。 Predictor接口對(duì)這些預(yù)測(cè)API提出了一些簡(jiǎn)單的要求。
Because Cortex takes a microservices approach to model serving, the Predictor interface is strictly concerned with two things:
由于Cortex采用微服務(wù)方法來(lái)建模服務(wù),因此Predictor接口嚴(yán)格關(guān)注兩件事:
- Initializing models 初始化模型
- Serving predictions 服務(wù)預(yù)測(cè)
In that spirit, Cortex’s Predictor interface requires two functions, __init__() and predict(), that do more or less what you’d expect:
本著這種精神,Cortex的Predictor接口需要兩個(gè)函數(shù)__init__()和predict() ,它們或多或少地實(shí)現(xiàn)了您的期望:
After initialization, you can think of a Predictor as just a Python object, whose single predict() function gets called when users query an endpoint.
初始化之后,您可以將Predictor視為一個(gè)Python對(duì)象,當(dāng)用戶查詢端點(diǎn)時(shí),將調(diào)用該對(duì)象的單個(gè)predict()函數(shù)。
One of the big benefits of this approach is that it is intuitive to anyone with software engineering experience. There’s no need to touch your data pipeline or model training code. A model is just a file, and a Predictor is just an object that imports it and runs a predict() method.
這種方法的最大好處之一是,它對(duì)于具有軟件工程經(jīng)驗(yàn)的任何人都是直觀的。 無(wú)需觸摸數(shù)據(jù)管道或模型訓(xùn)練代碼。 模型只是一個(gè)文件,而Predictor只是一個(gè)導(dǎo)入它并運(yùn)行predict()方法的對(duì)象。
Beyond its syntactic appeal, however, the approach offers some key benefits in how it complements Cortex’s broader approach.
但是,除了語(yǔ)法上的吸引力外,該方法在補(bǔ)充Cortex的更廣泛方法方面提供了一些關(guān)鍵優(yōu)勢(shì)。
2.預(yù)測(cè)只是一個(gè)HTTP請(qǐng)求 (2. A prediction is just an HTTP request)
One of the complexities of building an interface for serving predictions in production is that inputs will almost certainly differ, at least in format, from the model’s training data.
構(gòu)建用于為生產(chǎn)中的預(yù)測(cè)提供服務(wù)的界面的復(fù)雜性之一是,輸入至少與模型的訓(xùn)練數(shù)據(jù)至少在格式上肯定會(huì)有所不同。
This works on two levels:
這在兩個(gè)級(jí)別上起作用:
- The body of a POST request is not a NumPy array or whatever data structure your model was trained to process. POST請(qǐng)求的主體不是NumPy數(shù)組,也不是模型經(jīng)過(guò)訓(xùn)練可以處理的任何數(shù)據(jù)結(jié)構(gòu)。
Machine learning engineering is all about using models to build software, which often times means using models on data they were not trained to process, e.g. using GPT-2 to write folk music.
機(jī)器學(xué)習(xí)工程就是使用模型來(lái)構(gòu)建軟件,這通常意味著對(duì)未經(jīng)訓(xùn)練的數(shù)據(jù)使用模型,例如使用GPT-2 編寫民間音樂(lè) 。
The Predictor interface, therefore, cannot be opinionated about the inputs and outputs of a prediction API. A prediction is just an HTTP request, and the developer is free to process it however they want. If, for example, they want to deploy a multi-model endpoint and query different models based on request params, they can do that:
因此,對(duì)于預(yù)測(cè)API的輸入和輸出,不能使用Predictor接口。 預(yù)測(cè)只是一個(gè)HTTP請(qǐng)求,開(kāi)發(fā)人員可以根據(jù)需要隨意處理它。 例如,如果他們想部署多模型端點(diǎn)并根據(jù)請(qǐng)求參數(shù)查詢不同模型,則可以這樣做:
And while this interface gives developers freedom in terms of what they can do with their API, it also provides some natural scoping that enables Cortex to be more opinionated on the infrastructure side.
盡管此接口使開(kāi)發(fā)人員可以自由使用API??進(jìn)行操作,但它還提供了一些自然的作用域,使Cortex在基礎(chǔ)架構(gòu)方面更具主張。
For example, under the hood Cortex uses FastAPI to setup request routing. There are a number of processes Cortex sets up at this layer that relate to autoscaling, monitoring, and other infrastructure features which could become very complex if developers were required to implement routing.
例如, Cortex在后臺(tái)使用FastAPI設(shè)置請(qǐng)求路由。 Cortex在此層設(shè)置了許多與自動(dòng)縮放,監(jiān)視和其他基礎(chǔ)結(jié)構(gòu)功能有關(guān)的過(guò)程,如果需要開(kāi)發(fā)人員實(shí)施路由,這些過(guò)程可能會(huì)變得非常復(fù)雜。
But, because each API has a single predict() method, every API has the same number of routes—one. Being able to assume this allows Cortex to do a lot more at the infrastructure level without limiting engineers.
但是,由于每個(gè)API都有一個(gè)單獨(dú)的predict()方法,所以每個(gè)API都有相同數(shù)量的路由-一個(gè)。 能夠做到這一點(diǎn)使Cortex可以在基礎(chǔ)架構(gòu)級(jí)別上做更多的事情,而不會(huì)限制工程師。
3.服務(wù)模型只是微服務(wù) (3. A served model is just a microservice)
Scale is a chief concern for anyone using machine learning in production. Models can get huge (GPT-2 is roughly 6 GB), are computationally expensive, and can have high latency. Especially for realtime inference, scaling up to handle traffic is a challenge—even more so if you’re budget constrained.
對(duì)于在生產(chǎn)中使用機(jī)器學(xué)習(xí)的任何人來(lái)說(shuō),規(guī)模都是一個(gè)主要問(wèn)題。 模型可能會(huì)變得很大(GPT-2大約為6 GB),計(jì)算量很大,并且可能具有高延遲。 尤其是對(duì)于實(shí)時(shí)推理,擴(kuò)展規(guī)模以處理流量是一個(gè)挑戰(zhàn),如果您的預(yù)算有限,就更是如此。
To solve for this, Cortex treats Predictors as microservices, which can be horizontally scaled. More specifically, when a developer hits $ cortex deploy, Cortex containerizes the API, spins up a cluster provisioned for inference, and deploys. Then, it exposes the API as a web service behind a load balancer, and configures autoscaling, updating, and monitoring:
為了解決這個(gè)問(wèn)題,Cortex將Predictors視為可以水平縮放的微服務(wù)。 更具體地說(shuō),當(dāng)開(kāi)發(fā)人員點(diǎn)擊$ cortex deploy ,Cortex將容器化API,旋轉(zhuǎn)為推理而配置的集群,然后進(jìn)行部署。 然后,它將API作為Web服務(wù)公開(kāi)在負(fù)載均衡器后面,并配置自動(dòng)縮放,更新和監(jiān)視:
Cortex DocsCortex DocsThe Predictor interface is fundamental to this process, even though it “just” a Python interface.
Predictor接口是此過(guò)程的基礎(chǔ),即使它“只是” Python接口也是如此。
What the Predictor interface does is enforce a packaging of code in such a way that it becomes a single, atomic unit of inference. All the request handling code you need for a single API is contained within a Predictor. This makes it easy for Cortex to scale Predictors:
Predictor接口的作用是強(qiáng)制執(zhí)行代碼打包,以使其成為單個(gè)原子的推理單元。 單個(gè)API所需的所有請(qǐng)求處理代碼都包含在Predictor中。 這使Cortex輕松縮放預(yù)測(cè)變量:
Cortex GitHubCortex GitHubIn this way, engineers don’t have to do any extra work—unless they want to tweak things, of course—to prepare an API for production. A Cortex deployment is production-ready by default.
這樣,工程師不必做任何額外的工作-當(dāng)然,除非他們想進(jìn)行調(diào)整-可以為生產(chǎn)準(zhǔn)備API。 默認(rèn)情況下,Cortex部署可用于生產(chǎn)。
機(jī)器學(xué)習(xí)工程的界面-強(qiáng)調(diào)“工程” (An interface for machine learning engineering—emphasis on the “engineering”)
A constant theme in all of the above points is the balance of flexibility and ease-of-use. If the goal is to allow machine learning engineers to build whatever they want, Cortex needs to be:
上述所有方面的不變主題是靈活性和易用性之間的平衡。 如果目標(biāo)是允許機(jī)器學(xué)習(xí)工程師構(gòu)建他們想要的任何東西,那么Cortex需要:
- Flexible enough for engineers to implement any idea. 足夠靈活,工程師可以實(shí)施任何想法。
- Opinionated enough to automate the infrastructure work that obstructs engineers. 自以為是足以使阻礙工程師的基礎(chǔ)架構(gòu)工作自動(dòng)化。
Striking this balance is a constant challenge, particularly as the world of machine learning engineering is young and constantly changing.
達(dá)到這種平衡是一個(gè)持續(xù)的挑戰(zhàn),尤其是在機(jī)器學(xué)習(xí)工程世界還處于不斷變化之中的今天。
However, by focusing solely on what it takes to build software out of models—the “engineering” in machine learning engineering—we believe we can walk that tightrope.
但是,通過(guò)僅專注于從模型中構(gòu)建軟件的需要(機(jī)器學(xué)習(xí)工程中的“工程”),我們相信我們可以走鋼絲。
If contributing to the machine learning engineering ecosystem sounds interesting to you, we’re always happy to meet new contributors at Cortex.
如果對(duì)您的機(jī)器學(xué)習(xí)工程生態(tài)系統(tǒng)做出貢獻(xiàn)聽(tīng)起來(lái)很有趣,我們很高興 在Cortex結(jié)識(shí)新的貢獻(xiàn)者 。
翻譯自: https://towardsdatascience.com/designing-a-python-interface-for-machine-learning-engineering-ae308adc4412
問(wèn)答機(jī)器人接口python
總結(jié)
以上是生活随笔為你收集整理的问答机器人接口python_设计用于机器学习工程的Python接口的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: raspberry pi_通过串行蓝牙从
- 下一篇: python3(一)数字Number