sam服务器是什么_使用SAM CLI将机器学习模型部署到无服务器后端
sam服務器是什么
介紹 (Introduction)
Over the last year at CMRA, we have been incorporating more machine learning models into our applications. There are plenty of data science blogs about developing models. However, there are very few tutorials about deploying models into production. Fortunately, there are options available for this and it continues to get easier.
去年,在CMRA,我們將更多的機器學習模型整合到我們的應用程序中。 有很多關于開發模型的數據科學博客。 但是,很少有關于將模型部署到生產中的教程。 幸運的是,有一些可用的選項,并且它變得越來越容易。
Popular options for deploying models developed in Python include using services written in Flask or Django for client applications to interface with. AWS, Azure, and Google Cloud each offer their own brand of Machine Learning as a Service (MLaaS) products. These services offer convenient workflows, cloud based notebooks, and marketplaces for models. As nice as both options are there are drawbacks when it comes to costs and scaling.
部署使用Python開發的模型的流行選項包括使用Flask或Django編寫的服務來與客戶端應用程序進行交互。 AWS,Azure和Google Cloud各自提供自己的品牌的機器學習即服務(MLaaS)產品。 這些服務提供了便捷的工作流程,基于云的筆記本以及模型市場。 盡管這兩種選擇都很好,但是在成本和擴展方面存在弊端。
We chose to create a Serverless application using the popular Lambda service from AWS. A Serverless Architecture offers several compelling advantages over the other options:
我們選擇使用來自AWS的流行的Lambda服務來創建無服務器應用程序。 與其他選項相比,無服務器架構具有許多引人注目的優勢:
先決條件 (Prerequisites)
You will need to have the following technologies installed to follow along with this tutorial:
您將需要安裝以下技術,才能隨本教程一起學習:
Python 3.8.3 64 Bit (Preferably with Anaconda)
Python 3.8.3 64位(最好是Anaconda )
A free AWS Account
一個免費的AWS賬戶
SAM CLI by following the instructions here
按照此處的說明操作SAM CLI
Important note — Make sure that you config your AWS CLI with the IAM account that you created as part of the SAM CLI instructions. You can configure AWS CLI by running $ aws configure and filling in all of the prompts for your IAM credentials.
重要說明 —確保使用作為SAM CLI說明的一部分創建的IAM帳戶配置AWS CLI。 您可以通過運行$ aws configure并填寫IAM憑據的所有提示來配置AWS CLI。
背景 (Background)
Earlier this year, CMRA was working on developing several virtual activities designed to teach learners how ML technologies will work in an Advanced Robotics Manufacturing context.
今年早些時候,CMRA正在開發一些虛擬活動,旨在向學習者傳授ML技術如何在高級機器人制造環境中發揮作用。
One such activity was a game where you are in charge of a factory with machines that periodically breakdown. Predicting when the machines breakdown and assigning a technician to fix the machine results in lost money.
一個這樣的活動是一場游戲,您要負責一個工廠,該工廠的機器定期發生故障。 預測機器何時發生故障并指派技術人員修復機器會導致金錢損失。
Machine Factory Screenshot機器工廠截圖Fortunately there is a dataset available which contains machine sensor data. The sensor values have a simple linear relationship to the state of the machine — breaking or working. Predicting the breaking state allows for the player to put a technician on a machine before the machine goes offline and requires even more time to repair.
幸運的是,有一個可用數據集,其中包含機器傳感器數據。 傳感器值與機器狀態(斷裂或工作)具有簡單的線性關系。 預測斷裂狀態允許玩家在機器脫機之前將技術人員放到機器上,并且需要更多時間進行維修。
The dataset pictured below is pretty simple. Each instance of the data sample has a labeled state and each individual feature has a linear relationship to the state. The other features of the dataset include temperature, vibration, current, and noise. Noise, no pun intended, has no linear relationship to the working or breaking state of machines.
下圖所示的數據集非常簡單。 數據樣本的每個實例都有一個標記的狀態,每個單獨的特征與該狀態都有線性關系。 數據集的其他功能包括溫度,振動,電流和噪聲。 噪聲(并非雙關語)與機器的工作或斷裂狀態沒有線性關系。
[{
"state": "Breaking",
"temp": "88",
"vibration": "79",
"current": "12",
"noise": "84"
},
{
"state": "Working",
"temp": "27",
"vibration": "47",
"current": "59",
"noise": "48"
},
... {
"state": "Working",
"temp": "73",
"vibration": "11",
"current": "84",
"noise": "29"
}
]
部署訓練有素的模型 (Deploying a Trained Model)
We won’t spend a lot of time on this topic since the primary focus of this article is to deploy a model.
因為本文的主要重點是部署模型,所以我們不會在這個主題上花費很多時間。
Pictured below is the script that trains on the machine factory dataset located here. A simple python script will load the data, process the data, train a model, and save a pickled version of that model. I recommend running the training script with Python 3.8 in a Jupyter Notebook. The linked repository at the end of this article includes the script in notebook form. I also recommend using VS Code’s wonderful Python package to take advantage of builtin notebooks with linting and VS Code keybindings.
下圖是在此處的機器工廠數據集上訓練的腳本。 一個簡單的python腳本將加載數據,處理數據,訓練模型并保存該模型的腌制版本。 我建議在Jupyter Notebook中使用Python 3.8運行培訓腳本。 本文結尾處的鏈接存儲庫包含筆記本形式的腳本。 我還建議使用VS Code出色的Python 軟件包 ,以利用帶有linting和VS Code鍵綁定的內置筆記本的優勢。
Factory Data Training Script工廠數據培訓腳本This script converts the JSON data into a Pandas DataFrame object and normalizes the features and encodes the 2 states into 0 and 1.
該腳本將JSON數據轉換為Pandas DataFrame對象,并對功能進行規范化,并將2種狀態編碼為0和1。
The model is built using Scikit Learn’s LogisticRegression model with the ‘liblinear’ default solver. The accuracy comes in at an impressive 97%! As we previously said, the machine states are pretty predictable and probably wouldn’t require an ML model in the real world.
該模型是使用Scikit Learn的LogisticRegression模型和“ liblinear”默認求解器構建的。 準確率高達97%! 如前所述,機器狀態是可以預測的,并且在現實世界中可能不需要ML模型。
The model and the encoding ( a mapping of the state labels to the encoded values) are pickled and saved to a local file. The resulting pickled file can be set aside for now. This is the last time we will use the above training script. Assume that this was work completed by your talented data scientist.
腌制模型和編碼(狀態標簽到編碼值的映射)并將其保存到本地文件。 可以將生成的腌制文件暫時放在一旁。 這是我們最后一次使用上述培訓腳本。 假設這是您的才華橫溢的數據科學家完成的工作。
使用SAM CLI部署模型 (Deploying the Model with SAM CLI)
認識SAM (Meet SAM)
Sam山姆Now that we have SAM CLI installed, let’s take it for a quick test drive. We will start by initializing our application:
現在,我們已經安裝了SAM CLI,讓我們對其進行快速測試。 我們將從初始化應用程序開始:
$ sam initWhich template source would you like to use?1 - AWS Quick Start Templates
2 - Custom Template Location
Choice: 1Which runtime would you like to use?
1 - nodejs12.x
2 - python3.8
...
Runtime: 2Project name [sam-app]: your_app_nameCloning app templates from https://github.com/awslabs/aws-sam-cli-app-templates.gitAWS quick start application templates:
1 - Hello World Example
...
Template selection: 1
This will be our actual app, but to start out we will choose to checkout the quick start template, install Python 3.8, name our project, and use the ‘Hello World Example’.
這將是我們的實際應用程序,但是首先,我們將選擇簽出快速入門模板,安裝Python 3.8,為我們的項目命名,并使用“ Hello World Example”。
We will start by invoking the HelloWorld Lambda function. The function can be invoked by simply running $ sam local invoke in the project’s root directory. If everything is wired up properly, SAM CLI should run a simulated Docker container of the AWS Lambda environment and return a 200 response with the message ‘Hello World’.
我們將首先調用HelloWorld Lambda函數。 只需在項目的根目錄中運行$ sam local invoke ,即可調用該函數。 如果一切連接正確,則SAM CLI應該運行AWS Lambda環境的模擬Docker容器,并返回200響應,并顯示消息“ Hello World”。
Opening up HelloWorld/app.py reveals the Lambda function. All Lambda functions have an event and context parameter. The event parameter is usually a Python dict with data about the event that triggered the function. This can be data from an API request or from a direct invocation coming from any event that is configured to trigger the Lambda function (e.g. S3 file upload). The context parameter will go unused for our application. This parameter returns a Context object that provides data about the environment that the function is running in.
打開HelloWorld / app.py將顯示Lambda函數。 所有Lambda函數都有一個事件和上下文參數。 event參數通常是Python字典,其中包含有關觸發函數的事件的數據。 這可以是來自API請求的數據,也可以是來自配置為觸發Lambda函數(例如,S3文件上傳)的任何事件的直接調用中的數據。 對于我們的應用程序,上下文參數將不使用。 此參數返回一個Context對象,該對象提供有關函數在其中運行的環境的數據。
The HelloWorld function uses neither parameter. Instead, it returns the requisite response all Lambda functions are required to return when invoked by API Gateway. This is a JSON response with an HTTP status code and a message body.
HelloWorld函數不使用任何參數。 而是返回由API網關調用時所有Lambda函數都需要返回的必需響應。 這是帶有HTTP狀態代碼和消息正文的JSON響應。
def lambda_handler(event, context): return {"statusCode": 200,
"body": json.dumps({
"message": "hello world",
# "location": ip.text.replace("\n", "")
}),
}
創建我們自己的Lambda函數 (Creating Our Own Lambda Function)
We are ready to write our own function now that we have the basic feel for how to invoke Lambda functions locally. Let’s rename the folder containing the HelloWorldFunction ‘model_inference’ and replace the app.py code with the following snippet:
現在,我們已經對如何在本地調用Lambda函數有了基本的了解,現在就可以編寫自己的函數了。 讓我們重命名包含HelloWorldFunction'model_inference'的文件夾,并使用以下代碼段替換app.py代碼:
Model Inference app.py模型推斷app.pyThere’s a lot to decompose here. First of all we want to take the pickled model that we uploaded to S3 and load it into memory. Lambda provides a convenient Python library called Boto3 that can directly interface with any service that the Lambda function is authorized to work with. We will setup the Lambda’s policy but for right now let’s assume that the load pickle file will return the pickled model and encoding map when given the S3 Bucket and key (file name).
這里有很多要分解的東西。 首先,我們要獲取上載到S3的腌制模型并將其加載到內存中。 Lambda提供了一個稱為Boto3的便捷Python庫,該庫可以直接與Lambda函數被授權使用的任何服務接口。 我們將設置Lambda的策略,但現在,讓我們假定當給定S3存儲桶和鍵(文件名)時,加載pickle文件將返回該pickled模型和編碼圖。
The load_pickle function instantiates an S3 client and receives a download_file method which saves the pickle to a specified local path.
load_pickle函數實例化一個S3客戶端,并接收一個download_file方法,該方法將泡菜保存到指定的本地路徑。
Once the model is loaded, the event is processed and the JSON request payload data is converted to the Python dict format. The rest of the code mirrors the training script. The features are extracted by name, normalized, and then fed into the model’s predict method. The prediction is mapped back to the state of ‘working’ or ‘breaking’ and is passed to the JSON response message.
加載模型后,將處理事件,并將JSON請求有效負載數據轉換為Python dict格式。 其余代碼反映了培訓腳本。 按名稱提取特征,將其標準化,然后輸入模型的預測方法中。 預測被映射回“工作”或“中斷”狀態,并傳遞給JSON響應消息。
配置模板 (Configuring the Template)
Getting this all to work requires a few more steps. SAM CLI projects include a template.yaml file at the root of the project. This template has 3 major sections — Globals, Resources, and Outputs.
要使所有這些工作正常進行,還需要一些步驟。 SAM CLI項目在項目的根目錄中包含template.yaml文件。 該模板包含3個主要部分-全局,資源和輸出。
The Globals section holds all of the global settings that apply to all Lambda functions in your project. The warmup request takes a little bit of time. To be on the safe side we increased the Timeout for all functions to be 60 seconds.
“全局”部分包含適用于您項目中所有Lambda函數的所有全局設置。 預熱請求需要一點時間。 為了安全起見,我們將所有功能的超時設置為60秒。
Resources specifies each function in the project. Every function has a type. The ModelInferenceFunction has a type ofAWS::Serverless::Function. Properties about the function are set below each resource. The CodeUri matches the function’s parent directory name (model_inference). Policies for the Lambda function are set under Policies. We would like this function to read from S3 and download the pickled file that we uploaded. Therefore, we gave this function an S3ReadPolicy and specified the bucket name. Other policies can be added to allow access to other AWS services as needed.
資源指定了項目中的每個功能。 每個函數都有一個類型。 ModelInferenceFunction具有AWS :: Serverless :: Function的類型。 有關功能的屬性設置在每個資源的下方。 CodeUri與函數的父目錄名稱(model_inference)匹配。 Lambda函數的策略在“策略”下設置。 我們希望此功能可以從S3中讀取并下載上載的腌制文件。 因此,我們為該函數提供了S3ReadPolicy并指定了存儲桶名稱。 可以添加其他策略以允許根據需要訪問其他AWS服務。
All functions are triggered by an event. In our case the API Gateway service will be the service that triggers Lambda invocations. That is to say that once deployed the API Gateway service will give us an endpoint that we can send HTTP requests to and trigger our Lambda function. We also specify the API’s properties which include the the path name and the HTTP method.
所有功能均由事件觸發。 在我們的例子中,API網關服務將是觸發Lambda調用的服務。 也就是說,一旦部署,API網關服務將為我們提供一個終結點,我們可以向該終結點發送HTTP請求并觸發我們的Lambda函數。 我們還指定了API的屬性,包括路徑名和HTTP方法。
The outputs specify the values for your application’s components once the stack is deployed. Calling the aws cloudformation describe-stacks command will return the API endpoint, Lambda ARN, and Lambda IAM Role ARN. SAM CLI applications are deployed using the CloudFormation service which acts as an orchestration tool for spinning up and connecting multiple AWS services that work together.
部署堆棧后,輸出將指定應用程序組件的值。 調用aws cloudformation describe-stacks命令將返回API端點,Lambda ARN和Lambda IAM角色ARN。 SAM CLI應用程序使用CloudFormation服務進行部署,該服務充當用于協調和連接可協同工作的多個AWS服務的編排工具。
安裝依賴項 (Installing Dependencies)
If you run the updated function by calling sam invoke local you will be in for a disappointing surprise. None of the included dependencies are available. Fortunately, installing dependencies for a SAM CLI application is pretty straight forward.
如果您通過調用sam invoke local來運行更新的功能,那么您將感到失望。 所包含的依賴項均不可用。 幸運的是,為SAM CLI應用程序安裝依賴項非常簡單。
We need to create a requirements.txt file in the root directory for all of the requisite dependencies that will be used by our function. In our case we just need to install Pandas and Sklearn.
我們需要在根目錄中為我們的函數將使用的所有必需依賴項創建一個requirements.txt文件。 在我們的情況下,我們只需要安裝Pandas和Sklearn。
It works best if we create a pristine environment and freeze only the dependencies that we need for the project in the root directory. We can create an environment in python by running python3 -m venv env. This will create an env/ folder that will hold all of the data for our virtual environment. We can activate the environment by running source env/bin/activate.
如果我們創建一個原始環境并在根目錄中僅凍結項目所需的依賴項,則效果最佳。 我們可以通過運行python3 -m venv env在python中創建一個環境。 這將創建一個env /文件夾,其中包含我們虛擬環境的所有數據。 我們可以通過運行source env/bin/activate來激活環境。
You can tell when you have the environment activated when the environment name (env) appears to the left of the command line. We can now install our dependencies and‘freeze’ dependencies into a requirements.txt file.
當環境名稱(env)出現在命令行左側時,您可以知道何時激活了環境。 現在,我們可以將依賴項和“凍結”依賴項安裝到requirements.txt文件中。
(env) $ pip install sklearn pandas(env) $ pip freeze > requirements.txt
After we are done, we can exit the environment by simply calling the deactivate command.
完成后,我們只需調用deactivate命令即可退出環境。
Now we can build our lambda function with the included dependencies by including the requirements.txt manifest. This will create a build directory under .aws-sam/ that will be be packaged and sent to AWS when we are ready to deploy.
現在,通過包含requirements.txt清單,可以使用包含的依賴項來構建lambda函數。 這將在.aws-sam /下創建一個構建目錄,在我們準備部署時將其打包并發送到AWS。
$ sam build -m requirements.txt調用自定義事件 (Invoking Custom Events)
We still need to pass inference data to our Lambda function. We can do this by creating a custom event called single_inference.json and saving it within the events folder. This json file will be passed to the Lambda function upon invocation by calling sam local invoke -e events/single_Infernce.json.
我們仍然需要將推斷數據傳遞給Lambda函數。 為此,我們可以創建一個名為single_inference.json的自定義事件并將其保存在events文件夾中。 該json文件將在調用時通過調用sam local invoke -e events/single_Infernce.json傳遞給Lambda函數。
{"data": {
"temp": "10",
"vibration": "1.0",
"current": "0",
"noise": "78"
}
}
在本地測試API端點 (Testing the API Endpoint Locally)
Sam also offers a convenient local server that will allow us to perform a full integration test of our API endpoint and Lambda function. We can start the server at localhost:3000 by calling sam local start-api.
Sam還提供了一個方便的本地服務器,使我們可以對API端點和Lambda函數執行完整的集成測試。 我們可以通過調用sam local start-api在localhost:3000上啟動服務器。
When testing the API endpoint, you will notice that the event data looks slightly different than a direct invocation. This is why we process all incoming events with the parse_event function. API Invocations are in JSON. The ‘body’ and ‘data’ of the request needs to be extracted from the JSON. A direct invocation only requires us to select the ‘data’ key of a Python dict. Other services that can be used to trigger functions likely have different types or shapes to the event data. It’s best to log the event and become familiar with what a particular service’s event looks like.
測試API端點時,您會注意到事件數據看起來與直接調用略有不同。 這就是為什么我們使用parse_event函數處理所有傳入事件的原因。 API調用采用JSON。 需要從JSON中提取請求的“正文”和“數據”。 直接調用僅需要我們選擇Python字典的“數據”鍵。 可用于觸發功能的其他服務可能與事件數據具有不同的類型或形狀。 最好記錄事件并熟悉特定服務事件的外觀。
def parse_event(event):if 'body' in event.keys():
return json.loads(event['body'])['data']
else:
return event['data']
The local endpoint can now be called with a simple curl request that includes the factory data in JSON format.
現在可以通過簡單的curl請求調用本地端點,該請求包含JSON格式的工廠數據。
$ curl --request POST \--url https://localhost:3000/inferences \
--header 'content-type: application/json' \
--data '{
"data": {
"temp": "1",
"vibration": "1.0",
"current": "88",
"noise": "23"
}
}=> {"prediction": "Working"}
部署到AWS (Deploy to AWS)
We are now ready to deploy our little ML Serverless application to AWS! Deploying an application couldn’t be simpler. Simply call sam deploy --guided and follow the prompts to deploy to your AWS account.
現在,我們準備將我們的小型ML Serverless應用程序部署到AWS! 部署應用程序再簡單不過了。 只需致電sam deploy --guided然后按照提示進行部署即可將其部署到您的AWS賬戶。
Once the deploy completes, you can sign into AWS console and see the creation of your Lambda function, API Gateway Endpoint, and your CloudFormation stack. At this point we can also test our production endpoint by calling our endpoint at the given endpoint url. The invocation URL can be found at API Gateway -> API -> Stages -> Prod. On this screen you will see the ‘invocation url’ and you can test the endpoint by sending another CURL request.
部署完成后,您可以登錄AWS控制臺并查看Lambda函數,API Gateway Endpoint和CloudFormation堆棧的創建。 此時,我們還可以通過在給定的端點URL上調用我們的端點來測試生產端點。 可以在API網關-> API->階段->產品中找到調用URL。 在此屏幕上,您將看到“調用URL”,并且可以通過發送另一個CURL請求來測試端點。
$ curl --request POST \--url https://<your-endpoint-url>/Prod/inferences \
--header 'content-type: application/json' \
--data '{
"data": {
"temp": "1",
"vibration": "1.0",
"current": "88",
"noise": "23"
}
}=> {"prediction": "Working"}
結論 (Conclusion)
Congratulations! We have successfully deployed a simple logistical regression model that can now be invoked on demand without running a server.
恭喜你! 我們已經成功部署了一個簡單的邏輯回歸模型,該模型現在可以在不運行服務器的情況下按需調用。
Sam is Impressed山姆印象深刻In review, we used SAM CLI to quickly set up an AWS-connected project, then transplanted code from the provided CMRA template into it, so that it would load a saved (pickled) model and execute it. We then configured the YAML config file to point to the right functions, and installed the necessary dependencies. We set up single instance inference testing, tested it locally, and finally deployed it to AWS.
作為回顧,我們使用SAM CLI快速建立了一個與AWS連接的項目,然后將代碼從提供的CMRA模板移植到該項目中,以便它可以加載保存的(腌制的)模型并執行它。 然后,我們將YAML配置文件配置為指向正確的功能,并安裝了必要的依賴項。 我們設置了單實例推理測試,在本地對其進行了測試,最后將其部署到AWS。
We can add more lambda functions that handle other tasks. For instance, it wouldn’t take a lot of effort to create a TrainingFunction that is invoked each time a new dataset is uploaded to a particular S3 bucket. The training set can be loaded within the function and the model can be saved for the InferenceFunction to use.
我們可以添加更多的lambda函數來處理其他任務。 例如,創建每次將新數據集上載到特定S3存儲桶時都會調用的TrainingFunction并不需要花費很多精力。 可以將訓練集加載到函數中,并可以保存模型以供InferenceFunction使用。
I hope this article demystifies what it takes to deploy an ML model to a serverless backend. Do not hesitate to respond in the comments with any feedback or questions.
我希望本文能揭開將ML模型部署到無服務器后端所需的神秘性。 如有任何反饋或問題,請隨時在評論中回復。
This material is based upon work supported by the National Science Foundation under Grant Number 1937063.
該材料基于美國國家科學基金會在授權號1937063下的支持。
Github Repository
Github倉庫
翻譯自: https://medium.com/carnegie-mellon-robotics-academy/going-serverless-for-your-ml-backend-with-sam-cli-5332912019ef
sam服務器是什么
總結
以上是生活随笔為你收集整理的sam服务器是什么_使用SAM CLI将机器学习模型部署到无服务器后端的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 9月18日
- 下一篇: pca 主成分分析_六分钟的主成分分析(