当前位置: 首页 > news >正文

在亚马逊云科技上安全、合规地创建AI大模型训练基础设施并开发AI应用服务

项目简介:

小李哥将继续每天介绍一个基于亚马逊云科技AWS云计算平台的全球前沿AI技术解决方案,帮助大家快速了解国际上最热门的云计算平台亚马逊云科技AWS AI最佳实践,并应用到自己的日常工作里。

本次介绍的是如何在亚马逊云科技利用Service Catalog服务创建和管理包含AI大模型的应用产品,并通过权限管理基于员工的身份职责限制所能访问的云资源,并创建SageMaker机器学习托管服务并在该服务上训练和部署大模型,通过VPC endpoint节点私密、安全的加载模型文件和模型容器镜像。本架构设计全部采用了云原生Serverless架构,提供可扩展和安全的AI解决方案。本方案的解决方案架构图如下:

方案所需基础知识 

什么是 Amazon SageMaker?

Amazon SageMaker 是亚马逊云科技提供的一站式机器学习服务,旨在帮助开发者和数据科学家轻松构建、训练和部署机器学习模型。SageMaker 提供了从数据准备、模型训练到模型部署的全流程工具,使用户能够高效地在云端实现机器学习项目。

什么是亚马逊云科技 Service Catalog?

亚马逊云科技 Service Catalog 是一项服务,旨在帮助企业创建、管理和分发经过批准的云服务集合。通过 Service Catalog,企业可以集中管理已批准的资源和配置,确保开发团队在使用云服务时遵循组织的最佳实践和合规要求。用户可以从预定义的产品目录中选择所需的服务,简化了资源部署的过程,并减少了因配置错误导致的风险。

利用 SageMaker 构建 AI 服务的安全合规好处

符合企业合规性要求

使用 SageMaker 构建 AI 服务时,可以通过 Service Catalog 预先定义和管理符合公司合规标准的配置模板,确保所有的 AI 模型和资源部署都遵循组织的安全政策和行业法规,如 GDPR 或 HIPAA。

数据安全性

SageMaker 提供了端到端的数据加密选项,包括在数据存储和传输中的加密,确保敏感数据在整个 AI 模型生命周期中的安全性。同时可以利用VPC endpoint节点,私密安全的访问S3中的数据,加载ECR镜像库中保存的AI模型镜像容器。

访问控制和监控

通过与亚马逊云科技的身份和访问管理(IAM)集成,可以细粒度地控制谁可以访问和操作 SageMaker 中的资源。再结合 CloudTrail 和 CloudWatch 等监控工具,企业可以实时跟踪和审计所有的操作,确保透明度和安全性。

本方案包括的内容

1. 通过VPC Endpoint节点,私有访问S3中的模型文件

2. 创建亚马逊云科技Service Catalog资源组,统一创建、管理用户的云服务产品。

3. 作为Service Catalog的使用用户创建一个SageMaker机器学习训练计算实例

项目搭建具体步骤:

1. 登录亚马逊云科技控制台,进入无服务器计算服务Lambda,创建一个Lambda函数“SageMakerBuild”,复制以下代码,用于创建SageMaker Jupyter Notebook,训练AI大模型。

import json
import boto3
import requests
import botocore
import time
import base64## Request Status ##
global ReqStatusdef CFTFailedResponse(event, status, message):print("Inside CFTFailedResponse")responseBody = {'Status': status,'Reason': message,'PhysicalResourceId': event['ServiceToken'],'StackId': event['StackId'],'RequestId': event['RequestId'],'LogicalResourceId': event['LogicalResourceId']}headers={'content-type':'','content-length':str(len(json.dumps(responseBody)))	 }	print('Response = ' + json.dumps(responseBody))try:	req=requests.put(event['ResponseURL'], data=json.dumps(responseBody),headers=headers)print("delete_respond_cloudformation res "+str(req))		except Exception as e:print("Failed to send cf response {}".format(e))def CFTSuccessResponse(event, status, data=None):responseBody = {'Status': status,'Reason': 'See the details in CloudWatch Log Stream','PhysicalResourceId': event['ServiceToken'],'StackId': event['StackId'],'RequestId': event['RequestId'],'LogicalResourceId': event['LogicalResourceId'],'Data': data}headers={'content-type':'','content-length':str(len(json.dumps(responseBody)))	 }	print('Response = ' + json.dumps(responseBody))#print(event)try:	req=requests.put(event['ResponseURL'], data=json.dumps(responseBody),headers=headers)except Exception as e:print("Failed to send cf response {}".format(e))def lambda_handler(event, context):ReqStatus = "SUCCESS"print("Event:")print(event)client = boto3.client('sagemaker')ec2client = boto3.client('ec2')data = {}if event['RequestType'] == 'Create':try:## Value Intialization from CFT ##project_name = event['ResourceProperties']['ProjectName']kmsKeyId = event['ResourceProperties']['KmsKeyId']Tags = event['ResourceProperties']['Tags']env_name = event['ResourceProperties']['ENVName']subnet_name = event['ResourceProperties']['Subnet']security_group_name = event['ResourceProperties']['SecurityGroupName']input_dict = {}input_dict['NotebookInstanceName'] = event['ResourceProperties']['NotebookInstanceName']input_dict['InstanceType'] = event['ResourceProperties']['NotebookInstanceType']input_dict['Tags'] = event['ResourceProperties']['Tags']input_dict['DirectInternetAccess'] = event['ResourceProperties']['DirectInternetAccess']input_dict['RootAccess'] = event['ResourceProperties']['RootAccess']input_dict['VolumeSizeInGB'] = int(event['ResourceProperties']['VolumeSizeInGB'])input_dict['RoleArn'] = event['ResourceProperties']['RoleArn']input_dict['LifecycleConfigName'] = event['ResourceProperties']['LifecycleConfigName']except Exception as e:print(e)ReqStatus = "FAILED"message = "Parameter Error: "+str(e)CFTFailedResponse(event, "FAILED", message)if ReqStatus == "FAILED":return None;print("Validating Environment name: "+env_name)print("Subnet Id Fetching.....")try:## Sagemaker Subnet ##subnetName = env_name+"-ResourceSubnet"print(subnetName)response = ec2client.describe_subnets(Filters=[{'Name': 'tag:Name','Values': [subnet_name]},])#print(response)subnetId = response['Subnets'][0]['SubnetId']input_dict['SubnetId'] = subnetIdprint("Desc sg done!!")except Exception as e:print(e)ReqStatus = "FAILED"message = " Project Name is invalid - Subnet Error: "+str(e)CFTFailedResponse(event, "FAILED", message)if ReqStatus == "FAILED":return None;## Sagemaker Security group ##print("Security GroupId Fetching.....")try:sgName = env_name+"-ResourceSG"response = ec2client.describe_security_groups(Filters=[{'Name': 'tag:Name','Values': [security_group_name]},])sgId = response['SecurityGroups'][0]['GroupId']input_dict['SecurityGroupIds'] = [sgId]print("Desc sg done!!")except Exception as e:print(e)ReqStatus = "FAILED"message = "Security Group ID Error: "+str(e)CFTFailedResponse(event, "FAILED", message)if ReqStatus == "FAILED":return None;    try:if kmsKeyId:input_dict['KmsKeyId'] = kmsKeyIdelse:print("in else")print(input_dict)instance = client.create_notebook_instance(**input_dict)print('Sagemager CLI response')print(str(instance))responseData = {'NotebookInstanceArn': instance['NotebookInstanceArn']}NotebookStatus = 'Pending'response = client.describe_notebook_instance(NotebookInstanceName=event['ResourceProperties']['NotebookInstanceName'])NotebookStatus = response['NotebookInstanceStatus']print("NotebookStatus:"+NotebookStatus)## Notebook Failure ##if NotebookStatus == 'Failed':message = NotebookStatus+": "+response['FailureReason']+" :Notebook is not coming InService"CFTFailedResponse(event, "FAILED", message)else:while NotebookStatus == 'Pending':time.sleep(200)response = client.describe_notebook_instance(NotebookInstanceName=event['ResourceProperties']['NotebookInstanceName'])NotebookStatus = response['NotebookInstanceStatus']print("NotebookStatus in loop:"+NotebookStatus)## Notebook Success ##if NotebookStatus == 'InService':data['Message'] = "SageMaker Notebook name - "+event['ResourceProperties']['NotebookInstanceName']+" created succesfully"print("message InService :",data['Message'])CFTSuccessResponse(event, "SUCCESS", data)else:message = NotebookStatus+": "+response['FailureReason']+" :Notebook is not coming InService"print("message :",message)CFTFailedResponse(event, "FAILED", message)except Exception as e:print(e)ReqStatus = "FAILED"CFTFailedResponse(event, "FAILED", str(e))if event['RequestType'] == 'Delete':NotebookStatus = Nonelifecycle_config = event['ResourceProperties']['LifecycleConfigName']NotebookName = event['ResourceProperties']['NotebookInstanceName']try:response = client.describe_notebook_instance(NotebookInstanceName=NotebookName)NotebookStatus = response['NotebookInstanceStatus']print("Notebook Status - "+NotebookStatus)except Exception as e:print(e)NotebookStatus = "Invalid"#CFTFailedResponse(event, "FAILED", str(e))while NotebookStatus == 'Pending':time.sleep(30)response = client.describe_notebook_instance(NotebookInstanceName=NotebookName)NotebookStatus = response['NotebookInstanceStatus']print("NotebookStatus:"+NotebookStatus)if NotebookStatus != 'Failed' and NotebookStatus != 'Invalid' :print("Delete request for Notebookk name: "+NotebookName)print("Stoping the Notebook.....")if NotebookStatus != 'Stopped':try:response = client.stop_notebook_instance(NotebookInstanceName=NotebookName)NotebookStatus = 'Stopping'print("Notebook Status - "+NotebookStatus)while NotebookStatus == 'Stopping':time.sleep(30)response = client.describe_notebook_instance(NotebookInstanceName=NotebookName)NotebookStatus = response['NotebookInstanceStatus']print("NotebookStatus:"+NotebookStatus)except Exception as e:print(e)NotebookStatus = "Invalid"CFTFailedResponse(event, "FAILED", str(e))else:NotebookStatus = 'Stopped'print("NotebookStatus:"+NotebookStatus)if NotebookStatus != 'Invalid':print("Deleting The Notebook......")time.sleep(5)try:response = client.delete_notebook_instance(NotebookInstanceName=NotebookName)print("Notebook Deleted")data["Message"] = "Notebook Deleted"CFTSuccessResponse(event, "SUCCESS", data)except Exception as e:print(e)CFTFailedResponse(event, "FAILED", str(e))else:print("Notebook Invalid status")data["Message"] = "Notebook is not available"CFTSuccessResponse(event, "SUCCESS", data)if event['RequestType'] == 'Update':print("Update operation for Sagemaker Notebook is not recommended")data["Message"] = "Update operation for Sagemaker Notebook is not recommended"CFTSuccessResponse(event, "SUCCESS", data)

2. 接下来我们创建一个yaml脚本,复制以下代码,上传到S3桶中,用于通过CloudFormation,以IaC的形式创建SageMaker Jupyter Notebook。

AWSTemplateFormatVersion: 2010-09-09
Description: Template to create a SageMaker notebook
Metadata:'AWS::CloudFormation::Interface':ParameterGroups:- Label:default: Environment detailParameters:- ENVName- Label:default: SageMaker Notebook configurationParameters:- NotebookInstanceName- NotebookInstanceType- DirectInternetAccess- RootAccess- VolumeSizeInGB- Label:default: Load S3 Bucket to SageMakerParameters:- S3CodePusher- CodeBucketName- Label:default: Project detailParameters:- ProjectName- ProjectIDParameterLabels:DirectInternetAccess:default: Default Internet AccessNotebookInstanceName:default: Notebook Instance NameNotebookInstanceType:default: Notebook Instance TypeENVName:default: Environment NameProjectName:default: Project SuffixRootAccess:default: Root accessVolumeSizeInGB:default: Volume size for the SageMaker NotebookProjectID:default: SageMaker ProjectIDCodeBucketName:default: Code Bucket Name        S3CodePusher:default: Copy code from S3 to SageMaker
Parameters:SubnetName:Default: ProSM-ResourceSubnetDescription: Subnet Random StringType: StringSecurityGroupName:Default: ProSM-ResourceSGDescription: Security Group NameType: StringSageMakerBuildFunctionARN:Description: Service Token Value passed from Lambda StackType: StringNotebookInstanceName:AllowedPattern: '[A-Za-z0-9-]{1,63}'ConstraintDescription: >-Maximum of 63 alphanumeric characters. Can include hyphens (-), but notspaces. Must be unique within your account in an AWS Region.Description: SageMaker Notebook instance nameMaxLength: '63'MinLength: '1'Type: StringNotebookInstanceType:ConstraintDescription: Must select a valid notebook instance type.Default: ml.t3.mediumDescription: Select Instance type for the SageMaker NotebookType: StringENVName:Description: SageMaker infrastructure naming conventionType: StringProjectName:Description: >-The suffix appended to all resources in the stack.  This will allowmultiple copies of the same stack to be created in the same account.Type: StringRootAccess:Description: Root access for the SageMaker Notebook userAllowedValues:- Enabled- DisabledDefault: EnabledType: StringVolumeSizeInGB:Description: >-The size, in GB, of the ML storage volume to attach to the notebookinstance. The default value is 5 GB.Type: NumberDefault: '20'DirectInternetAccess:Description: >-If you set this to Disabled this notebook instance will be able to accessresources only in your VPC. As per the Project requirement, we haveDisabled it.Type: StringDefault: DisabledAllowedValues:- DisabledConstraintDescription: Must select a valid notebook instance type.ProjectID:Type: StringDescription: Enter a valid ProjectID.Default: QuickStart007S3CodePusher:Description: Do you want to load the code from S3 to SageMaker NotebookDefault: 'NO'AllowedValues:- 'YES'- 'NO'Type: StringCodeBucketName:Description: S3 Bucket name from which you want to copy the code to SageMaker.Default: lab-materials-bucket-1234Type: String    
Conditions:BucketCondition: !Equals - 'YES'- !Ref S3CodePusher
Resources:SagemakerKMSKey:Type: 'AWS::KMS::Key'Properties:EnableKeyRotation: trueTags:- Key: ProjectIDValue: !Ref ProjectID- Key: ProjectNameValue: !Ref ProjectNameKeyPolicy:Version: '2012-10-17'Statement:- Effect: AllowPrincipal:AWS: !Sub 'arn:aws:iam::${AWS::AccountId}:root'Action: - 'kms:Encrypt'- 'kms:PutKeyPolicy' - 'kms:CreateKey' - 'kms:GetKeyRotationStatus' - 'kms:DeleteImportedKeyMaterial' - 'kms:GetKeyPolicy' - 'kms:UpdateCustomKeyStore' - 'kms:GenerateRandom' - 'kms:UpdateAlias'- 'kms:ImportKeyMaterial'- 'kms:ListRetirableGrants' - 'kms:CreateGrant' - 'kms:DeleteAlias'- 'kms:RetireGrant'- 'kms:ScheduleKeyDeletion' - 'kms:DisableKeyRotation' - 'kms:TagResource' - 'kms:CreateAlias' - 'kms:EnableKeyRotation' - 'kms:DisableKey'- 'kms:ListResourceTags'- 'kms:Verify' - 'kms:DeleteCustomKeyStore'- 'kms:Sign' - 'kms:ListKeys'- 'kms:ListGrants'- 'kms:ListAliases' - 'kms:ReEncryptTo' - 'kms:UntagResource' - 'kms:GetParametersForImport'- 'kms:ListKeyPolicies'- 'kms:GenerateDataKeyPair'- 'kms:GenerateDataKeyPairWithoutPlaintext' - 'kms:GetPublicKey' - 'kms:Decrypt' - 'kms:ReEncryptFrom'- 'kms:DisconnectCustomKeyStore' - 'kms:DescribeKey'- 'kms:GenerateDataKeyWithoutPlaintext'- 'kms:DescribeCustomKeyStores' - 'kms:CreateCustomKeyStore'- 'kms:EnableKey'- 'kms:RevokeGrant'- 'kms:UpdateKeyDescription' - 'kms:ConnectCustomKeyStore' - 'kms:CancelKeyDeletion' - 'kms:GenerateDataKey'Resource:- !Join - ''- - 'arn:aws:kms:'- !Ref 'AWS::Region'- ':'- !Ref 'AWS::AccountId'- ':key/*'- Sid: Allow access for Key AdministratorsEffect: AllowPrincipal:AWS: - !GetAtt SageMakerExecutionRole.ArnAction:- 'kms:CreateAlias'- 'kms:CreateKey'- 'kms:CreateGrant' - 'kms:CreateCustomKeyStore'- 'kms:DescribeKey'- 'kms:DescribeCustomKeyStores'- 'kms:EnableKey'- 'kms:EnableKeyRotation'- 'kms:ListKeys'- 'kms:ListAliases'- 'kms:ListKeyPolicies'- 'kms:ListGrants'- 'kms:ListRetirableGrants'- 'kms:ListResourceTags'- 'kms:PutKeyPolicy'- 'kms:UpdateAlias'- 'kms:UpdateKeyDescription'- 'kms:UpdateCustomKeyStore'- 'kms:RevokeGrant'- 'kms:DisableKey'- 'kms:DisableKeyRotation'- 'kms:GetPublicKey'- 'kms:GetKeyRotationStatus'- 'kms:GetKeyPolicy'- 'kms:GetParametersForImport'- 'kms:DeleteCustomKeyStore'- 'kms:DeleteImportedKeyMaterial'- 'kms:DeleteAlias'- 'kms:TagResource'- 'kms:UntagResource'- 'kms:ScheduleKeyDeletion'- 'kms:CancelKeyDeletion'Resource:- !Join - ''- - 'arn:aws:kms:'- !Ref 'AWS::Region'- ':'- !Ref 'AWS::AccountId'- ':key/*'- Sid: Allow use of the keyEffect: AllowPrincipal:AWS: - !GetAtt SageMakerExecutionRole.ArnAction:- kms:Encrypt- kms:Decrypt- kms:ReEncryptTo- kms:ReEncryptFrom- kms:GenerateDataKeyPair- kms:GenerateDataKeyPairWithoutPlaintext- kms:GenerateDataKeyWithoutPlaintext- kms:GenerateDataKey- kms:DescribeKeyResource:- !Join - ''- - 'arn:aws:kms:'- !Ref 'AWS::Region'- ':'- !Ref 'AWS::AccountId'- ':key/*'- Sid: Allow attachment of persistent resourcesEffect: AllowPrincipal:AWS: - !GetAtt SageMakerExecutionRole.ArnAction:- kms:CreateGrant- kms:ListGrants- kms:RevokeGrantResource:- !Join - ''- - 'arn:aws:kms:'- !Ref 'AWS::Region'- ':'- !Ref 'AWS::AccountId'- ':key/*'Condition:Bool:kms:GrantIsForAWSResource: 'true'KeyAlias:Type: AWS::KMS::AliasProperties:AliasName: 'alias/SageMaker-CMK-DS'TargetKeyId:Ref: SagemakerKMSKeySageMakerExecutionRole:Type: 'AWS::IAM::Role'Properties:Tags:- Key: ProjectIDValue: !Ref ProjectID- Key: ProjectNameValue: !Ref ProjectNameAssumeRolePolicyDocument:Statement:- Effect: AllowPrincipal:Service:- sagemaker.amazonaws.comAction:- 'sts:AssumeRole'Path: /Policies:- PolicyName: !Join - ''- - !Ref ProjectName- SageMakerExecutionPolicyPolicyDocument:Version: 2012-10-17Statement:- Effect: AllowAction:- 'iam:ListRoles'Resource:- !Join - ''- - 'arn:aws:iam::'- !Ref 'AWS::AccountId'- ':role/*'- Sid: CloudArnResourceEffect: AllowAction:- 'application-autoscaling:DeleteScalingPolicy'- 'application-autoscaling:DeleteScheduledAction'- 'application-autoscaling:DeregisterScalableTarget'- 'application-autoscaling:DescribeScalableTargets'- 'application-autoscaling:DescribeScalingActivities'- 'application-autoscaling:DescribeScalingPolicies'- 'application-autoscaling:DescribeScheduledActions'- 'application-autoscaling:PutScalingPolicy'- 'application-autoscaling:PutScheduledAction'- 'application-autoscaling:RegisterScalableTarget'Resource:- !Join - ''- - 'arn:aws:autoscaling:'- !Ref 'AWS::Region'- ':'- !Ref 'AWS::AccountId'- ':*'- Sid: ElasticArnResourceEffect: AllowAction:- 'elastic-inference:Connect'Resource:- !Join - ''- - 'arn:aws:elastic-inference:'- !Ref 'AWS::Region'- ':'- !Ref 'AWS::AccountId'- ':elastic-inference-accelerator/*'  - Sid: SNSArnResourceEffect: AllowAction:- 'sns:ListTopics'Resource:- !Join - ''- - 'arn:aws:sns:'- !Ref 'AWS::Region'- ':'- !Ref 'AWS::AccountId'- ':*'- Sid: logsArnResourceEffect: AllowAction:- 'cloudwatch:DeleteAlarms'- 'cloudwatch:DescribeAlarms'- 'cloudwatch:GetMetricData'- 'cloudwatch:GetMetricStatistics'- 'cloudwatch:ListMetrics'- 'cloudwatch:PutMetricAlarm'- 'cloudwatch:PutMetricData'- 'logs:CreateLogGroup'- 'logs:CreateLogStream'- 'logs:DescribeLogStreams'- 'logs:GetLogEvents'- 'logs:PutLogEvents'Resource:- !Join - ''- - 'arn:aws:logs:'- !Ref 'AWS::Region'- ':'- !Ref 'AWS::AccountId'- ':log-group:/aws/lambda/*'- Sid: KmsArnResourceEffect: AllowAction:- 'kms:DescribeKey'- 'kms:ListAliases'Resource:- !Join - ''- - 'arn:aws:kms:'- !Ref 'AWS::Region'- ':'- !Ref 'AWS::AccountId'- ':key/*'- Sid: ECRArnResourceEffect: AllowAction:- 'ecr:BatchCheckLayerAvailability'- 'ecr:BatchGetImage'- 'ecr:CreateRepository'- 'ecr:GetAuthorizationToken'- 'ecr:GetDownloadUrlForLayer'- 'ecr:DescribeRepositories'- 'ecr:DescribeImageScanFindings'- 'ecr:DescribeRegistry'- 'ecr:DescribeImages'Resource:- !Join - ''- - 'arn:aws:ecr:'- !Ref 'AWS::Region'- ':'- !Ref 'AWS::AccountId'- ':repository/*'- Sid: EC2ArnResourceEffect: AllowAction:        - 'ec2:CreateNetworkInterface'- 'ec2:CreateNetworkInterfacePermission'- 'ec2:DeleteNetworkInterface'- 'ec2:DeleteNetworkInterfacePermission'- 'ec2:DescribeDhcpOptions'- 'ec2:DescribeNetworkInterfaces'- 'ec2:DescribeRouteTables'- 'ec2:DescribeSecurityGroups'- 'ec2:DescribeSubnets'- 'ec2:DescribeVpcEndpoints'- 'ec2:DescribeVpcs'Resource:- !Join - ''- - 'arn:aws:ec2:'- !Ref 'AWS::Region'- ':'- !Ref 'AWS::AccountId'- ':instance/*'- Sid: S3ArnResourceEffect: AllowAction: - 's3:CreateBucket'- 's3:GetBucketLocation'- 's3:ListBucket'       Resource:- !Join - ''- - 'arn:aws:s3::'- ':*sagemaker*'                  - Sid: LambdaInvokePermissionEffect: AllowAction:- 'lambda:ListFunctions'Resource:- !Join - ''- - 'arn:aws:lambda:'- !Ref 'AWS::Region'- ':'- !Ref 'AWS::AccountId'- ':function'- ':*'- Effect: AllowAction: 'sagemaker:InvokeEndpoint'Resource:- !Join - ''- - 'arn:aws:sagemaker:'- !Ref 'AWS::Region'- ':'- !Ref 'AWS::AccountId'- ':notebook-instance-lifecycle-config/*'Condition:StringEquals:'aws:PrincipalTag/ProjectID': !Ref ProjectID- Effect: AllowAction:- 'sagemaker:CreateTrainingJob'- 'sagemaker:CreateEndpoint'- 'sagemaker:CreateModel'- 'sagemaker:CreateEndpointConfig'- 'sagemaker:CreateHyperParameterTuningJob'- 'sagemaker:CreateTransformJob'Resource:- !Join - ''- - 'arn:aws:sagemaker:'- !Ref 'AWS::Region'- ':'- !Ref 'AWS::AccountId'- ':notebook-instance-lifecycle-config/*'Condition:StringEquals:'aws:PrincipalTag/ProjectID': !Ref ProjectID'ForAllValues:StringEquals':'aws:TagKeys':- Username- Effect: AllowAction:- 'sagemaker:DescribeTrainingJob'- 'sagemaker:DescribeEndpoint'- 'sagemaker:DescribeEndpointConfig'Resource:- !Join - ''- - 'arn:aws:sagemaker:'- !Ref 'AWS::Region'- ':'- !Ref 'AWS::AccountId'- ':notebook-instance-lifecycle-config/*'Condition:StringEquals:'aws:PrincipalTag/ProjectID': !Ref ProjectID- Effect: AllowAction:- 'sagemaker:DeleteTags'- 'sagemaker:ListTags'- 'sagemaker:DescribeNotebookInstance'- 'sagemaker:ListNotebookInstanceLifecycleConfigs'- 'sagemaker:DescribeModel'- 'sagemaker:ListTrainingJobs'- 'sagemaker:DescribeHyperParameterTuningJob'- 'sagemaker:UpdateEndpointWeightsAndCapacities'- 'sagemaker:ListHyperParameterTuningJobs'- 'sagemaker:ListEndpointConfigs'- 'sagemaker:DescribeNotebookInstanceLifecycleConfig'- 'sagemaker:ListTrainingJobsForHyperParameterTuningJob'- 'sagemaker:StopHyperParameterTuningJob'- 'sagemaker:DescribeEndpointConfig'- 'sagemaker:ListModels'- 'sagemaker:AddTags'- 'sagemaker:ListNotebookInstances'- 'sagemaker:StopTrainingJob'- 'sagemaker:ListEndpoints'- 'sagemaker:DeleteEndpoint'Resource:- !Join - ''- - 'arn:aws:sagemaker:'- !Ref 'AWS::Region'- ':'- !Ref 'AWS::AccountId'- ':notebook-instance-lifecycle-config/*'Condition:StringEquals:'aws:PrincipalTag/ProjectID': !Ref ProjectID- Effect: AllowAction:- 'ecr:SetRepositoryPolicy'- 'ecr:CompleteLayerUpload'- 'ecr:BatchDeleteImage'- 'ecr:UploadLayerPart'- 'ecr:DeleteRepositoryPolicy'- 'ecr:InitiateLayerUpload'- 'ecr:DeleteRepository'- 'ecr:PutImage'Resource: - !Join - ''- - 'arn:aws:ecr:'- !Ref 'AWS::Region'- ':'- !Ref 'AWS::AccountId'- ':repository/*sagemaker*'- Effect: AllowAction:- 's3:GetObject'- 's3:ListBucket'- 's3:PutObject'- 's3:DeleteObject'Resource:- !Join - ''- - 'arn:aws:s3:::'- !Ref SagemakerS3Bucket- !Join - ''- - 'arn:aws:s3:::'- !Ref SagemakerS3Bucket- /*Condition:StringEquals:'aws:PrincipalTag/ProjectID': !Ref ProjectID- Effect: AllowAction: 'iam:PassRole'Resource:- !Join - ''- - 'arn:aws:iam::'- !Ref 'AWS::AccountId'- ':role/*'Condition:StringEquals:'iam:PassedToService': sagemaker.amazonaws.comCodeBucketPolicy:Type: 'AWS::IAM::Policy'Condition: BucketConditionProperties:PolicyName: !Join - ''- - !Ref ProjectName- CodeBucketPolicyPolicyDocument:Version: 2012-10-17Statement:- Effect: AllowAction:- 's3:GetObject'Resource:- !Join - ''- - 'arn:aws:s3:::'- !Ref CodeBucketName- !Join - ''- - 'arn:aws:s3:::'- !Ref CodeBucketName- '/*'Roles:- !Ref SageMakerExecutionRoleSagemakerS3Bucket:Type: 'AWS::S3::Bucket'Properties:BucketEncryption:ServerSideEncryptionConfiguration:- ServerSideEncryptionByDefault:SSEAlgorithm: AES256Tags:- Key: ProjectIDValue: !Ref ProjectID- Key: ProjectNameValue: !Ref ProjectNameS3Policy:Type: 'AWS::S3::BucketPolicy'Properties:Bucket: !Ref SagemakerS3BucketPolicyDocument:Version: 2012-10-17Statement:- Sid: AllowAccessFromVPCEndpointEffect: AllowPrincipal: "*"Action:- 's3:Get*'- 's3:Put*'- 's3:List*'- 's3:DeleteObject'Resource:- !Join - ''- - 'arn:aws:s3:::'- !Ref SagemakerS3Bucket- !Join - ''- - 'arn:aws:s3:::'- !Ref SagemakerS3Bucket- '/*'Condition:StringEquals:"aws:sourceVpce": "<PASTE S3 VPC ENDPOINT ID>"EFSLifecycleConfig:Type: 'AWS::SageMaker::NotebookInstanceLifecycleConfig'Properties:NotebookInstanceLifecycleConfigName: 'Provisioned-LC'OnCreate:- Content: !Base64 'Fn::Join':- ''- - |#!/bin/bash - |aws configure set sts_regional_endpoints regional - yes | cp -rf ~/.aws/config /home/ec2-user/.aws/configOnStart:- Content: !Base64 'Fn::Join':- ''- - |#!/bin/bash  - |aws configure set sts_regional_endpoints regional - yes | cp -rf ~/.aws/config /home/ec2-user/.aws/config  EFSLifecycleConfigForS3:Type: 'AWS::SageMaker::NotebookInstanceLifecycleConfig'Properties:NotebookInstanceLifecycleConfigName: 'Provisioned-LC-S3'OnCreate:- Content: !Base64 'Fn::Join':- ''- - |#!/bin/bash - |# Copy Content- !Sub >aws s3 cp s3://${CodeBucketName} /home/ec2-user/SageMaker/ --recursive - |# Set sts endpoint- >aws configure set sts_regional_endpoints regional - yes | cp -rf ~/.aws/config /home/ec2-user/.aws/configOnStart:- Content: !Base64 'Fn::Join':- ''- - |#!/bin/bash  - |aws configure set sts_regional_endpoints regional - yes | cp -rf ~/.aws/config /home/ec2-user/.aws/config  SageMakerCustomResource:Type: 'Custom::SageMakerCustomResource'DependsOn: S3PolicyProperties:ServiceToken: !Ref SageMakerBuildFunctionARNNotebookInstanceName: !Ref NotebookInstanceNameNotebookInstanceType: !Ref NotebookInstanceTypeKmsKeyId: !Ref SagemakerKMSKeyENVName: !Join - ''- - !Ref ENVName- !Sub Subnet1IdSubnet: !Ref SubnetNameSecurityGroupName: !Ref SecurityGroupNameProjectName: !Ref ProjectNameRootAccess: !Ref RootAccessVolumeSizeInGB: !Ref VolumeSizeInGBLifecycleConfigName: !If [BucketCondition, !GetAtt EFSLifecycleConfigForS3.NotebookInstanceLifecycleConfigName, !GetAtt EFSLifecycleConfig.NotebookInstanceLifecycleConfigName]  DirectInternetAccess: !Ref DirectInternetAccessRoleArn: !GetAtt - SageMakerExecutionRole- ArnTags:- Key: ProjectIDValue: !Ref ProjectID- Key: ProjectNameValue: !Ref ProjectName
Outputs:Message:Description: Execution StatusValue: !GetAtt - SageMakerCustomResource- MessageSagemakerKMSKey:Description: KMS Key for encrypting Sagemaker resourceValue: !Ref KeyAliasExecutionRoleArn:Description: ARN of the Sagemaker Execution RoleValue: !Ref SageMakerExecutionRoleS3BucketName:Description: S3 bucket for SageMaker Notebook operationValue: !Ref SagemakerS3BucketNotebookInstanceName:Description: Name of the Sagemaker Notebook instance createdValue: !Ref NotebookInstanceNameProjectName:Description: Project ID used for SageMaker deploymentValue: !Ref ProjectNameProjectID:Description: Project ID used for SageMaker deploymentValue: !Ref ProjectID

3. 接下来我们进入VPC服务主页,进入Endpoint功能,点击Create endpoint创建一个VPC endpoint节点,用于SageMaker私密安全的访问S3桶中的大模型文件。

4. 为节点命名为“s3-endpoint”,并选择节点访问对象类型为AWS service,选择s3作为访问服务。

5. 选择节点所在的VPC,并配置路由表,最后点击创建。

6. 接下来我们进入亚马逊云科技service catalog服务主页,进入Portfolio功能,点击create创建一个新的portfolio,用于统一管理一整个包括不同云资源的服务。

7. 为service portfolio起名“SageMakerPortfolio“,所有者选为CQ。

8. 接下来我们为Portfolio添加云资源,点击"create product"

9. 我们选择通过CloudFormation IaC脚本的形式创建Product云资源,为Product其名为”SageMakerProduct“,所有者设置为CQ。

10. 在Product中添加CloudFormation脚本文件,我们通过URL的形式,将我们在第二步上传到S3中的CloudFormation脚本URL填入,并设置版本为1,最后点击Create创建Product云资源。

11.接下来我们进入到Constraints页面,点击create创建Constraints,用于通过权限管理限制利用Service Catalog Product对云资源的操作。

12. 选择限制我们刚刚创建的的Product: "SageMakerProduct",选择限制的类型为创建。

13. 为限制添加IAM角色规则,IAM角色中配置了对Product权限管理规则,再点击Create创建。

14. 接下来我们点击Access,创建一个Access来限制可以访问Product云资源的用户。

15. 我们添加了角色”SCEndUserRole“,用户代替用户访问Product创建云资源。

16. 接下来我们开始利用Service Catalog Product创建一些列的云资源。选中我们刚创建的Product,点击Launch

17. 为我们要创建的云资源Product起一个名字”DataScientistProduct“, 选择我们前一步创建的版本号1。

18. 为将要通过Product创建的SageMaker配置参数,环境名以及实例名

19. 添加我们在最开始创建的Lambda函数ARN ID,点击Launch开始创建。

20. 最后回到SageMaker服务主页,可以看到我们利用Service Catalog Product功能成功创建了一个新的Jupyter Notebook实例。利用这个实例,我们就可以开发我们的AI服务应用。

以上就是在亚马逊云科技上利用亚马逊云科技安全、合规地训练AI大模型和开发AI应用全部步骤。欢迎大家未来与我一起,未来获取更多国际前沿的生成式AI开发方案。

相关文章:

  • 北京网站建设多少钱?
  • 辽宁网页制作哪家好_网站建设
  • 高端品牌网站建设_汉中网站制作
  • 传输层安全性 ——TLS(Transport Layer Security)简介
  • Web Image scr图片从后端API获取基本实现
  • 【源码+论文】基于SpringBoot的网上订餐系统
  • ThinkPHP5漏洞分析之文件包含
  • UE开发中的设计模式(三) —— 对象池模式
  • duilib 之 窗口枚举显示实例 一
  • 高阶数据结构(Java):AVL树插入机制的探索
  • spring boot-18
  • 前端工程化16-什么是节流防抖
  • C#进阶-ASP.NET实现可以缩放和旋转的图片预览页
  • 《Linux运维总结:基于x86_64架构CPU使用docker-compose一键离线部署etcd 3.5.15容器版分布式集群》
  • Python办公自动化:使用openpyxl对工作表进行基本操作
  • 【性能优化】DNS解析优化
  • Python,Spire.Doc模块,处理word、docx文件,极致丝滑
  • 数据结构-排序的概念、应用及其算法实现1(直接插入排序、希尔排序、选择排序、堆排序、冒泡排序)
  • 【Leetcode】101. 对称二叉树
  • @angular/forms 源码解析之双向绑定
  • [js高手之路]搞清楚面向对象,必须要理解对象在创建过程中的内存表示
  • 【笔记】你不知道的JS读书笔记——Promise
  • CNN 在图像分割中的简史:从 R-CNN 到 Mask R-CNN
  • CSS 提示工具(Tooltip)
  • CSS3 聊天气泡框以及 inherit、currentColor 关键字
  • dva中组件的懒加载
  • HTTP中GET与POST的区别 99%的错误认识
  • JAVA SE 6 GC调优笔记
  • Java到底能干嘛?
  • Linux学习笔记6-使用fdisk进行磁盘管理
  • Redux 中间件分析
  • Sass 快速入门教程
  • Spark in action on Kubernetes - Playground搭建与架构浅析
  • Spring框架之我见(三)——IOC、AOP
  • Vue源码解析(二)Vue的双向绑定讲解及实现
  • windows下使用nginx调试简介
  • 回顾 Swift 多平台移植进度 #2
  • 基于HAProxy的高性能缓存服务器nuster
  • 看完九篇字体系列的文章,你还觉得我是在说字体?
  • 前端学习笔记之原型——一张图说明`prototype`和`__proto__`的区别
  • 微信小程序--------语音识别(前端自己也能玩)
  • 小程序测试方案初探
  • 写给高年级小学生看的《Bash 指南》
  • mysql面试题分组并合并列
  • Redis4.x新特性 -- 萌萌的MEMORY DOCTOR
  • # 达梦数据库知识点
  • #100天计划# 2013年9月29日
  • #基础#使用Jupyter进行Notebook的转换 .ipynb文件导出为.md文件
  • (1)Jupyter Notebook 下载及安装
  • (笔试题)分解质因式
  • (二)linux使用docker容器运行mysql
  • (没学懂,待填坑)【动态规划】数位动态规划
  • (万字长文)Spring的核心知识尽揽其中
  • .bat批处理(八):各种形式的变量%0、%i、%%i、var、%var%、!var!的含义和区别
  • .NET 8.0 发布到 IIS
  • .NET CORE 第一节 创建基本的 asp.net core
  • .NET/C# 如何获取当前进程的 CPU 和内存占用?如何获取全局 CPU 和内存占用?
  • .net程序集学习心得