Before you can begin creating templates for deployment by the Project Administration team you will need a shared services VPC to host a Python package mirror (PyPI) for use by data science teams. The mirror will host a collection of approved Python packages. The concept of a shared services VPC or PyPI mirror is not something that is detailed in this workshop, and is partially assumed as common practice among many AWS customers. After you have created a shared services VPC and PyPI mirror you will then, as the Cloud Platform Engineering Team, create a Service Catalog Portfolio which the project administrators can use to easily deploy data science environments in support of new projects.
This lab assumes other recommended security practices such as enabling AWS CloudTrail and capturing VPC Flow Logs. The contents of this lab focus soley on controls and guard rails directly related to data science resources.
In this section you will quickly get started by deploying a shared PyPI mirror for use by data science project teams. In addition to deploying a shared service this template will also create an IAM role for use by the AWS Service Catalog and for use by project administrators who are responsible for creating data science project environments.
The shared PyPI mirror will be hosted in a shared services VPC and exposed to project environments using a PrivateLink-powered endpoint. The mirror will host approved Python packages and can be used by all internal Python applications, such as machine learning code running on SageMaker.
The resulting architecture will look like this:
As a cloud platform engineering team member, deploy the CloudFormation template linked below to provision a shared service VPC and IAM roles.
|Oregon (us-west-2)||Deploy to AWS Oregon|
|Ohio (us-east-2)||Deploy to AWS Ohio|
|N. Virginia (us-east-1)||Deploy to AWS N. Virginia|
|Ireland (eu-west-1)||Deploy to AWS Ireland|
|London (eu-west-2)||Deploy to AWS London|
|Sydney (ap-southeast-2)||Deploy to AWS Sydney|
Deployment should take around 5 minutes.
Capabilitiesclick the 2 check boxes indicating you understand that the CloudFormation template will create IAM resources.
With the shared services VPC online and available you now need to provide the project administration team with a configured Service Catalog to provision data science project environments. To start, visit the AWS Service Catalog console and create a Portfolio. Grant the DataScienceAdmin role permissions to access the portfolio adn then use the appropriate CloudFormation template linked below to create a Data Science Environment product. Ensure that the product has a constraint applied to it that uses the IAM role ServiceCatalogLaunchRole to launch the product upon request. This will give the Service Catalog service the permissions needed to create a Data Science Environment.
Service Catalog Product Templates by Region:
Data Science Project Portfolio
Cloud Operations Team
Groups, roles, and userstab
Add groups, roles, users
DataScienceAdmininto the search field
Upload new product
Data Science Environment
Cloud Operations Team
Use a CloudFormation template
CloudFormation template URLenter the appropriate URL from the list below:
Actionsdrop down select
Add product to portfolio
Add Product to Portfolio
Portfolioson the left
Constraintstab in the portfolio detail page
Productdrop down select your product
Select IAM role
IAM roledrop down select
In addition to the Service Catalog Portfolio and product you have also created the following AWS resources to support the project administration team. Please take a moment and review these resources and their configuration.
AWS IAM roles
The IAM roles for the project administration team and the Service Catalog have been created. Visit the AWS IAM console and review the permissions granted to these two roles.
AWS Lambda Detective Control
An AWS Lambda has been deployed and configured to execute whenever an Amazon SageMaker resource is deployed. The Lambda function will act as a detective control, inspecting launched resources to ensure that the resource is configured correctly. To inspect the Lambda function and its triggers visit the AWS Lambda console. Can you determine exactly what types of events will cause the Lambda function to execute?
Parameters added to Parameter Store
A collection of parameters have been added to Parameter Store. Can you see what parameters have been added? How would you use these values?
Shared Services VPC
The template has created a VPC that will house our shared applications. Visit the console and see what services are accessible from within the VPC?
PyPI Mirror Service
A Python package mirror has been deployed as a containerised service in the Shared Services VPC. This service is running on a cluster managed by Amazon Elastic Container Service (ECS) Fargate which means there are no Amazon EC2 servers for you to manage. Visit the ECS console to check whether the service is up and running. You can also see the task logs from the container through the ECS console to check its status.
With these resources created you can now move on to Lab 2 where you will, as a project administrator, deploy a secure data science environment for a new project team.