The cloud platform enineering team have now minted a new secure environment for us to begin supporting data science teams. To get started, as the Data Science Administrator, use the AWS Service Catalog to create an IAM role for a new data science team. This IAM role will be used as the execution role for all SageMaker Notebooks created by the data scientists on the team. The execution role controls what AWS resources, such as S3 buckets, the SageMaker notebook will be able to access. We will also use AWS CloudFormation to create the rest of the team resources including an AWS Service Catalog Portfolio. The portfolio will contain a SageMaker Notebook product that will allow data scientists to self-service and deploy their own secure SageMaker notebooks.
Assume the role of the Data Science Administrator and using the Service Catalog launch the SageMakerNotebookExeRole product. After the product has launched create a Service Catalog portfolio specifically for the data science team.
Accountfield enter your 12-digit AWS account ID. You can find it on the My Account page.
Open the menu for the SageMakerNotebookExeRole product and click
Give the product a name such as
ml-product-team-nb-role (LOWER CASE ONLY), click
Next and then enter a unique
TeamName such as
team-<PRODUCT NAME> or a similar value of your choosing.
Next on the next 3 screens and then click
You will land on a Provisioned Product page and can periodically click the Refresh button to see the status of the product deployment.
Once the Status for the product shows
Succeeded you can move on to the next step.
With the role created it’s now time to create resources for a data science product team. The team will need Amazon S3 buckets, KMS encryption keys, and a Service Catalog Portfolio to self-service and create Jupyter notebooks. To create these resources use the Deploy to AWS button for your region below to launch a CloudFormation template. Please ensure that the same Team Name that was specified in Service Catalog above is used now with CloudFormation.
As in the previous step all of the parameters should have reasonable defaults but you can change them to your preference provided those changes are in line with the stack deployment in Lab 1.
You will be redirected to the CloudFormation console where you can see it provisioning resources on your behalf. When it shows CREATE_COMPLETE for the status you can proceed to the next step.
Be sure and launch into the same region you used during the previous step and use the same team name as defined above.
|Oregon (us-west-2)||Deploy to AWS Oregon|
|Ohio (us-east-2)||Deploy to AWS Ohio|
|N. Virginia (us-east-1)||Deploy to AWS N. Virginia|
|Ireland (eu-west-1)||Deploy to AWS Ireland|
|London (eu-west-2)||Deploy to AWS London|
|Sydney (ap-southeast-2)||Deploy to AWS Sydney|
You have now created multiple AWS resources to support the data science team. Please take a moment and review these resources and their configuration.
Amazon S3 buckets for training data and trained models
Visit the S3 console and see the Amazon S3 buckets that have been created for the team. Take note of the bucket policy that has been applied to the data bucket.
AWS KMS key for encrypting data at rest
A KMS key to encrypt data at rest in the data science environment. Visit the console, who is allowed to take what actions on the keys created?
Parameters added to Parameter Store
A parameter has been added to the collection in Parameter Store. Can you see what parameter has been added? How would you use this value?
Service Catalog Jupyter Notebook product
A Service Catalog Portfolio containing a best practice Jupyter notebook product has been configured to give the data science team members the ability to create resources on demand.
With the teams resources created let’s move on to Lab 3 where we will, as a data scientist, self-service and create a Jupyter notebook server.