Lab 2: Team Resources

The cloud platform enineering team have now minted a new secure environment for us to begin supporting data science teams. To get started, as the Data Science Administrator, use the AWS Service Catalog to create an IAM role for a new data science team. This IAM role will be used as the execution role for all SageMaker Notebooks created by the data scientists on the team. The execution role controls what AWS resources, such as S3 buckets, the SageMaker notebook will be able to access. We will also use AWS CloudFormation to create the rest of the team resources including an AWS Service Catalog Portfolio. The portfolio will contain a SageMaker Notebook product that will allow data scientists to self-service and deploy their own secure SageMaker notebooks.

Enable the data science team

Assume the role of the Data Science Administrator and using the Service Catalog launch the SageMakerNotebookExeRole product. After the product has launched create a Service Catalog portfolio specifically for the data science team.

Step-by-step instructions

With the role created it’s now time to create resources for a data science product team. The team will need Amazon S3 buckets, KMS encryption keys, and a Service Catalog Portfolio to self-service and create Jupyter notebooks. To create these resources use the Deploy to AWS button for your region below to launch a CloudFormation template. Please ensure that the same Team Name that was specified in Service Catalog above is used now with CloudFormation.

As in the previous step all of the parameters should have reasonable defaults but you can change them to your preference provided those changes are in line with the stack deployment in Lab 1.

Step-by-step instructions

Be sure and launch into the same region you used during the previous step and use the same team name as defined above.

Region Launch Template
Oregon (us-west-2) Deploy to AWS Oregon
Ohio (us-east-2) Deploy to AWS Ohio
N. Virginia (us-east-1) Deploy to AWS N. Virginia
Ireland (eu-west-1) Deploy to AWS Ireland
London (eu-west-2) Deploy to AWS London
Sydney (ap-southeast-2) Deploy to AWS Sydney

Review team resources

You have now created multiple AWS resources to support the data science team. Please take a moment and review these resources and their configuration.

  • Amazon S3 buckets for training data and trained models

    Visit the S3 console and see the Amazon S3 buckets that have been created for the team. Take note of the bucket policy that has been applied to the data bucket.

  • AWS KMS key for encrypting data at rest

    A KMS key to encrypt data at rest in the data science environment. Visit the console, who is allowed to take what actions on the keys created?

  • Parameters added to Parameter Store

    A parameter has been added to the collection in Parameter Store. Can you see what parameter has been added? How would you use this value?

  • Service Catalog Jupyter Notebook product

    A Service Catalog Portfolio containing a best practice Jupyter notebook product has been configured to give the data science team members the ability to create resources on demand.

With the teams resources created let’s move on to Lab 3 where we will, as a data scientist, self-service and create a Jupyter notebook server.