Data And Cloud (Infrastructure as Code)

Data And Cloud (Infrastructure as Code)

Slide 1: Introduction

About

  • Twitter: @akpanydre
  • Software Engineering Consultant: Linux, DevOps, Cloud Computing, Machine Learning
  • Self-taught in the industry since 2003, about 15 years of professional experience
  • Expertise: software problem-solving, creation of custom solutions
  • Polyglot (French/English and also programming languages)
  • Passionate about technology, African culture, plastic art, djembe drum, traveling, good food, basketball, and gymnastics.

A propos

  • Twitter: @akpanydre
  • Consultant en ingénierie logicielle: Linux, DevOps, Cloud Computing, Machine Learning
  • Autodidacte dans l'industrie depuis 2003, environ 15 ans d'expérience professionnelle
  • Expertise: résolution de problèmes logiciels, création de solutions personnalisées
  • Polyglotte (français/anglais et également des langages de programmation)
  • Passionné de technologie, de la culture africaine, de l'art plastique, du tam-tam djembé, des voyages, de la bonne cuisine, du basketball et de la gymnastique.

Presenter Notes: Why infrastructure as code (IaC)?

  • State the problem and what it solves
  • Explain the importance of infrastructure as code
  • Define infrastructure as code
  • Discuss the benefits of using infrastructure as code

Slide 2: What is Infrastructure as Code?

Definition

En

Infrastructure as code (IaC) is a practice in software development that involves managing and provisioning computer infrastructure using code, rather than through manual processes. This approach can help solve several problems that organizations face when managing their infrastructure, including:

Fr

L'Infrastructure en tant que code (IaC) est une pratique de développement de logiciels qui implique la gestion et la provision de l'infrastructure informatique en utilisant du code, plutôt que des processus manuels. Cette approche peut aider à résoudre plusieurs problèmes auxquels sont confrontées les organisations lors de la gestion de leur infrastructure, notamment:

Benefits

En

  • Consistency: With IaC, infrastructure can be consistently deployed across different environments, ensuring that the same configurations are applied every time.
  • Scalability: IaC can help organizations scale their infrastructure rapidly and efficiently, by allowing them to automate the provisioning of new resources as needed.
  • Speed: By automating infrastructure deployment and management tasks, IaC can help organizations reduce the time it takes to set up and maintain their infrastructure.
  • Collaboration: IaC makes it easier for teams to collaborate on infrastructure management tasks, by providing a shared, version-controlled repository of infrastructure code that can be accessed by multiple team members.
  • Versioning: With IaC, infrastructure code can be versioned, tracked and tested just like software code. This enables organizations to roll back to earlier versions of their infrastructure configurations if needed, and to ensure that changes are properly documented and tested.

Overall, IaC helps organizations manage their infrastructure in a more automated, consistent, and scalable manner, which can improve the efficiency, reliability, and agility of their IT operations.

Fr

  • Cohérence: Avec IaC, l'infrastructure peut être déployée de manière cohérente sur différents environnements, en veillant à ce que les mêmes configurations soient appliquées à chaque fois.
  • Scalabilité: IaC peut aider les organisations à mettre rapidement et efficacement à l'échelle leur infrastructure, en leur permettant d'automatiser la provision de nouvelles ressources au besoin.
  • Rapidité: En automatisant les tâches de déploiement et de gestion de l'infrastructure, IaC peut aider les organisations à réduire le temps nécessaire pour mettre en place et maintenir leur infrastructure.
  • Collaboration: IaC facilite la collaboration entre les équipes pour les tâches de gestion de l'infrastructure, en fournissant un référentiel partagé de code d'infrastructure versionné, accessible par plusieurs membres de l'équipe.
  • Versioning: Avec IaC, le code d'infrastructure peut être versionné, suivi et testé comme le code logiciel. Cela permet aux organisations de revenir aux versions antérieures de leurs configurations d'infrastructure si nécessaire, et de s'assurer que les changements sont correctement documentés et testés.

Dans l'ensemble, IaC aide les organisations à gérer leur infrastructure de manière plus automatisée, cohérente et évolutive, ce qui peut améliorer l'efficacité, la fiabilité et l'agilité de leurs opérations informatiques

Code familiarity

Configuration like (JSON, YAML, HCL etc)

Code like (Python, Go, Typescript, Javascript, CSharp etc)

Terms

En

  • CI: Continuous Integration
  • CD: Continuous Delivery
  • CDK: Cloud Development Kit

Fr

  • CI: Intégration Continue
  • CD: Livraison Continue
  • CDK: Kit de Développement Cloud

Presenter's Notes: See it in action (Dagger.io for CI, Terraform CDK for Provisionning and CD)

Use Staging Environment to:

  • Show monorepo with familiar for application and infrastructure
  • Show Dagger in Action in Google Cloud Build -- CI
  • Show Terraform CDK in Action in AWS and GCP -- Provisionning and CD

Slide 3: AWS Sagemaker Overview

Presenter's Notes: Sagemaker and Julia

  • Give a brief overview of AWS Sagemaker
  • Explain how it can be used to create a Julia environment

Slide 4: AWS CDK Overview

Presenter's Notes: Overview of AWS CDK

  • Introduce AWS Cloud Development Kit (CDK)
  • Discuss how it can be used to define cloud infrastructure

Slide 5: Terraform CDK Overview

Presenter's Notes: Overview of CDKTF

  • Introduce Terraform CDK
  • Explain how it can be used to define infrastructure as code

Slide 6: Dagger.io Overview

Presenter's Notes: Overview of Dagger.io

  • Introduce Dagger.io
  • Explain how it can be used to automate cloud infrastructure

Slide 7: Demo: Setting Up a Julia Environment in AWS Sagemaker

Walk through the steps of setting up a working Julia environment in AWS Sagemaker using infrastructure as code Show how to train a simple machine learning model and deploy it as a microservice in the cloud

Setting up a Cloud Environment for your Data Science Experiments

Infrastructure as Code

Python Dependencies with Pyenv and Poetry

Let's start by installing Pyenv on your system. Then let's pick a recent version of Python for setting up a virtual environment managed by Poetry for our Data and Cloud project.

#!/bin/bash
set -x #echo on

PYTHON_VERSION=3.11.1

echo "Setting up Python v$PYTHON_VERSION"

echo $PYTHON_VERSION > .python-version
pyenv install $PYTHON_VERSION
pyenv global $PYTHON_VERSION
pyenv local $PYTHON_VERSION
pyenv shell $PYTHON_VERSION

echo "Poetry installation and Virtual Environment creation"

pyenv exec pip install poetry
pyenv exec poetry init
pyenv exec poetry install
pyenv exec poetry shell

echo "Python depedencies installation into to the Virtual Environment"

pyenv exec poetry add black --group dev # Add Dev dependencies

echo "Virtual Environment visual check"

pyenv exec poetry show -v
pyenv exec poetry env info -p

Let's proceed with the installation of a recent version of NodeJs and the AWS CDK Toolkit CLI installation which has a dependency on Nodejs, on your machine.

node -v
npm install -g aws-cdk
v18.14.0
added 2 packages in 1s

Now let's create an infra folder from which we will bootstrap an AWS CDK project

#!/bin/bash
set -x #echo on

echo "Bootstrap AWS CDK Python application"

mkdir `pwd`/infra && cd `pwd`/infra && cdk init app --language python

Now let us add the dependencies that are specified inside our AWS CDK project found in our project infra to the Poetry Virtual Environment that we created earlier.

#!/bin/bash
set -x #echo on

echo "Install dependencies from infra/requirements.txt"

cat `pwd`/infra/requirements.txt | xargs pyenv exec poetry add

echo "Install dev dependencies from infra/requirements-dev.txt"

cat `pwd`/infra/requirements-dev.txt | xargs pyenv exec poetry add --group dev

Sagemaker Environment Provision

Amazon Sagemaker is a product or service that is part of the AWS offerings that helps build, train, and deploy machine learning (ML) models for any use case with fully managed infrastructure, tools, and workflows. The Sagemaker Science environment itself relies on Conda which is a package, dependency, and environment management for various languages such as FORTRAN, C/C++, Python, R etc. For the purpose of our Data and Cloud workshop, we are interested in Julia. Note that the Jupyter is a loose acronym for Julia, Python and R.

Let's start by bootstrapping the AWS CDK environment:

#!/bin/bash
set -x #echo on

cd `pwd`/infra && cdk bootstrap
from pathlib import Path

from aws_cdk import (
    Fn as fn,
    Stack,
    aws_iam as iam,
    aws_sagemaker as sagemaker,
)
from constructs import Construct

def read_text(file_path: str, encoding="utf-8"):
  text = ""
  if file_path is not None:
    text = Path(file_path).read_text(encoding=encoding)
  return text

class InfraStack(Stack):

    def __init__(
            self, scope: Construct, construct_id: str,
            on_create_script_path=None, on_start_script_path=None, **kwargs,
    ) -> None:

        super().__init__(scope, construct_id, **kwargs)

        # The code that defines your stack goes here

        onCreateScript = read_text(on_create_script_path)

        onStartScript = read_text(on_start_script_path)

        role = iam.Role(self, "DataAndCloudScienceRole", **dict(
            assumed_by=iam.ServicePrincipal("sagemaker.amazonaws.com"),
            managed_policies=[
                iam.ManagedPolicy.from_aws_managed_policy_name("AmazonSageMakerFullAccess"),
            ],
        ))

        lifecycle_config = sagemaker.CfnNotebookInstanceLifecycleConfig(
            self, "DataAndCloudScienceLifecycleConfig", **dict(
                notebook_instance_lifecycle_config_name="DataAndCloudScienceLifecycleConfig",
                on_create=[
                    dict(
                        content_type="text/x-shellscript",
                        content=fn.base64(onCreateScript),
                    ),
                ],
                on_start=[
                    dict(
                        content_type="text/x-shellscript",
                        content=fn.base64(onStartScript),
                    ),
                ],
            )
        )

        fastai_fastbook_repository = "FastAIFastBookRepository"

        sagemaker.CfnCodeRepository(
            self, "DataAndCloudFastAICodeRepository",
            code_repository_name=fastai_fastbook_repository,
            git_config=sagemaker.CfnCodeRepository.GitConfigProperty(
                branch="master",
                repository_url="https://github.com/fastai/fastbook.git",
            ),
        )

        julia_academy_datascience_repository = "JuliaAcademyDataScienceRepository"

        sagemaker.CfnCodeRepository(
            self, "DataAndCloudScienceCodeRepository",
            code_repository_name=julia_academy_datascience_repository,
            git_config=sagemaker.CfnCodeRepository.GitConfigProperty(
                branch="main",
                repository_url="https://github.com/JuliaAcademy/DataScience.git",
            ),
        )

        fluxml_fastaijl_repository = "FluxMLFastAIjlRepository"

        sagemaker.CfnCodeRepository(
          self, "DataAndCloudFluxMLFastAICodeRepository",
          code_repository_name=fluxml_fastaijl_repository,
          git_config=sagemaker.CfnCodeRepository.GitConfigProperty(
              branch="master",
              repository_url="https://github.com/FluxML/FastAI.jl.git",
          ),
        )

        sagemaker.CfnNotebookInstance(
            self, "DataAndCloudScienceNotebookCPU", **dict(
                lifecycle_config_name=lifecycle_config.notebook_instance_lifecycle_config_name,
                role_arn=role.role_arn,
                instance_type="ml.c5.xlarge", # "ml.p2.xlarge", # "ml.c5.2xlarge",
                additional_code_repositories=[
                  fastai_fastbook_repository,
                  julia_academy_datascience_repository,
                  fluxml_fastaijl_repository,
                ],
            ),
        )

        # sagemaker.CfnNotebookInstance(
        #     self, "DataAndCloudScienceNotebookGPU", **dict(
        #         lifecycle_config_name=lifecycle_config.notebook_instance_lifecycle_config_name,
        #         role_arn=role.role_arn,
        #         instance_type="ml.p2.xlarge",  # "ml.c5.xlarge", # "ml.c5.2xlarge",
        #         additional_code_repositories=[julia_academy_datascience_repository],
        #     ),
        # )
#!/usr/bin/env python3
import os

import aws_cdk as cdk

from infra.infra_stack import InfraStack

project_path = os.path.dirname(os.path.realpath(__file__))
scripts_path = f"{project_path}/../scripts"
app = cdk.App()
InfraStack(app, "InfraStack",
    # If you don't specify 'env', this stack will be environment-agnostic.
    # Account/Region-dependent features and context lookups will not work,
    # but a single synthesized template can be deployed anywhere.

    # Uncomment the next line to specialize this stack for the AWS Account
    # and Region that are implied by the current CLI configuration.

    #env=cdk.Environment(account=os.getenv('CDK_DEFAULT_ACCOUNT'), region=os.getenv('CDK_DEFAULT_REGION')),

    # Uncomment the next line if you know exactly what Account and Region you
    # want to deploy the stack to. */

    #env=cdk.Environment(account='123456789012', region='us-east-1'),

    # For more information, see https://docs.aws.amazon.com/cdk/latest/guide/environments.html

    # Science Environment Lifecycle Scripts
    ,**dict(
      on_create_script_path=f"{scripts_path}/aws/onCreate.sh",
      on_start_script_path=f"{scripts_path}/aws/onStart.sh",
    ),
)

app.synth()

Now let's go ahead and provision the Amazon Sagemaker Science Environment:

#!/bin/bash
set -x #echo on

cd `pwd`/infra && cdk deploy

We can terminate the Sagemaker Science Environment by executing the following:

#!/bin/bash
set -x #echo on

cd `pwd`/infra && cdk destroy

Sagemaker Environment Customization

Julia installation script

Let's add Julia to our Sagemaker Science Environment using the following script:

#!/bin/bash
set -x #echo on

JULIA_RELEASE=1.8
JULIA_VERSION=1.8.5
JULIA_DOWNLOAD_URL=https://julialang-s3.julialang.org/bin/linux/x64/$JULIA_RELEASE/julia-$JULIA_VERSION-linux-x86_64.tar.gz
JULIA_INSTALL_DIR=~/SageMaker/envs/julia
JULIA_DEPOT_PATH=$JULIA_INSTALL_DIR/depot

conda create --yes --prefix $JULIA_INSTALL_DIR

wget -c $JULIA_DOWNLOAD_URL -O - | tar -xz

cp -R julia-$JULIA_VERSION/* $JULIA_INSTALL_DIR

mkdir -p $JULIA_INSTALL_DIR/etc/conda/activate.d

echo 'export JULIA_DEPOT_PATH=$JULIA_DEPOT_PATH' >> $JULIA_INSTALL_DIR/etc/conda/activate.d/env.sh
echo -e 'empty!(DEPOT_PATH)\npush!(DEPOT_PATH,raw"/home/ec2-user/SageMaker/envs/julia/depot")' >> /home/ec2-user/SageMaker/envs/julia/etc/julia/startup.jl

conda run --prefix $JULIA_INSTALL_DIR/ julia --eval 'using Pkg; Pkg.add("IJulia"); using IJulia; IJulia.installkernel("Julia")'

rm -rf julia-$JULIA_VERSION

We could execute scripts on the occurrence of various lifecycle events during the provision of our Sagemaker Science Environment.

On-Create lifecycle event script

#!/bin/bash
set -x #echo on

set -e

On-Start lifecycle event script

#!/bin/bash
set -x #echo on

set -e

Let's apply our Julia v1.8.5 installation script during the creation phase of out Sagemaker Science Environment.

sudo -u ec2-user -i <<'EOF'

JULIA_RELEASE=1.8
JULIA_VERSION=1.8.5
JULIA_DOWNLOAD_URL=https://julialang-s3.julialang.org/bin/linux/x64/$JULIA_RELEASE/julia-$JULIA_VERSION-linux-x86_64.tar.gz
JULIA_INSTALL_DIR=/home/ec2-user/SageMaker/envs/julia
JULIA_DEPOT_PATH=$JULIA_INSTALL_DIR/depot

conda create --yes --prefix $JULIA_INSTALL_DIR

wget -c $JULIA_DOWNLOAD_URL -O - | tar -xz

cp -R julia-$JULIA_VERSION/* $JULIA_INSTALL_DIR

mkdir -p $JULIA_INSTALL_DIR/etc/conda/activate.d

echo 'export JULIA_DEPOT_PATH=$JULIA_DEPOT_PATH' >> $JULIA_INSTALL_DIR/etc/conda/activate.d/env.sh
echo -e 'empty!(DEPOT_PATH)\npush!(DEPOT_PATH,raw"/home/ec2-user/SageMaker/envs/julia/depot")' >> /home/ec2-user/SageMaker/envs/julia/etc/julia/startup.jl

conda run --prefix $JULIA_INSTALL_DIR/ julia --eval 'using Pkg; Pkg.add("IJulia"); using IJulia; IJulia.installkernel("Julia")'

rm -rf julia-$JULIA_VERSION

EOF

Notes:

  1. Reset Pyenv Shell: pyenv shell system
  2. Synchronize Poetry Lock file: pyenv exec poetry update package

Slide 8: Conclusion

Recap the main points of the presentation Encourage attendees to start using infrastructure as code for their cloud deployments

Appendix

Final Code Summary

Before I start, I want to clarify some key terms and concepts that will be used in the tutorial:

  • AWS CDK: AWS Cloud Development Kit is an open-source software development framework that enables developers to define cloud infrastructure using familiar programming languages, such as Python, TypeScript, and Java. With AWS CDK, you can define your cloud resources as code and use your preferred programming language to define cloud infrastructure.
  • AWS SageMaker: Amazon SageMaker is a fully-managed service that provides developers and data scientists with the ability to build, train, and deploy machine learning models in the cloud. SageMaker provides pre-built machine learning algorithms, notebooks, and other tools to help developers quickly build and deploy machine learning models.
  • Shell script: A shell script is a computer program designed to be run by the Unix shell, a command-line interpreter. Shell scripts are used to automate tasks, such as setting up an environment, installing software packages, and running commands.

Now let's dive into the code tutorial!

Step 1:

Provisioning Data Science environments in AWS SageMaker using AWS CDK

The initial project uses the AWS CDK Python API to provision Data Science environments in AWS SageMaker. The code for this is located in the infra\(\_\)stack.py file.

Step 2:

Custom installation of a Julia Kernel into the Sagemaker Conda environment

To install a Julia kernel into the SageMaker Conda environment, two shell scripts are used: onCreate.sh and onStart.sh. These scripts are located in the scripts/aws directory.

  • onCreate.sh: This script is executed when a SageMaker notebook instance is created. It sets up the environment for the Julia kernel installation.
  • onStart.sh: This script is executed when a SageMaker notebook instance is started. It installs the Julia kernel into the SageMaker Conda environment.

Step 3:

Installing Julia kernel into SageMaker Conda environment

To install the Julia kernel into the SageMaker Conda environment, the onStart.sh script is used. Here are the steps that the script takes:

Set up variables: The script sets up variables for the Julia release version, Julia version, download URL, installation directory, and depot path. Create Conda environment: The script creates a Conda environment for the Julia installation. Download and extract Julia: The script downloads the Julia tarball from the Julia download site and extracts it to the installation directory. Set up Julia environment variables: The script sets up environment variables for the Julia depot path and adds them to the env.sh file. Install IJulia: The script installs IJulia, a package that provides a Julia kernel for Jupyter notebooks, and adds it to the kernel list. Clean up: The script removes the downloaded Julia tarball.

Step 4:

Setting up AWS CDK and project dependencies

To set up AWS CDK and project dependencies, you need to create a requirements.txt file that lists the required dependencies for the project. In this project, the requirements.txt file includes the following dependencies:

  • aws-cdk-lib==2.70.0: AWS CDK library version 2.70.0.
  • constructs>=10.0.0,<11.0.0: Constructs library version 10.x.

Step 5:

In infra\(\_\)stack.py, a SageMaker Notebook instance lifecycle configuration is created using the sagemaker.CfnNotebookInstanceLifecycleConfig class. This configuration specifies the scripts to be run when a notebook instance is created or started.

Step 6:

The fastai\(\_\)fastbook\(\_\)repository and julia\(\_\)academy\(\_\)datascience\(\_\)repository repositories are created using the sagemaker.CfnCodeRepository class. These repositories contain the code that will be used by the notebook instance.

Step 7:

The fluxml\(\_\)fastaijl\(\_\)repository repository is created using the sagemaker.CfnGitHubLocation class. This repository contains the Julia code that will be used by the notebook instance.

Step 8:

Finally, the InfraStack class is instantiated in app.py, passing in the on\(\_\)create\(\_\)script\(\_\)path and on\(\_\)start\(\_\)script\(\_\)path arguments to specify the paths to the onCreate.sh and onStart.sh scripts, respectively.

Step 9:

The app.synth() method is called to synthesize the CloudFormation template for the stack, which can then be deployed to AWS.

That's the end of the documentation for this code. Keep in mind that this project uses the AWS CDK Python API to provision Data Science environments in AWS Sagemaker, and the onCreate.sh and onStart.sh shell scripts are used to install a custom Julia Kernel into the Sagemaker Conda environment.

Résumé du Code Final

Avant de commencer, je tiens à clarifier certains termes clés et concepts qui seront utilisés dans le tutoriel:

  • AWS CDK: AWS Cloud Development Kit est un framework de développement de logiciels open source qui permet aux développeurs de définir l'infrastructure cloud en utilisant des langages de programmation familiers, tels que Python, TypeScript et Java. Avec AWS CDK, vous pouvez définir vos ressources cloud en tant que code et utiliser votre langage de programmation préféré pour définir l'infrastructure cloud.
  • AWS SageMaker: Amazon SageMaker est un service entièrement géré qui offre aux développeurs et aux scientifiques des données la possibilité de créer, former et déployer des modèles d'apprentissage automatique dans le cloud. SageMaker fournit des algorithmes d'apprentissage automatique pré-construits, des notebooks et d'autres outils pour aider les développeurs à créer et déployer rapidement des modèles d'apprentissage automatique.
  • Script Shell: Un script shell est un programme informatique conçu pour être exécuté par le shell Unix, un interpréteur de ligne de commande. Les scripts shell sont utilisés pour automatiser des tâches telles que la configuration d'un environnement, l'installation de packages logiciels et l'exécution de commandes.

Maintenant, plongeons dans le tutoriel de code!

Étape 1:

Provisionnement d'environnements de science des données dans AWS SageMaker à l'aide d'AWS CDK

Le projet initial utilise l'API AWS CDK Python pour provisionner des environnements de science des données dans AWS SageMaker. Le code pour cela se trouve dans le fichier infra\(\_\)stack.py.

Étape 2: Installation personnalisée d'un noyau Julia dans l'environnement Sagemaker Conda

Pour installer un noyau Julia dans l'environnement SageMaker Conda, deux scripts shell sont utilisés: onCreate.sh et onStart.sh. Ces scripts se trouvent dans le répertoire scripts/aws.

  • onCreate.sh: ce script est exécuté lorsqu'une instance de notebook SageMaker est créée. Il configure l'environnement pour l'installation du noyau Julia.
  • onStart.sh: ce script est exécuté lorsqu'une instance de notebook SageMaker est démarrée. Il installe le noyau Julia dans l'environnement SageMaker Conda.

Étape 3: Installation du noyau Julia dans l'environnement Conda de SageMaker

Pour installer le noyau Julia dans l'environnement Conda de SageMaker, le script onStart.sh est utilisé. Voici les étapes que le script prend :

Configurer les variables : Le script configure les variables pour la version de sortie de Julia, la version de Julia, l'URL de téléchargement, le répertoire d'installation et le chemin du dépôt. Créer l'environnement Conda : Le script crée un environnement Conda pour l'installation de Julia. Télécharger et extraire Julia : Le script télécharge le tarball de Julia à partir du site de téléchargement de Julia et l'extrait dans le répertoire d'installation. Configurer les variables d'environnement de Julia : Le script configure les variables d'environnement pour le chemin du dépôt Julia et les ajoute au fichier env.sh. Installer IJulia : Le script installe IJulia, un package qui fournit un noyau Julia pour les notebooks Jupyter, et l'ajoute à la liste des noyaux. Nettoyer : Le script supprime le tarball Julia téléchargé.

Étape 4:

Configuration d'AWS CDK et des dépendances du projet

Pour configurer AWS CDK et les dépendances du projet, vous devez créer un fichier requirements.txt qui répertorie les dépendances requises pour le projet. Dans ce projet, le fichier requirements.txt inclut les dépendances suivantes :

  • aws-cdk-lib==2.70.0: version 2.70.0 de la bibliothèque AWS CDK.
  • constructs>=10.0.0,<11.0.0: version 10.x de la bibliothèque Constructs.

Étape 5:

Dans infra\(\_\)stack.py, une configuration du cycle de vie de l'instance de notebook SageMaker est créée à l'aide de la classe sagemaker.CfnNotebookInstanceLifecycleConfig. Cette configuration spécifie les scripts à exécuter lorsqu'une instance de notebook est créée ou démarrée.

Étape 6:

Les référentiels fastai\(\_\)fastbook\(\_\)repository et julia\(\_\)academy\(\_\)datascience\(\_\)repository sont créés à l'aide de la classe sagemaker.CfnCodeRepository. Ces référentiels contiennent le code qui sera utilisé par l'instance de notebook.

Étape 7:

Le référentiel fluxml\(\_\)fastaijl\(\_\)repository est créé à l'aide de la classe sagemaker.CfnGitHubLocation. Ce référentiel contient le code Julia qui sera utilisé par l'instance de notebook.

Étape 8:

Enfin, la classe InfraStack est instanciée dans app.py, en passant les arguments on\(\_\)create\(\_\)script\(\_\)path et on\(\_\)start\(\_\)script\(\_\)path pour spécifier les chemins des scripts onCreate.sh et onStart.sh, respectivement.

Étape 9:

La méthode app.synth() est appelée pour synthétiser le modèle CloudFormation pour la pile, qui peut ensuite être déployée sur AWS.

C'est la fin de la documentation pour ce code. Gardez à l'esprit que ce projet utilise l'API AWS CDK Python pour provisionner des environnements de science des données dans AWS Sagemaker, et que les scripts shell onCreate.sh et onStart.sh sont utilisés pour installer un noyau Julia personnalisé dans l'environnement Conda de Sagemaker.

Join us on Discord

The Ubuntu TechHive on Discord

COME HANGOUT!
Join us on Discord