Amazon S3

The Amazon S3 integration shows how to set up a connection to the Amazon S3 object storage using Python. We demonstrate how to download files from the storage and vice versa. Before your Python script can interact with the Amazon S3 storage, we first need to get Amazon credentials.

Get access keys for Amazon

To connect to Amazon S3 object storage, you need an access key id and secret access key.

AWS CLI

If you have the AWS CLI, you can use it to interactive create a configuration file. In your terminal run:

aws configure

Follow the prompts. In the end, you can find your credentials in the file ~/.aws/credentials/.

AWS Console

For instructions about creating a user using the IAM Console, follow the steps on Creating IAM users.

With an IAM user, you can get access keys by following the steps in Managing access keys.

Python

In the example below, we show how to connect the Amazon S3 object storage using Python. We demonstrate how to download files from the storage and vice versa. To do this, we need Python 3.7 or newer and install the following packages:

boto3 (required to set up the connection)
python-dotenv

We recommend using a virtual environment and adding the packages to a requirements.txt file. In this file, you can add the following:

boto3          # tested with version 1.20.4
python-dotenv  # tested with version 0.18.0

Download from Amazon S3

See also Downloading files in the Boto3 documentation.

import os
import boto3

# More about dotenv in the section `Configure dotenv`
from dotenv import find_dotenv, load_dotenv
load_dotenv(find_dotenv())


bucket_name = "S3 BUCKET NAME"      # Bucket to download from
source_object_name = "OBJECT NAME"  # S3 object name
target_file_name = "PATH TO FILE"   # File to save the object to

s3_client = boto3.client(
    "s3",
    aws_access_key_id=os.getenv("AWS_ACCESS_KEY_ID"),
    aws_secret_access_key=os.getenv("AWS_SECRET_ACCESS_KEY"),
)

s3_client.download_file(bucket_name, source_object_name, target_file_name)

Uploading to Amazon S3

See also Uploading files in the Boto3 documentation.

import os
import boto3
from botocore.exceptions import ClientError

# More about dotenv in the section `Configure dotenv`
from dotenv import find_dotenv, load_dotenv
load_dotenv(find_dotenv())

bucket_name = "S3 BUCKET"        # Bucket to upload to
source_file_name = "PATH FILE"   # File to upload

s3_client = boto3.client(
    "s3",
    aws_access_key_id=os.getenv("AWS_ACCESS_KEY_ID"),
    aws_secret_access_key=os.getenv("AWS_SECRET_ACCESS_KEY"),
)

response = s3_client.upload_file(file_name, bucket_name)

Configure dotenv

In the Python examples, we use:

from dotenv import find_dotenv, load_dotenv
load_dotenv(find_dotenv())

These two lines make it possible to develop your Python code locally, while you can also run the same code in AskAnna. When you add project variables, these variables will become available as environment variables in the run environment.

Locally, you can add a file .env. When you run the Python code locally, the environment variables are loaded from this file. Read more about this on the project page of python-dotenv.

To run the above example, you need a .env file with:

AWS_ACCESS_KEY_ID={ACCESS_KEY}
AWS_SECRET_ACCESS_KEY={SECRET_KEY}

For how to get the values of these variables, see Get access keys for Amazon.

More examples

In the Boto3 documentation, you can find more Amazon S3 examples.

Add AskAnna project variables

To run the above examples as a job in AskAnna, you should add project variables. On the project page, go to the tab variables. Here you can create new variables. To run the above example, you should add variables with names and corresponding values:

AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY

For how to get the values of these variables, see Get access keys for Amazon.

Warning

Make sure to set the variables to masked. You don't want to expose these values.