What Is the Best Way to Programmatically Download a .csv File From an S3 Bucket Using Python
How to Upload And Download Files From AWS S3 Using Python (2022)
Learn how to use cloud resources in your Python scripts
I am writing this post out of sheer frustration.
Every post I've read on this topic assumed that I already had an account in AWS, an S3 bucket, and a mound of stored data. They simply show the code merely kindly shadow over the most important part — making the lawmaking work through your AWS account.
Well, I could've figured out the lawmaking easily, thank you very much. I had to sift through many SO threads and the AWS docs to go rid of every nasty authentication error forth the way.
Then that y o u won't feel the same and do the difficult work, I volition share all the technicalities of managing an S3 bucket programmatically, correct from account creation to adding permissions to your local auto to access your AWS resource.
Step 1: Setup an account
Right, allow's start with creating your AWS business relationship if you haven't already. Zilch unusual, but follow the steps from this link:
Then, we will become to the AWS IAM (Identity and Access Management) console, where we will be doing most of the piece of work.
You can hands switch between dissimilar AWS servers, create users, add policies, and allow access to your user account from the console. We volition do each one past i.
Step ii: Create a user
For one AWS account, you can create multiple users, and each user can accept diverse levels of access to your account'due south resource. Permit's create a sample user for this tutorial:
In the IAM console:
- Get to the Users tab.
- Click on Add users.
- Enter a username in the field.
- Tick the "Access fundamental — Programmatic access field" (essential).
- Click "Next" and "Attach existing policies directly."
- Tick the "AdministratorAccess" policy.
- Click "Next" until you encounter the "Create user" button
- Finally, download the given CSV file of your user's credentials.
Information technology should look like this:
Shop information technology somewhere safety because we volition be using the credentials after.
Step 3: Create a bucket
Now, let'south create an S3 bucket where we tin can store data.
In the IAM console:
- Click services in the tiptop left corner.
- Gyre downwards to storage and select S3 from the correct-hand listing.
- Click "Create bucket" and give it a name.
You can choose any region you want. Leave the rest of the settings and click "Create bucket" again.
Stride iv: Create a policy and add information technology to your user
In AWS, access is managed through policies. A policy tin can be a set of settings or a JSON file attached to an AWS object (user, resource, group, roles), and information technology controls what aspects of the object you can use.
Below, we will create a policy that enables united states to interact with our saucepan programmatically — i.e., through the CLI or in a script.
In the IAM console:
- Go to the Policies tab and click "Create a policy."
- Click the "JSON" tab and insert the code below:
replacing your-saucepan-proper noun with your ain. If you pay attending, in the Action field of the JSON, we are putting s3:*
to allow any interaction to our bucket. This is very broad, and so you may only allow specific deportment. In that case, check out this page of the AWS docs to acquire to limit access.
This policy is only fastened to the saucepan, and nosotros should connect it to the user likewise so that your API credentials work correctly. Hither are the instructions:
In the IAM console:
- Go to the Users tab and click on the user we created in the terminal department.
- Click the "Add together permissions" button.
- Click the "Attach existing policies" tab.
- Filter them past the policy we just created.
- Tick the policy, review information technology and click "Add" the final time.
Step v: Download AWS CLI and configure your user
We download the AWS control-line tool considering it makes hallmark and so much easier. Kindly go to this folio and download the executable for your platform:
Run the executable and reopen any active final sessions to let the changes accept effect. And so, type aws configure
:
Insert your AWS Cardinal ID and Undercover Admission Key, forth with the region you created your saucepan in (employ the CSV file). You tin discover the region name of your saucepan on the S3 page of the panel:
Just click "Enter" when yous attain the Default Output Format field in the configuration. There won't be any output.
Step half-dozen: Upload your files
We are nearly there.
Now, we upload a sample dataset to our bucket so that we tin download information technology in a script later:
Information technology should be piece of cake one time you go to the S3 folio and open your saucepan.
Step 7: Check if authentication is working
Finally, pip install the Boto3 package and run this snippet:
If the output contains your bucket name(s), congratulations — yous now have total admission to many AWS services through boto3
, not just S3.
Using Python Boto3 to download files from the S3 bucket
With the Boto3 package, you have programmatic access to many AWS services such equally SQS, EC2, SES, and many aspects of the IAM panel.
However, equally a regular data scientist, y'all volition mostly need to upload and download data from an S3 bucket, so nosotros will only cover those operations.
Allow's start with the download. After importing the package, create an S3 grade using the client
function:
To download a file from an S3 bucket and immediately salve information technology, we tin can use the download_file
function:
There won't be whatever output if the download is successful. You should pass the exact file path of the file to be downloaded to the Key
parameter. The Filename
should contain the pass you want to save the file to.
Uploading is also very straightforward:
The role is upload_file
and you only have to change the lodge of the parameters from the download function.
Decision
I advise reading the Boto3 docs for more avant-garde examples of managing your AWS resources. It covers services other than S3 and contains lawmaking recipes for the almost common tasks with each i.
Cheers for reading!
Yous can go a premium Medium member using the link beneath and become access to all of my stories and thousands of others:
Or just subscribe to my electronic mail listing:
You tin can achieve out to me on LinkedIn or Twitter for a friendly chat about all things data. Or you can simply read another story from me. How virtually these:
DOWNLOAD HERE
Posted by: myershimince.blogspot.com