Managing access to your Amazon S3 data is crucial for ensuring security and efficiency in your cloud architecture. You typically have three options for managing access to S3 data for client applications: using Amazon CloudFront distribution, leveraging S3 presigned URLs, or routing through backend APIs.
In this blog post, we will delve into S3 presigned URLs as an effective method for implementing tiered access to your S3 data. We’ll discuss their benefits, limitations, and provide a straightforward example to illustrate their use.
What Are S3 Presigned URLs?
S3 presigned URLs are a feature of the S3 service that allows you to grant temporary access to your S3 data without modifying S3 bucket policies. When access to an S3 object is requested, a URL is generated with an authorization token. This URL can be used by any client to upload or retrieve data from the specified S3 location.
Benefits and Limitations
Benefits:
Programmatic Access: Access is granted programmatically, eliminating the need for updating infrastructure
Flexibility: Authorization checks are handled by your backend API, which issues the presigned URL.
Offloading Backend Tasks: AWS manages data downloads/uploads, reducing the load on your backend.
Cost Efficiency: No need for setting up and maintaining CloudFront distribution.
Reusability: The URL can be reused multiple times within its validity period.
Expiration Control: URLs can be configured with expiration times to limit access duration.
Data Integrity: Support for checksums ensures data integrity during transfers.
Limitations:
Only File-Specific Access: Access is limited to single files; folder access is not supported.
Operation Constraints: Only PUT and GET operations are supported.
URL Changes: URLs are dynamically generated, which may require adjustments in your application to handle them.
Architecture
Fig 1. Presigned Urls Architecture
To implement presigned URLs, you need an API that issues these URLs. This API is responsible for authenticating users and determining if they are authorized to access the requested object.
User requests PUT/GET access to s3 object. The API authenticates the user and verifies their permissions for the requested S3 object.
If the user is authorized, the backend generates a presigned URL with PUT or GET permissions for the object and returns it to the client.
If the user is not authorized, the backend returns a 403 Forbidden status code.
The service responsible for issuing presigned URLs, such as an AWS Lambda function, must have the corresponding permissions to access the S3 bucket.
User can then use the generated presigned URL to interact with the S3 bucket using GET/PUT HTTP methods.
Example: Tiered S3 Access With Presigned Urls
In this example, we will deploy the sample architecture shown in Fig 1 and simulate tiered access to the S3 bucket.
You will need programmatic access to your AWS account and have AWS SAM installed.
Api With Cognito Authorizer
First, let’s define the API Gateway with a Cognito authorizer. To simplify the process, we will add two users directly into the SAM template. However, note that this approach is not recommended for a production environment.
Different access levels will be granted to different users. The first user will have access to both tier one and tier two content, while the second user will have access only to tier one content.
importjsonimportosimportreimportboto3frombotocore.exceptionsimportClientErrordefhandler(event:dict,context:object)->dict:key=event["queryStringParameters"]["key"]username=event["requestContext"]["authorizer"]["claims"]["email"]# access only to tier 1 and tier 2 folders permittedifnotre.match(r"^(tier 1|tier 2)/.*jpg$",key):return{"statusCode":403,"body":json.dumps({"message":"Not authorized to access non tier folders."}),}# limit first user to tier 1 contentifkey.startswith("tier 2")andusername!=os.environ["TIER_TWO_USERNAME"]:return{"statusCode":403,"body":json.dumps({"message":"Not authorized to access tier 2 data"}),}try:# generate pre signed urls3_client=boto3.client("s3")presigned_url=s3_client.generate_presigned_url("get_object",Params={"Bucket":os.environ["BUCKET_NAME"],"Key":key},ExpiresIn=3600,)return{"statusCode":200,"body":json.dumps({"url":presigned_url}),}exceptClientErrorase:print(f"Error: {str(e)}")return{"statusCode":500,"message":"Internal server error"}
To simplify the process, we will define the Lambda function directly in the SAM template.
Transform:AWS::Serverless-2016-10-31Parameters:CognitoUserOneEmail:Description:Email address of the first created userType:StringCognitoUserTwoEmail:Description:Email address of the second created userType:StringStageName:Description:Api stage nameType:StringDefault:"dev"Resources:MediaBucket:Type:AWS::S3::BucketProperties:{}MyApi:Type:AWS::Serverless::ApiProperties:StageName:!Ref StageNameCors:AllowMethods:"'*'"AllowHeaders:"'*'"AllowOrigin:"'*'"Auth:Authorizers:CognitoAuthorizer:UserPoolArn:!GetAtt UserPool.ArnUserPool:Type:AWS::Cognito::UserPoolProperties:Policies:PasswordPolicy:MinimumLength:8RequireLowercase:trueRequireNumbers:trueRequireSymbols:trueRequireUppercase:trueUsernameAttributes:- emailSchema:- AttributeDataType:StringName:emailRequired:falseUserPoolClient:Type:AWS::Cognito::UserPoolClientProperties:UserPoolId:!Ref UserPoolGenerateSecret:falseExplicitAuthFlows:- ALLOW_USER_PASSWORD_AUTH- ALLOW_REFRESH_TOKEN_AUTHUserPoolUserOne:Type:AWS::Cognito::UserPoolUserProperties:DesiredDeliveryMediums:- EMAILUsername:!Ref CognitoUserOneEmailUserPoolId:!Ref UserPoolUserPoolUserTwo:Type:AWS::Cognito::UserPoolUserProperties:DesiredDeliveryMediums:- EMAILUsername:!Ref CognitoUserTwoEmailUserPoolId:!Ref UserPoolIssuerFunction:Type:AWS::Serverless::FunctionProperties:Runtime:python3.10Handler:index.handlerPolicies:- S3CrudPolicy:BucketName:!Ref MediaBucketEvents:ApiEvent:Type:ApiProperties:RestApiId:!Ref MyApiPath:/presignedurlMethod:GETRequestParameters:- method.request.querystring.key:Required:trueCaching:falseAuth:Authorizer:CognitoAuthorizerEnvironment:Variables:BUCKET_NAME:!Ref MediaBucketTIER_ONE_USERNAME:!Ref UserPoolUserOneTIER_TWO_USERNAME:!Ref UserPoolUserTwoInlineCode:| import json
import os
import re
import boto3
from botocore.exceptions import ClientError
def handler(event: dict, context: object) -> dict:
key = event["queryStringParameters"]["key"]
username = event["requestContext"]["authorizer"]["claims"]["email"]
# access only to tier 1 and tier 2 folders permitted
if not re.match(r"^(tier 1|tier 2)/.*jpg$", key):
return {
"statusCode": 403,
"body": json.dumps(
{"message": "Not authorized to access non tier folders."}
),
}
#
if key.startswith("tier 2") and username != os.environ["TIER_TWO_USERNAME"]:
return {
"statusCode": 403,
"body": json.dumps({"message": "Not authorized to access tier 2 data"}),
}
try:
# generate pre signed url
s3_client = boto3.client("s3")
presigned_url = s3_client.generate_presigned_url(
"get_object",
Params={"Bucket": os.environ["BUCKET_NAME"], "Key": key},
ExpiresIn=3600,
)
return {
"statusCode": 200,
"body": json.dumps({"url": presigned_url}),
}
except ClientError as e:
print(f"Error: {str(e)}")
return {"statusCode": 500, "message": "Internal server error"}Outputs:MyApiUrl:Description:Url of api gatewayValue:!Sub "https://${MyApi}.execute-api.${AWS::Region}.amazonaws.com/${StageName}"CognitoUserPoolClientId:Description:Cognito user pool client idValue:!Ref UserPoolClientUserPoolId:Description:User pool idValue:!GetAtt UserPool.UserPoolIdBucketName:Description:Bucket nameValue:!Ref MediaBucket
Assign valid email addresses to both users. These addresses must be functional since temporary passwords will be sent to them. Proceed to deploy the infrastructure.
1
2
3
4
5
6
7
8
9
10
11
# set your email hereSTACK_NAME="s3-pre-signed-urls"COGNITO_USER_ONE_EMAIL="me+user1@example.com"COGNITO_USER_TWO_EMAIL="me+user2@example.com"# deploy the stacksam deploy \
--parameter-overrides CognitoUserOneEmail=$COGNITO_USER_ONE_EMAILCognitoUserTwoEmail=$COGNITO_USER_TWO_EMAIL\
--capabilities CAPABILITY_IAM \
--stack-name $STACK_NAME\
-t template.sam.yaml
Lets copy cute cat to tier one and super cute cat to tier two folder in the s3 bucket.
Set the authorization token and test access for both tiers. If you have access, test the presigned URLs.
1
2
3
4
5
6
7
8
9
10
11
# send token as part of the Authorization header when requesting resources.curl -G -H "Authorization: Bearer $TOKEN_ID" --data-urlencode "key=tier 1/cute-cat.jpg""$API_URL/presignedurl"TIER1_GET_URL=$(curl -G -H "Authorization: Bearer $TOKEN_ID" --data-urlencode "key=tier 1/cute-cat.jpg""$API_URL/presignedurl"| jq -r '.url')curl -G -L --output cute-cat.jpg $TIER1_GET_URLcurl -G -H "Authorization: Bearer $TOKEN_ID" --data-urlencode "key=tier 2/super-cute-cat.jpg""$API_URL/presignedurl"# if authorized download the imageTIER2_GET_URL=$(curl -G -H "Authorization: Bearer $TOKEN_ID" --data-urlencode "key=tier 2/super-cute-cat.jpg""$API_URL/presignedurl"| jq -r '.url')curl -G -L --output super-cute-cat.jpg $TIER2_GET_URL
Repeat the same for user two.
Cleanup
When you are done testing, clean up the resources.
1
2
3
4
5
# delete files in s3 bucketaws s3 rm s3://$BUCKET_NAME --recursive
# delete the stacksam delete --stack-name $STACK_NAME
Conclusion
Leveraging S3 presigned URLs can streamline access management and offload operational tasks to AWS, allowing you to focus on building robust and scalable applications.