1. Introduction
In today’s software landscape, interacting with cloud storage services such as Amazon Simple Storage Service (S3) has become a fundamental aspect of many applications. One common requirement is downloading files stored in S3 using a provided URL.
In this article, we’ll explore a simplified approach to achieve this using Java, Spring Boot, and the AWS SDK for Java.
2. Setup
First, we need to have configured our AWS credentials to access the S3 bucket. This can be done in several ways. For development purposes, we can set our credentials in the application.properties file:
aws.accessKeyId= <your_access_key_id>
aws.secretKey= <your_secret_access_key>
aws.region= <your_region>
Next, let’s include the AWS S3 maven dependency:
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>s3</artifactId>
<version>${amazon.s3.version}</version>
</dependency>
3. Configuring S3 Client
An S3 client typically refers to software or a library that allows users to interact with Amazon S3. Since we’re using AWS Java SDK, we’ll create the S3 client using Java AWS SDK using the provided API:
S3Client s3Client = S3Client.builder()
.region(Region.US_EAST_1)
.credentialsProvider(DefaultCredentialsProvider.create())
.build();
The S3 client handles authentication and authorization when interacting with Amazon S3. It uses the credentials provided to authenticate requests to the S3 service. In this case, we’re configuring the client with the default credentials provider. This typically looks for credentials in environment variables or a shared credentials file created in our prerequisite setup.
4. Defining Download Service
Let’s define a service that interacts with S3Client for downloads.
Firstly, let’s start with defining the method contract for the service:
public interface FileService {
FileDownloadResponse downloadFile(String s3url) throws IOException, S3Exception;
}
Now, let’s go through the download steps in sequence.
4.1. Extracting Key And Bucket from URL
In this step, let’s focus on extracting essential information from a valid S3 URL, specifically the bucket name and the object key.
Let’s suppose that we have a file stored in an S3 bucket called “baeldung” with the following path: “path/to/my/s3article.txt“. This represents a hierarchical structure within the S3 bucket, where the object “s3article.txt” is nested within directories “path“, “to“, and “my“.
To extract this information programmatically, we’ll decode the S3 URL into its components using Java’s URI class. Then, we’ll separate the hostname (the bucket name) and the path (the object key):
URI uri = new URI(s3Url);
String bucketName = uri.getHost();
String objectKey = uri.getPath()
.substring(1);
Considering the earlier example, we’ll have the URI “s3://baeldung/path/to/my/s3article.txt“, we’ll extract the bucket name to be “baeldung“. The object key, representing the path within the bucket, would be “path/to/my/s3article.txt“. Importantly here, By using substring(1), we’ll remove the leading “/” character, resulting in the object key being “path/to/my/s3article.txt, which is the desired format for S3 object keys.
In summary, here we can identify the location of the file within the S3 bucket, enabling us to construct requests and perform operations on the desired object next.
4.2. Building GetObjectRequest
Now, let’s build a GetObjectRequest using the AWS SDK:
GetObjectRequest getObjectRequest = GetObjectRequest.builder()
.bucket(bucketName)
.key(objectKey)
.build();
GetObjectRequest has the information needed to retrieve an object from S3, such as the bucket name and the key of the object to retrieve. It also allows developers to specify additional parameters like version ID, range, response headers, etc., to customize the behavior of the object retrieval process.
4.3. Sending GetObjectRequest
With the GetObjectRequest prepared, we’ll send it to Amazon S3 using the configured S3Client to retrieve the object data:
ResponseInputStream<GetObjectResponse> responseInputStream = s3ResponseReader.readResponse(getObjectRequest);
GetObjectResponse getObjectResponse = responseInputStream.response();
4.4. Response Data and MetaData
After receiving the response from Amazon S3 as ResponseInputStream<GetObjectResponse>, we’ll extract the file content with the associated metadata.
Let’s extract the file content as a byte array first:
byte[] fileContent = IOUtils.toByteArray(responseInputStream);
Next, let’s inspect some of the useful metadata using the response:
// Get object metadata
String contentType = getObjectResponse.contentType();
String contentDisposition = getObjectResponse.contentDisposition()
String key = getObjectRequest.key();
String filename = extractFilenameFromKey(key);
String originalFilename = contentDisposition == null ? filename : contentDisposition.substring(contentDisposition.indexOf("=")+1);
Some metadata will be required when sending the response back to the client. We’ll create an abstraction FileDownloadResponse that encapsulates file content as bytes, contentType, and originalFilename:
@Builder
@Data
@RequiredArgsConstructor
public class FileDownloadResponse {
private final byte[] fileContent;
private final String originalFilename;
private final String contentType;
}
If we wish to perform integration tests, we can consider mocking S3 using Test Containers.
5. Conclusion
In this article, we’d a quick look at how we can download a file from S3 using the provided URL. We used AWS Java SDK consisting of S3 Client to enable users to access and download S3 resources securely. It simplifies managing S3 buckets and objects by providing a convenient and consistent API for performing various operations on S3 resources.
As always, the full implementation of this article can be found over on GitHub.