# AWS S3

The AWS S3 connector provides Apache Pekko Stream sources and sinks to connect to [Amazon S3](https://aws.amazon.com/s3/).
S3 stands for Simple Storage Service and is an object storage service with a web service interface.

@@project-info{ projectId="s3" }

## Artifacts

@@dependency [sbt,Maven,Gradle] {
  group=org.apache.pekko
  artifact=pekko-connectors-s3_$scala.binary.version$
  version=$project.version$
  symbol2=PekkoVersion
  value2=$pekko.version$
  group2=org.apache.pekko
  artifact2=pekko-stream_$scala.binary.version$
  version2=PekkoVersion
  symbol3=PekkoHttpVersion
  value3=$pekko-http.version$
  group3=org.apache.pekko
  artifact3=pekko-http_$scala.binary.version$
  version3=PekkoHttpVersion
  group4=org.apache.pekko
  artifact4=pekko-http-xml_$scala.binary.version$
  version4=PekkoHttpVersion
}

The table below shows direct dependencies of this module and the second tab shows all libraries it depends on transitively.

@@dependencies { projectId="s3" }

## Configuration

The settings for the S3 connector are read by default from `pekko.connectors.s3` configuration section.
Credentials are loaded as described in the @javadoc[DefaultCredentialsProvider](software.amazon.awssdk.auth.credentials.DefaultCredentialsProvider) documentation.
Therefore, if you are using Apache Pekko Connectors S3 connector in a standard environment, no configuration changes should be necessary.
However, if you use a non-standard configuration path or need multiple different configurations, please refer to @ref[the attributes section below](s3.md#apply-s3-settings-to-a-part-of-the-stream) to see how to apply different configuration to different parts of the stream.
All of the available configuration settings can be found in the @github[reference.conf](/s3/src/main/resources/reference.conf).

## Store a file in S3

A file can be uploaded to S3 by creating a source of @apidoc[org.apache.pekko.util.ByteString] and running that with a sink created from @apidoc[S3.multipartUpload](S3$).

Scala
: @@snip [snip](/s3/src/test/scala/docs/scaladsl/S3SinkSpec.scala) { #upload }

Java
: @@snip [snip](/s3/src/test/java/docs/javadsl/S3Test.java) { #upload }

## Download a file from S3

A source for downloading a file can be created by calling @apidoc[S3.download](S3$).
It will emit an @scala[`Option`]@java[`Optional`] that will hold file's data and metadata or will be empty if no such file can be found.

Scala
: @@snip [snip](/s3/src/test/scala/docs/scaladsl/S3SourceSpec.scala) { #download }

Java
: @@snip [snip](/s3/src/test/java/docs/javadsl/S3Test.java) { #download }

In order to download a range of a file's data you can use overloaded method which
additionally takes `ByteRange` as argument.

Scala
: @@snip [snip](/s3/src/test/scala/docs/scaladsl/S3SourceSpec.scala) { #rangedDownload }

Java
: @@snip [snip](/s3/src/test/java/docs/javadsl/S3Test.java) { #rangedDownload }

File metadata (@apidoc[ObjectMetadata](org.apache.pekko.stream.connectors.s3.ObjectMetadata)) holds content type, size and other useful information about the object.
Here's an example of using this metadata to stream an object back to a client in Apache Pekko Http.

Scala
: @@snip [snip](/s3/src/test/scala/docs/scaladsl/S3SourceSpec.scala) { #downloadToPekkoHttp }

Java
: @@snip [snip](/s3/src/test/java/docs/javadsl/S3Test.java) { #downloadToPekkoHttp }

## Access object metadata without downloading object from S3

If you do not need the object itself, you can query for only object metadata using a source from @apidoc[S3.getObjectMetadata](S3$).

Scala
: @@snip [snip](/s3/src/test/scala/docs/scaladsl/S3SourceSpec.scala) { #objectMetadata }

Java
: @@snip [snip](/s3/src/test/java/docs/javadsl/S3Test.java) { #objectMetadata }

## List bucket contents

To get a list of all objects in a bucket, use @apidoc[S3.listBucket](S3$).
When run, this will give a stream of @scaladoc[ListBucketResultContents](org.apache.pekko.stream.connectors.s3.ListBucketResultContents).

Scala
: @@snip [snip](/s3/src/test/scala/docs/scaladsl/S3SourceSpec.scala) { #list-bucket }

Java
: @@snip [snip](/s3/src/test/java/docs/javadsl/S3Test.java) { #list-bucket }

## List bucket contents and common prefixes

To get a list of the contents and common prefixes for one hierarchy level using a delimiter, use @scala[@scaladoc[S3.listBucketAndCommonPrefixes](org.apache.pekko.stream.connectors.s3.scaladsl.S3$)]@java[@scaladoc[S3.listBucketAndCommonPrefixes](org.apache.pekko.stream.connectors.s3.javadsl.S3$)].
When run, this will give a tuple stream of (Seq[@apidoc[ListBucketResultContents]], Seq[@apidoc[ListBucketResultCommonPrefixes]]).

Scala
: @@snip [snip](/s3/src/test/scala/docs/scaladsl/S3SourceSpec.scala) { #list-bucket-and-common-prefixes }

Java
: @@snip [snip](/s3/src/test/java/docs/javadsl/S3Test.java) { #list-bucket-and-common-prefixes }

## Copy upload (multi part)

Copy an S3 object from source bucket to target bucket using @scala[@scaladoc[S3.multipartCopy](org.apache.pekko.stream.connectors.s3.scaladsl.S3$)]@java[@scaladoc[S3.multipartCopy](org.apache.pekko.stream.connectors.s3.javadsl.S3$)].
When run, this will emit a single @scaladoc[MultipartUploadResult](org.apache.pekko.stream.connectors.s3.MultipartUploadResult) with the information about the copied object.

Scala
: @@snip [snip](/s3/src/test/scala/docs/scaladsl/S3SinkSpec.scala) { #multipart-copy }

Java
: @@snip [snip](/s3/src/test/java/docs/javadsl/S3Test.java) { #multipart-copy }

If your bucket has versioning enabled, you could have multiple versions of the same object.
By default AWS identifies the current version of the object to copy.
You can optionally specify a specific version of the source object to copy by adding the `sourceVersionId` parameter.

Scala
: @@snip [snip](/s3/src/test/scala/docs/scaladsl/S3SinkSpec.scala) { #multipart-copy-with-source-version }

Java
: @@snip [snip](/s3/src/test/java/docs/javadsl/S3Test.java) { #multipart-copy-with-source-version }

Different options are available for server side encryption in the @scaladoc[ServerSideEncryption](org.apache.pekko.stream.connectors.s3.headers.ServerSideEncryption$) factory.

Scala
: @@snip [snip](/s3/src/test/scala/docs/scaladsl/S3SinkSpec.scala) { #multipart-copy-sse }

Java
: @@snip [snip](/s3/src/test/java/docs/javadsl/S3Test.java) { #multipart-copy-sse }

More S3 specific headers and arbitrary HTTP headers can be specified by adding to the @scaladoc[S3Headers](org.apache.pekko.stream.connectors.s3.S3Headers) container.

## Apply S3 settings to a part of the stream

It is possible to make one part of the stream use different @apidoc[S3Settings$] from the rest of the graph.
This can be useful, when one stream is used to copy files across regions or even different S3 compatible endpoints.
You can attach a custom `S3Settings` instance or a custom config path to a graph using attributes from @apidoc[S3Attributes$]:

Scala
: @@snip [snip](/s3/src/test/scala/docs/scaladsl/S3SourceSpec.scala) { #list-bucket-attributes }

Java
: @@snip [snip](/s3/src/test/java/docs/javadsl/S3Test.java) { #list-bucket-attributes }


## Bucket management

Bucket management API provides functionality for both Sources and Futures / CompletionStages. 
In case of the Future API user can specify attributes to the request in the method itself and as for Sources it can be done via method `.withAttributes`.
For more information about attributes see: @apidoc[S3Attributes$] and @apidoc[Attributes](org.apache.pekko.stream.Attributes)

### Make bucket
In order to create a bucket in S3 you need to specify its unique name. This value has to be set accordingly to the [requirements](https://docs.aws.amazon.com/AmazonS3/latest/dev/BucketRestrictions.html#bucketnamingrules).
The bucket will be created in the region specified in the settings.

Scala
: @@snip [snip](/s3/src/test/scala/docs/scaladsl/S3SourceSpec.scala) { #make-bucket }

Java
: @@snip [snip](/s3/src/test/java/docs/javadsl/S3Test.java) { #make-bucket }


### Delete bucket
To delete a bucket you need to specify its name and the bucket needs to be empty.

Scala
: @@snip [snip](/s3/src/test/scala/docs/scaladsl/S3SourceSpec.scala) { #delete-bucket }

Java
: @@snip [snip](/s3/src/test/java/docs/javadsl/S3Test.java) { #delete-bucket }


### Check if bucket exists
It is possible to check if a bucket exists and the user has rights to perform a `listBucket` operation.

There are 3 possible outcomes:

- The user has access to the existing bucket, then it will return `AccessGranted`
- The user doesn't have access but the bucket exists so `AccessDenied` will be returned
- The bucket doesn't exist, the method will return `NotExists`

Scala
: @@snip [snip](/s3/src/test/scala/docs/scaladsl/S3SourceSpec.scala) { #check-if-bucket-exists }

Java
: @@snip [snip](/s3/src/test/java/docs/javadsl/S3Test.java) { #check-if-bucket-exists }


## Running the example code

The code in this guide is part of runnable tests of this project. You are welcome to edit the code and run it in sbt.

Scala
:   ```
    sbt
    > s3/test
    ```

Java
:   ```
    sbt
    > s3/test
    ```

> Some test code requires [docker](https://www.docker.com/) to be installed and running. Please read either the
> [official instructions](https://www.docker.com/get-started/) or refer to your Linux distro.
