Java SDK for SQL API of Azure Cosmos DB
- Consuming the official Microsoft Azure Cosmos DB Java SDK
- Prerequisites
- API Documentation
- Usage Code Sample
- Guide for Prod
- Future, CompletableFuture, and ListenableFuture
- Checking out the Source Code
- FAQ
- Release changes
- Contribution and Feedback
- License
Consuming the official Microsoft Azure Cosmos DB Java SDK
This project provides a SDK library in Java for interacting with SQL API of Azure Cosmos DB Database Service. This project also includes samples, tools, and utilities.
Jar dependency binary information for maven and gradle can be found here at maven.
For example, using maven, you can add the following dependency to your maven pom file:
<dependency>
<groupId>com.microsoft.azure</groupId>
<artifactId>azure-cosmosdb</artifactId>
<version>2.6.11</version>
</dependency>
Useful links:
- Sample Get Started APP
- Introduction to Resource Model of Azure Cosmos DB Service
- Introduction to SQL API of Azure Cosmos DB Service
- SDK JavaDoc API
- RxJava Observable JavaDoc API
- SDK FAQ
Prerequisites
- Java Development Kit 8
- An active Azure account. If you don't have one, you can sign up for a free account. Alternatively, you can use the Azure Cosmos DB Emulator for development and testing. As emulator https certificate is self signed, you need to import its certificate to java trusted cert store as explained here
- (Optional) SLF4J is a logging facade.
- (Optional) SLF4J binding is used to associate a specific logging framework with SLF4J.
- (Optional) Maven
SLF4J is only needed if you plan to use logging, please also download an SLF4J binding which will link the SLF4J API with the logging implementation of your choice. See the SLF4J user manual for more information.
API Documentation
Javadoc is available here.
The SDK provide Reactive Extension Observable based async API. You can read more about RxJava and Observable APIs here.
Usage Code Sample
Code Sample for creating a Document:
import com.microsoft.azure.cosmosdb.rx.*;
import com.microsoft.azure.cosmosdb.*;
ConnectionPolicy policy = new ConnectionPolicy();
policy.setConnectionMode(ConnectionMode.Direct);
AsyncDocumentClient asyncClient = new AsyncDocumentClient.Builder()
.withServiceEndpoint(HOST)
.withMasterKeyOrResourceToken(MASTER_KEY)
.withConnectionPolicy(policy)
.withConsistencyLevel(ConsistencyLevel.Eventual)
.build();
Document doc = new Document(String.format("{ 'id': 'doc%d', 'counter': '%d'}", 1, 1));
Observable<ResourceResponse<Document>> createDocumentObservable =
asyncClient.createDocument(collectionLink, doc, null, false);
createDocumentObservable
.single() // we know there will be one response
.subscribe(
documentResourceResponse -> {
System.out.println(documentResourceResponse.getRequestCharge());
},
error -> {
System.err.println("an error happened: " + error.getMessage());
});
We have a get started sample app available here.
Also We have more examples in form of standalone unit tests in examples project.
Guide for Prod
To achieve better performance and higher throughput there are a few tips that are helpful to follow:
Use Appropriate Scheduler (Avoid stealing Eventloop IO Netty threads)
SDK uses netty for non-blocking IO. The SDK uses a fixed number of IO netty eventloop threads (as many CPU cores your machine has) for executing IO operations.
The Observable returned by API emits the result on one of the shared IO eventloop netty threads. So it is important to not block the shared IO eventloop netty threads. Doing CPU intensive work or blocking operation on the IO eventloop netty thread may cause deadlock or significantly reduce SDK throughput.
For example the following code executes a cpu intensive work on the eventloop IO netty thread:
Observable<ResourceResponse<Document>> createDocObs = asyncDocumentClient.createDocument(
collectionLink, document, null, true);
createDocObs.subscribe(
resourceResponse -> {
//this is executed on eventloop IO netty thread.
//the eventloop thread is shared and is meant to return back quickly.
//
// DON'T do this on eventloop IO netty thread.
veryCpuIntensiveWork();
});
After result is received if you want to do CPU intensive work on the result you should avoid doing so on eventloop IO netty thread. You can instead provide your own Scheduler to provide your own thread for running your work.
import rx.schedulers;
Observable<ResourceResponse<Document>> createDocObs = asyncDocumentClient.createDocument(
collectionLink, document, null, true);
createDocObs.subscribeOn(Schedulers.computation())
subscribe(
resourceResponse -> {
// this is executed on threads provided by Scheduler.computation()
// Schedulers.computation() should be used only the work is cpu intensive and you are not doing blocking IO, thread sleep, etc. in this thread against other resources.
veryCpuIntensiveWork();
});
Based on the type of your work you should use the appropriate existing RxJava Scheduler for your work. Please read here Schedulers
.
Disable netty's logging
Netty library logging is very chatty and need to be turned off (suppressing log in the configuration may not be enough) to avoid additional CPU costs. If you are not in debugging mode disable netty's logging altogether. So if you are using log4j to remove the additional CPU costs incurred by org.apache.log4j.Category.callAppenders()
from netty add the following line to your codebase:
org.apache.log4j.Logger.getLogger("io.netty").setLevel(org.apache.log4j.Level.OFF);
OS Open files Resource Limit
Some Linux systems (like Redhat) have an upper limit on the number of open files and so the total number of connections. Run the following to view the current limits:
ulimit -a
The number of open files (nofile) need to be large enough to have enough room for your configured connection pool size and other open files by the OS. It can be modified to allow for a larger connection pool size.
Open the limits.conf file:
vim /etc/security/limits.conf
Add/modify the following lines:
* - nofile 100000
Use native SSL implementation for netty
Netty can use OpenSSL directly for SSL implementation stack to achieve better performance. In the absence of this configuration netty will fall back to Java's default SSL implementation.
on Ubuntu:
sudo apt-get install openssl
sudo apt-get install libapr1
and add the following dependency to your project maven dependencies:
<dependency>
<groupId>io.netty</groupId>
<artifactId>netty-tcnative</artifactId>
<version>2.0.20.Final</version>
<classifier>linux-x86_64</classifier>
</dependency>
For other platforms (Redhat, Windows, Mac, etc) please refer to the instructions on netty's wiki.
Using system properties to modify default Direct TCP options
We have added the ability to modify the default Direct TCP options utilized by the SDK. In priority order we will take default Direct TCP options from:
-
The JSON value of system property
azure.cosmos.directTcp.defaultOptions
. Example:java -Dazure.cosmos.directTcp.defaultOptions={\"idleEndpointTimeout\":\"PT24H\"} -jar target/cosmosdb-sdk-testing-1.0-jar-with-dependencies.jar Direct 10 0 Read
-
The contents of the JSON file located by system property
azure.cosmos.directTcp.defaultOptionsFile
. Example:java -Dazure.cosmos.directTcp.defaultOptionsFile=/path/to/default/options/file -jar Direct 10 0 Query
-
The contents of the JSON resource file named
azure.cosmos.directTcp.defaultOptions.json
. Specifically, the resource file is read from this stream:RntbdTransportClient.class.getClassLoader().getResourceAsStream("azure.cosmos.directTcp.defaultOptions.json")
Example: Contents of resource file
azure.cosmos.directTcp.defaultOptions.json
.{ "bufferPageSize": 8192, "connectionTimeout": "PT1M", "idleChannelTimeout": "PT0S", "idleEndpointTimeout": "PT1M10S", "maxBufferCapacity": 8388608, "maxChannelsPerEndpoint": 10, "maxRequestsPerChannel": 30, "receiveHangDetectionTime": "PT1M5S", "requestExpiryInterval": "PT5S", "requestTimeout": "PT1M", "requestTimerResolution": "PT0.5S", "sendHangDetectionTime": "PT10S", "shutdownTimeout": "PT15S" }
Values that are in error are ignored.
Common Perf Tips
There is a set of common perf tips written for our sync SDK. The majority of them also apply to the async SDK. It is available here.
Future, CompletableFuture, and ListenableFuture
The SDK provide Reactive Extension (Rx) Observable based async API.
RX API has advantages over Future based APIs. But if you wish to use Future
you can translate Observables to Java native Futures:
// You can convert an Observable to a ListenableFuture.
// ListenableFuture (part of google guava library) is a popular extension
// of Java's Future which allows registering listener callbacks:
// https://github.com/google/guava/wiki/ListenableFutureExplained
import rx.observable.ListenableFutureObservable;
Observable<ResourceResponse<Document>> createDocObservable = asyncClient.createDocument(
collectionLink, document, null, false);
// NOTE: if you are going to do CPU intensive work
// on the result thread consider changing the scheduler see Use Proper Scheduler
// (Avoid Stealing Eventloop IO Netty threads) section
ListenableFuture<ResourceResponse<Document>> listenableFuture =
ListenableFutureObservable.to(createDocObservable);
ResourceResponse<Document> rrd = listenableFuture.get();
For this to work you will need RxJava Guava library dependency. More information available on RxJavaGuava's GitHub.
Checking out the Source Code
The SDK is open source and is available here sdk.
Clone the Repo
git clone https://github.com/Azure/azure-cosmosdb-java.git
cd azure-cosmosdb-java
How to Build from Command Line
Run the following maven command to build:
maven clean package -DskipTests
Running Tests from Command Line
Running tests require Azure Cosmos DB Endpoint credentials:
mvn test -DACCOUNT_HOST="https://REPLACE_ME_WITH_YOURS.documents.azure.com:443/" -DACCOUNT_KEY="REPLACE_ME_WITH_YOURS"
Import into Intellij or Eclipse
-
Load the main parent project pom file in Intellij/Eclipse (That should automatically load examples).
-
For running the samples you need a proper Azure Cosmos DB Endpoint. The endpoints are picked up from TestConfigurations.java. There is a similar endpoint config file for the sdk tests here.
-
You can pass your endpoint credentials as VM Arguments in Eclipse JUnit Run Config:
-DACCOUNT_HOST="https://REPLACE_ME.documents.azure.com:443/" -DACCOUNT_KEY="REPLACE_ME"
-
or you can simply put your endpoint credentials in TestConfigurations.java
-
The SDK tests are written using TestNG framework, if you use Eclipse you may have to add TestNG plugin to your eclipse IDE as explained here. Intellij has builtin support for TestNG.
-
Now you can run the tests in your Intellij/Eclipse IDE.
FAQ
We have a frequently asked questions which is maintained here.
Release changes
Release changelog is available here.
Contribution and Feedback
This is an open source project and we welcome contributions.
If you would like to become an active contributor to this project please follow the instructions provided in Azure Projects Contribution Guidelines.
We have travis build CI which should pass for any PR.
If you encounter any bugs with the SDK please file an issue in the Issues section of the project.
License
MIT License Copyright (c) 2018 Copyright (c) Microsoft Corporation
Attribution
This project includes the MurmurHash3 algorithm, which came with the following notice: “The MurmurHash3 algorithm was created by Austin Appleby and placed in the public domain. * This java port was authored by Yonik Seeley and also placed into the public domain. * The author hereby disclaims copyright to this source code.”