BigQuery Maven plugin
This Maven plugin provides goals to create datasets, tables and views in Google BigQuery.
How to use
In your application, add the following plugin to your pom.xml:
<plugin>
<groupId>io.allune</groupId>
<artifactId>bigquery-maven-plugin</artifactId>
<version>1.0.0</version>
<configuration>
...
</configuration>
</plugin>
Supported goals
Goal | Description |
---|---|
bigquery:create |
Creates the dataset, tables and views defined in the plugin configuration. |
bigquery:create-dataset |
Creates the dataset defined in the plugin configuration. |
bigquery:clean |
Removes the dataset, tables and views defined in the plugin configuration. |
bigquery:help |
Displays help information on the plugin. Use |
Example plugin configuration
<plugin>
<groupId>io.allune</groupId>
<artifactId>bigquery-maven-plugin</artifactId>
<version>1.0.0</version>
<configuration>
<projectId>your_project_id</projectId>
<credentialsFile>/credentials.json</credentialsFile>
<datasetName>your_dataset</datasetName>
<dataLocation>EU</dataLocation>
</configuration>
<executions>
<execution>
<id>create</id>
<goals>
<goal>create</goal>
</goals>
<phase>pre-integration-test</phase>
<configuration>
<skip>${skipTests}</skip>
<createDataset>true</createDataset>
<sourceUri>gs://folder/data.json</sourceUri>
<formatOptions>CSV</formatOptions>
<nativeSchemaLocations>file://${project.basedir}/src/main/resources/bigquery/schemas/dir1</nativeSchemaLocations>
<externalSchemaLocations>classpath:/bigquery/schemas/dir2</externalSchemaLocations>
<viewLocations>file://${project.basedir}/src/main/resources/bigquery/views</viewLocations>
</configuration>
</execution>
<execution>
<id>clean</id>
<goals>
<goal>clean</goal>
</goals>
<phase>post-integration-test</phase>
<configuration>
<skip>${skipTests}</skip>
<deleteDataset>true</deleteDataset>
<forceDeleteDataset>true</forceDeleteDataset>
</configuration>
</execution>
</executions>
</plugin>
Schema definition
A JSON schema file consists of a JSON array that contains the following:
-
(Optional) The column’s description
-
The column name
-
The column’s data type
-
(Optional) The column’s mode (if unspecified, mode defaults to NULLABLE)
Table name
The schema file name is used as the table name.
Example
test_table
[
{
"name": "id",
"mode": "NULLABLE",
"type": "STRING"
},
{
"name": "subject",
"mode": "NULLABLE",
"type": "STRING"
},
{
"name": "from",
"mode": "NULLABLE",
"type": "STRING"
},
{
"name": "to",
"mode": "NULLABLE",
"type": "STRING"
},
{
"name": "cc",
"mode": "NULLABLE",
"type": "STRING"
},
{
"name": "body",
"mode": "NULLABLE",
"type": "STRING"
},
{
"name": "time",
"mode": "NULLABLE",
"type": "TIME"
},
{
"name": "timestamp",
"mode": "NULLABLE",
"type": "TIMESTAMP"
},
{
"name": "fields",
"type": "RECORD",
"mode": "REPEATED",
"fields": [
{
"name": "field1",
"type": "STRING",
"mode": "REQUIRED"
},
{
"name": "moreFields",
"type": "RECORD",
"mode": "REPEATED",
"fields": [
{
"name": "field1",
"type": "STRING",
"mode": "REQUIRED"
},
{
"name": "field2",
"type": "STRING",
"mode": "REQUIRED"
}
]
}
]
}
]