Datasource
This section is about operating datasources in Timbr. Datasources are used as the execution engine for all the queries run on an ontology. Timbr connects to a datasource using a JDBC driver, and can theoretically connect to any datasource that has a JDBC driver. The types of datasources that are currently supported out-of-the-box are:
Supported Datasources
Amazon Athenawith driver:com.simba.athena.jdbc.DriverAmazon Redshiftwith driver:com.amazon.redshift.jdbc.DriverApache Drillwith driver:org.apache.drill.jdbc.DriverApache Hivewith driver:org.apache.hive.jdbc.HiveDriverApache Sparkwith driver:org.apache.hive.jdbc.HiveDriverAzure Blob Storagewith JSON connection templateAzure Datalake Storagewith driver:org.apache.drill.jdbc.DriverClickhousewith driver:com.clickhouse.jdbc.ClickHouseDriverDatabrickswith driver:com.simba.spark.jdbc.DriverGoogle BigQuerywith driver:com.simba.googlebigquery.jdbc42.DriverGoogle Cloud Storagewith JSON connection templateMicrosoft SQL Serverwith driver:com.microsoft.sqlserver.jdbc.SQLServerDriverMySQLwith driver:com.mysql.jdbc.DriverOraclewith driver:oracle.jdbc.driver.OracleDriverOracle 11with driver:oracle.jdbc.driver.OracleDriverPostgreSQLwith driver:org.postgresql.DriverPrestowith driver:com.facebook.presto.jdbc.PrestoDriverS3with driver:org.apache.drill.jdbc.DriverSAP Hanawith driver:com.sap.db.jdbc.DriverSnowflakewith driver:net.snowflake.client.jdbc.SnowflakeDriverTrinowith driver:io.trino.jdbc.TrinoDriverVertica Analytics Platformwith driver:com.vertica.jdbc.Driver
Create a datasource
Required information when querying to create a new datasource (the curly
brackets {} should not be an input, they are used only as a
variable substitution):
- {db_name} - The name of the datasource as it will be presented in Timbr
- {target_url} - The JDBC URL of the datasource, usually including a protocol, hostname, port and sometimes database/catalog
- {datasource_type} - The type of the datasource, as its presented in the list of supported datasources above
- {user} - The username credentials for the datasource
- {password} - The password credentials for the datasource
To create a datasource you can run the command:
CREATE DATASOURCE `{db_name}` DESCRIPTION '{description}' OPTIONS(URL '{target_url}', DRIVER '{datasource_driver}', TYPE '{datasource_type}', USER '{user}', PASSWORD '{password}') WITH VIRTUALIZATION
Optional keys are:
- DESCRIPTION '{description}' - Provide a string with description for the datasource
- WITH VIRTUALIZATION - Works only in the cases where you have the Timbr Virtualization Service or if the datasource type is Apache Spark or Databricks. This setting indicates whether the datasource serves as a virtualization engine. If you are using the Timbr Virtualization Service then your datasource type is Apache Spark
Create a datasource based on a json string
Some datasources like S3, Azure Datalake Storage, Google Cloud Storage, Azure Blob Storage require a JSON connection configuration template.
Required information when querying to create a new datasource (the curly brackets {} should not be an input, they are used only as a variable substitution):
- {db_name} - The name of the datasource as it will be presented in Timbr
- {json} - A stringify version of the template JSON used for the connection
- {datasource_type} - The type of the datasource, as its presented in the list of supported datasources above
- {json_secret} - The password credentials for the datasource
If you want to create such datasource requiring a JSON string, you can do it programmatically by running
CREATE DATASOURCE `{db_name}` DESCRIPTION '{description}' OPTIONS(JSON '{json}', TYPE '{datasource_type}', PASSWORD '{json_secret}') WITH VIRTUALIZATION
Optional keys are:
- DESCRIPTION '{description}' - Provide a string with description for the datasource
- WITH VIRTUALIZATION - Works only in the cases where you have the Timbr Virtualization Service or if the datasource type is Apache Spark or Databricks. This setting indicates whether the datasource serves as a virtualization engine. If you are using the Timbr Virtualization Service then your datasource type is Apache Spark
Show all datasources
Shows all the datasources associated with an ontology
SHOW DATASOURCES
Change the description of a datasource
Required information when changing the description of a datasource (the curly brackets {} should not be an input, they are used only as a variable substitution):
- {datasource} - The name of the datasource as it will be presented in Timbr
- {description} - The description given to the datasource
When connected to a current ontology that has the datasource set as active, you can run the command
ALTER DATASOURCE `{datasource}` SET DESCRIPTION = '{description}'
Setting the virtualization of a datasource on / off
If a datasource was created with a WITH VIRTUALIZATION option, you can set it ON or OFF
Required information when setting the virutalization of a datasource (the curly brackets {} should not be an input, they are used only as a variable substitution):
- {datasource} - The name of the datasource as its presented in Timbr
- {value} -
1will set it ON and0will set it OFF
ALTER DATASOURCE `{datasource}` SET VIRTUALIZATION = {value}
Removing a datasource
Removes a datasource from Timbr.
In order to remove a datasource from timbr you need to make sure it is not associated/connected to any ontology. If the datasource is associated/connected to an ontology, you should first remove it from that ontology and then remove it from Timbr.
Required information for removing a datasource (the curly brackets {}
should not be an input, they are used only as a variable
substitution):
- {datasource} - The name of the datasource as its presented in Timbr
DROP DATASOURCE `{datasource}`