Single Origin provides different ways to achieve low-effort migrations / bootstrapping from other systems by parsing queries in bulk. Bulk importing follows the same process as importing a single SQL, except for three differences:
- If imported view/data entities do not have validation issues or errors that require a user's action, they will automatically be added to production. No drafts will be created.
- In a bulk import we auto-generate names and descriptions for entities. The methodology for generating this information varies by import source, so please see "Naming Entities" below.
- Views and data entities are deduplicated across both:
- the views/data entities in the bulk import
- the views/data entities already in Single Origin.
The bulk import process happens asynchronously, and you can find the summary for an import by clicking the run on your "Imports" page.
While we try to minimize the number of drafts generated by a bulk import, multiple drafts may be created. For example, if your bulk import attempts to create an entity with the same name as an entity already in production, then a draft will be created so that you can give the new entity a unique name.
- only users with the Admin role can run bulk imports
- for BigQuery we require tables to be referenced using
- for BigQuery we only support standard SQL (not Legacy SQL)
- for BigQuery we require that you add a
WHEREclause when querying a table with a wildcard. For more on querying wildcard tables with
_TABLE_SUFFIX, please see this BigQuery documentation
Running and Viewing Bulk Imports
Navigate to Bulk Imports by selecting Manage in the navigation bar and then clicking Imports. From here, administrators can:
- create new bulk imports. Click the "New Import" button in the top right, and select "Import from Collection."
- view a history of previously run imports. Click on a row in the table to see more details about that import.
You can also start a bulk import from the results page for a Query Audit by selecting the "Import Queries" option in the "Actions" menu.
If you are running a bulk import based on a CSV or Compliant Query Table, then we will name the entity based on the first tag of the associated SQL. If you are running a bulk import based on a Query History Table, then we will name the entity based on the insertId column of the associated SQL. Since we require names to be in snake case format, we will remove special characters and numbers from the beginning of the name when necessary; however, in the description for the entity we will keep the original tag (including any special characters and numbers). This way, you can still search for the original tag and find the associated entities.
Finishing a Batch Import
Once you start a batch import from one of the sources above, it will appear on your "Batch Import" history page with a "To Be Confirmed" status. Since batch imports can potentially create hundreds of entities in production at once, you can review the summary statistics of your import before finishing it.
If you want to proceed and finish the import, click "Finish Import." After this, your entities will be added to production!
When you click "Finish Import," you may see a modal that duplicate names are deteced. This happens when an entity named using the "Naming Entities" metholdology described above is the same name as an entity already in production. We require entities to have globally unique names, so on this modal you can either (a) add a suffix to try and make entity names from your import unique, or (b) proceed without adding a suffix. If you choose option (b), then drafts will be created where you can update the name.
- There are a few reasons we may fail to process a query during a batch import, including:
- The SQL query is invalid.
- The SQL query is valid, but it references a table that is in a different project/warehouse than you Single Origin Connector. In this case, we do not have schema metadata for the table.
Updated 3 months ago