Bulk Import
Single Origin provides different ways to achieve low-effort migrations / bootstrapping from other systems by parsing queries in bulk. Bulk importing follows the same process as importing a single SQL, except for three differences:
- If imported view/data entities do not have validation issues or errors that require a user's action, they will automatically be published to your Catalog. No "Pending" entities will be created.
- In a bulk import, we auto-generate names and descriptions for entities. The methodology for generating this information varies by import source, so please see "Naming Entities" below.
- Views and data entities are deduplicated across both:
- the views/data entities in the bulk import
- the views/data entities already in Single Origin.
The bulk import process happens asynchronously, and you can find the summary for an import by clicking the run on your "Imports" page.
Note
- only users with the Admin role can run bulk imports
- for BigQuery we require tables to be referenced using
project.dataset.table
notation e.g.sandbox-demo-db.thelook_ecommerce.order_items
- for BigQuery we only support standard SQL (not Legacy SQL)
- for BigQuery we require that you add a
_TABLE_SUFFIX
to yourWHERE
clause when querying a table with a wildcard. For more on querying wildcard tables with_TABLE_SUFFIX
, please see this BigQuery documentation
We try to minimize the number of "Pending" entities generated by a bulk import, but some may be created. For example, if your bulk import attempts to create an entity with the same name as an entity already in production, then a "Pending" entity will be created so that you can give the new entity a unique name.
Naming Entities
If you are running a bulk import based on a CSV or Compliant Query Table, then we will name the entity based on the first tag of the associated SQL. If you are running a bulk import based on a Query History Table, then we will name the entity based on the insertId column of the associated SQL. Since we require names to be in snake case format, we will remove special characters and numbers from the beginning of the name when necessary; however, in the description for the entity we will keep the original tag (including any special characters and numbers). This way, you can still search for the original tag and find the associated entities.
Running and Viewing Bulk Imports
Navigate to Bulk Imports by selecting Manage in the navigation bar and then clicking Imports. From here, administrators can:
- Create new bulk imports. Click the "New Import" button in the top right, and select "Import from Collection."
- View a history of previously run imports. Click on a row in the table to see more details about that import.
You can also start a bulk import from the results page for a Query Audit by selecting the "Import Queries" option in the "Actions" menu.
Example Video
FAQ
There are a few reasons we may fail to process a query during a bulk import, including:
- The SQL query is invalid.
- The SQL query is valid but references a table in a different project/warehouse than your Single Origin Connector. In this case, we do not have schema metadata for the table.
Check the "Logs" tab on a bulk import's details page to learn more.
Updated about 2 months ago