Welcome back to the Snowflake Interview Question & Answers series Part 2: ! In case you missed it, we recommend taking a few moments to review Part 1 before proceeding
1.Which scenario did you use time travel?
Ans.
Restore data that got accidentally deleted using UNDROP.
Backing up of data from key points in the past i.e data a week back, month back. Query data in the past that has since been updated or deleted.
2.What are the different ways of getting/accessing/querying data from Time travel?
Ans.
It can be done in 3 ways
- By timestamp : check data at a specific point in time
SELECT * FROM my_table AT(timestamp => ‘Mon, 01 May 2021 08:00:00 -0700’::timestamp_tz);
SELECT * FROM my_table BEFORE(timestamp => ‘Mon, 01 May 2021 08:00:00 -0700’::timestamp_tz);
- By offset →In this example, we select the historical data from a table as of 15 minutes ago:
SELECT * FROM my_table AT(offset => -60*15);
SELECT * FROM my_table BEFORE(offset => -60*15);
- By query statement ID→ We can see the Query Statement ID in the history tab
SELECT * FROM my_table BEFORE(STATEMENT => ‘8e5d0ca9-005e-44e6-b858-a8f5b37c5726’);
SELECT * FROM my_table AT(STATEMENT => ‘8e5d0ca9-005e-44e6-b858-a8f5b37c5726’);
Accessing data with AT => Data at that point
Accessing data with BEFORE => Data before that point
3.What’s are different types of tables
Ans.
Permanent, Temporary, transient and external tables
4.What’s different between permanent, transient and temporary tables?
Ans.
- Retention period is 0-90 days for permanent tables, 0-1 days for temporary and transient tables
- Fail safe is not available for temporary and transient tables
- Scope is until explicitly dropped for permanent and transient tables, available only for that session for temporary tables
5.What are the different data loading options?
Ans.
Data Loading Options
- Bulk Data loading with Copy Command
- Continuous data loading with Snowpipe
(Through WebUI/Wizard also we can load data but we didn’t use it much in our projects) – Maximum 50 MB files can only be loaded using UI)
6.If comma is defined as column delimiter and if we get extra comma in the data how to handle this scenario?
Ans.
Add ESCAPE_UNENCLOSED_FIELD and FIELD_OPTIONALLY_ENCLOSED_BY as single quotes, double quotes in file format options
Snowflake stores all data internally in the UTF-8 character set, IN SQL Server it will be in UTF-16 encoding so while migrating from there you might face issue so change it to UTF-16
7.What are types of stages?
Ans.
Stages in Snowflake specify where data files are stored (staged) so that the data in the files can be loaded into a table. It is the location of the files before moving them to Snowflake tables.
- External stage
- Internal stage: Internal stages are of three types
- User stage
- Table stage
- Named internal stage
8.What are the different ways of accessing or listing files from stages?
Ans.
Table stage can be accessed as LIST @%tablename
User stage can be accessed as LIST @~username
Named internal/External stage can be accessed as LIST @Stagename
9.Can we run PUT command in UI or worksheet?
Ans.
No, PUT command is not supported in UI/worksheet. We will have use SnowSQL CLI and run PUT commands
10.Will snowflake charge for data stored in AWS S3/Azure blob container/Google cloud container?
Ans.
No snowflake won’t charge any storage cost. (But it will charged by cloud provider through cloud accounts i.e AWS/Azure)
11.Will snowflake charge for data stored in Internal Stages?
Ans.
Yes, as internal stage is snowflake provided storage.
Also data that is there in time travel and fail safe will incur storage costs
12.How to unload data from Snowflake?
Ans.
Use copy command, directly download data into AWS S3/Azure blob/Google cloud storage
If we have to download/unload data into local machine then we will have to unload data into internal stage using copy command and using GET command we can unload it into our local machine from internal stage
13.How will you connect from snowflake to AWS S3/Azure blob/Google cloud storage?
Ans.
- We can create external stage and pass s3 path url and AWS key Access key and secret key
- Create storage integration and pass AWS IAM ROLE ARN and s3 path or url and use that storage integration in external stage
14.Can we read data directly from S3 bucket/External/Internal Stage?
Ans.
Yes, we can read data directly from S3 bucket/External/Internal stage as follows.
Select $1,$2,$3 from @my_external_stage/dataload/emp.csv.gz (file_format => ‘my_csv_format’)
Select $1,$2,$3 from @my_internal_stage/dataload/emp.csv.gz (file_format => ‘my_csv_format’)
15.Can we reload a file that is already loaded into a table using copy command?
Ans.
No, because based on filename, Metadata information will stored for 64 days
16.How can we reload the same file (same name) that already loaded?
Ans.
In copy command, we can specify FORCE=TRUE to load an already loaded file
17.How to check if file has got any errors before loading the files to Snowflake tables?
Ans.
Use Validation mode= return_errors| return_all_errors| return_n_rows copy option to check for errors in a file before loading into snowflake tables.
18.How to capture error records into error table during data loads?
Ans.
Use Validate function to capture error records into error table during data loads
19.How to continue load even in case of error?
Ans.
Specify ON_ERROR = CONTINUE in copy command, file will get partially loaded in case of error
20.How to skip the file as soon as error is encountered?
Ans.
Using ON_ERROR= SKIP_FILE in Copy command, error file will be skipped
The possible values for ON_ERROR copy option is as follows
ON_ERROR = { CONTINUE | SKIP_FILE | SKIP_FILE_<num> | SKIP_FILE_<num>% | ABORT_STATEMENT }
21.What is the default on_error option for COPY command in case if you are not specifying it?
Ans.
Default value for ON_ERROR copy options is ABORT STATEMENT
In snowpipe, the default ON_ERROR option for COPY Command is SKIP_FILE
22.How can you remove file from stage as soon as the load completes?
Ans.
Use PURGE =TRUE in copy command to remove files form the stage as soon as the load completes.
23.What is SCD Type1?
Ans.
Maintaining Current state of the data in the target table is referred as SCD Type1.
Based on key field it will overwrite the data
24.What is SCD Type2, how will you implement SCD Type2?
Ans.
Maintaining historical state of data in the target table is referred as SCD Type2.
We can have additional fields like START_DATE, END_DATE, CURRENT_FLAG
Using streams and Merge statement, we can implement SCD Type2 in snowflake
25.How do we implement SCD type1 and type2?
Ans.
Using streams, Merge statement and Tasks, we can implement SCD Type1 in snowflake.