SQL remains an essential tool for data professionals due to its unrivaled ability to handle large datasets with precision and efficiency. However, its mastery lies not just in SQL’s capabilities, but in the skill of the user. Advanced SQL techniques can greatly enhance data handling and exploration, giving you a competitive edge.
In this article, we will explore 5 such advanced SQL techniques tailored for Snowflake users. We will explore advanced and unique SQL techniques that may be unfamiliar to you. These techniques will not only help you work smarter but also allow you to fully leverage Snowflake’s true hidden potential.
5 advanced Snowflake SQL techniques
1) Auto-generating ordered sequences in Snowflake
Sequences let you generate unique, sequential numbers across multiple sessions and statements. Common use cases include primary keys, surrogate keys and auto-increment identifiers. Snowflake sequences guarantee uniqueness but, like most distributed systems, don’t guarantee gap-free values. If your process requires 1, 2, 3 with no skips, a sequence isn’t the right tool; use ROW_NUMBER() instead.
One important note: since the 2024_01 behavior change bundle, new Snowflake sequences default to NOORDER. That means values aren’t guaranteed to be strictly increasing when multiple INSERT operations run concurrently. This improves performance in high-throughput scenarios. If you need ordered values, explicitly specify ORDER when creating your sequence.
To create a sequence in Snowflake, you use the CREATE SEQUENCE statement. The following is an example of how to create a Snowflake sequence:
Step 1—Create a database and basic sequence
To begin, we need to create a database to house the objects for this sequence generation. Execute the following commands either in Snowflake WebUI or SnowSQL command-line client:
CREATE DATABASE my_awesome_db;
CREATE SEQUENCE my_sequence;

Step 2—Retrieve a sequence value
Now, let’s retrieve a value from the newly created sequence. You can now use the NEXTVAL function with sequence to generate unique numbers.
For example, the following statement will return the next value in the my_sequence sequence:
SELECT my_sequence.NEXTVAL;

Step 3—Increment and validate uniqueness
After the sequence has been incremented, running the same SELECT statement again will provide the next value in the sequence.
SELECT my_sequence.NEXTVAL;

This time, the query will return the next value, which should be 2.
Here’s where Snowflake’s behavior differs from many other databases. When you reference NEXTVAL multiple times in a single SELECT statement, each reference gets a distinct value. Other databases often return the same value for repeated calls within one statement:
SELECT my_sequence.NEXTVAL, my_sequence.NEXTVAL, my_sequence.NEXTVAL, my_sequence.NEXTVAL, my_sequence.NEXTVAL, my_sequence.NEXTVAL;

Depending on the sequence’s current value, each column will display a different value, which demonstrates the uniqueness of the values generated by the sequence.
Step 4—Creating a custom sequence
Sequences can also be used to populate columns in a table.
Let’s create a custom sequence with custom start and increment values. The following example shows how to create a sequence & table and populate a column with values from a sequence:
CREATE SEQUENCE MY_CUSTOM_SEQUENCE
START WITH 5
INCREMENT BY 5;

(optional) To test the sequence with the custom start and increment values, execute the following SQL:
SELECT MY_CUSTOM_SEQUENCE.NEXTVAL, MY_CUSTOM_SEQUENCE.NEXTVAL, MY_CUSTOM_SEQUENCE.NEXTVAL, MY_CUSTOM_SEQUENCE.NEXTVAL;
The output should confirm that the sequence starts at 5 and increments by 5.
Step 5—Populate table columns with a sequence
Sequences can be used to populate incremental values in table columns. Let’s create a new table and insert data into one of its columns using a sequence. Execute the following SQL statements:
CREATE TABLE MY_TABLE (
ID INT,
NAME STRING
);

Let’s write a command that inserts some rows into the MY_TABLE table. We will populate the ID column using the MY_CUSTOM_SEQUENCE sequence, which ensures that each row receives a unique and incremental identifier. Also, we will fill the NAME column with custom string values. You can either add your own custom strings or use random strings if desired.
INSERT INTO MY_TABLE (ID, NAME)
SELECT MY_CUSTOM_SEQUENCE.NEXTVAL, t.NAME
FROM (VALUES ('Apple'), ('Banana'), ('Cherry'), ('Potato'), ('Mango'), ('Tomato')) AS t(NAME);

Step 6—Query the populated table
To view the data inserted into the table, execute the following SQL:
SELECT * FROM MY_TABLE;

As you can see, the output displays the inserted data, with the ID column showing auto-incremented values.
Alternatively
Step 7—Set a sequence as a column default
In Snowflake, you can define default values for table columns using sequences during table creation and column definition. Follow these SQL statements to achieve this:
CREATE TABLE MY_TABLE (
ID INT DEFAULT MY_CUSTOM_SEQUENCE.NEXTVAL,
NAME STRING
);

Now, let’s write a simple command to insert rows into the MY_TABLE table. As the ID column is already populated with the default value from the sequence, manual population is not required. Finally, all we have to do is fill the NAME column with custom string values. You have the flexibility to provide your own custom strings or opt for random strings if preferred.
INSERT INTO MY_TABLE (NAME)
VALUES ('Apple'), ('Banana'), ('Cherry'), ('Poatato'), ('Mango'), ('Tomato') ;

Step 8—View the populated table
To view the data inserted into the table, execute the following SQL:
SELECT * FROM MY_TABLE;

Creating sequences in Snowflake is incredibly easy and efficient. As you can see, you can quickly generate auto-incrementing sequences in Snowflake with minimal effort.
2) Working with timestamp data in Snowflake
Timestamp handling is one of those areas where Snowflake’s design is genuinely thoughtful, but easy to get wrong if you don’t understand the underlying model. Let’s clear it up.
Snowflake has three distinct TIMESTAMP variants, not just one:
- TIMESTAMP_NTZ: No time zone. The value is stored and returned exactly as entered, with no offset applied. This is the default when you use the TIMESTAMP alias.
- TIMESTAMP_LTZ: Local time zone. Snowflake stores the value in UTC internally and converts it to the session time zone on output.
- TIMESTAMP_TZ: Stores the value along with an explicit UTC offset.
The TIMESTAMP_TYPE_MAPPING session parameter controls which variant the TIMESTAMP alias resolves to. Its default is TIMESTAMP_NTZ.
Let’s explore practical examples of working with date, time and timestamp data types, focusing on how Snowflake handles time zones. In real-world scenarios, data is often received in raw text files, which contain date and time data as strings. We will start by converting these string values into date values, as shown in the following steps:
Step 1—Store date values
In Snowflake, date values are stored using the DATE data type. The standard format for dates is “YYYY-MM-DD”. To store a date value as a string, we can create a table and use the TO_DATE function to convert the string to a valid formatted date.
Here’s an example:
First, let’s create a database and a table named date_test with a date_id and date_value column of type DATE.
CREATE DATABASE snowflake_timestamp_db;
CREATE TABLE date_test (
date_id INTEGER,
date_value DATE
);

Now, let’s insert some value and use the TO_DATE function to convert the string value ‘2019-12-19’ into a valid date format.
INSERT INTO date_test (date_id, date_value)
VALUES (1, TO_DATE('2023-07-11', 'YYYY-MM-DD'));

Step 2—Insert a timestamp into a DATE column
Snowflake allows storing timestamp values in the DATE data type columns as well. But remember that the time component of the input will be ignored.
Here’s an example:
INSERT INTO date_test (date_id, date_value)
VALUES (2, TO_TIMESTAMP('2023.07.11 02:21:10', 'YYYY.MM.DD HH:MI:SS'));


As you can see, a timestamp value is inserted into the date_value column. Although the value includes a time component is ignored.
Step 3—Insert a time-only value into a DATE column
If you pass only a time value, Snowflake sets the date to the Unix epoch: January 1, 1970.
INSERT INTO date_test (date_id, date_value)
VALUES (3, TO_DATE('02:20:20', 'HH:MI:SS'));

That’s a deliberate behaviour, not a bug, but it can cause confusion if you’re not expecting it.
Step 4—Query the table
To view the data in the table, you can execute a simple SELECT statement:
select * from date_test;

As you can see, DATE data type completely disregards the time component.
Step 5—Create a TIMESTAMP table and inspect session time zone settings
Creating a table with a Timestamp column:
Now, let’s create a new table, TIMESTAMP_TEST, with a TIMESTAMP column to demonstrate how Snowflake manages time zones.
CREATE TABLE TIMESTAMP_TEST (timestamp_id INTEGER, timestamp_value TIMESTAMP);

We will explore the session object and how it can be used to manage time zones. The session contains various objects, each with a default value. To view the values of all the time zone-related objects in the session, run the following command:
SHOW PARAMETERS LIKE '%TIMEZONE%' IN SESSION;

As you can see, this shows the TIMEZONE parameter (default is America/Los_Angeles) and the TIMESTAMP_TYPE_MAPPING parameter (default is TIMESTAMP_NTZ).
Step 6—Modify the session time zone
You can change the value of the TIMEZONE parameter using the ALTER SESSION command. The value should be set to a valid timezone string. You can check out the valid values from here.
sssALTER SESSION SET TIMEZONE='Antarctica/Mawson';

Step 7—Confirm the updated parameter
After modifying the time zone parameter, you can recheck the session parameters to view the updated value:
SHOW PARAMETERS LIKE '%TIMEZONE%' IN SESSION;

See? This query will display the updated value of the time zone parameter.
Step 8—Insert timestamp values
To demonstrate how Snowflake manages time zones, let’s insert a timestamp value in our previously created table:
INSERT INTO TIMESTAMP_TEST (timestamp_id, timestamp_value)
VALUES (1, '2023-07-19 22:00:00.000'),
(2, '2023-07-19 23:00:00.000'),
(3, '2023-07-20 00:00:00.000'),
(4, '2023-07-20 01:00:00.000'),
(5, '2023-07-20 02:00:00.000'),
(6, '2023-07-20 03:00:00.000'),
(7, '2023-07-20 04:00:00.000'),
(8, '2023-07-20 05:00:00.000'),
(9, '2023-07-20 06:00:00.000'),
(10, '2023-07-20 07:00:00.000');

Step 9—Query the table with time zones
To observe how Snowflake handles time zones, execute a query to retrieve data from the table:
Select * from TIMESTAMP_TEST;

Step 10—Modify Timestamp_Type_Mapping
Let’s change another session parameter. We can change how Snowflake manages timestamp data by modifying the TIMESTAMP_TYPE_MAPPING parameter. Here’s a simple example:
ALTER SESSION SET TIMESTAMP_TYPE_MAPPING = 'TIMESTAMP_TZ';
ALTER SESSION SET TIMEZONE = 'Asia/Thimphu';
CREATE OR REPLACE TABLE timestamp_test_2 (tsp TIMESTAMP);
INSERT INTO timestamp_test_2 VALUES ('2023-07-20 06:00:00.000');
SELECT tsp FROM timestamp_test_2;

As you can see, the query above modifies the TIMESTAMP_TYPE_MAPPING parameter and sets it to ‘TIMESTAMP_TZ’ (default value is ‘TIMESTAMP_NTZ’ ). It then updates the time zone and creates a table timestamp_test_2 with a ts column of type TIMESTAMP. It then inserts a timestamp value and retrieves it through a query.
Learn more about TIMESTAMP_TYPE_MAPPING from here.
When working with timestamp data in Snowflake, it is important to be aware of the following:
- A TIMESTAMP value inserted into a DATE column loses its time component
- A time-only value inserted into a DATE column gets the date 1970-01-01
- TIMESTAMP is an alias; its behavior depends on TIMESTAMP_TYPE_MAPPING (default: TIMESTAMP_NTZ)
- Session-level TIMEZONE affects TIMESTAMP_LTZ and TIMESTAMP_TZ, but not TIMESTAMP_NTZ
3) Working with unique counts
Counting distinct values in a distributed system is expensive. Snowflake’s architecture makes exact COUNT(DISTINCT …) queries slow on large tables because they require shuffling data across nodes to identify duplicates. For many analytical use cases, you don’t need an exact count. That’s where approximate counting comes in.
Snowflake gives you three main options:
- COUNT(): Counts all rows, including duplicates
- COUNT(DISTINCT): Returns exact distinct count, slower at scale
- APPROX_COUNT_DISTINCT(): Uses the HyperLogLog algorithm for a fast, approximate distinct count
Step 1—Query sample data
First, we will perform a simple SELECT query on the Lineitem table, which is available in the SNOWFLAKE_DEMO_DB database and the TPCH_SF1000 schema. We will query 5000 rows from the dataset.
SELECT *
FROM "SNOWFLAKE_SAMPLE_DATA"."TPCH_SF1000"."LINEITEM"SAMPLE ROW (5000 ROWS);

Note that it took 1 minute and 11 seconds for Snowflake to query the 1.5 billion rows in the table. Snowflake has already queried the table, collected metadata and cached some data, which will assist in subsequent queries.

Step 2—Count SUPPKEY grouped by shipping instruction
Next, we will execute a count on the L_SUPPKEY column while applying grouping to L_SHIPINSTRUCT. The analytic that we are trying to calculate here is the popularity of order priority for customers. This does not mean we are looking for an interesting insight as we are dealing with synthetic data here.
SELECT
L_SHIPINSTRUCT,
COUNT(L_SUPPKEY)
FROM "SNOWFLAKE_SAMPLE_DATA"."TPCH_SF1000"."LINEITEM"GROUP BY 1
ORDER BY 1;
We obtain the results in approximately 30 seconds


Step 3—Count DISTINCT SUPPKEY per shipping instruction
Now, let’s calculate the same count, but this time with a variation. We will apply the DISTINCT function to the “L_SUPPKEY” column to ensure that customers with repeated orders are counted only once. To prevent Snowflake from utilizing results from its cache, we need to reset it by using “USE_CACHED_RESULT” to “FALSE” in the session.
ALTER SESSION SET USE_CACHED_RESULT = FALSE;
SELECT
L_SHIPINSTRUCT,
COUNT(DISTINCT L_SUPPKEY)
FROM "SNOWFLAKE_SAMPLE_DATA"."TPCH_SF1000"."LINEITEM"GROUP BY 1
ORDER BY 1;
We obtain the results in approximately 4 minutes and 18 seconds

Step 4—Approximate distinct count
Now, let’s try the same thing, but this time with the APPROX_COUNT_DISTINCT function on the L_SUPPKEY column, rather than with the COUNT(DISTINCT L_SUPPKEY) function we used in previous step.
ALTER SESSION SET USE_CACHED_RESULT = FALSE;
SELECT
L_SHIPINSTRUCT,
APPROX_COUNT_DISTINCT(L_SUPPKEY)
FROM "SNOWFLAKE_SAMPLE_DATA"."TPCH_SF1000"."LINEITEM"GROUP BY 1
ORDER BY 1;

As you can see, the result is similar to the previous step’s outcome, but it is not the same. Let’s look at the count for the COLLECT COD category of L_SHIPINSTRUCT. The count provided by the approximate method is around 9.93 million, while the result from the exact count was 10 million. This is an error of around 0.65 percent, while at the same time, the query execution time has come down to 21 seconds from almost 5 minutes.
How HyperLogLog actually works:
APPROX_COUNT_DISTINCT uses Snowflake’s bias-corrected HyperLogLog (HLL) implementation. HLL works by hashing input values into a large number of sub-streams (registers). For each register, it tracks the maximum number of leading zeros in the hash values seen so far. It then uses a harmonic mean across all registers to produce a final cardinality estimate. The use of multiple sub-streams and harmonic mean (rather than a simple maximum) is what distinguishes HyperLogLog from the older LogLog algorithm and gives it better accuracy.
Snowflake’s HLL implementation has an average relative error of 1.62%, per Snowflake’s documentation. That means for a query where the exact count would return 1,000,000, HyperLogLog typically returns a value between approximately 983,767 and 1,016,233. At the 1.62% error tolerance, the tradeoff against a multi-minute runtime is almost always worth it for exploratory analytics.
HLL is also memory-efficient. Snowflake’s implementation uses at most 4,096 bytes per aggregation group (2^12 sub-streams at 1 byte each), making it practical at any scale.
4) Extracting day, month, year and other calendar components from date data
In this section, we’ll look at the various Snowflake SQL functions for extracting specific parts of date data, such as the day, day of the week, week of the year, day name and important dates in business reporting cycles, such as the last day of the month or the first day of the month. Because time and date are widely employed as dimensions in data warehousing, these features are critical. We will also discuss some issues with configuring the week’s start in Snowflake and show how to create a utility calendar table with several date-related columns.
To begin, we will start with a straightforward date generator as part of this guide. The generator function will generate calendar dates for the entire year (365 rows). We will use the generated date field to derive additional fields in the table. Let’s start by generating some dates:
Step 1—Generate calendar dates
Let’s start by generating some dates using seq4() Snowflake SQL function to generate a sequence of numbers. These numbers will be added to the first date of the year 2023. To generate 365 rows, we can use the following code, which utilizes the GENERATOR() function:
SELECT (TO_DATE('2023-01-01') + SEQ4()) AS gen_calendar_data
FROM TABLE(GENERATOR(ROWCOUNT => 365));

Step 2—Extract day, month and year
Now that we have the foundation set up, we can begin adding Snowflake SQL functions to extract different parts of the date. In this step, we will extract the day, month and year using the DATE_PART() function:
SELECT
(TO_DATE('2023-01-01') + SEQ4()) AS gen_calendar_data,
DATE_PART(day, gen_calendar_data) AS Day_,
DATE_PART(month, gen_calendar_data) AS Month_,
DATE_PART(year, gen_calendar_data) AS Year_
FROM TABLE(GENERATOR(ROWCOUNT => 365));

Step 3—Add first and last day of the month
Now that we have extracted the basic date components, we can enhance it further by adding additional fields. We will include the first and last day of the month for each date in our dataset. Two columns will be added, as demonstrated in the following code:
SELECT
(TO_DATE('2023-01-01') + SEQ4()) AS gen_calendar_data,
DATE_PART(day, gen_calendar_data) AS Day_,
DATE_PART(month, gen_calendar_data) AS Month_,
DATE_PART(year, gen_calendar_data) AS Year_,
DATE_TRUNC('month', gen_calendar_data) AS first_day_of_month,
DATEADD('day', -1, DATEADD('month', 1, DATE_TRUNC('month', gen_calendar_data))) AS last_day_of_month
FROM TABLE(GENERATOR(ROWCOUNT => 365));

Step 4—Add English month names
Moving on, let’s add the English names of the months to the existing dataset. We already have the month available in the CAL_MONTH column. We can simply use the DECODE() function to obtain the English names of the months, as shown in the following code:
SELECT
(TO_DATE('2023-01-01') + SEQ4()) AS gen_calendar_data,
DATE_PART(day, gen_calendar_data) AS Day_,
DATE_PART(month, gen_calendar_data) AS Month_,
DATE_PART(year, gen_calendar_data) AS Year_,
DATE_TRUNC('month', gen_calendar_data) AS first_day_of_month,
DATEADD('day', -1, DATEADD('month', 1, DATE_TRUNC('month', gen_calendar_data))) AS last_day_of_month,
DECODE(Month_,
1, 'January',
2, 'February',
3, 'March',
4, 'April',
5, 'May',
6, 'June',
7, 'July',
8, 'August',
9, 'September',
10, 'October',
11, 'November',
12, 'December') AS month_name
FROM TABLE(GENERATOR(ROWCOUNT => 365));

Step 5—Capture quarter-end dates
You may take it a step further by including a column in the dataset that captures the end of the quarter for each date. To determine the last day of the month and the last day of the quarter for each date in the table (make use the DATEADD() Snowflake SQL function). The procedure is similar to what we did before.
Simply add the following code snippet to the preceding code.
DATEADD('day', -1, DATEADD('month', 3, DATE_TRUNC('quarter', CAL_DT))) AS cal_qtr_end

5) Running ordered analytics for marketing insights and large dataset management
This section will cover Snowflake’s capability to perform ordered analytics over subsets of rows, particularly useful in marketing analytics applications where trends are identified using moving averages or cumulative functions. These features are essential for data scientists working with large datasets.
Step 1—Generate sample data and create a view with data generation logic
To begin, let’s generate some sample data. We will create a view that includes the necessary logic to generate the data. The following query generates the base data for implementing the view. We will be using the OrderedAnalytics database for this purpose.
CREATE DATABASE IF NOT EXISTS OrderedAnalytics;
USE OrderedAnalytics ;
Now let’s create a view that generates 365 records, where customer_id consists of five unique values (0 to 4) that repeat throughout the records. The deposit_amount column contains values from the sets {100, 200, 300, 400 and 500}. The deposit_date column represents the dates, starting from the current date and going back 365 days.
CREATE OR REPLACE VIEW customer_deposits AS
SELECT
MOD(SEQ4(), 5) AS customer_id,
(MOD(UNIFORM(50, 500, RANDOM()), 5) + 1) * 100 AS deposit_amount,
DATEADD(DAY, -SEQ4(), CURRENT_DATE()) AS deposit_date
FROM TABLE(GENERATOR(ROWCOUNT => 365));
select * from customer_deposits;

Step 2—Run window functions for deposit comparison
Now that we have created the view, let’s use this dataset to run various window functions available in Snowflake. We will focus on highlighting customers who made a single deposit greater than the cumulative sum of their last two deposits:
SELECT
customer_id,
deposit_date,
deposit_amount,
deposit_amount > COALESCE(SUM(deposit_amount) OVER (
PARTITION BY customer_id
ORDER BY deposit_date
ROWS BETWEEN 2 PRECEDING AND 1 PRECEDING), 0) AS hi_deposit_alert
FROM customer_deposits
ORDER BY customer_id, deposit_date DESC;

As you can see, this query generates a new column that displays either TRUE or FALSE, depending on whether a customer made a deposit greater than the sum of their last two deposits.
Step 3—Analyze deposits with cumulative sum
Snowflake provides a range of other functions for window frames, allowing for different configurations. Let’s modify the window range and use the average instead of the sum. We will specify a window that includes all deposits made prior to the current row. By calculating the average deposit within that window and comparing it to the current deposit, we can identify higher-than-normal deposits:
SELECT
customer_id,
deposit_date,
deposit_amount,
COALESCE(AVG(deposit_amount) OVER (
PARTITION BY customer_id
ORDER BY deposit_date
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING), 0) AS past_average_deposit,
deposit_amount > past_average_deposit AS hi_deposit_alert
FROM customer_deposits
WHERE CUSTOMER_ID = 3
ORDER BY customer_id, deposit_date DESC;

Windowing functions in Snowflake enable aggregation and utilization of information from rows beyond the current row. These functions allow aggregations across related rows, without grouping and reducing the number of output rows.
Save up to 30% on your Snowflake spend in a few minutes!
Conclusion
These five techniques cover distinct layers of Snowflake SQL capability: sequence generation, timestamp handling, approximate counting, date extraction and ordered analytics. None of them are obvious from the basic SQL docs, and all of them show up regularly in production data pipelines and analytics workloads. Mastering them puts you in a much better position to write queries that are both correct and fast.
FAQs
How do I manage timestamp data in Snowflake?
Snowflake has three timestamp variants: TIMESTAMP_NTZ (no time zone), TIMESTAMP_LTZ (local time zone) and TIMESTAMP_TZ (explicit offset stored). Use TO_DATE(), TO_TIME() and TO_TIMESTAMP() to convert strings. The TIMESTAMP_TYPE_MAPPING session parameter controls which variant the generic TIMESTAMP alias resolves to.
What does “shredding” date data mean in Snowflake?
It refers to extracting individual calendar components from a date value. Use DATE_PART(year, col), DATE_PART(month, col) and DATE_PART(day, col), or the equivalent EXTRACT syntax. For month names, TO_CHAR(col, ‘MMMM’) is cleaner than a DECODE block.
How does Snowflake handle unique counts at scale?
Use COUNT(DISTINCT …) for exact results and APPROX_COUNT_DISTINCT() for fast approximations on large datasets. The latter uses HyperLogLog with an average relative error of 1.62% per Snowflake’s documentation.
What is the difference between TIMESTAMP_NTZ, TIMESTAMP_LTZ and TIMESTAMP_TZ?
TIMESTAMP_NTZ stores no time zone information; the value is returned exactly as entered. TIMESTAMP_LTZ stores in UTC and converts to the session time zone on display. TIMESTAMP_TZ stores the value with an explicit UTC offset.
Are Snowflake sequences gap-free?
No. Snowflake sequences guarantee uniqueness but not gap-free ordering. In concurrent environments with the default NOORDER setting, values may appear out of order. Use ROW_NUMBER() if you need a gap-free sequence.
What is the default sequence ordering in Snowflake since the 2024_01 behavior change?
New sequences and auto-incremented columns default to NOORDER as of the 2024_01 behavior change bundle. If you need ascending ordered values, explicitly specify ORDER in your CREATE SEQUENCE statement.
How do I generate a calendar table in Snowflake?
Use SEQ4() with GENERATOR(ROWCOUNT => n) to produce a row per day, then apply DATE_TRUNC, DATE_PART, DATEADD and DECODE or TO_CHAR to derive calendar columns.
Can I use SQL to perform window-based analytics in Snowflake?
Yes. Snowflake supports the full ANSI window function syntax including PARTITION BY, ORDER BY, ROWS BETWEEN and RANGE BETWEEN frames. Functions include SUM, AVG, COUNT, MIN, MAX, LEAD, LAG, ROW_NUMBER, RANK and DENSE_RANK.
How can I optimize query performance in Snowflake?
Key strategies include choosing the right warehouse size, using clustering keys on large tables, partitioning by high-cardinality filter columns, using result caching for repeated queries and replacing COUNT(DISTINCT …) with APPROX_COUNT_DISTINCT() where exact precision isn’t required.
How does Snowflake handle time zone conversion?
Set TIMEZONE at the session level using ALTER SESSION SET TIMEZONE = ‘Region/City’. For TIMESTAMP_LTZ, Snowflake stores values in UTC and converts them to the session time zone on output. For TIMESTAMP_TZ, the stored offset doesn’t change when the session time zone changes.
Does Snowflake support semi-structured data types in SQL?
Yes. Snowflake’s VARIANT, OBJECT and ARRAY types store JSON, Avro and Parquet natively. You can query nested fields using colon notation (col:field::STRING) and flatten arrays with the LATERAL FLATTEN function.
Can I perform data masking using SQL in Snowflake?
Yes. Snowflake supports dynamic data masking via masking policies, which you apply to columns using ALTER TABLE … ALTER COLUMN … SET MASKING POLICY. Masking logic is written in SQL and applied at query time based on the querying role.
How does Snowflake handle JSON data?
Use PARSE_JSON() to convert a string to a VARIANT, then use colon-dot notation to access fields (col:address.city::STRING). LATERAL FLATTEN expands arrays into rows. Functions like GET, GET_PATH and JSON_EXTRACT_PATH_TEXT also work for field extraction.
Can I manage user roles and privileges using SQL in Snowflake?
Yes. Use CREATE ROLE, GRANT PRIVILEGE ON OBJECT TO ROLE and GRANT ROLE TO USER to build out your access control model. Snowflake’s role hierarchy supports fine-grained privilege management across databases, schemas, tables and views.
How does Snowflake support data retention and time travel?
Set DATA_RETENTION_TIME_IN_DAYS on databases, schemas or tables using ALTER … SET DATA_RETENTION_TIME_IN_DAYS = n. Time travel lets you query historical states using AT or BEFORE clauses, for example: SELECT * FROM my_table AT (TIMESTAMP => ‘2024-01-01 00:00:00’::TIMESTAMP).
Can I use SQL for data deduplication in Snowflake?
Yes. Common patterns include SELECT DISTINCT, ROW_NUMBER() OVER (PARTITION BY key ORDER BY updated_at DESC) to pick the latest record per key, and MERGE statements to upsert while avoiding duplicates.