Difference between View and Indexed View or Materialized


Hello, welcome to Quickpickdeal in this session, we will learn about indexed views. Before we proceed, I strongly recommend reading the two posts below in this series.

  • What is a view and the advantages of views with an example
  • Updatable views in sql server

  • What is an indexed or materialized view, and what happens when you create an index on a view?

  • A standard non-indexed view is simply a stored SQL query. When data is retrieved from this view, it's actually fetched from the underlying base tables.

  • So, a view is essentially a virtual table. By default, it doesn't store any data. However, this default behavior can be altered by creating an index on the view.

  • Once an index is created on a view, the view becomes materialized, meaning it is now capable of storing data.

  • In SQL Server, we refer to these as indexed views, while in Oracle, they are termed as materialized views.

  • Let's consider an example: Suppose I have an event organization company and I want to sell event tickets. For this purpose, I have a TicketDetail table, which contains fields such as Ticket name, ShowTime, unit price, and TicketType (Bronze, Silver, & Gold). This table stores the master ticket details. Additionally, I have a TicketSale table which records the sales of tickets, essentially storing ticket sales data.

sql server indexed view vs materialized view

Now, based on these two tables, I want you to create a view that can result in below-aggregated data.

2

So, I want you to write a view that can actually return the total sales and total transactions by ticket. 

Now, we want to create an index on this view so it can be materialized. Obviously, if you want a view to be indexed, then there are certain rules that you should follow.

Create view vWTotalSaleByTicket
with SchemaBinding
as
Select TicketName,
SUM(ISNULL((TotalQuantitySold * UnitPrice), 0)) as TotalSales,
COUNT_BIG(*) as TotalTransactions
from dbo.TicketSale
join dbo.TicketDetail
on dbo.TicketDetail.Id = dbo.TicketSale.TicketId
group by TicketName

If you examine the implementation of the view, you'll notice that we're using the 'with schema binding' option.

Now, the first rule states that the view should be created with the 'schema binding' option if you plan to create an index on it. So, if you intend to create an index on a view, it must be created with the 'schema binding' option.

The second rule is that if an aggregate function in the select list references an expression and there's a possibility for that expression to become null, then a replacement value should be specified.

If you inspect the view, you'll see that we're utilizing an aggregate function, such as SUM(). We use this because we aim to compute the total sales made.

Create view vWTotalSaleByTicket
with SchemaBinding
as
Select TicketName,
SUM(ISNULL((TotalQuantitySold * UnitPrice), 0)) as TotalSales,
COUNT_BIG(*) as TotalTransactions
from dbo.TicketSale
join dbo.TicketDetail
on dbo.TicketDetail.Id = dbo.TicketSale.TicketId
group by TicketName

We're utilizing the Sum() function to calculate total sales. To achieve this, we multiply the quantity sold by the unit price for each row in the TicketSale table.

So, we're multiplying the unit price by the quantity sold. However, if for any reason the TicketId or quantity sold is Null, this expression could result in Null.

According to the second rule, if an aggregate function in the select list employs an expression and there's a chance for that expression to yield Null, a replacement value should be specified.

In the case where this expression becomes Null, instead of returning Null to the Sum() function, I opt to provide zero as the replacement value.

SUM(ISNULL((TotalQuantitySold * UnitPrice), 0)) as TotalSales

  • Another crucial rule is that if you utilize a GROUP BY clause in your view, then the select list of the view must include a COUNT_BIG() function.

    In our case, we're grouping by the ticket, total sales, and total transactions, thus necessitating the use of a GROUP BY clause. Consequently, we need to count the total number of transactions, hence the use of COUNT_BIG().

    While you can also employ the count() function, there's a limitation: if you intend to create an index on the view, you cannot use count(); you must use COUNT_BIG().

    Moreover, base tables in the view must be referenced with a two-part name. In this view, we have two base tables: TicketSale and TicketDetail. Therefore, when referencing these tables in the view, you should utilize the two-part name, consisting of the schema name and the table name. Using a one-part name without a schema or a three-part/four-part name is not permissible. You must strictly adhere to the two-part name format: schema name followed by a dot, then the table name.

from dbo.TicketSale
join dbo.TicketDetail
on dbo.TicketDetail.Id = dbo.TicketSale.,TicketId

as you can above we are using dbo. table name. if we will not use the two-part name let’s see what happens.

3

it will give the error
“Msg 319, Level 15, State 1, Line 6
Incorrect syntax near the keyword ‘with’. If this statement is a common table expression, an XML namespaces clause or a change tracking context clause, the previous statement must be terminated with a semicolon.”

These are only a few of the rules. But for the complete list of rules, please visit Microsoft.We have just created a view and we are executing the select statement on that view.

When we execute the query 'select * from vWTotalSaleByTicket', currently, this view does not store any data.

What actually happens behind the scenes when we execute 'select * from view' is that the database engine executes the select statement. This means the query has to retrieve data from the base tables and then compute the sum total transactions, among other computations, based on the defined logic in the view.

So, every time you run 'select * from vWTotalSaleByTicket', the database engine has to compute these values anew. 

Imagine the scenario where the ticket sales table is massive, as in the case of companies like Amazon, Flipkart, Walmart, etc., which deal with millions of transactions. If you have to compute these sums every time you execute a query, it could significantly impact the time taken to retrieve results.

However, if we create an index on this view, it would change the scenario.

Create Unique Clustered Index UIX_vWTotalSaleByTicket_Name
on vWTotalSaleByTicket(TicketName)

Remember, the first index you create on a view should be a unique clustered index. This choice is logical because the view, as it currently exists, doesn’t store any data. Without existing data in the view, creating a non-clustered index wouldn’t be feasible. Hence, the initial index should be a unique clustered one.

When I execute the query 'select * from vWTotalSaleByTicket', it retrieves data directly from that index. There's no need to refer back to the base tables for computation.

Every time a transaction occurs and a new row is added to the table, the view recalculates and updates accordingly. Consequently, when users issue 'select * from vWTotalSaleByTicket', the calculations are executed swiftly, significantly enhancing query performance.

Indexed views are particularly advantageous for OLAP systems in a data warehousing environment where data changes infrequently. In such systems, data is primarily utilized for reporting and analysis, often loaded on a periodic basis (e.g., weekly or nightly batch loads), followed by reprocessing.

For optimal data loading with indexed views, the recommended approach involves dropping the index views, loading the data, and then recomputing or recreating the index views. This approach reduces processing time considerably.

Indexed views are best suited for environments with infrequent data changes. However, for OLTP (Online Transaction Processing) systems, indexed views might impede performance because they necessitate continuous updates whenever new transactions occur, potentially impacting overall system performance.

Sql table script

GO
/****** Object: Table [dbo].[TicketDetail] Script Date: 09/30/2020 7:57:48 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[TicketDetail](
[Id] [int] NOT NULL,
[TicketName] [varchar](50) NOT NULL,
[ShowTime] [datetime] NOT NULL,
[UnitPrice] [float] NOT NULL,
[TicketType] [varchar](50) NULL,
CONSTRAINT [PK_TicketDetail] PRIMARY KEY CLUSTERED
(
[Id] ASC
)
) ON [PRIMARY]
GO
/****** Object: Table [dbo].[TicketSale] Script Date: 09/30/2020 7:57:48 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[TicketSale](
[Id] [int] IDENTITY(1,1) NOT NULL,
[TicketId] [int] NULL,
[TotalQuantitySold] [int] NULL,
CONSTRAINT [PK_TicketSale] PRIMARY KEY CLUSTERED
(
[Id] ASC
)
) ON [PRIMARY]
GO
INSERT [dbo].[TicketDetail] ([Id], [TicketName], [ShowTime], [UnitPrice], [TicketType]) VALUES (1, 'Dog Event Ticket', CAST('2020-08-19T00:00:00.000' AS DateTime), 20, 'Bronze')
GO
INSERT [dbo].[TicketDetail] ([Id], [TicketName], [ShowTime], [UnitPrice], [TicketType]) VALUES (2, 'Night Club Party Ticket', CAST('2020-08-19T00:00:00.000' AS DateTime), 10, 'Bronze')
GO
INSERT [dbo].[TicketDetail] ([Id], [TicketName], [ShowTime], [UnitPrice], [TicketType]) VALUES (3, 'Tomorrowland music festival', CAST('2020-08-19T00:00:00.000' AS DateTime), 15, 'Silver')
GO
INSERT [dbo].[TicketDetail] ([Id], [TicketName], [ShowTime], [UnitPrice], [TicketType]) VALUES (4, 'La Tomatina Festival', CAST('2020-08-19T00:00:00.000' AS DateTime), 20, 'gold')
GO
INSERT [dbo].[TicketDetail] ([Id], [TicketName], [ShowTime], [UnitPrice], [TicketType]) VALUES (5, 'Christmas party events', CAST('2020-08-19T00:00:00.000' AS DateTime), 50, 'Bronze')
GO
INSERT [dbo].[TicketDetail] ([Id], [TicketName], [ShowTime], [UnitPrice], [TicketType]) VALUES (6, 'New year''s eve events', CAST('2020-08-19T00:00:00.000' AS DateTime), 100, 'Bronze')
GO
SET IDENTITY_INSERT [dbo].[TicketSale] ON
GO
INSERT [dbo].[TicketSale] ([Id], [TicketId], [TotalQuantitySold]) VALUES (1, 1, 4)
GO
INSERT [dbo].[TicketSale] ([Id], [TicketId], [TotalQuantitySold]) VALUES (2, 3, 3)
GO
INSERT [dbo].[TicketSale] ([Id], [TicketId], [TotalQuantitySold]) VALUES (3, 2, 4)
GO
INSERT [dbo].[TicketSale] ([Id], [TicketId], [TotalQuantitySold]) VALUES (4, 1, 7)
GO
INSERT [dbo].[TicketSale] ([Id], [TicketId], [TotalQuantitySold]) VALUES (5, 3, 5)
GO
INSERT [dbo].[TicketSale] ([Id], [TicketId], [TotalQuantitySold]) VALUES (6, 4, 8)
GO
INSERT [dbo].[TicketSale] ([Id], [TicketId], [TotalQuantitySold]) VALUES (7, 5, 9)
GO
INSERT [dbo].[TicketSale] ([Id], [TicketId], [TotalQuantitySold]) VALUES (8, 6, 5)
GO
INSERT [dbo].[TicketSale] ([Id], [TicketId], [TotalQuantitySold]) VALUES (9, 4, 1)
GO
INSERT [dbo].[TicketSale] ([Id], [TicketId], [TotalQuantitySold]) VALUES (10, 1, 10)
GO
INSERT [dbo].[TicketSale] ([Id], [TicketId], [TotalQuantitySold]) VALUES (11, 2, 40)
GO
INSERT [dbo].[TicketSale] ([Id], [TicketId], [TotalQuantitySold]) VALUES (12, 3, 25)
GO
INSERT [dbo].[TicketSale] ([Id], [TicketId], [TotalQuantitySold]) VALUES (13, 5, 12)
GO
INSERT [dbo].[TicketSale] ([Id], [TicketId], [TotalQuantitySold]) VALUES (14, 1, 13)
GO
INSERT [dbo].[TicketSale] ([Id], [TicketId], [TotalQuantitySold]) VALUES (15, 2, 10)
GO
INSERT [dbo].[TicketSale] ([Id], [TicketId], [TotalQuantitySold]) VALUES (16, 3, 4)
GO
SET IDENTITY_INSERT [dbo].[TicketSale] OFF
GO