March towards SQL Server : Day 28 – SQL DBA Interview Questions Answers – All About Indexes – Part 2


day28

Here goes another post on Indexes. I hope you read my previous post on Indexes. If not, you may quickly go through. SQL DBA interview questions and answers on Indexes Part 1

1. What are the Advantages of Indexes?

  • Query optimization: Indexes make search queries much faster.
  • Uniqueness: Indexes like primary key index and unique index help to avoid duplicate row data.
  • Text searching: Full-text indexes in SQL Server, users have the opportunity to optimize searching against even large amounts of text located in any field indexed as such.
  • Sort Operations: Can be used to avoid a sort operation by providing the data pre-sorted

2. What are the Disadvantages of Indexes?

  • Additional Disk Space
  • Insert Update and Delete statements can become slow
  • Adding indexes to the schema increases the overhead on the database because the indexes will require ongoing maintenance.

3. What is Fill factor in indexes?

Data is stored in the leaf-level pages of Index.  The percentage of space to be filled with data in a leaf level page is decided by fill factor. The remaining space left is used for future growth of data in the page. Fill factor is a number from 1 to 100. Its default value is 0, which is same as 100. So when we say fill factor is 70 means, 70% of space is filled with data and remaining 30% is vacant for future use. So higher the fill factor, more data is stored in the page. Fill factor setting is applied when we create/rebuild index.

4. What are Points to remember while using the FILLFACTOR Argument?

Below are the criteria you need to consider while choosing fill factor.

  • For static/look-up table: these types of tables have static data means data changes very rarely in the table. So we can set high value of fill factor means 100 or 0.
  • For Dynamic table: In this type of table, data get changes(inserted/updated) frequently. So we need to set low fill factor, between 80 to 90.
  • For Table with Clustered Index on Identity Column: Here the data is inserted at the end of the table always. So we can have higher value of fill factor between 95 to 100.

5. How to set Fill Factor?

Server level: A generic fill factor setting is applied at the server level for all table/index. We can set a server level default fill factor by using sp_configure with a parameter as below script.

EXEC sys.sp_configure 'fill factor', 90
GO
RECONFIGURE WITH OVERRIDE
GO

At Index/Table level: While creating/rebuilding index we can set a specific fill factor. We can use below script to set fill factor while rebuilding index.

ALTER INDEX Index_name ON [SchemaName].[TableName]
REBUILD WITH (FILLFACTOR = 80);
GO

6. What is Pad index in Index?

This option specifies index padding. When turned ON, it uses the percentage specified by FILLFACTOR is applied to the intermediate-level and root level pages of an index.

PAD_INDEX = { ON | OFF } When ON, the percentage of free space specified by FILLFACTOR is applied to the intermediate level pages of the index. When OFF or a FILLFACTOR value is not specified, the intermediate level pages are filled to near capacity leaving enough space for at least one row of the maximum size the index can have, considering the set of keys on the intermediate pages. The default is OFF.

7. Can we have indexes on Views?

Indexes can be defined on views. Indexed views are a method of storing the result set of the view in the database, thereby reducing the overhead of dynamically building the result set. An indexed view also automatically reflects modifications made to the data in the base tables after the index is created.

8. What is Index Fragmentation?

Index fragmentation is an “expected” and “unavoidable” characteristic of any OLTP environment. Fragmentation is defined as any condition which causes more than the optimal amount of disk I/O to be performed in accessing a table, or causes the disk I/Os that are performed to take longer than they optimally would.

9. What are the Causes of Index Fragmentation?

  • Insert and Update operations causing Page splits
  • Delete operations
  • Initial allocation of pages from mixed extents
  • Large row size

10. How many types of Index Fragmentation are there in SQL Server?

Various Types of Index Fragmentation

  • Internal Fragmentation
  • Logical Fragmentation
  • Extent Fragmentation

11. What is Internal Fragmentation?

When pages are less than fully used, the part of each page that is unused constitutes a form of fragmentation, since the table’s or index’s rows are no longer packed together as tightly as they could be. This is known as Internal Fragmentation.

12. How Internal Fragmentation occurs?

It usually occurs due to –

  • Random deletes resulting in empty space on data pages
  • Page-splits due to inserts or updates
  • Shrinking the row such as when updating a large value to a smaller value
  • Using a fill factor of less than 100
  • Using large row sizes

13. What is Logical Fragmentation?

SQL Server uses 8KB Pages to store data on disk. When a clustered index is created on a table, SQL Server creates a b-tree data structure for the index and links all data pages at the leaf level of the tree in a logical order by using a doubly linked list. Logical fragmentation occurs when the pages in this doubly linked list are not contiguous in the index, meaning that indexes have pages in which the logical ordering of pages, which is based on the key value, does not match the physical ordering inside the data file.

14. How Logical Fragmentation occurs?

This could happen due to –

  • Page-splits due to inserts or updates
  • Heavy deletes that can cause pages be removed from the page chain, resulting in dis-contiguous page chain

15. What is External Fragmentation?

Extent fragmentation occurs when the extents of a table or index are not contiguous with the database leaving extents from one or more indexes intermingled in the file.

16. How External Fragmentation occurs?

This can occur due to:

  • Random deletes, which could leave some of the pages in an extent unused while the extent itself is still reserved as part of the table’s space allocation. Think of it like Internal fragmentation, but in extents instead of pages
  • Deletes on ranges of contiguous rows within the table, causing one or more entire extents to become de-allocated, thus leaving a gap between the surrounding extents of the table or index
  • Interleaving of a table’s data extents with the extents of other objects

17. How to get information about the Index Fragmentation in a database?

We can use sys.dm_db_index_physical_stats DMF.

18. What all modes are available to get a report about Index fragmentation?

DMF “sys.dm_db_index_physical_stats ” has three modes –

  1. DETAILED – reads all data and index pages. Be careful with using this options since it causes the entire index be read into memory and may result in IO/Memory issues
  2. SAMPLED: reads 1% of the pages if more than 10,000 pages
  3. LIMITED: only reads the parent level of b-tree (same as DBCC SHOWCONTIG WITH FAST). Limited option doesn’t report page density, since it does not read the leaf level pages

19. How to resolve Fragmentation in Indexes?

  • Rebuild – ALTER INDEX … REBUILD (replaces DBCC REINDEX)
  • Reorganize – ALTER INDEX … REORGANIZE (replaces DBCC INDEXDEFRAG)

20. What is the difference between Index Rebuild and Index Reorganize?

Characteristic Alter Index REORGANIZE Alter Index REBUILD
Online or Offline Online Offline (unless using the Online keyword)
Address Internal Fragmentation Yes (can only raise page density) Yes
Address Logical Fragmentation Yes Yes
Transaction Atomicity Small Discrete Transactions Single Atomic Transaction
Rebuild Statistics Automatically No Yes
Parallel Execution in multi-processor machines No Yes
Untangle Indexes that have become interleaved within a data file No Yes
Transaction log space used Less More
Additional free space required in the data file No Yes

 21. What is the difference between Online and Offline operations?

Online rebuild is a process where while the index is rebuild, it can still be used. it is like part by part is copied in memory as a temporary index and once everything is done, the actual index/table is locked for sometime during which old index is dropped and newly created temporary index is renamed.
Offline is where the table/index has to be locked during the time of index rebuild. Bad about online rebuild is you need more memory but the best part is no impact or rather minimum impact on application and/or users. Good part of offline index is no extra memory needed but worst part is table/index is locked during the process of rebuild.
You should always try to rebuild index online but if it is failing then you have to go with offline rebuild.

22. How to resolve Fragmentation for a heap table?

SQL Server 2008 onwards we can use REBUILD command to remove fragmentation from heap.

ALTER TABLE…REBUILD (SQL 2008+).

23. What are missing indexes in a database?

When the query optimizer generates a query plan, it analyzes what are the best indexes for a particular filter condition. If the best indexes do not exist, the query optimizer generates a suboptimal query plan, but still stores information about these indexes. The missing indexes feature enables you to access information about these indexes so you can decide whether they should be implemented.

24. Which DMV is used to identify missing Indexes in a database?

sys.dm_db_missing_index_details

25. How can we identify Index usage history?

sys.dm_db_index_usage_stats – Returns counts of different types of index operations and the time each type of operation was last performed.

26. What is SORT_IN_TEMPDB option while creating or rebuilding an Index?

When we create or rebuild an index, by setting the SORT_IN_TEMPDB option to ON we can direct the SQL Server Database Engine to use tempdb to store the intermediate sort results that are used to build the index. Although this option increases the amount of temporary disk space that is used to create an index, the option could reduce the time that is required to create or rebuild an index when tempdb is on a set of disks different from that of the user database.

27. What is data compression in a SQL server database?

SQL Server 2012 supports row and page compression for tables and indexes. You can use the data compression feature to help compress the data inside a database, and to help reduce the size of the database. In addition to saving space, data compression can help improve performance of I/O intensive workloads because the data is stored in fewer pages and queries need to read fewer pages from disk. However, extra CPU resources are required on the database server to compress and decompress the data, while data is exchanged with the application.

28. What all database objects can be configured with Data compression?

Data compression can be configured for the following database objects:

  • A whole table that is stored as a heap.
  • A whole table that is stored as a clustered index.
  • A whole nonclustered index.
  • A whole indexed view.
  • For partitioned tables and indexes, the compression option can be configured for each partition, and the various partitions of an object do not have to have the same compression setting.

29. What are the Limitations and Restrictions of Data Compression?

  • System tables cannot be enabled for compression.
  • If the table is a heap, the rebuild operation for ONLINE mode will be single threaded. Use OFFLINE mode for a multi-threaded heap rebuild operation.
  • You cannot change the compression setting of a single partition if the table has nonaligned indexes.

30. Various types of compression in SQL Server database?

Data compression is implemented at two levels: ROW and PAGE.

31. What are the benefits of Column Stored index?

The benefits of using a non-clustered columnstore index are:

  • Only the columns needed to solve a query are fetched from disk (this is often fewer than 15% of the columns in a typical fact table)
  • It is easier to compress the data due to the redundancy of data within a column
  • Buffer hit rates are improved because data is highly compressed, and frequently accessed parts of commonly used columns remain in memory, while infrequently used parts are paged out.

References: Thanks to the all the SQL Server techies who wrote and shared the valuable information in the below blogs which helped me a lot to prepare this series of Questions. Also big thanks to Microsoft Documentation which contains each and everything about their product.

http://www.codeproject.com/Tips/830192/Fill-factor-in-SQL-server

https://robertbgoodman.wordpress.com/2013/04/02/benefit-of-column-store-indexes/


One response to “March towards SQL Server : Day 28 – SQL DBA Interview Questions Answers – All About Indexes – Part 2”