Tech Point Fundamentals

Friday, June 24, 2022

SQL Interview Questions and Answers - Part 05

SQL Interview Questions and Answers - Part 05

sql-interview-questions-answers

Are you preparing for the SQL Interview? If yes, then you are at the right place. This is the SQL Interview Questions and Answers article series. Here we will see Top 100+ SQL Interview Questions with Answers. 

Please visit our YouTube Channel for Interviews and other videos by below link:




Please read the complete Design Pattern, C#, MVC, WebAPI, and .Net Framework Interview Questions and Answers article series here.




Introduction


This is the 5th part of the SQL Interview Questions and Answers article series. Each part contains eight to ten SQL Interview Questions with Answers. Please read all the SQL Interview Questions list here.

I will highly recommend to please read the previous parts over here before continuing the current part:





SQL Interview Questions and Answers - Part 05


Q33. What is the difference between Unicode data types and Non-Unicode data types?

SQL Server supports both Unicode and Non-Unicode characters and hence supports multiple languages. Varchar, Char, and Text datatypes support Non-Unicode data and Nvarchar, Nchar, Ntext datatypes support Unicode data.



Unicode Data Types:

A Unicode character takes more bytes to store the data in the database. Unicode data types are used to store international language characters like Chinese, Japanese, Korean and Arabic, etc. SQL Server has supported Unicode since SQL Server 7.0 by providing nchar, nvarchar, ntext data types.

The difference in storing character data between Unicode and non-Unicode depends on whether non-Unicode data is stored by using double-byte character sets. All Non-East Asian languages and the Thai language store Non-Unicode characters in single bytes. 

Therefore, storing these languages as Unicode uses two times the space that is used specifying a non-Unicode code page. On the other hand, the non-Unicode code pages of many other Asian languages specify character storage in double-byte character sets (DBCS). Therefore, for these languages, there is almost no difference in storage between non-Unicode and Unicode.



Unicode data can be encoded in many different ways. Some of the common encoding formats for Unicode are UCS-2, UTF-8, UTF-16, and UTF-32. Under this mechanism, all Unicode characters are stored by using 2 bytes.

The NCHAR, NVARCHAR, NTEXT are the Unicode data types in SQL. These literals are prefixed with N also apart from single quotes i.e  N'Tech Point Fundamentals'.

Unicode data takes twice the storage as Non-Unicode data, considering this, it is important to define the right data type for the application. This disadvantage on the storage side can take a performance hit as more data pages have to be read to fetch the data.

The main disadvantage of Unicode datatype is that they are slow and require more space in the disk. Sorting lots of Unicode data can be slower than non-Unicode data because the data is stored in double bytes. 



Non-Unicode Data Types:

The Non-Unicode is exactly the opposite of Unicode. Using Non-Unicode it is easy to store languages like ‘English’ but not other Asian languages that need more bits to store correctly otherwise truncation will occur. They are best suited for US English.

The char, varchar, text are the Non-Unicode data types in SQL. These literals are enclosed in single quotes i.e 'Tech Point Fundamentals'.

The main advantages of Non-Unicode data types are performance and disk space. It takes less space to store the data in the database hence we will save a lot of hard disk space. Moving database files from one server to other also takes less time.



Q34. What is the difference between fixed-length data type and variable-length data type?

The database server supports the storage of both fixed-length and varying-length character data. 

Fixed Length Data Types:

A fixed-length data type is a specified length that cannot be changed. A fixed-length column requires the user-defined number of bytes regardless of the actual size of the data. 

The CHAR data type is of fixed length. A fixed-length string is a type of string where the length of the string is fixed. No matter how many characters constitute the string, the space is fixed.

For example, a CHAR(80) column requires 80 bytes of storage for all values, so the string “Tech Point Fundamentals” uses all the 80  bytes of storage. 








Variable Length Data Types:

The variable length refers to anything whose length can vary. A varying-length column size can be the number of bytes occupied by its data. NVARCHAR and VARCHAR and the data types are varying-length character data types.

For example, a VARCHAR(80) column reserves up to 80 bytes of storage for the column value, but the character string “Tech Point Fundamentals” uses only 23 bytes of the reserved 80 bytes.

Microsoft SQL Server 2008 and above can store up to 8000 characters as the maximum length of the string using varchar data type.




Q35. What is the difference between CHAR vs VARCHAR data types?

Both CHAR and VARCHAR data types are used to store string data types in SQL. However:

  1. VARCHAR is variable length, while CHAR is fixed length. 
  2. Since CHAR is a fixed-length string data type, any remaining space in the field is padded with blanks. 
  3. CHAR takes up 1 byte per character while VARCHAR takes up 1 byte per character + 2 bytes to hold length information. So VARCHAR only costs two “extra” bytes, when compared to CHAR. 

So, a CHAR(100) field takes up 100 bytes on the disk, regardless of the string it holds. On the other hand, if you set a VARCHAR(100) datatype = 'TechPoint', then it would take up 9 bytes plus 2 bytes i.e. or 11 bytes in all.




Q36. What is the difference between VARCHAR vs NVARCHAR data types?

Both VARCHAR and NVARCHAR data types are variable-length data types in SQL. However, the key difference between varchar and nvarchar is the way they are stored.

  1. VARCHAR is a Non-Unicode character data type which is used to store ASCII values, while NVARCHAR is a Unicode character data type which can be used to store non-ASCII characters as well.
  2. VARCHAR is used when you know that you store only English characters or Non- Unicode characters whereas NVarchar is used where you want to store Non-English characters or Unicode characters.
  3. VARCHAR takes 1 byte of space for each character whereas NVarchar takes 2 bytes of space for each character. which means NVARCHAR is twice as expensive as VARCHAR.
  4. A VARCHAR can stores a maximum of 8000 characters where as NVarchar can store a maximum of 4000 Unicode/Non-Unicode characters.
  5. VARCHAR literals are enclosed in single quotes, like 'TechPoint,' but NVARCHAR literals are prefixed with N also, for example,  N'TechPoint'.
  6. Memory allocation in VARCHAR is equal to the inserted number of characters plus two extra bytes for offset while in nvarchar memory allocation is equal to twice of the inserted number of characters plus with extra bytes for offset.

In SQL Server NVARCHAR takes more space than VARCHAR data type, almost 2x as much space as VARCHAR.  If Optional parameter value n is not specified in the variable declaration or column definition then it is considered as 1. But When optional parameter n is not specified while using the CAST/ CONVERT functions, then it is considered as 30.



Q37. What is the max size of a VARCHAR and NVARCHAR data type in SQL, Why? 

VARCHAR is a non-Unicode character data type which is used to store ASCII values, while NVARCHAR is a Unicode character data type which can be used to store non-ASCII characters as well.

The varchar is stored as regular 8-bit data(1 byte per character) and nvarchar stores data at 2 bytes per character. Due to this reason, nvarchar can hold upto 4000 characters only and it takes double the space as SQL varchar.

So a Varchar can store maximum 8000 characters where as Nvarchar can store maximum 4000 Unicode/Non-Unicode characters. The reason behind this limitation is the SQL 8KB page. Please watch the SQL 8KB page video here for more details.



So in cases when there are chances that the string length of the varchar column might exceed 8000 bytes, using varchar(8001) or anything higher will result into an error. 

The SQL Server 2005 got around this limitation of 8KB storage size and provided a workaround with varchar(max). It is a non-Unicode large variable-length character data type and can store a maximum of 2^31-1 bytes (2 GB) of non-Unicode characters.




Actually, SQL Server uses page to store data and the size of each page is 8KB. If the data to be stored is less than or equal to 8000 bytes, varchar(n) or varchar(max) stores it in-row. However, if the data exceeds the 8000 byte size then it is treated as a Large Object(LOB) and they are not stored in-row but in separate LOB pages(LOB_DATA). 

Row in such case will only have a pointer to the LOB data page where the actual data is present and SQL Server automatically assigns an over-flow indicator to the page to manipulate data rows. In nutshell, if you know the data might exceed 8000 byte, it is a better option to use varchar(max) as the data type.




Q38. What is the difference between VARCHAR(MAX) and VARCHAR(n) data types?

In practical scenarios, varchar(n) is used to store variable length value as a string, here ‘n’ denotes the string length in bytes and it can go up to 8000 characters.  We can use string length up to varchar(8000) only as this is the maximum number of characters that SQL varchar(n) data type can hold. So in cases when there are chances that the string length of the varchar column might exceed 8000 bytes, using varchar(8001) or anything higher will result into an error.

SQL Server 2005 got around this limitation of 8KB storage size and provided a workaround with varchar(max). It is a non-Unicode large variable-length character data type and can store a maximum of 2^31-1 bytes (2 GB) of non-Unicode characters.



Actually, SQL Server uses pages to store data and the size of each page is 8KB. SQL Server stores data in 8KB pages (8,192 bytes), but the page header occupies the first 96 bytes of each data page leaving 8,096 bytes for data, row overhead, and row offsets, that’s why you cannot declare a value greater than VARCHAR(8000).

If the data to be stored is less than or equal to 8000 bytes, varchar(n) or varchar(max) stores it in-row. However, if the data exceeds the 8000-byte size then it is treated as a Large Object(LOB) and they are not stored in-row but in separate LOB pages(LOB_DATA). 

The biggest limitation of VARCHAR(MAX) or NVARCHAR(MAX) is that we can not create an index on these columns, hence we can not make them as Primary Key or Unique Key as well.

Please watch the complete video here.



Q39. What is the difference between VARCHAR(MAX) and TEXT(n) data types?

Both VARCHAR(Max) and VARCHAR(n) are variable length data types, they have some major differences pertaining to storage, behavior and of course their intended usage.

The VARCHAR(Max), NVARCHAR(max), and VARBINARY(max) string data types were first introduced in SQL Server 2005 to replace the Large Object Blob (LOB) data types TEXT, NTEXT, and IMAGE respectively. All of these data types can store up to 2GB of data except NVARCHAR(max) that can store 1GB of Unicode characters.

One of the advantages of the VARCHAR(max) data type as a replacement of TEXT data type is that we can declare local variables to manipulate LOBs and even declare VARCHAR(max) parameters on functions and stored procedures. This is something that cannot be done with the TEXT data type.



The VARCHAR(max) data type can be used inside string functions such as REPLACE, CHARINDEX or LEFT instead of using READTEXT, WRITETEXT, and UPDATETEXT commands to manipulate LOBs. However, there is an obvious limitation on VARCHAR(max) columns which is that these columns cannot be indexed. 

For read, write or update TEXT data using the SELECT, INSERT or UPDATE statements and special statements like READTEXT, WRITETEXT and UPDATETEXT  also can be used. But you can directly read, write or update VARCHAR(MAX) data using the SELECT, INSERT or UPDATE statements no special statements READTEXT, WRITETEXT and UPDATETEXT  can be used.



Q40. What is the difference between LEN() and DATALENGTH() in SQL?

LEN() function only display the length of the string; it does not specify how many bytes used by the Varchar or Nvarchar. On the other hand DATALENGTH() is especially useful with varchar, varbinary, text, image, nvarchar, and ntext data types because these data types can store variable-length data.

LEN() is used to return the number of characters in a string while DATALENGTH() is used to return the number of bytes used by any expression or column.

LEN() function does NOT include trailing spaces of string expression as part of the LEN() measurement. While DATALENGTH() is used to return the number of bytes used by any expression/column whose data type is UNICODE.



The input parameter is string expression and LEN would allow the input parameter that could be constant OR variable OR column of character OR binary data OR NULL. But It does not allow the TEXT and IMAGE data types as the input parameter. On the other hand in DATALENGTH() input parameter is an expression and it could be any data type.




The return type of the LEN() function is BIGINT if expression is of the varchar(max), nvarchar(max) or varbinary(max) data types; otherwise, INT. On the other hand in DATALENGTH() the return type is BIGINT if expression is of the varchar(max), nvarchar(max) OR varbinary(max) data types; otherwise, INT.

Please watch the LEN vs DATALENGHT video here for more details.



Q41. What is the difference between DateTime and DateTime2 data types in SQL?

  1. Datetime2 is ANSI & ISO 8601 standard SQL Compliant while DateTime does not.
  2. The date range for DateTime is 1753-01-01 thru 9999-12-31 while for datetime2 is 0001-01-01 thru 9999-12-31. So DateTime 2 has a larger date range.
  3. The time range for datetime is 00:00:00 thru 23:59:59.997 while for datetime2 is 00:00:00 thru 23:59:59.9999999.
  4. The storage size for DateTime is 8 bytes while datetime2 is 6 to 8 bytes, depending on the precision(Plus 1 byte to store the precision). So the datetime2 has less storage size when using the same number of decimal places as DateTime.
  5. The accuracy of DateTime is rounded to increments of .000, .003, or .007 seconds while DateTime has the accuracy of 100 nanoseconds. So datetime2 has higher accuracy.
  6. The datetime2 has User-defined fractional second precision but DateTime does not have. Also, the datetime2 has larger default fractional precision.



To Be Continued Part-06...


Recommended Articles






Thanks for visiting this page. Please follow and join us on LinkedInFacebookTelegramQuoraYouTubeTwitterPinterestTumbler, and VK for regular updates.

    

No comments:

Post a Comment

Please do not enter any HTML. JavaScript or spam link in the comment box.