C# 11 - Raw String Literals
C# 11 - Raw String Literals

C# 11 – What are Raw String Literals ?

I have read this name – Raw String Literal – few weeks back and then I was wondering what this feature is. In this article, we will try to understand what it means and how it can be used.

What was the previous behavior ?

A string literal is a value specified between double quotes. Since very beginning, C# supports 2 two different types of string literals:

  • Quoted string literals specified between double quotes, e.g. “Some Literal”. In this type of literal, some escape sequences can be specified (e.g. \n – new line, \t – tab, hexadecimal characters, etc.) and those sequences render corresponding different character in the output. e.g. “some-\t-value” will render a tab between the two hyphens.
  • Verbatim strings, where starting double quote is prefixed by “@” symbol, e.g. @”some-\t-value” and this would print all the characters literally. This means “\t” will not render a tab character but it will just render a slash ‘\’ and a ‘t’. Note that verbatim strings still have a double quote for beginning and another double quote for ending the string literal.

Just because we are talking about string literals, let’s also talk about string interpolation too. In regular string literal, “if it is prefixed by a “$” sign before the starting quote, we can specify a C# expression within opening and closing curly brackets. The C# expression would be evaluated at runtime and the resulting value would be added to the string.

For example, in the below code snippet, the expression { 2 + 3 } is evaluated to be 5 and the output string contains this evaluated value.

In case of verbatim strings, the “$” character should be prefixed to the verbatim string prefix “@” as shown in the code snippet given above.

If the string is interpolated, which means it is prefixed with “$” sign, and we want to escape opening and closing curly brackets around an expression, then two curly brackets should be used as shown on the last line in the above code snippet.

The output of above code snippet is shown below.

C# – Quoted Literals, Verbatim Literals and String Interpolation

What is “raw string literal” ?

This is a new feature introduced with C# 11. What does this feature mean ?

A “raw string literal” is a special type of string literal. It can be identified by three double quotes at the beginning and another three double quotes at the end of the string literal (e.g. """string literal""").

If I may correct previous statement, the raw string literal is any string literals which start and end with at least three double quotes. It means that raw string literals may have more than three double quotes too at beginning and ending of the strings.

Within these double quotes, single " are considered content and included in the string. It means if a double quote is present within starting and ending set of quotes, then it is treated as normal character in the string. e.g. the string – """ The "quotes" are included in the string """ – will include the double quotes around the word quotes.

Any number of double quotes less than the number that opened the raw string literal are treated as content. It means if the string begins with 3 double quotes, then two double quotes at any point in the string literal would be treated as content. So, if you want to print 3 double quotes, you can begin and end the raw string literal with 4 or more double quotes.

Rules for Raw String Literals

Note that there are certain rules depending on whether the string is single line or multi-line.

  • If the string literal is single line literal (i.e. begins and ends on single line in the code), then the beginning quotes and ending quotes need to be places on the same line.
  • If the string literal is multi-line (i.e. begins on one line and string literal spans over multiple lines in the code), then the beginning quotes and ending quotes must be placed on their own lines.
  • If the string literal is multi-line, then any whitespace (space or tab characters) appearing on left side of closing quotes, are removed.

The code snippet given below shows examples of above mentioned rules.

The output of above program should clarify the rules further.

Raw string literal rules – single line vs multi-line literals

Interpolation with Raw String Literals

We already have seen what is interpolated string. It begins with “$” character. How does it fit with raw string literals?

The documentation says that we can prefix multiple “$” signs to the interpolated strings. If we use a single $ sign as prefix, then the C# expression can be embedded inside a single opening and single closing curly brackets.

If two $ sings are used, then the single pair of curly brackets are treated as contents. The expression needs to be embedded inside the double curly brackets as shown in the code example given below.

The output of above code snippet is shown below.

String Interpolation and Raw String Literals

Where this feature can be useful ?

While writing this article, I am thinking where this feature can be used. In my opinion, this feature can be used in application which heavily deals with very long strings, especially custom code generators which generate XML / JSON / HTML code.

For example, if we have a source code – which works as a custom source code generators, – which take some inputs (from user or from some files / database) and then generates the code, then this feature may be useful, to generate indented source code. The string may contain double quotes, angular brackets, new lines tab anything.

I am thinking, perhaps, in some cases, it may also be helpful for logging formatted string statements.

I hope you find this information helpful. Do you have ideas where this feature can be useful for you ? Let me know in the comments.

Leave a Reply Cancel reply

This Post Has 2 Comments

  1. rekabis
    Rekabis

    So a raw sting literal automatically removes whitespace at the beginning of the string. And it removes it from the left side, because English reads left-to-right. Cool. What about those languages like Arabic, Hebrew, Persian, Pashto, Urdu, Kashmiri and Sindhi, that read right-to-left?

    1. Manoj Choudhari
      Manoj Choudhari

      That is very good question ! I think generally the language specific strings are supposed to be part of resources files (.resx files). As I mentioned in the article, my opinion is – this feature is useful for source code (XML, HTML, JSON or any other language) generators. I am not sure if these can also be written in RTL fashion. But again this is my opinion and I may be wrong ! 🙂