Regex | Sanitize Strings (KQL, Powershell)
I recently came across some special chars in Sentinel Entitites which broke my KQL Query. I decided to create a whitelist solution to sanitize strings.
My goal was to replace all chars except for
- a-z/A-Z
- 0-9
- spaces and _
With a little help from ChatGPT i created the following expression:
[^a-zA-Z0-9 _]
Let’s break it down:
[^a-zA-Z0-9 _]
- This is a character class, denoted by square brackets[]
, which defines a set of characters that are to be matched.^
- The caret symbol at the beginning of the character class denotes negation, meaning it matches any character that is not in the set.a-zA-Z0-9
- This range represents all lowercase letters (a-z
), uppercase letters (A-Z
), and digits (0-9
).- `` - This space character represents a literal space.
_
- This underscore character represents a literal underscore.
In summary, this regular expression matches any character that is not a letter (lowercase or uppercase), a digit, a space, or an underscore. It can be used to identify any special characters or symbols present in a given string.
If you want to whitelist more characters, add them before the ]. Let’s add ‘.’ for FQDNs:[^a-zA-Z0-9 _.]
I let ChatGPT create the following test string:
!”#$%&’()*+,-./0123456789:;<=>?@AB CDEFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvwxyz{|}~¡¢£¤¥¦§¨©ª«¬®¯°±²³´µ¶·¸”
Replace special chars in KQL
let strRegex = @'[^a-zA-Z0-9 _]';
let strTest = '!"#$%&\'()*+,-./0123456789:;<=>?@AB CDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~¡¢£¤¥¦§¨©ª«¬®¯°±²³´µ¶·¸"';
let strSanitized = replace_regex(strTest, strRegex, @'');
print strSanitized
Output:
Replace special chars in Powershell
$strRegex = '[^a-zA-Z0-9 _]'
$strTest = '!"#$%&''()*+,-./0123456789:;<=>?@AB CDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~¡¢£¤¥¦§¨©ª«¬®¯°±²³´µ¶·¸"'
$strTest -replace $strRegex
Output: