Common Regular Expression patternsλ︎
Common string formats used in software development and examples of regular expressions to check their correctness.
Username Regular Expression Patternλ︎
A 8 to 24 character passwords that can include any lower case character or digit (number). Only the underscore and dash special characters can be used.
Breakdown the regex pattern:
^[a-z0-9_-]{8,24}$
^ # Start of the line
[a-z0-9_-] # Match characters and symbols in the list, a-z, 0-9 , underscore , hyphen
{8,24} # Length at least 8 characters and maximum length of 24
$ # End of the line
Password Regular Expression Patternλ︎
A password should be 8 to 24 character string with at least one digit, one upper case letter, one lower case letter and one special symbol, @#$%
.
The order of the grouping formulas does not matter
Breakdown the regex pattern:
((?=.*\d)(?=.*[a-z])(?=.*[A-Z])(?=.*[@#$%]).{8,24})
( # Start of group
(?=.*\d) # must contains one digit from 0-9
(?=.*[a-z]) # must contains one lowercase characters
(?=.*[A-Z]) # must contains one uppercase characters
(?=.*[@#$%]) # must contains one special symbols in the list "@#$%"
. # match anything with previous condition checking
{8,24} # length at least 8 characters and maximum of 24
) # End of group
?=
means apply the assertion condition, which is meaningless by itself and works in combination with others.
Hexadecimal Color Code Regular Expression Patternλ︎
The string must start with a #
symbol , follow by a letter from a
to f
, A
to Z
or a digit from 0
to 9
with a length of exactly 3 or 6.` This regular expression pattern is very useful for the Hexadecimal web colors code checking.
Breakdown the regex pattern:
^#([A-Fa-f0-9]{3}|[A-Fa-f0-9]{6})$
^ #start of the line
# # must contain a "#" symbols
( # start of group #1
[A-Fa-f0-9]{3} # any strings in the list, with length of 3
| # ..or
[A-Fa-f0-9]{6} # any strings in the list, with length of 6
) # end of group #1
$ #end of the line
Email Regular Expression Patternλ︎
The account side of an email address starts with _A-Za-z0-9-\\+
optional follow by .[_A-Za-z0-9-]
, ending with an @
symbol.
The domain starts with A-Za-z0-9-
, follow by first level domain, e.g .org
, .io
and .[A-Za-z0-9]
optionally follow by a second level domain, e.g. .ac.uk
, .com.au
or \\.[A-Za-z]{2,}
, where second level domain must start with a dot .
and length must equal or more than 2 characters.
(re-matches
#"^[_A-Za-z0-9-]+(\.[_A-Za-z0-9-]+)*@[A-Za-z0-9]+(\.[A-Za-z0-9]+)*(\.[A-Za-z]{2,})$"
"jenny.jenn@jetpack.com.au")
Double escaping special characters
Double escaping of special characters is not required in the Clojure syntax.
Breakdown the regex pattern:
^[_A-Za-z0-9-]+(\\.[_A-Za-z0-9-]+)*@[A-Za-z0-9]+(\\.[A-Za-z0-9]+)*(\\.[A-Za-z]{2,})$
^ #start of the line
[_A-Za-z0-9-]+ # must start with string in the bracket [ ], must contains one or more (+)
( # start of group #1
\\.[_A-Za-z0-9-]+ # follow by a dot "." and string in the bracket [ ], must contains one or more (+)
)* # end of group #1, this group is optional (*)
@ # must contains a "@" symbol
[A-Za-z0-9]+ # follow by string in the bracket [ ], must contains one or more (+)
( # start of group #2 - first level TLD checking
\\.[A-Za-z0-9]+ # follow by a dot "." and string in the bracket [ ], must contains one or more (+)
)* # end of group #2, this group is optional (*)
( # start of group #3 - second level TLD checking
\\.[A-Za-z]{2,} # follow by a dot "." and string in the bracket [ ], with minimum length of 2
) # end of group #3
$ #end of the line
Image File name and Extension Regular Expression Patternλ︎
A file extension name is 1 or more characters without white space, follow by dot .
and string end in jpg
or png
or gif
or bmp
. The file name extension is case-insensitive.
Change the combination (jpg|png|gif|bmp)
for other file extension.
In-line modifiers indirectly supported in ClojureScript
ClojureScript is hosted on JavaScript which does not support in-line modifier flags such as (?i)
for a case insensitive pattern.
In-line flags will be converted by the ClojureScript reader if they are the first element in the literal regular expression pattern, or if the js/RegExp
function is used to create the regular expression.
Breakdown the regex pattern:
([^\s]+(\.(?i)(jpg|png|gif|bmp))$)
( #Start of the group #1
[^\s]+ # must contains one or more anything (except white space)
( # start of the group #2
\. # follow by a dot "."
(?i) # ignore the case sensitive checking
( # start of the group #3
jpg # contains characters "jpg"
| # ..or
png # contains characters "png"
| # ..or
gif # contains characters "gif"
| # ..or
bmp # contains characters "bmp"
) # end of the group #3
) # end of the group #2
$ # end of the string
) #end of the group #1
IP Address Regular Expression Patternλ︎
An IP address comprises of 4 groups of numbers between 0 and 255, with each group separated by a dot.
Example IP address are: 192.168.0.1
, 127.0.0.1
, 192.120.240.100
(re-matches
#"^([01]?\d\d?|2[0-4]\d|25[0-5])\.([01]?\d\d?|2[0-4]\d|25[0-5])\.([01]?\d\d?|2[0-4]\d|25[0-5])\.([01]?\d\d?|2[0-4]\d|25[0-5])$"
"192.168.0.1")
Breakdown the regex pattern:
^([01]?\\d\\d?|2[0-4]\\d|25[0-5])\\.([01]?\\d\\d?|2[0-4]\\d|25[0-5])\\.
([01]?\\d\\d?|2[0-4]\\d|25[0-5])\\.([01]?\\d\\d?|2[0-4]\\d|25[0-5])$
^ #start of the line
( # start of group #1
[01]?\\d\\d? # Can be one or two digits. If three digits appear, it must start either 0 or 1
# e.g ([0-9], [0-9][0-9],[0-1][0-9][0-9])
| # ...or
2[0-4]\\d # start with 2, follow by 0-4 and end with any digit (2[0-4][0-9])
| # ...or
25[0-5] # start with 2, follow by 5 and end with 0-5 (25[0-5])
) # end of group #2
\. # follow by a dot "."
.... # repeat with 3 time (3x)
$ #end of the line
Time Format Regular Expression Patternλ︎
Time in 12-Hour Format Regular Expression Pattern. The 12-hour clock format start between 0-12, then a semi colon, :
, follow by 00-59
. The pattern ends with am
or pm
.
Breakdown the regex pattern:
(1[012]|[1-9]):[0-5][0-9](\\s)?(?i)(am|pm)
( #start of group #1
1[012] # start with 10, 11, 12
| # or
[1-9] # start with 1,2,...9
) #end of group #1
: # follow by a semi colon (:)
[0-5][0-9] # follow by 0..5 and 0..9, which means 00 to 59
(\\s)? # follow by a white space (optional)
(?i) # next checking is case insensitive
(am|pm) # follow by am or pm
Time in 24-Hour Format Regular Expression Patternλ︎
The 24-hour clock format start between 0-23 or 00-23, then a semi colon :
and follow by 00-59.
Breakdown the regex pattern:
([01]?[0-9]|2[0-3]):[0-5][0-9]
( #start of group #1
[01]?[0-9] # start with 0-9,1-9,00-09,10-19
| # or
2[0-3] # start with 20-23
) #end of group #1
: # follow by a semi colon (:)
[0-5][0-9] # follow by 0..5 and 0..9, which means 00 to 59
Date Format Patternλ︎
Date format in the form dd/mm/yyyy
. Validating a leap year and if there is 30 or 31 days in a month is not simple though.
Breakdown the regex pattern:
(0?[1-9]|[12][0-9]|3[01])/(0?[1-9]|1[012])/((19|20)\\d\\d)
( #start of group #1
0?[1-9] # 01-09 or 1-9
| # ..or
[12][0-9] # 10-19 or 20-29
| # ..or
3[01] # 30, 31
) #end of group #1
/ # follow by a "/"
( # start of group #2
0?[1-9] # 01-09 or 1-9
| # ..or
1[012] # 10,11,12
) # end of group #2
/ # follow by a "/"
( # start of group #3
(19|20)\\d\\d # 19[0-9][0-9] or 20[0-9][0-9]
) # end of group #3
HTML tag Patternλ︎
HTML code uses tags to define structure of content. HTML tag, start with an opening tag “<" , follow by double quotes "string", or single quotes 'string' but does not allow one double quotes (") "string, one single quote (') 'string or a closing tag > without single or double quotes enclosed. At last , end with a closing tag “>”
Breakdown the regex pattern:
<("[^"]*"|'[^']*'|[^'">])*>
< #start with opening tag "<"
( # start of group #1
"[^"]*" # only two double quotes are allow - "string"
| # ..or
'[^']*' # only two single quotes are allow - 'string'
| # ..or
[^'">] # cant contains one single quotes, double quotes and ">"
) # end of group #1
* # 0 or more
> #end with closing tag ">"
HTML links Regular Expression Patternλ︎
HTML A tag Regular Expression Pattern
(?i)<a([^>]+)>(.+?)</a>
( #start of group #1
?i # all checking are case insensitive
) #end of group #1
<a #start with "<a"
( # start of group #2
[^>]+ # anything except (">"), at least one character
) # end of group #2
> # follow by ">"
(.+?) # match anything
</a> # end with "</a>
Extract HTML link Regular Expression Patternλ︎
\s*(?i)href\s*=\s*(\"([^"]*\")|'[^']*'|([^'">\s]+));
\s* #can start with whitespace
(?i) # all checking are case insensive
href # follow by "href" word
\s*=\s* # allows spaces on either side of the equal sign,
( # start of group #1
"([^"]*") # only two double quotes are allow - "string"
| # ..or
'[^']*' # only two single quotes are allow - 'string'
| # ..or
([^'">]+) # cant contains one single / double quotes and ">"
) # end of group #1