I am trying to split a string on all occurrences of ’ AND ’ except
where it appears within quotes. I have a regular expression which
works but generates strange output in a certain case:
“text_search ‘(large red spear OR axe) AND wood’ AND material
1”.split( /(?: AND )?(\S+ ‘.+’)(?: AND )?|(?: AND )/ )
gives:
["", “text_search ‘(large red spear OR axe) AND wood’”, “material 1”]
I don’t understand why my regular expression is producing the blank
entry at the beginning of the array. Can anyone lend some insight?
Thanks,
Ryan Wallace
Ryan Wallace wrote:
“text_search ‘(large red spear OR axe) AND wood’ AND material
1”.split( /(?: AND )?(\S+ ‘.+’)(?: AND )?|(?: AND )/ )
gives:
["", “text_search ‘(large red spear OR axe) AND wood’”, “material 1”]
I don’t understand why my regular expression is producing the blank
entry at the beginning of the array. Can anyone lend some insight?
“text_search ‘(large red spear OR axe) AND wood’” is what your regex
matches.
“” is what comes before the match and “material 1” is what comes after
the
match. If a split-regex matches the beginning of a string, the first
item in
the returned array will be “”. Compare:
“a1b”.split(/1/)
=> [“a”, “b”]
“1b”.split(/1/)
=> ["", “b”]
“1b”.split(/(1)/)
=> ["", “1”, “b”]
I also notice that you have a greedy quantifier between the ‘’ in the
regex.
This will likely cause unwanted result when you have more than one pair
of ‘’
in your string.
HTH,
Sebastian