These are some patterns that I have written that work (in order of least efficient to most efficient in terms of “step count”):
Step counts are based on this sample input:
(green:bar AND black:foo)
(blue:bar AND darkblue:foo)
(yellow:bar AND grey:foo)
(greengarden:bar AND red:foo)
-
/(?:red|blue|green)(*SKIP)(*FAIL)|[a-z]+(?=:)/
Demo (513 steps) -
/\b(?!red:|blue:|green:)[a-z]+(?=:)/
Demo (372 steps) -
/(?<=\(|AND )(?!red:|blue:|green:)[^:]*/
Demo (319 steps) -
/(?<=\(|AND )(?:(?:red:|blue:|green:)(*SKIP)(*FAIL)|[^:]+)/
Demo (304 steps) -
/(?:\(|AND )\K(?!red:|blue:|green:)[^:]*/
Demo (291 steps) -
/(?:\(|AND )\K(?!red\b|blue\b|green\b)[^:]+/
Demo (291 steps) -
/[( ]\K(?!red\b|blue\b|green\b)[a-z]+/
Demo (172 steps)
This final pattern is the best performer and takes full advantage of the strict format of your input data — pertaining to capitalization, opening parentheses, and the two spaces per line.
It finds the opening parenthesis or a space, then restarts the fullstring match using \K
, disqualifies any substring that is wholly red
, blue
, or green
and stops matching on the last lowercase letter.
In regex, speed gains are accomplished by using narrow character classes ([]
& [^]
), limiting alternatives (pipes |
), “lookarounds”, and capture/non-capture groups.
My patterns deliberately avoid making capture groups because they only lead to output array bloat. All of your desired “color” substrings will be found in the fullstring [0] matches subarray that preg_match_all()
returns.
Code: (Demo)
$string='(green:bar AND black:foo)
(blue:bar AND darkblue:foo)
(yellow:bar AND grey:foo)
(greengarden:bar AND red:foo)';
var_export(preg_match_all('/[( ]\K(?!red\b|blue\b|green\b)[a-z]+/',$string,$out)?$out[0]:'fail');
Output:
array (
0 => 'black',
1 => 'darkblue',
2 => 'yellow',
3 => 'grey',
4 => 'greengarden',
)
solved Regular expression match against a string (blue:bar AND darkblue:foo) to darkblue