Playing Regex Golf with Genetic Programming" has been accepted at ACM GECCO, the top conference on Evolutionary Computation (we were there also in 2013 and 2012, though!).
What is Regex Golf? An unstructured and informal programming competition that consists in writing the shortest possible regular expression that solves a given problem. In particular, that matches all strings in a given list and does not match any string in another given list.
We have developed a player that is internally based on Genetic Programming. It takes a problem description (two lists as above) and generates a regex based on an evolutionary search in the space of all possible solutions.
We considered a popular regex golf challenge proposed recently and compared the performance of our player to the best results produced by humans and to the only existing algorithm for playing automatically---an algorithm developed by Peter Norvig, Director of Google Research (!).
We rank in the top ten list worldwide, 6-th and 7-th place, beaten only by a few humans.
We are very proud of this result.
In the coming weeks we will make our player publicly available on the web---much like our webapp for generating regex for text extractions automatically based on examples (which is a problem different from regex golf for several reasons that we discuss in detail in the paper).