Is there an algorithm which takes a set of strings and returns the shortest possible regular expression which generates those strings and only those strings? More generally, is there an algorithm which takes a regular expression and returns the shortest possible equivalent regular expression? If so, what’s its complexity?
For example, in the first case, we might input the strings “Händel”, “Haendel”, and “Handel”, and the algorithm would return the regular expression “H(ä|ae?)ndel” or “H(ae?|ä)ndel”. In the second case, we might input the regular expression “H(a|ae|ä)ndel” and get the same result. (The first example is equivalent to passing “Händel|Haendel|Handel” in the second case.)