Text to speech programs do okay on words they know, but on longer words not in their 'dictionary' they have to sound them out phonetically which seems to be a really hit or miss operation. I wonder if one could hook up text to speech software and a polygraph sensor together to monitor the listeners reaction to the words being read.
I know I cringe when I hear something mis-pronounced and surely something in my mental wince is externally measurable. If the software detected a negative reaction to the way it pronounced a word it could try an alternate pronunciation the next time. Granted it would be a highly iterative process -- requiring many listeners for a each text sample so that the most-favorable response for each word can be found, but how many people listened to Harry Potter as a book on tape.