Saturday, March 28, 2009

English Contractions and Expansions

I have searched for a list like this online, but have never been able to find one. So I thought I better publish my own list. These are all the common contractions of the English language, along with their expanded forms. AIML chat bots expand all contractions using a list like this, as a pre-processing step. Perhaps they may be of use to someone else besides us botmasters.

("AREN'T" "ARE NOT")
("CAN'T" "CAN NOT")
("CANNOT" "CAN NOT")
("COULD'VE" "COULD HAVE")
("COULDN'T" "COULD NOT")
("DIDN'T" "DID NOT")
("DOESN'T" "DOES NOT")
("DON'T" "DO NOT")
("EVERYTHING'S" "EVERYTHING IS")
("HADN'T" "HAD NOT")
("HASN'T" "HAS NOT")
("HAVEN'T" "HAVE NOT")
("HE S" "HE IS")
("HE'D" ("HE HAD" "HE WOULD"))
("HE'LL" "HE WILL")
("HE'S" "HE IS")
("HOW'D" ("HOW HAD" "HOW WOULD"))
("HOW'S" "HOW IS")
("I'D" ("I HAD" "I WOULD"))
("I'LL" "I WILL")
("I'M" "I AM")
("I'VE" "I HAVE")
("ISN'T" "IS NOT")
("IT S" "IT IS")
("IT'D" ("IT HAD" "IT WOULD"))
("IT'LL" "IT WILL")
("IT'S" "IT IS")
("LET S" "LET US")
("LET'S" "LET US")
("MIGHT'VE" "MIGHT HAVE")
("SHE'LL" "SHE WILL")
("SHE'S" "SHE IS")
("SHOULD'VE" "SHOULD HAVE")
("SHOULDN'T" "SHOULD NOT")
("THAT S" "THAT IS")
("THAT'D" ("THAT HAD" "THAT DID"))
("THAT'LL" "THAT WILL")
("THAT'S" "THAT IS")
("THERE S" "THERE IS")
("THERE'LL" "THERE WILL")
("THERE'S" "THERE IS")
("THERE'S" "THERE IS")
("THEY'D" ("THEY HAD" "THEY WOULD"))
("THEY'LL" "THEY WILL")
("THEY'RE" "THEY ARE")
("THEY'VE" "THEY HAVE")
("THEY'VE" "THEY HAVE")
("THIS'LL" "THIS WILL")
("WASN'T" "WAS NOT")
("WE'D" ("WE HAD" "WE WOULD"))
("WE'LL" "WE WILL")
("WE'RE" "WE ARE")
("WE'VE" "WE HAVE")
("WEREN'T" "WERE NOT")
("WHAT'D" ("WHAT HAD" "WHAT DID"))
("WHAT'LL" "WHAT WILL")
("WHAT'S" "WHAT IS")
("WHERE S" "WHERE IS")
("WHERE'S" "WHERE IS")
("WHO'S" "WHO IS")
("WHO'S" "WHO IS")
("WHY'S" "WHY IS")
("WON'T" "WILL NOT")
("WOULD'VE" "WOULD HAVE")
("WOULDN'T" "WOULD NOT")
("YOU'D" "YOU HAD YOU WOULD")
("YOU'LL" "YOU WILL")
("YOU'RE" "YOU ARE")
("YOU'VE" "YOU HAVE")
("'TIS" "IT IS")
("'EM" "THEM")


No comments:

Post a Comment