I would like to compare two human-language, user-supplied texts. Call them Text A and Text B. Each is the length of a fat novel, i.e., around 150,000 words.
Is there a tool that can help me find all exact phrase-matches, anywhere in the text, that are $n$ words long. (For example: Text A starts with "It was the Best of Times, it was the worst of Times; Text B has the phrase "worst of Times" buried somewhere. That comes up as a hit with the specification $n=3$, or $2≤n≤6$.)
Hey, thanks in advance to anyone who can give me pointers! I'm totally new to this, so sorry if this is elementary.
TextStructure, using the optional second argument"ConstituentStrings", to make a list of the constituent phrases in each text. Then I'd sort them, compare the two sorted lists. Then I'd do some serious thinking ... – High Performance Mark Oct 18 '22 at 10:31StringCaseslikeStringCases["hat cat dog hat cat", "hat cat"]– userrandrand Nov 29 '22 at 01:02