Sentence Stack is a linguistic search engine, pointed out to me by a reader of an earlier post, Searching with Ludwig, where I reviewed the features offered by Ludwig Guru. Indeed, Sentence Stack advertises itself as a free alternative to Ludwig. In this post I will try to gauge how well Sentence Stack stacks up in comparison.
The Sentence Stack homepage lists the following features:
(1) Search for a phrase or idiom in context
(2) Translate phrases to English (automatic language detection) with examples in context.
(3) Look up definition and synonyms for a single word
(4) Compare frequency of phrases
(5) Fill in the blank * search
(7) Compare frequency of single words
(9) Check grammar, spelling, and punctuation of a sentence
(10) Check proper article usage (i.e. indefinite and definite articles)
The numbering of the above list matches in (1) – (5) and (7) the list given in Searching with Ludwig. Items (6) and (8) are blank as features offered by Ludwig but not Sentence Stack (paraphrase a sentence = substitute a synonym, and permute a set of words and see which is the most common order); items (9) and (10) above are not offered on Ludwig. For the common features, I will test the same examples as in Searching with Ludwig here to help compare the two resources.
(1) Searching for the idiom “up to here” as in “had enough” is not recognized as an idiom (no definition is given for it), and listed are some 30 exact matches for this sequence of words in the corpus of texts, for each the full sentence being given in which it appears. Next to each sentence is an icon Listen for hearing to the sentence read aloud (the accent varies, at first I thought chosen according to the source, for example an American reads from the Washington Post, a British English voice reads from the Guardian, but this is not consistently the case), Visit Source giving a link to the online source itself (which helps give wider context to the given sentence), and Show Context that at time of writing is a feature not yet implemented, which presumably will give some surrounding sentences to the given one, obviating the need to follow the link to the original source.
As well as the sought after idiom “up to here” (I think we’re about up to here with remote learning in most places in the country), sentences are given in which this word combination bears a different meaning. For example, the sentence That’s because it’s an important feature of what Trump is really up to here combines “to be up to” (be doing something) and “here”. In other words, the examples give all sentences with the given combination of words, with no division according to meaning. (This remark also applies to Ludwig.) Around a third of the sentences feature the “have it up to here” idiom, the remaining meanings including “be up to” as in to be doing something (Well, there can be no reconciliation, and that, of course, is what Democrats are up to here) and “up to here” as in up to this point (I think that means all points up to here have been covered).
An intriguing set of five “related searches” are suggested: misalliance, pick out, great finding, addressing the situation, certain time.
Making the more restrictive search “have it up to here” gives 20 sentences, which are not a subset of those obtained with the search “up to here”, and 12 of which give the meaning of the idiom – the example “We have to clean it up here a little bit better” (read with an English accent from the Chicago Tribune) achieves a near hit of “have it up to here” by melding phrasal verbs “have to” and “clean up”. Another intriguing set of five “related searches” is offered: ever-changing, throw away, biggest friend, lower priority, he says that.
(2) Upon typing /translate hygge into the search box, the Danish word is translated into English as cosiness, and 7 English sentences offered featuring cosiness or cosy (not all of which have a meaning approximate to the Danish hygge, however, e.g. He could probably see the sense in a cosy league for Europe’s aristocrats). Now that hygge has been embraced as an English word, typing hygge in the search box yields the definition of cosiness or conviviality, cf. Lexico/Google’s a quality of cosiness and comfortable conviviality that engenders a feeling of contentment or well-being (regarded as a defining characteristic of Danish culture), and four sentences featuring the word, all from Wikipedia.
(3) Entering fun into the search box produces a definition of fun as verb, noun and adjective – the same definition as given by “define fun” on Google (which draws on Oxford Languages/Lexico for its definitions) only without the markers of informality and North American that Lexico gives. This loses important information for the user, who may be surprised to see the verb form “to fun” as in to joke or tease (an informal Americanism). The same complaint applies to Ludwig (see item (3) of Searching with Ludwig), although Ludwig does give a dictionary definition different to that which can simply obtained from Google Search.
(4) Under item (4) of Searching with Ludwig I discuss the question of whether to hyphenate such adverb plus participle phrases as “well known” and “well defined”. Entering well known /vs well-known in the search box, the frequency ratio is 47% to 53%, a list of 15 sentences for each variant is given, and from these examples the “rule” does appear to be that hyphens are used when used as a pre-modifier and no hyphen when used as a post-modifier. The search well defined /vs well-defined is not so successful: a ratio of 30% to 70% is reported for frequencies in the corpus, but the examples for “well-defined” (just 5 of them) all in fact lack the hyphen.
The other frequency comparison I tested on Ludwig was of high interest /vs of great interest. Sentence Stack gives 21% for the former, 79% for the latter, and lists for each 15 sentences, 5 of which for the former are “of high interest rates” which raises the question of whether the statistic of 21% includes these examples not relevant to my particular query. Indeed, entering to be of high interest /vs to be of great interest in order to eliminate these irrelevant examples produces the statistic of 8% vs 92%, very close to Ludwig’s 9% vs 91%. But even here, sentences such as People no longer have to be bribed with a high interest rate to save rather than consume sneak in to the examples for “to be of high interest” (of which there are only 7, and 3 involve high interest rates rather than compelling attention). Occasionally, then, it would be helpful to force the search to limit itself to contiguous occurrences of the search words forming a phrase (ideally with provision for grammatical variation in the words e.g. “is” for “to be”).
(5) Entering “come * with”, in which * could for instance be “up” (“come up with” = produce something, especially when pressured or challenged), “down” (“come down with” = to begin to suffer from [an illness]), “out” (“come out with” = say something in a sudden, rude, or incautious way) or “back” (“come back with” = to make a reply or response of) apportions a frequency of 72% to come up with, 7% to come back with, 4% come out with and 18% to other words (including come through with and come in with, each represented by 1 example sentence out of the 5 sentences offered). All the example sentences have the exact word “come” (not e.g. comes or coming). Searching for “comes * with” apportions a frequency of 42% to comes up with, 7% comes back with, 6% to comes in with, and 45% to other words: 2 of the 5 example sentences feature comes up with, 1 features comes along with, and the other two not phrasal verbs but the collocations comes standard with and comes equipped with. Searching for “coming * with” apportions a frequency of 56% to coming up with, 8% coming out with, 7% to coming in with, and 30% to other words: 4 of the 5 example sentences feature coming up with, the remaining 1 featuring coming from with. It would be helpful to have an example for each of the more frequent combinations (here coming in with has no example to bring out its meaning).
(7) Comparing the frequency of words is a special case of item (4), which allows phrases, but is a convenient way to compare three or more words at once. In Searching with Ludwig, item (7), I discuss the distinctions among the three variants different to/than/from. Searching for “different [to than]” in the corpus used by Sentence Stack, the proportion is roughly 2:1 in favour of “different than” (compared to Ludwig’s proportion of 3:1). For each 15 example sentences are given. Throwing “from” into the mix by searching for “different [from to than]”, shows “different from” to be predominant, appearing almost 4 times as often as the other two put together. What conclusions you can reach from such statistics is unclear: reading Merriam-Webster’s discussion may shed some light (albeit low candlepower) on the matter.
(9) I tried /check I go to market, and bought fresh fishes and was informed there were no errors of grammar or spelling. I went to the market and bought fresh fish would perhaps be more standard. Over twenty sentences are given that contain some combination of words in common with my input sentence (They went to the market and bought fish every day as the boats unloaded being the closest, The developer in question didn’t go to Apple with an idea, he went with a marketable product among the most remote).
(10) Checking whether an indefinite or definite article should be included is a feature especially useful for speakers whose native language dispenses with the need for them altogether (such as Czech). Entering go to [the] market reveals that the article is omitted over twice as many times as it is included: inspecting the 15 example sentences given for either option may allow the reader to glean the “rule” for when it should be used (when context allows a certain market to be pinpointed as an entity), and also led me via the sentence Together we’ve pushed boundaries on form factors, materials, packaging and go to market strategies (from Forbes) to discover what is meant by go to market strategies. This is one virtue of a resource such as Sentence Stack and Ludwig Guru, that a word or phrase you come across in one context has its scope widened out by seeing it in use in sentences gathered from elsewhere. Trying go to [the] hell reports that the article is absent 100% of the time although one example sentence It’s worth it not to go back to the hell I was in for those two years (from Wikipedia) uses the article (correctly, as the hell refers to an identifiable period of subjective experience in the past).
In conclusion, Sentence Stack appears to deliver a comparable functionality to the free version of Ludwig Guru, and without the limit on number of queries you can make in a day. It appears to draw on a wider range of sources than Ludwig, which is both positive (larger sample of current English) and negative (some sources may not be carefully edited for English), although Wikipedia seems to be a source heavily drawn on. Sentence Stack promotes itself as a Linguistic Search Engine and not just a producer of sentence examples containing a given set of words, by providing “advanced features” such as producing examples close to but not exactly the same as a given phrase (useful for correcting an imprecisely formulated phrase); however, this consists only in removing words from the given phrase and/or allowing intermediate words: grammatical changes such as changing tense are not recognized – see (5) above. But given this is a free resource, multiple searches can make up for this drawback. Comparative frequencies of variants and “fill in the blank” are other features that distinguish the search engine from a simple corpus look-up search. Plus there is access to a dictionary (which appears to draw on the same source as Google search, i.e. Lexico), spelling and grammar check (but see (9) above), thesaurus, audio for checking pronunciation (various accents are offered, loosely associated with the origin of the sentence – see (1) above), and translation.
I never worked out how the “Related Searches” offered after each search were in fact related to the search I had made: as well as in this way exercising one’s powers of association they do at least provide an instrument for serendipitous discovery, which, along with the broadening of the field of connotation for a given word or phrase by seeing them in use in a variety of sentences, is perhaps the main virtue of resources such as Sentence Stack and Ludwig Guru.