Thoughts for serious language learners
The Antimoon Blog header image

Correct English Search

Correct English logoThere are two problems with using Google to check your English:

  1. The Web is full of bad English written by non-native speakers and verbally incompetent native speakers. This is a serious problem when you’re looking for correct example sentences to learn the usage of a word or when you’re trying to check if some phrase is correct, because you cannot trust the information you get from the general Web. In each case, you have to examine the source of the sentence to check if it’s trustworthy.
  2. Google routinely reports the wrong number of hits, especially for phrases. It may tell you that “I have a question for you” occurs on 1,600,000 pages, but the actual number is 473. This means you cannot trust the reported number of hits when you want to check if some phrase is correct. The only solution is to find the last search results page, but this can be hard if there are a lot of hits. (Bing used to be accurate, but now has the same issues.)

The Correct English search engine (based on Google) solves the first problem. It includes a subset of the Web — a hand-picked list of sources which are known to contain good English: online dictionaries, news sites, selected blogs and communities, Wikipedia, movie scripts, government sites, and others. Of course, the content is not 100% “pure”, but the quality is vastly better than on the general Web. Correct English contains practically no sentences written in bad English.

Yes, a proper corpus (such as the COCA) will give you much more searching power (for one thing, it lets you specify part-of-speech wildcards), but Correct English is quicker, easier to use and includes a much larger set of sentences. It does not show the number of hits, so you’ll have to click through to the last results page if you want to find out how common something is.

Addendum

A definitive example of Google’s incorrect search result count (if Google’s own admission is not enough for you):

It is logically impossible for the first query to return more results than the second query. The second query should return more results because it includes pages which contain the words what, did, you and hear anywhere on the page, while the first query requires that they appear in that specific phrase.

Tags:

20 Comments so far ↓

  • marcos rogerio machado da fonseca

    Congratulations on your site. Many people have a very poor written English and now they may resort to you for correction.

  • rivo

    Thank you Antimoon team,
    Unfortunately, not all online dictionaries seem to be included. For example macmillandictionary.com. I haven’t check the other dictionaries yet. I looked up for the idiom “take the bull by the horns” and CE didn’t give any result from macmillandictionary.com, one of my favorite and sophisticated dictionaries.
    I just wanted to honestly suggest to include all online dictionaries possible, because their entries are at least written by verbally competent native speakers, hopefully.
    Anyway, thank you in advance for providing such quite useful service (still, hits are important whatsoever) and also depicting to us that we can also customize the Google search; one thing that’s not normally utilized by most of users, esp. language learners.)

    • Tom

      Thank you for your suggestion. Macmillan was included, but it appears there is an URL matching bug in Google Custom Search. Hopefully I’ve managed to work around it. Please note that not all dictionary entries are indexed by Google, so the search has some gaps, unfortunately.

  • Chaya

    An an ESL teacher, I welcome this!

  • Michal

    Thank you for this useful tool. However, I must report that this is very inconvenient for me to read the search results without left margin on the search results page.

  • Ed

    Thanks a lot, this is very help especially when ESL students want to write something in English but afraid to make some mistakes!

  • Jamal Farajallah

    I really concerned of such skills to improve my English, especially; I want to study for M.A in U.S.A

  • Michal

    On the page http://www.google.com.au/support/websearch/bin/answer.py?answer=136861 you will find some tips that are very useful while looking for an answer to a gramma question with Google. These tips, like using * as wildcard for a whole word, combined with Tom’s Correct English give you a very decent language searching tool that is pretty enough in 95% cases.

    • Tom

      Actually, * stands for “one or more words”.

      • Michal

        Yes, you are right. I was not exact. However, it is good to add here that this is very likely that matches with one word instead of * will go first or will be high enough on the result page, and this is why it is practical to use * if you expect one unknown word in your query, * * if you expect two unknown words, and so on … Don’t work in 100% but is ok.

  • Rick Schaul-Yoder

    Correct English certainly ain’t “100% pure,” as you say. The first hit that my very first search elicited was the following: “None of my friends were big fans . . . .” Sigh.

    • Tom

      Both “none of * was” and “none of * were” are commonly used in formal and informal English. That means they’re both correct, unless you’re a prescriptivist.

  • Jorge

    Thank you Tom…
    Just when we thought antimoon couldn’t get any better, you make something up and it gets better.

  • search for meaning

    In your article you say “The only solution is to find the last search results page”. What do you mean by the last search results page? For example you are googling something and let’s say you get 10 pages so that last search results page is page number 10?

    Another issue is about the number of results you get. Considering the example you mentioned googling “I have a question for you” I got “About 896,000,000 results”. You say that you got about 1,600,000 pages. There is a big difference …Does the number of search results differs from country to country? I do not think that the number of search results changed dramatically in days.

    Last thing, you say “but the actual number is 473” How can you know the actual number of results?

    Thanks in advance

    • Tom

      1. Yes, that’s what I meant.
      2. I think you typed it in without quotes.
      3. On the last search results page, Google tells you “In order to show you the most relevant results, we have omitted some entries very similar to the XX already displayed.”. This appears to be closer to the actual number of results, at least for infrequent phrases. “I have a question for you” is a bad example — I’m sure it has more than 473 results (but I doubt it it’s on 1,600,000 pages).

      See: http://www.google.com/support/websearch/bin/answer.py?hl=en&answer=38

      Also try searching for:
      http://www.google.com/search?hl=&q=“hit+it+with+a+bang”
      and
      http://www.google.com/search?hl=&q=“hit+it+with+a+hammer”

      According to Google estimates, they both have about 300,000 results. I find it hard to believe that these phrases are about equally common in English. BTW, Google displays only 43 links for “hit it with a bang”.

  • Santiago

    Last page (with 100 results per page) says: “Page 6 of about 3,840,000 results” … I know what you mean, but this is not a valid example… What this means is: “we have limited resources so we can only show you about 500 results”

    • Tom

      Thanks for pointing that out. It’s possible that neither number can be trusted (the initial estimate and the last-page number). That would make Google completely useless for frequency estimation.

      I know that Google will never show you more than 1000 links — perhaps the limit is lower for some seearches.

      However, I have found the last-page figure to be credible when searching for infrequent phrases in the Correct English Search (< 1000 results). I will be looking more closely into this.

  • Paola Maria Sciortino

    I really enjoy antimoon. I’ve found many great hints for my work. I’m an English teacher. My desire is to find an efficient method for middle school children, your research seems to be fit for adults or at least for students who have a strong willpower and basic competence in English.
    I’m looking forward for future developments in the field.

    “Do you know what is the difference between a learner and a native speaker?”
    This is an abstract of one of your webpage.
    Can you tell me why , according to my competence , it sounds wrong to me.
    I would have written
    “Do you know what the difference between a learner and a native speaker is?” Meaning that one is questioning the first part of the period.

  • slabo

    This is really useful for me as a teacher. In particular, I’m looking for occurrences of a phrase or word in Movie/TV transcripts. But if I click any of the refine options, i get the following error::

    Not Found

    The requested URL /ce/correctenglish-small.png;LH:65;LP:1;LC: was not found on this server.

    I hope this is easy to fix, seems I’m the first to report the issue so maybe this is a new error.

Leave a Comment