Mark Davies, creator of the largest freely available corpus of English, has built another tool, called WordAndPhrase.info. What does it do? You type in an English word and it shows you the following information:
- the word’s rank in the COCA corpus (for example, perform is the 954th most frequent word in the corpus)
- the word’s relative frequency in spoken English, fiction, magazines, newspapers and academic texts; for example, sullen (=angry and silent) is used almost exclusively in fiction writing and practically never in spoken English
- collocates, sorted by part of speech and by frequency; for example, perform often occurs together with words such as task, function, well, better, able and poorly
- example sentences containing the word
If you type * into the search box, you can also get a list of the most frequent English words.
You should be looking at a redesigned Antimoon, the project I’ve been working on for the past month or so. If it looks wrong, you may have to reload the page to get the newest styles.
If anything seems wrong after you reload the page, please let me know.
Here’s a rundown of the most important changes:
- More readable, better-looking text
- Navigation bar on every page
- Coordinated color scheme
- (articles and blog pages only) Dictionary lookup feature — double-click any word to look it up in the Cambridge Advanced Learner’s Dictionary (includes phonetic transcription, recordings and of course example sentences)
- (articles only) Google Translate link for people who have difficulty reading in English and whose languages are not included in the Translation Wiki.
- (articles and blog pages only) Buttons for sharing an article on Facebook, Twitter, etc.
- Awesome print stylesheet — if you print an article, it will look almost as good as a page from a book. No ads, no site navigation, just pure text set in a nice font.
I was wondering how many items we should add to SuperMemo per day. What average worked well for you and your friends when you were in high school?
I was a heavy SuperMemo user for about 2.5 years. In that period, I added 6,000 items to my English collection. Therefore, my daily average was about 6 items per day. The typical scenario was that, every few days, I would sit down and add anywhere between 10 and 50 words and phrases to my English collection. In addition to that, I added an average of 4 items per day to my German collection.
Doesn’t sound impressive at all, does it? If there ever was a conference for users of spaced repetition software (SRS), I think 6,000 items would get me laughed out of the room. I personally know people who have memorized more than 20,000 English items. My friend and ex-partner at Antimoon, Michal Ryszard Wojcik, added twice as many items as me.
Quick update for those of you who care:
After spending weeks trying to configure a replacement for Ask Antimoon, I’ve convinced myself that the project is not really worth the effort. I’ve got to face the facts here: the site never took off the way I hoped it would. The number of visitors was minuscule next to the total number of Antimoon visitors and, while there was an upward trend, it was barely detectable. I know there was a small group of people who really liked Ask Antimoon (I still get enquiries from former users). There were also some insightful discussions there and I’m sorry they’re no longer accessible. Still, setting up, maintaining and administering Ask Antimoon takes really unpleasant work. Since I suck at making money on the Interwebs, there would be zero financial payoff to that work. Of course I like doing stuff for the community, but I think my time is better spent on things other than providing and policing a discussion board.
I apologize to those of you who were waiting for the site to come back up. And no, there will be no revival of the old forum. No way I’m going back to that level of discussion.
I spent a week improving the pronunciation section of How to learn English. The main pronunciation page now contains more concrete advice.
I plan to make several updates to the pronunciation section in the near future. Here’s the first one: Learn to pronounce English words as soon as possible. The gist of the article is that you shouldn’t put off studying English pronunciation because doing so puts you at risk of developing fossilized mistakes (bad habits). The article also explains the concept of “getting it right in your head” when pronouncing English words.
I needed to find an online vowel chart for English, but I couldn’t find one I liked, so I made one myself. Here it is: English vowel chart
In case you hadn’t noticed,
it has somehow become uncool
to sound like you know what you’re talking about?
Or believe strongly in what you’re saying?
Invisible question marks and parenthetical (you know?)’s
have been attaching themselves to the ends of our sentences?
Even when those sentences aren’t, like, questions? You know?
I thought you might enjoy this typographically animated version of “Totally like whatever, you know?”, a poem by Taylor Mali (text version here). The poem satirizes the use of like, you know, and the rising, “questiony” intonation at the ends of a sentences, which are found in the speech of many young people in the United States.
Have you ever wondered how many English words you know? The question is not very precise — what does it mean to ‘know’ a word? is teacup a word or a combination of two words? how about tick off? is game (something you play) a different word from game (wild animals)? Nevertheless, it feels good to put some kind of number on your vocabulary.
Testyourvocab.com will estimate the size of your passive English vocabulary (the words that you can understand; not necessarily use in a sentence) by showing you a sample of words from a dictionary to determine your general level, and then another sample to get a more precise measurement. You can then compare your result with native speakers and non-native speakers of various ages.
The authors have also published some interesting charts based on the data they’ve collected.
P.S. In case you’re wondering, my score was 26,400.
For those of you wondering why you cannot access Ask Antimoon anymore, here’s the deal: Ask Antimoon was on a server that was set to expire on Monday. (By the way, I would like to thank Fog Creek Software for hosting the site free of charge for almost 2 years.)
For the past two weeks, I’ve been working on setting up a clone of Ask Antimoon on my own server. I made dozens of UI customizations to make the new site behave in roughly the same way as the old site (which I was generally happy with). When I finally got around to the problem of getting the actual data from the old site to the new site, it turned out that this doesn’t work. There are bugs and missing features to deal with. Yes, the data import should have been the first thing that I tested, but I was an idiot.
To make things worse, I will be mostly unavailable for the next 10 days or so, so it might take a while before the site is back up.
If you want to learn fluent English, you should probably get about 6 hours of spoken input a week. This usually means that you need a constant supply of interesting audio/video content to listen to/watch at home.
It is not always easy to find new sources of input every week, so it is a good idea to watch and listen to episodic content. That way, rather than wonder “What movie am I going to watch today?”, you can just tune in to your favorite show regularly and get your dose of English.
With this in mind, I have decided to publish a list of episodic content (TV shows, podcasts, etc.) that I have found exceptionally entertaining or informative. I will add to this list as I discover new shows, so check back in a while.
Thanks to the great people at Comedy Central, everyone can now watch full episodes of The Daily Show with Jon Stewart and The Colbert Report online for free! (That is, everyone except people in a few countries, like the UK, in which these shows are aired on TV.)
As you know, I am a big fan of watching TV series and shows because they are a constant source of input that gives you the intensity you need to build your English. So if you have a sense of humor and you are at all interested in US politics, follow these shows for some excellent English practice.
(You probably know that you can watch all the South Park episodes online, but here’s a link just in case.)
Mark Davies, who developed the invaluable Corpus of Contemporary American English (COCA), recently launched two new language corpuses: Corpus del Español and Corpus do Português.
Once you get the hang of the query syntax and the user interface (which can be daunting at first), you can search through a large database of Spanish and Portuguese sentences to answer lots of different questions about these two languages, e.g. which preposition goes with insistir, which synonym of duro is most commonly used with trabajo, and many others.
Unlike Google, Mark’s corpuses allow you to search for all the grammatical forms of a word (just put the base form of the word in [brackets]), specify parts of speech (e.g. [v*] stands for any verb in any form), or search by proximity (e.g. find all adjectives within 5 words of ojos). They will also sort the results by frequency, which can be a real time-saver.
I have just rolled out a full IPA “keyboard” which lets you type IPA phonetic symbols for any language (not just English).
You will find it useful if you ever need to type phonetic transcriptions for a language other than English. You will also like it if you’re a phonetics geek and always wanted to transcribe tree as
[tʰɹ̥ʷiː] or heel as
There are other online solutions for IPA input, but this one is easily the fastest, allowing both quick access to buttons and intuitive keyboard shortcuts. There is almost no learning curve — just hold Ctrl and press the letter that most resembles the IPA symbol you want to type; keep pressing the letter until you get the symbol you want.
You will need Windows Vista/7 or a third-party IPA font to see all the symbols. Works best with Firefox, Internet Explorer 8 and Safari on Mac.
Added May 2012: See here for a better solution to problems with the LDOCE.
It appears there is an unofficial patch which fixes many of the LDOCE’s shortcomings (thanks are due to my readers who told me about it). Here are the main fixes:
- Mousewheel scrolling works (still a little too slow for my liking).
- Keyboard scrolling is fixed for the most part (there are still occasional problems with PgUp)
- You can select any text and right-click it to copy it to the clipboard.
- The PopUp mode window can now be resized. This is probably the most convenient way to use LDOCE, as it has very few distractions. I’ve noticed that in the PopUp mode, LDOCE tries to look up every piece of text you copy to the clipboard. To turn this off, you have to right-click the QuickFind icon in the system tray (bottom right corner of the screen) and choose Exit.
- You now get the dictionary window immediately after startup rather than having to click Dictionary.
The only big problem that hasn’t been fixed is the startup time. It still takes about 10 seconds to open the dictionary.
There are two problems with using Google to check your English:
- The Web is full of bad English written by non-native speakers and verbally incompetent native speakers. This is a serious problem when you’re looking for correct example sentences to learn the usage of a word or when you’re trying to check if some phrase is correct, because you cannot trust the information you get from the general Web. In each case, you have to examine the source of the sentence to check if it’s trustworthy.
- Google routinely reports the wrong number of hits, especially for phrases. It may tell you that “I have a question for you” occurs on 1,600,000 pages, but the actual number is 473. This means you cannot trust the reported number of hits when you want to check if some phrase is correct. The only solution is to find the last search results page, but this can be hard if there are a lot of hits. (Bing used to be accurate, but now has the same issues.)
The Correct English search engine (based on Google) solves the first problem. It includes a subset of the Web — a hand-picked list of sources which are known to contain good English: online dictionaries, news sites, selected blogs and communities, Wikipedia, movie scripts, government sites, and others. Of course, the content is not 100% “pure”, but the quality is vastly better than on the general Web. Correct English contains practically no sentences written in bad English.
My friend Michał recently asked me for an opinion on Extreme English, the flagship English-learning course at SuperMemo.net. He has moved to England and is eager to improve his English.
Michał is a smart guy. He realizes that just living in England will not make him a good English speaker. As a case in point, the Polish family he is currently staying with has lived in England for five years and speaks hardly any English. They watch Polish channels on TV, they talk mostly to each other and to other Poles, and they do jobs that require little communication skills, so they don’t get enough input to make progress.
So he is simply continuing the English-learning strategy that he used in Poland. He listens to English radio, watches English TV, reads English newspapers, and develops his own SuperMemo collection. The only difference is that now his future depends on how well he can learn English. This leads to more intensity (he’s now learning for several hours a day), but also a lot of pressure.
Added May 2012: See here for a better solution to problems with the LDOCE.
In last week’s episode, I decided to say goodbye to the PC version of the Longman Dictionary of Contemporary English (LDOCE) after months of putting up with its slowness, missing features, and — most of all — random bugs. Today, I am pleased to report that I seem to have found a satisfactory solution.
Every copy of the LDOCE comes with an access code to the online version of the dictionary, available at www.longmandictionariesonline.com.
Web applications give the programmer much less freedom than native Windows applications. In this case, that’s a good thing, because it means that Longman’s developers have had much less freedom to screw up basic features like scrolling or copy-and-paste.