3

Sorry for the click bait title - I'm assuming Afghanistan is not the 8th most used proper noun in German. However it's listed as such in what appears to be one of the most popular German frequency dictionary:

Routledge Frequency Dictionaries - A Frequency Dictionary of German: Core Vocabulary for Learners

So this question is more out of curiosity about the Leipzig/BYU Corpus of Contemporary German. Can anyone comment on it? That Afghanistan should be listed as the 8th most used proper noun in German makes me wonder about the corpus and this dictionary that's based off it. That the DDR is listed as the 5th most used proper noun and Schröder as one of the most used names seems to point to it being less than contemporary (and not particularly broad).

I can't believe that even a corpus limited to just world news and the time around Peter Struck's famous quote back in 2004 that "Unsere Sicherheit wird nicht nur, aber auch am Hindukusch verteidigt" would yield such a high result for Afghanistan.

Proper nouns by frequency

George Hawkins
  • 223
  • 2
  • 7
  • 4
    Isn't it better for Skeptics.SO? – Eller Nov 30 '17 at 16:02
  • 1
    Afghanistan is in discussion about being a safe enough country for Afghan people to return to. It's a hot topic since about 2 years but it was cooking for since about 2005. – Janka Nov 30 '17 at 16:12
  • 8
    First, perhaps, we should know what this strange list defines as "proper nouns". Then, we should know what text corpus that list is based on. – Christian Geiselmann Nov 30 '17 at 17:58
  • 1
    @Marzipanherz Yes, it's only names. Because this is a list of names. So it's no wonder that the list of names consists only of names. – Eller Dec 01 '17 at 10:29
  • Would you please add a reference to that corpus? – mike Dec 01 '17 at 11:57
  • @mike - initially I assumed this must be a simple Google search. But actually I can't find anything much googling e.g. byu german corpus, or replacing BYU with "Brigham Young University". Makes me all the more suspicious about the whole thing! There's a bit more if one tries searching for Leipzig corpora, e.g. I could find this Leipzig top 50 words by frequency but didn't really get to anything very satisfactory. – George Hawkins Dec 01 '17 at 18:34
  • Note that Leipzig publish their own frequency dictionary but I can't find out what the corpora behind it is. Or whether it has less unusual proper nouns in its top 10. – George Hawkins Dec 01 '17 at 18:37
  • @GeorgeHawkins Did you look at the list from the Institut für Deutsche Sprache? It contains proper names and they are tagged, so you should be able to extract them. – mike Dec 01 '17 at 18:38
  • Many people seem unsurprised that Afghanistan might have such a high ranking. But even at the height of the war in Afghanistan or German peace keeping involvement I find it hard to believe that a corporus that's supposed to be broad - contemporary German - would rank it so. Even if one looked at contemporary German at the time of the Kunduz airstrike I doubt that Afghistan popped up more often than e.g. Frankreich unless you're looking at some very narrow world-news-only corpus. – George Hawkins Dec 01 '17 at 18:46
  • 1
    @GeorgeHawkins I added another source from Uni Leipzig. Have a look, should be quite interesting. – mike Dec 01 '17 at 18:58
  • 1
    By the way, the DWDS offers a nice tool to plot the frequency of words, e.g. https://www.dwds.de/r/plot?q=Afghanistan. –  Dec 01 '17 at 19:17
  • @DeeDuu - nice :) It's a pity the granularity (decades) isn't finer. So according to DWDS Frankreich has a way higher frequency than Afghanistan across all time periods, including the last two decades. Hardly surprising - but at odds with Routledge's odd list. – George Hawkins Dec 01 '17 at 19:44

4 Answers4

2

A lot depends on what time period you are referring to. If the question is, "is Afghanistan (a world trouble spot for "current events") a "top ten" item this year (or any year for about the past ten or so)," then it seems plausible.

If the question is, "Is Afghanistan a "top ten" printed proper noun in all of history," basically since Gutenberg invented moving type, I would expect a very different answer.

Tom Au
  • 12,750
  • 4
  • 38
  • 78
  • As I commented on @fdb's answer I don't dispute that Afghanistan has become far more newsworthy in recent years but I'm still sceptical it should rank so highly. I looked at 2004 and 2005 for my fdb comment, I then looked further - at the height of revelations about the Kunduz airstrike in around 2009/2010 Afghanistan did rise greatly in frequency of use. But even in the news-only Leipzig corpora it never beat Frankreich (as it does it the Routledge list). Even if it did it would seem bizarre for a frequency dictionary to be based off a corpus so subject to short term spikes. – George Hawkins Dec 03 '17 at 13:13
  • @GeorgeHawkins: Christian Geiselmann said it best: "First, perhaps, we should know what this strange list defines as "proper nouns". Then, we should know what text corpus that list is based on." A lot depends on how you define the universe of proper nouns, and how you define the source material. – Tom Au Dec 03 '17 at 14:03
  • 1
    I didn't think the definition of proper nouns was particularly open to interpretation or that the words in the Routledge list diverged from the common understanding of the term. This Routledge frequency dictionary says it is based off the Leipzig/BYU Corpus of Contemporary German - you can find more details, that I've put together, here. – George Hawkins Dec 03 '17 at 18:40
0

A bit of search reveals that this Frequency Dictionary was published in 2005.The high ranking of "Afghanistan" is not surprising in a book of that time.

fdb
  • 3,338
  • 10
  • 12
  • I downloaded the Leipzig news-only corpora for 2004 and 2005. And even these news focused corpora rank Frankreich far higher than Afghanistan for both years. Interestingly Irak ranks higher in frequency as well for both years. I'm not disputing that Afghanistan has risen significantly in news worthiness at times but I do doubt that it crops up more often, given trade, EU politics, closeness etc. than countries like Frankreich in a news setting let alone something claiming to be a corpus of contemporary German suitable as a study aid. – George Hawkins Dec 03 '17 at 12:13
0

I absolutely can imagine that "Afghanistan" was this important. It has a long sad history and is a symbol for a lot of conflicts in the world. Please note that even the Russians have a long history with it. So for us Germans it was and probably still is a hot topic when you want to discuss topics like "First world against third world", "Christian vs. Islam", "Military help vs. Humanity help", "Democracy vs. Autocratic" and so and so on. Nowadays the focus is fading to Syria or Iraq or whatever country might be in the news but still Afghanistan was the first bigger country which excited many people and provoked all the rising questions of nowadays conflicts.

Thomas
  • 2,961
  • 10
  • 26
-1

Without having access to your splendid source of information, I turned to the Duden and the following nouns are the most common (descending):

Jahr

Uhr

Prozent

Million

Euro

Zeit

Tag

Frau

Mensch

Mann

Wikipedia maintains another list of most common words in german, but not only nouns.

Finally, the Institut für Deutsche Sprache also maintains several lists including a frequency list.

As none of these lists contain Afghanistan in the top spots, it is safe to assume, that it is not really in the top spot. This would coincide my observation of german news, that this country is not that present in the collective conscience. It was however, in the past, as Germany used to have troops stationed in the hindu kush (2001-2014), but they returned.

I'm aware, that the question is about proper names, the list in the last link includes proper names, analysis is left as an exercise to the OP.

Uni Leipzig also allows the download of the corpora by year from 1995 to 2015. It also categorizes by news, web and wikipedia. With a little work you can look for Afghanistan there. The linked pages also explains more about the format of the files and the origin of the corpus.

mike
  • 243
  • 1
  • 4
  • 6
    The question was about PROPER nouns. – fdb Dec 01 '17 at 13:11
  • 2
    Thanks for the pointer to the frequency list from the Institut für Deutsche Sprache. I did as you suggested and picked out the proper nouns - the top 10 are Deutschland, Berlin, SPD, CDU, Peter, USA, Michael, München, Thomas and Bad - so quite a different list to the one shown in my question. – George Hawkins Dec 01 '17 at 19:07
  • And thanks for the pointer to the Leipzig corpora that I failed to find earlier. I downloaded the "mixed typical" 2011 corpus and looked at it - unfortunately they don't mark proper nouns. But if I pick out just country names I see that Afghanistan comes way behind Deutschland, China, Frankreich and more. So the Routledge corpus, whatever it may be, seems quite different to this broad Leipzig one. – George Hawkins Dec 01 '17 at 19:52
  • You're welcome! Would you consider accepting the answer? – mike Dec 02 '17 at 17:00
  • @mike - the other two answers are just opinions while this includes useful links to corpora. However I think fdb's remark is fair - it would be a better answer if it didn't lead with a list of common nouns. I tried looking for the Dudenkorpus but failed to get anything more detailed. I guess your list came from the Sprachratgeber list on the Duden site (and unfortunately this list isn't long enough to allow one to pick out any proper nouns). – George Hawkins Dec 03 '17 at 12:54
  • The only pointer I could find to further statistics on the Dudenskorpus is the "Sprache in Zahlen" section of their Rechtschreibung volume but I don't have a copy to check. However as this section is only 12 pages long (according to the index) I doubt it goes into proper nouns etc. :( – George Hawkins Dec 03 '17 at 13:00