List of 1000 most common words across all languagesUsing Swadesh lists to find languages with most frequent vowel use?Database of Swadesh lists

Do wheelchair aircraft exist?

What exactly did this mechanic sabotage on the American Airlines 737, and how dangerous was it?

Lost Update Understanding

How to justify a team increase when the team is doing good?

How to deal with a Homophobic PC

How to see the previous "Accessed" date in Windows

Do we know the situation in Britain before Sealion (summer 1940)?

Averting Bathos

Does Sitecore have support for Sitecore products in containers?

Safe to use 220V electric clothes dryer when building has been bridged down to 110V?

Is it too late to harvest aronia berries

Fuel sender works when outside of tank, but not when in tank

Quick Yajilin Puzzles: Scatter and Gather

while loop factorial only works up to 20?

Can an integer optimization problem be convex?

My Project Manager does not accept carry-over in Scrum, Is that normal?

Difference between types of yeast

Why does NASA publish all the results/data it gets?

Why solving a differentiated integral equation might eventually lead to erroneous solutions of the original problem?

A file manager to open a zip file like opening a folder, instead of extract it by using a archive manager

Why does this image of Jupiter look so strange?

Does "as soon as" imply simultaneity?

What are the consequences of high orphan block rate?

What is the white pattern on trim wheel for?



List of 1000 most common words across all languages


Using Swadesh lists to find languages with most frequent vowel use?Database of Swadesh lists






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








2















What words are the most common across languages? Is there a list of 100 or 1000?










share|improve this question



















  • 4





    This question is really too broad for a meaningful answer. What do you mean by "words"? What do you mean by "most common across languages"? The most common word in English is "the", but German has a dozen or so different definite articles: should they all be grouped together for comparison with English? And so on and so forth.

    – Draconis
    19 hours ago











  • What Draconis said. You need to define a corpus. You could calculate such a list from the fastText .vec files. However it would contain many assumptions, and corous-specific skew and noise.

    – Adam Bittlingmayer
    18 hours ago






  • 1





    I want nouns, verbs, and adjectives in every language. Red, blue, sun, moon, etc.

    – Lance Pollard
    18 hours ago

















2















What words are the most common across languages? Is there a list of 100 or 1000?










share|improve this question



















  • 4





    This question is really too broad for a meaningful answer. What do you mean by "words"? What do you mean by "most common across languages"? The most common word in English is "the", but German has a dozen or so different definite articles: should they all be grouped together for comparison with English? And so on and so forth.

    – Draconis
    19 hours ago











  • What Draconis said. You need to define a corpus. You could calculate such a list from the fastText .vec files. However it would contain many assumptions, and corous-specific skew and noise.

    – Adam Bittlingmayer
    18 hours ago






  • 1





    I want nouns, verbs, and adjectives in every language. Red, blue, sun, moon, etc.

    – Lance Pollard
    18 hours ago













2












2








2


1






What words are the most common across languages? Is there a list of 100 or 1000?










share|improve this question














What words are the most common across languages? Is there a list of 100 or 1000?







swadesh-list






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked 20 hours ago









Lance PollardLance Pollard

1,0556 silver badges15 bronze badges




1,0556 silver badges15 bronze badges










  • 4





    This question is really too broad for a meaningful answer. What do you mean by "words"? What do you mean by "most common across languages"? The most common word in English is "the", but German has a dozen or so different definite articles: should they all be grouped together for comparison with English? And so on and so forth.

    – Draconis
    19 hours ago











  • What Draconis said. You need to define a corpus. You could calculate such a list from the fastText .vec files. However it would contain many assumptions, and corous-specific skew and noise.

    – Adam Bittlingmayer
    18 hours ago






  • 1





    I want nouns, verbs, and adjectives in every language. Red, blue, sun, moon, etc.

    – Lance Pollard
    18 hours ago












  • 4





    This question is really too broad for a meaningful answer. What do you mean by "words"? What do you mean by "most common across languages"? The most common word in English is "the", but German has a dozen or so different definite articles: should they all be grouped together for comparison with English? And so on and so forth.

    – Draconis
    19 hours ago











  • What Draconis said. You need to define a corpus. You could calculate such a list from the fastText .vec files. However it would contain many assumptions, and corous-specific skew and noise.

    – Adam Bittlingmayer
    18 hours ago






  • 1





    I want nouns, verbs, and adjectives in every language. Red, blue, sun, moon, etc.

    – Lance Pollard
    18 hours ago







4




4





This question is really too broad for a meaningful answer. What do you mean by "words"? What do you mean by "most common across languages"? The most common word in English is "the", but German has a dozen or so different definite articles: should they all be grouped together for comparison with English? And so on and so forth.

– Draconis
19 hours ago





This question is really too broad for a meaningful answer. What do you mean by "words"? What do you mean by "most common across languages"? The most common word in English is "the", but German has a dozen or so different definite articles: should they all be grouped together for comparison with English? And so on and so forth.

– Draconis
19 hours ago













What Draconis said. You need to define a corpus. You could calculate such a list from the fastText .vec files. However it would contain many assumptions, and corous-specific skew and noise.

– Adam Bittlingmayer
18 hours ago





What Draconis said. You need to define a corpus. You could calculate such a list from the fastText .vec files. However it would contain many assumptions, and corous-specific skew and noise.

– Adam Bittlingmayer
18 hours ago




1




1





I want nouns, verbs, and adjectives in every language. Red, blue, sun, moon, etc.

– Lance Pollard
18 hours ago





I want nouns, verbs, and adjectives in every language. Red, blue, sun, moon, etc.

– Lance Pollard
18 hours ago










2 Answers
2






active

oldest

votes


















6
















There is no such list, but you could build one.
You can find (multiple) frequency lists for many languages, and you could come up with a way to decide which frequency list to use (there are very many for English). I suppose you are thinking you might construct a list with entries like day, Tag, jour, день, gün, siku, päivi, 日 where all of the words seem to mean the same thing and the words all end up on the top-1000 list in their respective languages (I don't know if they do, this is just a hypothetical example).



The problem is that the most frequent words in English (and many other languages) are things like "a, the, all, but, she", and these are not going to have correspondents in all languages. Plus, the various forms of the verb "be" or "do" and "don't" are each treated as separate words in some frequency lists. It would be more productive to define a subset of concrete nouns and "verbs" like "cat, dog, big, small, eat, walk" and get the N most frequent equivalents across languages. You must abandon the search for data in every language, but you could go for "as many as you can get". As a precursor to this exercise, you might try to come up with the N most frequent concrete nouns and verbs of English, filtering out proper names (unless you really want proper names to be included). Then do the same thing for Khmer. Then you have to decide whether "good" and ល្អ are "the same" in meaning (the Khmer word also translates "attractive").






share|improve this answer
































    1
















    I think you are looking for the "Swadesh list", a list of the 100 most common concepts across languages.



    https://en.wikipedia.org/wiki/Swadesh_list






    share|improve this answer




















    • 1





      Do you have a reason to think this is based on frequency, rather than "basicness"?

      – user6726
      8 hours ago













    Your Answer








    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "312"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    noCode: true, onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );














    draft saved

    draft discarded
















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2flinguistics.stackexchange.com%2fquestions%2f33565%2flist-of-1000-most-common-words-across-all-languages%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    6
















    There is no such list, but you could build one.
    You can find (multiple) frequency lists for many languages, and you could come up with a way to decide which frequency list to use (there are very many for English). I suppose you are thinking you might construct a list with entries like day, Tag, jour, день, gün, siku, päivi, 日 where all of the words seem to mean the same thing and the words all end up on the top-1000 list in their respective languages (I don't know if they do, this is just a hypothetical example).



    The problem is that the most frequent words in English (and many other languages) are things like "a, the, all, but, she", and these are not going to have correspondents in all languages. Plus, the various forms of the verb "be" or "do" and "don't" are each treated as separate words in some frequency lists. It would be more productive to define a subset of concrete nouns and "verbs" like "cat, dog, big, small, eat, walk" and get the N most frequent equivalents across languages. You must abandon the search for data in every language, but you could go for "as many as you can get". As a precursor to this exercise, you might try to come up with the N most frequent concrete nouns and verbs of English, filtering out proper names (unless you really want proper names to be included). Then do the same thing for Khmer. Then you have to decide whether "good" and ល្អ are "the same" in meaning (the Khmer word also translates "attractive").






    share|improve this answer





























      6
















      There is no such list, but you could build one.
      You can find (multiple) frequency lists for many languages, and you could come up with a way to decide which frequency list to use (there are very many for English). I suppose you are thinking you might construct a list with entries like day, Tag, jour, день, gün, siku, päivi, 日 where all of the words seem to mean the same thing and the words all end up on the top-1000 list in their respective languages (I don't know if they do, this is just a hypothetical example).



      The problem is that the most frequent words in English (and many other languages) are things like "a, the, all, but, she", and these are not going to have correspondents in all languages. Plus, the various forms of the verb "be" or "do" and "don't" are each treated as separate words in some frequency lists. It would be more productive to define a subset of concrete nouns and "verbs" like "cat, dog, big, small, eat, walk" and get the N most frequent equivalents across languages. You must abandon the search for data in every language, but you could go for "as many as you can get". As a precursor to this exercise, you might try to come up with the N most frequent concrete nouns and verbs of English, filtering out proper names (unless you really want proper names to be included). Then do the same thing for Khmer. Then you have to decide whether "good" and ល្អ are "the same" in meaning (the Khmer word also translates "attractive").






      share|improve this answer



























        6














        6










        6









        There is no such list, but you could build one.
        You can find (multiple) frequency lists for many languages, and you could come up with a way to decide which frequency list to use (there are very many for English). I suppose you are thinking you might construct a list with entries like day, Tag, jour, день, gün, siku, päivi, 日 where all of the words seem to mean the same thing and the words all end up on the top-1000 list in their respective languages (I don't know if they do, this is just a hypothetical example).



        The problem is that the most frequent words in English (and many other languages) are things like "a, the, all, but, she", and these are not going to have correspondents in all languages. Plus, the various forms of the verb "be" or "do" and "don't" are each treated as separate words in some frequency lists. It would be more productive to define a subset of concrete nouns and "verbs" like "cat, dog, big, small, eat, walk" and get the N most frequent equivalents across languages. You must abandon the search for data in every language, but you could go for "as many as you can get". As a precursor to this exercise, you might try to come up with the N most frequent concrete nouns and verbs of English, filtering out proper names (unless you really want proper names to be included). Then do the same thing for Khmer. Then you have to decide whether "good" and ល្អ are "the same" in meaning (the Khmer word also translates "attractive").






        share|improve this answer













        There is no such list, but you could build one.
        You can find (multiple) frequency lists for many languages, and you could come up with a way to decide which frequency list to use (there are very many for English). I suppose you are thinking you might construct a list with entries like day, Tag, jour, день, gün, siku, päivi, 日 where all of the words seem to mean the same thing and the words all end up on the top-1000 list in their respective languages (I don't know if they do, this is just a hypothetical example).



        The problem is that the most frequent words in English (and many other languages) are things like "a, the, all, but, she", and these are not going to have correspondents in all languages. Plus, the various forms of the verb "be" or "do" and "don't" are each treated as separate words in some frequency lists. It would be more productive to define a subset of concrete nouns and "verbs" like "cat, dog, big, small, eat, walk" and get the N most frequent equivalents across languages. You must abandon the search for data in every language, but you could go for "as many as you can get". As a precursor to this exercise, you might try to come up with the N most frequent concrete nouns and verbs of English, filtering out proper names (unless you really want proper names to be included). Then do the same thing for Khmer. Then you have to decide whether "good" and ល្អ are "the same" in meaning (the Khmer word also translates "attractive").







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered 18 hours ago









        user6726user6726

        38.8k1 gold badge26 silver badges76 bronze badges




        38.8k1 gold badge26 silver badges76 bronze badges


























            1
















            I think you are looking for the "Swadesh list", a list of the 100 most common concepts across languages.



            https://en.wikipedia.org/wiki/Swadesh_list






            share|improve this answer




















            • 1





              Do you have a reason to think this is based on frequency, rather than "basicness"?

              – user6726
              8 hours ago















            1
















            I think you are looking for the "Swadesh list", a list of the 100 most common concepts across languages.



            https://en.wikipedia.org/wiki/Swadesh_list






            share|improve this answer




















            • 1





              Do you have a reason to think this is based on frequency, rather than "basicness"?

              – user6726
              8 hours ago













            1














            1










            1









            I think you are looking for the "Swadesh list", a list of the 100 most common concepts across languages.



            https://en.wikipedia.org/wiki/Swadesh_list






            share|improve this answer













            I think you are looking for the "Swadesh list", a list of the 100 most common concepts across languages.



            https://en.wikipedia.org/wiki/Swadesh_list







            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered 9 hours ago









            fdbfdb

            17.4k1 gold badge22 silver badges46 bronze badges




            17.4k1 gold badge22 silver badges46 bronze badges










            • 1





              Do you have a reason to think this is based on frequency, rather than "basicness"?

              – user6726
              8 hours ago












            • 1





              Do you have a reason to think this is based on frequency, rather than "basicness"?

              – user6726
              8 hours ago







            1




            1





            Do you have a reason to think this is based on frequency, rather than "basicness"?

            – user6726
            8 hours ago





            Do you have a reason to think this is based on frequency, rather than "basicness"?

            – user6726
            8 hours ago


















            draft saved

            draft discarded















































            Thanks for contributing an answer to Linguistics Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2flinguistics.stackexchange.com%2fquestions%2f33565%2flist-of-1000-most-common-words-across-all-languages%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Canceling a color specificationRandomly assigning color to Graphics3D objects?Default color for Filling in Mathematica 9Coloring specific elements of sets with a prime modified order in an array plotHow to pick a color differing significantly from the colors already in a given color list?Detection of the text colorColor numbers based on their valueCan color schemes for use with ColorData include opacity specification?My dynamic color schemes

            Invision Community Contents History See also References External links Navigation menuProprietaryinvisioncommunity.comIPS Community ForumsIPS Community Forumsthis blog entry"License Changes, IP.Board 3.4, and the Future""Interview -- Matt Mecham of Ibforums""CEO Invision Power Board, Matt Mecham Is a Liar, Thief!"IPB License Explanation 1.3, 1.3.1, 2.0, and 2.1ArchivedSecurity Fixes, Updates And Enhancements For IPB 1.3.1Archived"New Demo Accounts - Invision Power Services"the original"New Default Skin"the original"Invision Power Board 3.0.0 and Applications Released"the original"Archived copy"the original"Perpetual licenses being done away with""Release Notes - Invision Power Services""Introducing: IPS Community Suite 4!"Invision Community Release Notes

            François Viète Contents Biography Work and thought Bibliography See also Notes Further reading External links Navigation menup. 21Google Bookspp. 75–77Google BooksDe thou (from University of Saint Andrews)ArchivedGoogle BooksGoogle BooksGoogle BooksGoogle booksGoogle Bookscc-parthenay.frL'histoire universelle (fr)Universal History (en)ArchivedAdsabs.harvard.eduPagesperso-orange.frArchive.orgChikara Sasaki. Descartes' mathematical thought p.259Google BooksGoogle BooksGoogle Bookspp. 152 and onwardGoogle BooksGoogle BooksScribd.comGoogle Books1257-7979Google BooksGoogle BooksGoogle BooksGoogle BooksGoogle BooksGoogle BooksGallica.bnf.frGoogle BooksGoogle Books"François Viète"Francois Viète: Father of Modern Algebraic NotationThe Lawyer and the GamblerAbout TarporleySite de Jean-Paul GuichardL'algèbre nouvelle"About the Harmonicon"cb120511976(data)1188044800000 0001 0913 5903n82164680ola2013766880073431702w6vt1sb70287374827140948071409480