List of 1000 most common words across all languagesUsing Swadesh lists to find languages with most frequent vowel use?Database of Swadesh lists
Do wheelchair aircraft exist?
What exactly did this mechanic sabotage on the American Airlines 737, and how dangerous was it?
Lost Update Understanding
How to justify a team increase when the team is doing good?
How to deal with a Homophobic PC
How to see the previous "Accessed" date in Windows
Do we know the situation in Britain before Sealion (summer 1940)?
Averting Bathos
Does Sitecore have support for Sitecore products in containers?
Safe to use 220V electric clothes dryer when building has been bridged down to 110V?
Is it too late to harvest aronia berries
Fuel sender works when outside of tank, but not when in tank
Quick Yajilin Puzzles: Scatter and Gather
while loop factorial only works up to 20?
Can an integer optimization problem be convex?
My Project Manager does not accept carry-over in Scrum, Is that normal?
Difference between types of yeast
Why does NASA publish all the results/data it gets?
Why solving a differentiated integral equation might eventually lead to erroneous solutions of the original problem?
A file manager to open a zip file like opening a folder, instead of extract it by using a archive manager
Why does this image of Jupiter look so strange?
Does "as soon as" imply simultaneity?
What are the consequences of high orphan block rate?
What is the white pattern on trim wheel for?
List of 1000 most common words across all languages
Using Swadesh lists to find languages with most frequent vowel use?Database of Swadesh lists
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
What words are the most common across languages? Is there a list of 100 or 1000?
swadesh-list
add a comment
|
What words are the most common across languages? Is there a list of 100 or 1000?
swadesh-list
4
This question is really too broad for a meaningful answer. What do you mean by "words"? What do you mean by "most common across languages"? The most common word in English is "the", but German has a dozen or so different definite articles: should they all be grouped together for comparison with English? And so on and so forth.
– Draconis
19 hours ago
What Draconis said. You need to define a corpus. You could calculate such a list from the fastText .vec files. However it would contain many assumptions, and corous-specific skew and noise.
– Adam Bittlingmayer
18 hours ago
1
I want nouns, verbs, and adjectives in every language. Red, blue, sun, moon, etc.
– Lance Pollard
18 hours ago
add a comment
|
What words are the most common across languages? Is there a list of 100 or 1000?
swadesh-list
What words are the most common across languages? Is there a list of 100 or 1000?
swadesh-list
swadesh-list
asked 20 hours ago
Lance PollardLance Pollard
1,0556 silver badges15 bronze badges
1,0556 silver badges15 bronze badges
4
This question is really too broad for a meaningful answer. What do you mean by "words"? What do you mean by "most common across languages"? The most common word in English is "the", but German has a dozen or so different definite articles: should they all be grouped together for comparison with English? And so on and so forth.
– Draconis
19 hours ago
What Draconis said. You need to define a corpus. You could calculate such a list from the fastText .vec files. However it would contain many assumptions, and corous-specific skew and noise.
– Adam Bittlingmayer
18 hours ago
1
I want nouns, verbs, and adjectives in every language. Red, blue, sun, moon, etc.
– Lance Pollard
18 hours ago
add a comment
|
4
This question is really too broad for a meaningful answer. What do you mean by "words"? What do you mean by "most common across languages"? The most common word in English is "the", but German has a dozen or so different definite articles: should they all be grouped together for comparison with English? And so on and so forth.
– Draconis
19 hours ago
What Draconis said. You need to define a corpus. You could calculate such a list from the fastText .vec files. However it would contain many assumptions, and corous-specific skew and noise.
– Adam Bittlingmayer
18 hours ago
1
I want nouns, verbs, and adjectives in every language. Red, blue, sun, moon, etc.
– Lance Pollard
18 hours ago
4
4
This question is really too broad for a meaningful answer. What do you mean by "words"? What do you mean by "most common across languages"? The most common word in English is "the", but German has a dozen or so different definite articles: should they all be grouped together for comparison with English? And so on and so forth.
– Draconis
19 hours ago
This question is really too broad for a meaningful answer. What do you mean by "words"? What do you mean by "most common across languages"? The most common word in English is "the", but German has a dozen or so different definite articles: should they all be grouped together for comparison with English? And so on and so forth.
– Draconis
19 hours ago
What Draconis said. You need to define a corpus. You could calculate such a list from the fastText .vec files. However it would contain many assumptions, and corous-specific skew and noise.
– Adam Bittlingmayer
18 hours ago
What Draconis said. You need to define a corpus. You could calculate such a list from the fastText .vec files. However it would contain many assumptions, and corous-specific skew and noise.
– Adam Bittlingmayer
18 hours ago
1
1
I want nouns, verbs, and adjectives in every language. Red, blue, sun, moon, etc.
– Lance Pollard
18 hours ago
I want nouns, verbs, and adjectives in every language. Red, blue, sun, moon, etc.
– Lance Pollard
18 hours ago
add a comment
|
2 Answers
2
active
oldest
votes
There is no such list, but you could build one.
You can find (multiple) frequency lists for many languages, and you could come up with a way to decide which frequency list to use (there are very many for English). I suppose you are thinking you might construct a list with entries like day, Tag, jour, день, gün, siku, päivi, 日 where all of the words seem to mean the same thing and the words all end up on the top-1000 list in their respective languages (I don't know if they do, this is just a hypothetical example).
The problem is that the most frequent words in English (and many other languages) are things like "a, the, all, but, she", and these are not going to have correspondents in all languages. Plus, the various forms of the verb "be" or "do" and "don't" are each treated as separate words in some frequency lists. It would be more productive to define a subset of concrete nouns and "verbs" like "cat, dog, big, small, eat, walk" and get the N most frequent equivalents across languages. You must abandon the search for data in every language, but you could go for "as many as you can get". As a precursor to this exercise, you might try to come up with the N most frequent concrete nouns and verbs of English, filtering out proper names (unless you really want proper names to be included). Then do the same thing for Khmer. Then you have to decide whether "good" and ល្អ are "the same" in meaning (the Khmer word also translates "attractive").
add a comment
|
I think you are looking for the "Swadesh list", a list of the 100 most common concepts across languages.
https://en.wikipedia.org/wiki/Swadesh_list
1
Do you have a reason to think this is based on frequency, rather than "basicness"?
– user6726
8 hours ago
add a comment
|
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "312"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2flinguistics.stackexchange.com%2fquestions%2f33565%2flist-of-1000-most-common-words-across-all-languages%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
There is no such list, but you could build one.
You can find (multiple) frequency lists for many languages, and you could come up with a way to decide which frequency list to use (there are very many for English). I suppose you are thinking you might construct a list with entries like day, Tag, jour, день, gün, siku, päivi, 日 where all of the words seem to mean the same thing and the words all end up on the top-1000 list in their respective languages (I don't know if they do, this is just a hypothetical example).
The problem is that the most frequent words in English (and many other languages) are things like "a, the, all, but, she", and these are not going to have correspondents in all languages. Plus, the various forms of the verb "be" or "do" and "don't" are each treated as separate words in some frequency lists. It would be more productive to define a subset of concrete nouns and "verbs" like "cat, dog, big, small, eat, walk" and get the N most frequent equivalents across languages. You must abandon the search for data in every language, but you could go for "as many as you can get". As a precursor to this exercise, you might try to come up with the N most frequent concrete nouns and verbs of English, filtering out proper names (unless you really want proper names to be included). Then do the same thing for Khmer. Then you have to decide whether "good" and ល្អ are "the same" in meaning (the Khmer word also translates "attractive").
add a comment
|
There is no such list, but you could build one.
You can find (multiple) frequency lists for many languages, and you could come up with a way to decide which frequency list to use (there are very many for English). I suppose you are thinking you might construct a list with entries like day, Tag, jour, день, gün, siku, päivi, 日 where all of the words seem to mean the same thing and the words all end up on the top-1000 list in their respective languages (I don't know if they do, this is just a hypothetical example).
The problem is that the most frequent words in English (and many other languages) are things like "a, the, all, but, she", and these are not going to have correspondents in all languages. Plus, the various forms of the verb "be" or "do" and "don't" are each treated as separate words in some frequency lists. It would be more productive to define a subset of concrete nouns and "verbs" like "cat, dog, big, small, eat, walk" and get the N most frequent equivalents across languages. You must abandon the search for data in every language, but you could go for "as many as you can get". As a precursor to this exercise, you might try to come up with the N most frequent concrete nouns and verbs of English, filtering out proper names (unless you really want proper names to be included). Then do the same thing for Khmer. Then you have to decide whether "good" and ល្អ are "the same" in meaning (the Khmer word also translates "attractive").
add a comment
|
There is no such list, but you could build one.
You can find (multiple) frequency lists for many languages, and you could come up with a way to decide which frequency list to use (there are very many for English). I suppose you are thinking you might construct a list with entries like day, Tag, jour, день, gün, siku, päivi, 日 where all of the words seem to mean the same thing and the words all end up on the top-1000 list in their respective languages (I don't know if they do, this is just a hypothetical example).
The problem is that the most frequent words in English (and many other languages) are things like "a, the, all, but, she", and these are not going to have correspondents in all languages. Plus, the various forms of the verb "be" or "do" and "don't" are each treated as separate words in some frequency lists. It would be more productive to define a subset of concrete nouns and "verbs" like "cat, dog, big, small, eat, walk" and get the N most frequent equivalents across languages. You must abandon the search for data in every language, but you could go for "as many as you can get". As a precursor to this exercise, you might try to come up with the N most frequent concrete nouns and verbs of English, filtering out proper names (unless you really want proper names to be included). Then do the same thing for Khmer. Then you have to decide whether "good" and ល្អ are "the same" in meaning (the Khmer word also translates "attractive").
There is no such list, but you could build one.
You can find (multiple) frequency lists for many languages, and you could come up with a way to decide which frequency list to use (there are very many for English). I suppose you are thinking you might construct a list with entries like day, Tag, jour, день, gün, siku, päivi, 日 where all of the words seem to mean the same thing and the words all end up on the top-1000 list in their respective languages (I don't know if they do, this is just a hypothetical example).
The problem is that the most frequent words in English (and many other languages) are things like "a, the, all, but, she", and these are not going to have correspondents in all languages. Plus, the various forms of the verb "be" or "do" and "don't" are each treated as separate words in some frequency lists. It would be more productive to define a subset of concrete nouns and "verbs" like "cat, dog, big, small, eat, walk" and get the N most frequent equivalents across languages. You must abandon the search for data in every language, but you could go for "as many as you can get". As a precursor to this exercise, you might try to come up with the N most frequent concrete nouns and verbs of English, filtering out proper names (unless you really want proper names to be included). Then do the same thing for Khmer. Then you have to decide whether "good" and ល្អ are "the same" in meaning (the Khmer word also translates "attractive").
answered 18 hours ago
user6726user6726
38.8k1 gold badge26 silver badges76 bronze badges
38.8k1 gold badge26 silver badges76 bronze badges
add a comment
|
add a comment
|
I think you are looking for the "Swadesh list", a list of the 100 most common concepts across languages.
https://en.wikipedia.org/wiki/Swadesh_list
1
Do you have a reason to think this is based on frequency, rather than "basicness"?
– user6726
8 hours ago
add a comment
|
I think you are looking for the "Swadesh list", a list of the 100 most common concepts across languages.
https://en.wikipedia.org/wiki/Swadesh_list
1
Do you have a reason to think this is based on frequency, rather than "basicness"?
– user6726
8 hours ago
add a comment
|
I think you are looking for the "Swadesh list", a list of the 100 most common concepts across languages.
https://en.wikipedia.org/wiki/Swadesh_list
I think you are looking for the "Swadesh list", a list of the 100 most common concepts across languages.
https://en.wikipedia.org/wiki/Swadesh_list
answered 9 hours ago
fdbfdb
17.4k1 gold badge22 silver badges46 bronze badges
17.4k1 gold badge22 silver badges46 bronze badges
1
Do you have a reason to think this is based on frequency, rather than "basicness"?
– user6726
8 hours ago
add a comment
|
1
Do you have a reason to think this is based on frequency, rather than "basicness"?
– user6726
8 hours ago
1
1
Do you have a reason to think this is based on frequency, rather than "basicness"?
– user6726
8 hours ago
Do you have a reason to think this is based on frequency, rather than "basicness"?
– user6726
8 hours ago
add a comment
|
Thanks for contributing an answer to Linguistics Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2flinguistics.stackexchange.com%2fquestions%2f33565%2flist-of-1000-most-common-words-across-all-languages%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
4
This question is really too broad for a meaningful answer. What do you mean by "words"? What do you mean by "most common across languages"? The most common word in English is "the", but German has a dozen or so different definite articles: should they all be grouped together for comparison with English? And so on and so forth.
– Draconis
19 hours ago
What Draconis said. You need to define a corpus. You could calculate such a list from the fastText .vec files. However it would contain many assumptions, and corous-specific skew and noise.
– Adam Bittlingmayer
18 hours ago
1
I want nouns, verbs, and adjectives in every language. Red, blue, sun, moon, etc.
– Lance Pollard
18 hours ago