Jargon request: “Canonical Form” of a wordArgot vs JargonDoes maths use jargon or metalanguage?What's the name of a lexeme's “surface form”?What is the word for a word which can mean anything you want it to?What's the difference between word vectors, word representations and vector embeddings?Сoncept of an attribute usesd by Russian grammariansDoes the single word 'equimolar' have single-word equivalents for less than or greater than?Word for a word that was the ancestor for another word?Term for borrowing an inflected form as an uninflected formIs there distinct jargon for syllabaries depending on their inventory?
How can I end combat quickly when the outcome is inevitable?
Why would future John risk sending back a T-800 to save his younger self?
Jargon request: "Canonical Form" of a word
Importance of Building Credit Score?
Union with anonymous struct with flexible array member
Longest bridge/tunnel that can be cycled over/through?
A IP can traceroute to it, but can not ping
How do governments keep track of their issued currency?
Can Rydberg constant be in joules?
What can I, as a user, do about offensive reviews in App Store?
Arriving at the same result with the opposite hypotheses
How is water heavier than petrol, even though its molecular weight is less than petrol?
What ways have you found to get edits from non-LaTeX users?
Compiling C files on Ubuntu and using the executable on Windows
Is the term 'open source' a trademark?
Zeros of the Hadamard product of holomorphic functions
Using "subway" as name for London Underground?
Is using haveibeenpwned to validate password strength rational?
Which languages would be most useful in Europe at the end of the 19th century?
Generate basis elements of the Steenrod algebra
Why didn't Voldemort recognize that Dumbledore was affected by his curse?
How can I make some of my chapters "come to life"?
Why can't I use =default for default ctors with a member initializer list
Why can my keyboard only digest 6 keypresses at a time?
Jargon request: “Canonical Form” of a word
Argot vs JargonDoes maths use jargon or metalanguage?What's the name of a lexeme's “surface form”?What is the word for a word which can mean anything you want it to?What's the difference between word vectors, word representations and vector embeddings?Сoncept of an attribute usesd by Russian grammariansDoes the single word 'equimolar' have single-word equivalents for less than or greater than?Word for a word that was the ancestor for another word?Term for borrowing an inflected form as an uninflected formIs there distinct jargon for syllabaries depending on their inventory?
I have zero experience with linguistics. Some friends told me that my question is one in linguistic, so I decided to give a shot here.
Question
While designing a dictionary, people collect words into it. Naively, each word will occupy an entry, but that is not an efficient way. Instead, we should combine words like 'book' and 'books' into an entry, and probably should name that entry 'book' because the word 'book' is more "reduced" than the work 'books'.
In this case, I would like to say that the 'canonical form' of the word 'books' is 'book'. What is the terminology in linguistics (if any) that is similar to my 'canonical form'? And how do people deal with this dictionary-designing problem? Any related terms will also be appreciated. Thank you in advance!
terminology
New contributor
add a comment |
I have zero experience with linguistics. Some friends told me that my question is one in linguistic, so I decided to give a shot here.
Question
While designing a dictionary, people collect words into it. Naively, each word will occupy an entry, but that is not an efficient way. Instead, we should combine words like 'book' and 'books' into an entry, and probably should name that entry 'book' because the word 'book' is more "reduced" than the work 'books'.
In this case, I would like to say that the 'canonical form' of the word 'books' is 'book'. What is the terminology in linguistics (if any) that is similar to my 'canonical form'? And how do people deal with this dictionary-designing problem? Any related terms will also be appreciated. Thank you in advance!
terminology
New contributor
add a comment |
I have zero experience with linguistics. Some friends told me that my question is one in linguistic, so I decided to give a shot here.
Question
While designing a dictionary, people collect words into it. Naively, each word will occupy an entry, but that is not an efficient way. Instead, we should combine words like 'book' and 'books' into an entry, and probably should name that entry 'book' because the word 'book' is more "reduced" than the work 'books'.
In this case, I would like to say that the 'canonical form' of the word 'books' is 'book'. What is the terminology in linguistics (if any) that is similar to my 'canonical form'? And how do people deal with this dictionary-designing problem? Any related terms will also be appreciated. Thank you in advance!
terminology
New contributor
I have zero experience with linguistics. Some friends told me that my question is one in linguistic, so I decided to give a shot here.
Question
While designing a dictionary, people collect words into it. Naively, each word will occupy an entry, but that is not an efficient way. Instead, we should combine words like 'book' and 'books' into an entry, and probably should name that entry 'book' because the word 'book' is more "reduced" than the work 'books'.
In this case, I would like to say that the 'canonical form' of the word 'books' is 'book'. What is the terminology in linguistics (if any) that is similar to my 'canonical form'? And how do people deal with this dictionary-designing problem? Any related terms will also be appreciated. Thank you in advance!
terminology
terminology
New contributor
New contributor
New contributor
asked 8 hours ago
StudentStudent
1162
1162
New contributor
New contributor
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
The term you are looking for is lemma.
And yes, indeed most linguists will treat book and books as the same "word" (in the sense of word=lemma) in a dictionary, which is why the term "lemma" is sometimes also referred to as the dictionary form of a word. This applies not only when writing a literal dictionary, but is also the common understanding of the term "word" in theoretical linguistics: "books" and "book" are generally regarded as belonging to the same lexicon entry, where "lexicon" here means not a physical lexicon you keep in your bookshelf, but an abstract idea of a language's word inventory that speakers have intuitions about.
I don't know exactly what you'd like to know more about how this problem is dealt with; basically, forms of a word which are mere grammatical variants (such as singular vs. plural in the case of nouns, present vs. past tense in the case of verbs, ...) but carry the same core meaning will be treated as the same "word" (in the sense of lemma), while two words of the same form but with a different meaning (such as the word(s) "bow": to bend down vs. something to shoot an arrow with - both words happen to have the same form, but it makes sense to treat them as different words) are treated as belonging to different lemmas. More difficult cases include phenomena like "parliament" (this could have the sense of building or an institution - is this still in some way the same meaning or should these two senses of "parliament" be treated as different entries in a lexicon?), but this is then more a question of where to draw the line between different "meanings", while "lemma" in the sense of "an abstraction over all grammatical variants of a word" is relatively clear-cut.
BTW: The process of computing the lemma form for a given word (e.g. books -> book), often used in the context of computational linguistics, is called lemmatization.
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "312"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Student is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2flinguistics.stackexchange.com%2fquestions%2f31683%2fjargon-request-canonical-form-of-a-word%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
The term you are looking for is lemma.
And yes, indeed most linguists will treat book and books as the same "word" (in the sense of word=lemma) in a dictionary, which is why the term "lemma" is sometimes also referred to as the dictionary form of a word. This applies not only when writing a literal dictionary, but is also the common understanding of the term "word" in theoretical linguistics: "books" and "book" are generally regarded as belonging to the same lexicon entry, where "lexicon" here means not a physical lexicon you keep in your bookshelf, but an abstract idea of a language's word inventory that speakers have intuitions about.
I don't know exactly what you'd like to know more about how this problem is dealt with; basically, forms of a word which are mere grammatical variants (such as singular vs. plural in the case of nouns, present vs. past tense in the case of verbs, ...) but carry the same core meaning will be treated as the same "word" (in the sense of lemma), while two words of the same form but with a different meaning (such as the word(s) "bow": to bend down vs. something to shoot an arrow with - both words happen to have the same form, but it makes sense to treat them as different words) are treated as belonging to different lemmas. More difficult cases include phenomena like "parliament" (this could have the sense of building or an institution - is this still in some way the same meaning or should these two senses of "parliament" be treated as different entries in a lexicon?), but this is then more a question of where to draw the line between different "meanings", while "lemma" in the sense of "an abstraction over all grammatical variants of a word" is relatively clear-cut.
BTW: The process of computing the lemma form for a given word (e.g. books -> book), often used in the context of computational linguistics, is called lemmatization.
add a comment |
The term you are looking for is lemma.
And yes, indeed most linguists will treat book and books as the same "word" (in the sense of word=lemma) in a dictionary, which is why the term "lemma" is sometimes also referred to as the dictionary form of a word. This applies not only when writing a literal dictionary, but is also the common understanding of the term "word" in theoretical linguistics: "books" and "book" are generally regarded as belonging to the same lexicon entry, where "lexicon" here means not a physical lexicon you keep in your bookshelf, but an abstract idea of a language's word inventory that speakers have intuitions about.
I don't know exactly what you'd like to know more about how this problem is dealt with; basically, forms of a word which are mere grammatical variants (such as singular vs. plural in the case of nouns, present vs. past tense in the case of verbs, ...) but carry the same core meaning will be treated as the same "word" (in the sense of lemma), while two words of the same form but with a different meaning (such as the word(s) "bow": to bend down vs. something to shoot an arrow with - both words happen to have the same form, but it makes sense to treat them as different words) are treated as belonging to different lemmas. More difficult cases include phenomena like "parliament" (this could have the sense of building or an institution - is this still in some way the same meaning or should these two senses of "parliament" be treated as different entries in a lexicon?), but this is then more a question of where to draw the line between different "meanings", while "lemma" in the sense of "an abstraction over all grammatical variants of a word" is relatively clear-cut.
BTW: The process of computing the lemma form for a given word (e.g. books -> book), often used in the context of computational linguistics, is called lemmatization.
add a comment |
The term you are looking for is lemma.
And yes, indeed most linguists will treat book and books as the same "word" (in the sense of word=lemma) in a dictionary, which is why the term "lemma" is sometimes also referred to as the dictionary form of a word. This applies not only when writing a literal dictionary, but is also the common understanding of the term "word" in theoretical linguistics: "books" and "book" are generally regarded as belonging to the same lexicon entry, where "lexicon" here means not a physical lexicon you keep in your bookshelf, but an abstract idea of a language's word inventory that speakers have intuitions about.
I don't know exactly what you'd like to know more about how this problem is dealt with; basically, forms of a word which are mere grammatical variants (such as singular vs. plural in the case of nouns, present vs. past tense in the case of verbs, ...) but carry the same core meaning will be treated as the same "word" (in the sense of lemma), while two words of the same form but with a different meaning (such as the word(s) "bow": to bend down vs. something to shoot an arrow with - both words happen to have the same form, but it makes sense to treat them as different words) are treated as belonging to different lemmas. More difficult cases include phenomena like "parliament" (this could have the sense of building or an institution - is this still in some way the same meaning or should these two senses of "parliament" be treated as different entries in a lexicon?), but this is then more a question of where to draw the line between different "meanings", while "lemma" in the sense of "an abstraction over all grammatical variants of a word" is relatively clear-cut.
BTW: The process of computing the lemma form for a given word (e.g. books -> book), often used in the context of computational linguistics, is called lemmatization.
The term you are looking for is lemma.
And yes, indeed most linguists will treat book and books as the same "word" (in the sense of word=lemma) in a dictionary, which is why the term "lemma" is sometimes also referred to as the dictionary form of a word. This applies not only when writing a literal dictionary, but is also the common understanding of the term "word" in theoretical linguistics: "books" and "book" are generally regarded as belonging to the same lexicon entry, where "lexicon" here means not a physical lexicon you keep in your bookshelf, but an abstract idea of a language's word inventory that speakers have intuitions about.
I don't know exactly what you'd like to know more about how this problem is dealt with; basically, forms of a word which are mere grammatical variants (such as singular vs. plural in the case of nouns, present vs. past tense in the case of verbs, ...) but carry the same core meaning will be treated as the same "word" (in the sense of lemma), while two words of the same form but with a different meaning (such as the word(s) "bow": to bend down vs. something to shoot an arrow with - both words happen to have the same form, but it makes sense to treat them as different words) are treated as belonging to different lemmas. More difficult cases include phenomena like "parliament" (this could have the sense of building or an institution - is this still in some way the same meaning or should these two senses of "parliament" be treated as different entries in a lexicon?), but this is then more a question of where to draw the line between different "meanings", while "lemma" in the sense of "an abstraction over all grammatical variants of a word" is relatively clear-cut.
BTW: The process of computing the lemma form for a given word (e.g. books -> book), often used in the context of computational linguistics, is called lemmatization.
edited 6 hours ago
answered 7 hours ago
lemontree♦lemontree
4,88541130
4,88541130
add a comment |
add a comment |
Student is a new contributor. Be nice, and check out our Code of Conduct.
Student is a new contributor. Be nice, and check out our Code of Conduct.
Student is a new contributor. Be nice, and check out our Code of Conduct.
Student is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Linguistics Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2flinguistics.stackexchange.com%2fquestions%2f31683%2fjargon-request-canonical-form-of-a-word%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown