How to visualize an ordinal variable predicting a continuous outcome?How to visualize (make plot) of regression output against categorical input variable?Regression for continuous dependent variable with independent ordinal variableplot predicted values from a cumulative link model (clm, ordinal)Ordinal vs. Continuous Variable and Appropriate Method for Testing Difference of GroupsR Mediate: How to interpret output with an ordinal outcome?continuous independent variable with three levels
Lost passport and visa, tried to reapply, got rejected twice. What are my next steps?
I transpose the source code, you transpose the input!
Garage door sticks on a bolt
Vilna Gaon's gematria for the number of kosher & non-kosher sukkot in Masechet Sukkah
What makes learning more difficult as we age?
How important is knowledge of trig identities for use in Calculus
Can I exile my opponent's Progenitus/True-Name Nemesis with Teferi, Hero of Dominaria's emblem?
Assembly of PCBs containing a mix of SMT and thru-hole parts?
Worlds with different mathematics and logic
If a spaceship ran out of fuel somewhere in space between Earth and Mars, does it slowly drift off to the Sun?
Calculate the Ultraradical
Windows 10 deletes lots of tiny files super slowly. Anything that can be done to speed it up?
Can an energy drink or chocolate before an exam be useful ? What sort of other edible goods be helpful?
Impossible violin chord, how to fix this?
Why isn't there armor to protect from spells in the Potterverse?
Duck, duck, gone!
Fix Ethernet 10/100 PoE cable with 7 out of 8 wires alive
Speed and Velocity in Russian
How do my husband and I get over our fear of having another difficult baby?
What are one's options when facing religious discrimination at the airport?
About non-FTL travel and realitivistic effect for a hard sci fi novel
How to unpickle a Result object returned by an IBMQ experiment?
Can RPi4 run simultaneously on dual band (WiFi 2.4GHz / 5GHz)?
How to prevent pickpocketing in busy bars?
How to visualize an ordinal variable predicting a continuous outcome?
How to visualize (make plot) of regression output against categorical input variable?Regression for continuous dependent variable with independent ordinal variableplot predicted values from a cumulative link model (clm, ordinal)Ordinal vs. Continuous Variable and Appropriate Method for Testing Difference of GroupsR Mediate: How to interpret output with an ordinal outcome?continuous independent variable with three levels
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
$begingroup$
What is the best way to visualize the relationship between an ordinal predictor and a continuous outcome? So far I have this, but I feel like this is lacking...
The way I modeled it is I treated the ordinal predictor as interval instead of categorical. If this is not the best way to treat this type of data, I'd appreciate the feedback.
data-visualization ordinal-data
$endgroup$
add a comment
|
$begingroup$
What is the best way to visualize the relationship between an ordinal predictor and a continuous outcome? So far I have this, but I feel like this is lacking...
The way I modeled it is I treated the ordinal predictor as interval instead of categorical. If this is not the best way to treat this type of data, I'd appreciate the feedback.
data-visualization ordinal-data
$endgroup$
3
$begingroup$
1. Is there any particular reason to imagine the relationship would be linear? 2. Is there indeed any need to draw any kind of curve or line? Why not simply mark in the means (or any other suitable measure of location) in each category? 3. Can you say more about the continuous outcome? What kind of thing is it?
$endgroup$
– Glen_b♦
11 hours ago
add a comment
|
$begingroup$
What is the best way to visualize the relationship between an ordinal predictor and a continuous outcome? So far I have this, but I feel like this is lacking...
The way I modeled it is I treated the ordinal predictor as interval instead of categorical. If this is not the best way to treat this type of data, I'd appreciate the feedback.
data-visualization ordinal-data
$endgroup$
What is the best way to visualize the relationship between an ordinal predictor and a continuous outcome? So far I have this, but I feel like this is lacking...
The way I modeled it is I treated the ordinal predictor as interval instead of categorical. If this is not the best way to treat this type of data, I'd appreciate the feedback.
data-visualization ordinal-data
data-visualization ordinal-data
asked 12 hours ago
data:image/s3,"s3://crabby-images/7c4b7/7c4b78a677df6bd7867c6a79c594c64adea5fdb7" alt=""
data:image/s3,"s3://crabby-images/7c4b7/7c4b78a677df6bd7867c6a79c594c64adea5fdb7" alt=""
762762
376 bronze badges
376 bronze badges
3
$begingroup$
1. Is there any particular reason to imagine the relationship would be linear? 2. Is there indeed any need to draw any kind of curve or line? Why not simply mark in the means (or any other suitable measure of location) in each category? 3. Can you say more about the continuous outcome? What kind of thing is it?
$endgroup$
– Glen_b♦
11 hours ago
add a comment
|
3
$begingroup$
1. Is there any particular reason to imagine the relationship would be linear? 2. Is there indeed any need to draw any kind of curve or line? Why not simply mark in the means (or any other suitable measure of location) in each category? 3. Can you say more about the continuous outcome? What kind of thing is it?
$endgroup$
– Glen_b♦
11 hours ago
3
3
$begingroup$
1. Is there any particular reason to imagine the relationship would be linear? 2. Is there indeed any need to draw any kind of curve or line? Why not simply mark in the means (or any other suitable measure of location) in each category? 3. Can you say more about the continuous outcome? What kind of thing is it?
$endgroup$
– Glen_b♦
11 hours ago
$begingroup$
1. Is there any particular reason to imagine the relationship would be linear? 2. Is there indeed any need to draw any kind of curve or line? Why not simply mark in the means (or any other suitable measure of location) in each category? 3. Can you say more about the continuous outcome? What kind of thing is it?
$endgroup$
– Glen_b♦
11 hours ago
add a comment
|
4 Answers
4
active
oldest
votes
$begingroup$
The problem with this is that there's no way of knowing how many dots are bunched up together. Two solutions I've seen:
Box plot
This would give you tighter box if data points are bunched up together.
Bubble chart
Not sure if this is the official name, but basically you put the vertical axis into bins. The size of the bubble is determined by how many observations fall into that bin.
$endgroup$
$begingroup$
How does the bubble chart help to display the ordinal variable? Maybe you can change your example image to one with ordinal data.
$endgroup$
– Pieter
8 hours ago
$begingroup$
@Pieter Thanks for the suggestion but I couldn't find one... but you can see the variable on the x-axis are all integers (11, 12, 13, 14, ...) so that's one example of discrete data, and you can treat ordinal values as discrete data.
$endgroup$
– Art
8 hours ago
add a comment
|
$begingroup$
To your scatterplot, I would add a large point indicating the mean Y-value at every unique X-value, and also do one or more of the following:
- Square-root (or cube-root) transform your Y-axis. Both these
transformations can deal with zeroes, unlike log transformations.
Cube roots can also deal with negative numbers. - Make the points a bit transparent.
- Add a little jitter to the X-axis values if the previous steps are insufficient.
As Glen_b notes, there is insufficient information right now to note whether adding a linear regression line is meaningful.
$endgroup$
$begingroup$
Are there any zeros in the response?
$endgroup$
– Nick Cox
11 hours ago
$begingroup$
@NickCox Hard to say for sure. I squinted at it and it seemed like there might be. See X = 3, for example.
$endgroup$
– mkt
11 hours ago
$begingroup$
Question remains for OP, who should know. Incidentally I am a small fan of cube roots, used in Miles, Stokes, Vieli, Cox in Nature. We had to work hard to persuade reviewers that they were a good idea for a response that was variously positive and negative. But for the question here I would favour a Poisson model, which can work fine for non-negative continuous responses.
$endgroup$
– Nick Cox
9 hours ago
$begingroup$
@NickCox Agreed, but the question was about how to visualise, not how to model.
$endgroup$
– mkt
9 hours ago
1
$begingroup$
Indeed. That was a comment not an answer. But a Poisson model would imply plotting on log scale with a secondary question on how to plot observed zeros.
$endgroup$
– Nick Cox
9 hours ago
add a comment
|
$begingroup$
The plot you shown is pretty good. But I think you can improve the data-ink ratio (invented by Edward Tufte) even more by showing all the datapoints. You can do this by adding jitter to the x-axis.
Another improvement is to emphasise that the ordinal variable is categorical and not continuous. You can do this by using a different colour for the different levels.
As an example I have plotted the titanic dataset in R, using the passenger class as an ordinal variable and the passenger age as the continuous variable.
library(tidyverse)
library(ggplot2)
library(titanic)
df <- titanic_train %>% mutate(Class=factor(Pclass))
ggplot(df, aes(Class, Age, color=Class)) +
geom_jitter(height = 0) +
ggtitle("Titanic passenger age vs. class")
$endgroup$
add a comment
|
$begingroup$
You state that one variable is ordinal, then you decide to treat it as interval. Is that reasonable? There is no way for us to know, as you have not said what the ordinal variable actually is. If you do decide to keep it as ordinal, then what to do depends on your sample size. If N is very large then I like the box plot solution. If N is not so large, then I like jitter. There are other additions you can make to the scatterplot as well - I wrote a presentation about this using SAS, but I am sure it could be duplicated in R. (If that link does not work, Googling flom, scatterplots, enhancements should find it).
But what if treating the variable as interval is not reasonable? You could come to this conclusion either substantively or by trying different codings and seeing how results change. In that case, I suggest trying optimal scaling. There is an R package optiscale that may help (I have not used this package).
$endgroup$
add a comment
|
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "65"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f428445%2fhow-to-visualize-an-ordinal-variable-predicting-a-continuous-outcome%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
The problem with this is that there's no way of knowing how many dots are bunched up together. Two solutions I've seen:
Box plot
This would give you tighter box if data points are bunched up together.
Bubble chart
Not sure if this is the official name, but basically you put the vertical axis into bins. The size of the bubble is determined by how many observations fall into that bin.
$endgroup$
$begingroup$
How does the bubble chart help to display the ordinal variable? Maybe you can change your example image to one with ordinal data.
$endgroup$
– Pieter
8 hours ago
$begingroup$
@Pieter Thanks for the suggestion but I couldn't find one... but you can see the variable on the x-axis are all integers (11, 12, 13, 14, ...) so that's one example of discrete data, and you can treat ordinal values as discrete data.
$endgroup$
– Art
8 hours ago
add a comment
|
$begingroup$
The problem with this is that there's no way of knowing how many dots are bunched up together. Two solutions I've seen:
Box plot
This would give you tighter box if data points are bunched up together.
Bubble chart
Not sure if this is the official name, but basically you put the vertical axis into bins. The size of the bubble is determined by how many observations fall into that bin.
$endgroup$
$begingroup$
How does the bubble chart help to display the ordinal variable? Maybe you can change your example image to one with ordinal data.
$endgroup$
– Pieter
8 hours ago
$begingroup$
@Pieter Thanks for the suggestion but I couldn't find one... but you can see the variable on the x-axis are all integers (11, 12, 13, 14, ...) so that's one example of discrete data, and you can treat ordinal values as discrete data.
$endgroup$
– Art
8 hours ago
add a comment
|
$begingroup$
The problem with this is that there's no way of knowing how many dots are bunched up together. Two solutions I've seen:
Box plot
This would give you tighter box if data points are bunched up together.
Bubble chart
Not sure if this is the official name, but basically you put the vertical axis into bins. The size of the bubble is determined by how many observations fall into that bin.
$endgroup$
The problem with this is that there's no way of knowing how many dots are bunched up together. Two solutions I've seen:
Box plot
This would give you tighter box if data points are bunched up together.
Bubble chart
Not sure if this is the official name, but basically you put the vertical axis into bins. The size of the bubble is determined by how many observations fall into that bin.
answered 11 hours ago
ArtArt
3251 silver badge7 bronze badges
3251 silver badge7 bronze badges
$begingroup$
How does the bubble chart help to display the ordinal variable? Maybe you can change your example image to one with ordinal data.
$endgroup$
– Pieter
8 hours ago
$begingroup$
@Pieter Thanks for the suggestion but I couldn't find one... but you can see the variable on the x-axis are all integers (11, 12, 13, 14, ...) so that's one example of discrete data, and you can treat ordinal values as discrete data.
$endgroup$
– Art
8 hours ago
add a comment
|
$begingroup$
How does the bubble chart help to display the ordinal variable? Maybe you can change your example image to one with ordinal data.
$endgroup$
– Pieter
8 hours ago
$begingroup$
@Pieter Thanks for the suggestion but I couldn't find one... but you can see the variable on the x-axis are all integers (11, 12, 13, 14, ...) so that's one example of discrete data, and you can treat ordinal values as discrete data.
$endgroup$
– Art
8 hours ago
$begingroup$
How does the bubble chart help to display the ordinal variable? Maybe you can change your example image to one with ordinal data.
$endgroup$
– Pieter
8 hours ago
$begingroup$
How does the bubble chart help to display the ordinal variable? Maybe you can change your example image to one with ordinal data.
$endgroup$
– Pieter
8 hours ago
$begingroup$
@Pieter Thanks for the suggestion but I couldn't find one... but you can see the variable on the x-axis are all integers (11, 12, 13, 14, ...) so that's one example of discrete data, and you can treat ordinal values as discrete data.
$endgroup$
– Art
8 hours ago
$begingroup$
@Pieter Thanks for the suggestion but I couldn't find one... but you can see the variable on the x-axis are all integers (11, 12, 13, 14, ...) so that's one example of discrete data, and you can treat ordinal values as discrete data.
$endgroup$
– Art
8 hours ago
add a comment
|
$begingroup$
To your scatterplot, I would add a large point indicating the mean Y-value at every unique X-value, and also do one or more of the following:
- Square-root (or cube-root) transform your Y-axis. Both these
transformations can deal with zeroes, unlike log transformations.
Cube roots can also deal with negative numbers. - Make the points a bit transparent.
- Add a little jitter to the X-axis values if the previous steps are insufficient.
As Glen_b notes, there is insufficient information right now to note whether adding a linear regression line is meaningful.
$endgroup$
$begingroup$
Are there any zeros in the response?
$endgroup$
– Nick Cox
11 hours ago
$begingroup$
@NickCox Hard to say for sure. I squinted at it and it seemed like there might be. See X = 3, for example.
$endgroup$
– mkt
11 hours ago
$begingroup$
Question remains for OP, who should know. Incidentally I am a small fan of cube roots, used in Miles, Stokes, Vieli, Cox in Nature. We had to work hard to persuade reviewers that they were a good idea for a response that was variously positive and negative. But for the question here I would favour a Poisson model, which can work fine for non-negative continuous responses.
$endgroup$
– Nick Cox
9 hours ago
$begingroup$
@NickCox Agreed, but the question was about how to visualise, not how to model.
$endgroup$
– mkt
9 hours ago
1
$begingroup$
Indeed. That was a comment not an answer. But a Poisson model would imply plotting on log scale with a secondary question on how to plot observed zeros.
$endgroup$
– Nick Cox
9 hours ago
add a comment
|
$begingroup$
To your scatterplot, I would add a large point indicating the mean Y-value at every unique X-value, and also do one or more of the following:
- Square-root (or cube-root) transform your Y-axis. Both these
transformations can deal with zeroes, unlike log transformations.
Cube roots can also deal with negative numbers. - Make the points a bit transparent.
- Add a little jitter to the X-axis values if the previous steps are insufficient.
As Glen_b notes, there is insufficient information right now to note whether adding a linear regression line is meaningful.
$endgroup$
$begingroup$
Are there any zeros in the response?
$endgroup$
– Nick Cox
11 hours ago
$begingroup$
@NickCox Hard to say for sure. I squinted at it and it seemed like there might be. See X = 3, for example.
$endgroup$
– mkt
11 hours ago
$begingroup$
Question remains for OP, who should know. Incidentally I am a small fan of cube roots, used in Miles, Stokes, Vieli, Cox in Nature. We had to work hard to persuade reviewers that they were a good idea for a response that was variously positive and negative. But for the question here I would favour a Poisson model, which can work fine for non-negative continuous responses.
$endgroup$
– Nick Cox
9 hours ago
$begingroup$
@NickCox Agreed, but the question was about how to visualise, not how to model.
$endgroup$
– mkt
9 hours ago
1
$begingroup$
Indeed. That was a comment not an answer. But a Poisson model would imply plotting on log scale with a secondary question on how to plot observed zeros.
$endgroup$
– Nick Cox
9 hours ago
add a comment
|
$begingroup$
To your scatterplot, I would add a large point indicating the mean Y-value at every unique X-value, and also do one or more of the following:
- Square-root (or cube-root) transform your Y-axis. Both these
transformations can deal with zeroes, unlike log transformations.
Cube roots can also deal with negative numbers. - Make the points a bit transparent.
- Add a little jitter to the X-axis values if the previous steps are insufficient.
As Glen_b notes, there is insufficient information right now to note whether adding a linear regression line is meaningful.
$endgroup$
To your scatterplot, I would add a large point indicating the mean Y-value at every unique X-value, and also do one or more of the following:
- Square-root (or cube-root) transform your Y-axis. Both these
transformations can deal with zeroes, unlike log transformations.
Cube roots can also deal with negative numbers. - Make the points a bit transparent.
- Add a little jitter to the X-axis values if the previous steps are insufficient.
As Glen_b notes, there is insufficient information right now to note whether adding a linear regression line is meaningful.
edited 11 hours ago
answered 11 hours ago
mktmkt
7,4436 gold badges31 silver badges89 bronze badges
7,4436 gold badges31 silver badges89 bronze badges
$begingroup$
Are there any zeros in the response?
$endgroup$
– Nick Cox
11 hours ago
$begingroup$
@NickCox Hard to say for sure. I squinted at it and it seemed like there might be. See X = 3, for example.
$endgroup$
– mkt
11 hours ago
$begingroup$
Question remains for OP, who should know. Incidentally I am a small fan of cube roots, used in Miles, Stokes, Vieli, Cox in Nature. We had to work hard to persuade reviewers that they were a good idea for a response that was variously positive and negative. But for the question here I would favour a Poisson model, which can work fine for non-negative continuous responses.
$endgroup$
– Nick Cox
9 hours ago
$begingroup$
@NickCox Agreed, but the question was about how to visualise, not how to model.
$endgroup$
– mkt
9 hours ago
1
$begingroup$
Indeed. That was a comment not an answer. But a Poisson model would imply plotting on log scale with a secondary question on how to plot observed zeros.
$endgroup$
– Nick Cox
9 hours ago
add a comment
|
$begingroup$
Are there any zeros in the response?
$endgroup$
– Nick Cox
11 hours ago
$begingroup$
@NickCox Hard to say for sure. I squinted at it and it seemed like there might be. See X = 3, for example.
$endgroup$
– mkt
11 hours ago
$begingroup$
Question remains for OP, who should know. Incidentally I am a small fan of cube roots, used in Miles, Stokes, Vieli, Cox in Nature. We had to work hard to persuade reviewers that they were a good idea for a response that was variously positive and negative. But for the question here I would favour a Poisson model, which can work fine for non-negative continuous responses.
$endgroup$
– Nick Cox
9 hours ago
$begingroup$
@NickCox Agreed, but the question was about how to visualise, not how to model.
$endgroup$
– mkt
9 hours ago
1
$begingroup$
Indeed. That was a comment not an answer. But a Poisson model would imply plotting on log scale with a secondary question on how to plot observed zeros.
$endgroup$
– Nick Cox
9 hours ago
$begingroup$
Are there any zeros in the response?
$endgroup$
– Nick Cox
11 hours ago
$begingroup$
Are there any zeros in the response?
$endgroup$
– Nick Cox
11 hours ago
$begingroup$
@NickCox Hard to say for sure. I squinted at it and it seemed like there might be. See X = 3, for example.
$endgroup$
– mkt
11 hours ago
$begingroup$
@NickCox Hard to say for sure. I squinted at it and it seemed like there might be. See X = 3, for example.
$endgroup$
– mkt
11 hours ago
$begingroup$
Question remains for OP, who should know. Incidentally I am a small fan of cube roots, used in Miles, Stokes, Vieli, Cox in Nature. We had to work hard to persuade reviewers that they were a good idea for a response that was variously positive and negative. But for the question here I would favour a Poisson model, which can work fine for non-negative continuous responses.
$endgroup$
– Nick Cox
9 hours ago
$begingroup$
Question remains for OP, who should know. Incidentally I am a small fan of cube roots, used in Miles, Stokes, Vieli, Cox in Nature. We had to work hard to persuade reviewers that they were a good idea for a response that was variously positive and negative. But for the question here I would favour a Poisson model, which can work fine for non-negative continuous responses.
$endgroup$
– Nick Cox
9 hours ago
$begingroup$
@NickCox Agreed, but the question was about how to visualise, not how to model.
$endgroup$
– mkt
9 hours ago
$begingroup$
@NickCox Agreed, but the question was about how to visualise, not how to model.
$endgroup$
– mkt
9 hours ago
1
1
$begingroup$
Indeed. That was a comment not an answer. But a Poisson model would imply plotting on log scale with a secondary question on how to plot observed zeros.
$endgroup$
– Nick Cox
9 hours ago
$begingroup$
Indeed. That was a comment not an answer. But a Poisson model would imply plotting on log scale with a secondary question on how to plot observed zeros.
$endgroup$
– Nick Cox
9 hours ago
add a comment
|
$begingroup$
The plot you shown is pretty good. But I think you can improve the data-ink ratio (invented by Edward Tufte) even more by showing all the datapoints. You can do this by adding jitter to the x-axis.
Another improvement is to emphasise that the ordinal variable is categorical and not continuous. You can do this by using a different colour for the different levels.
As an example I have plotted the titanic dataset in R, using the passenger class as an ordinal variable and the passenger age as the continuous variable.
library(tidyverse)
library(ggplot2)
library(titanic)
df <- titanic_train %>% mutate(Class=factor(Pclass))
ggplot(df, aes(Class, Age, color=Class)) +
geom_jitter(height = 0) +
ggtitle("Titanic passenger age vs. class")
$endgroup$
add a comment
|
$begingroup$
The plot you shown is pretty good. But I think you can improve the data-ink ratio (invented by Edward Tufte) even more by showing all the datapoints. You can do this by adding jitter to the x-axis.
Another improvement is to emphasise that the ordinal variable is categorical and not continuous. You can do this by using a different colour for the different levels.
As an example I have plotted the titanic dataset in R, using the passenger class as an ordinal variable and the passenger age as the continuous variable.
library(tidyverse)
library(ggplot2)
library(titanic)
df <- titanic_train %>% mutate(Class=factor(Pclass))
ggplot(df, aes(Class, Age, color=Class)) +
geom_jitter(height = 0) +
ggtitle("Titanic passenger age vs. class")
$endgroup$
add a comment
|
$begingroup$
The plot you shown is pretty good. But I think you can improve the data-ink ratio (invented by Edward Tufte) even more by showing all the datapoints. You can do this by adding jitter to the x-axis.
Another improvement is to emphasise that the ordinal variable is categorical and not continuous. You can do this by using a different colour for the different levels.
As an example I have plotted the titanic dataset in R, using the passenger class as an ordinal variable and the passenger age as the continuous variable.
library(tidyverse)
library(ggplot2)
library(titanic)
df <- titanic_train %>% mutate(Class=factor(Pclass))
ggplot(df, aes(Class, Age, color=Class)) +
geom_jitter(height = 0) +
ggtitle("Titanic passenger age vs. class")
$endgroup$
The plot you shown is pretty good. But I think you can improve the data-ink ratio (invented by Edward Tufte) even more by showing all the datapoints. You can do this by adding jitter to the x-axis.
Another improvement is to emphasise that the ordinal variable is categorical and not continuous. You can do this by using a different colour for the different levels.
As an example I have plotted the titanic dataset in R, using the passenger class as an ordinal variable and the passenger age as the continuous variable.
library(tidyverse)
library(ggplot2)
library(titanic)
df <- titanic_train %>% mutate(Class=factor(Pclass))
ggplot(df, aes(Class, Age, color=Class)) +
geom_jitter(height = 0) +
ggtitle("Titanic passenger age vs. class")
answered 8 hours ago
data:image/s3,"s3://crabby-images/565fa/565faf87f996e2e7f82e4bc8d904f9165f99c411" alt=""
data:image/s3,"s3://crabby-images/565fa/565faf87f996e2e7f82e4bc8d904f9165f99c411" alt=""
PieterPieter
1,4776 silver badges19 bronze badges
1,4776 silver badges19 bronze badges
add a comment
|
add a comment
|
$begingroup$
You state that one variable is ordinal, then you decide to treat it as interval. Is that reasonable? There is no way for us to know, as you have not said what the ordinal variable actually is. If you do decide to keep it as ordinal, then what to do depends on your sample size. If N is very large then I like the box plot solution. If N is not so large, then I like jitter. There are other additions you can make to the scatterplot as well - I wrote a presentation about this using SAS, but I am sure it could be duplicated in R. (If that link does not work, Googling flom, scatterplots, enhancements should find it).
But what if treating the variable as interval is not reasonable? You could come to this conclusion either substantively or by trying different codings and seeing how results change. In that case, I suggest trying optimal scaling. There is an R package optiscale that may help (I have not used this package).
$endgroup$
add a comment
|
$begingroup$
You state that one variable is ordinal, then you decide to treat it as interval. Is that reasonable? There is no way for us to know, as you have not said what the ordinal variable actually is. If you do decide to keep it as ordinal, then what to do depends on your sample size. If N is very large then I like the box plot solution. If N is not so large, then I like jitter. There are other additions you can make to the scatterplot as well - I wrote a presentation about this using SAS, but I am sure it could be duplicated in R. (If that link does not work, Googling flom, scatterplots, enhancements should find it).
But what if treating the variable as interval is not reasonable? You could come to this conclusion either substantively or by trying different codings and seeing how results change. In that case, I suggest trying optimal scaling. There is an R package optiscale that may help (I have not used this package).
$endgroup$
add a comment
|
$begingroup$
You state that one variable is ordinal, then you decide to treat it as interval. Is that reasonable? There is no way for us to know, as you have not said what the ordinal variable actually is. If you do decide to keep it as ordinal, then what to do depends on your sample size. If N is very large then I like the box plot solution. If N is not so large, then I like jitter. There are other additions you can make to the scatterplot as well - I wrote a presentation about this using SAS, but I am sure it could be duplicated in R. (If that link does not work, Googling flom, scatterplots, enhancements should find it).
But what if treating the variable as interval is not reasonable? You could come to this conclusion either substantively or by trying different codings and seeing how results change. In that case, I suggest trying optimal scaling. There is an R package optiscale that may help (I have not used this package).
$endgroup$
You state that one variable is ordinal, then you decide to treat it as interval. Is that reasonable? There is no way for us to know, as you have not said what the ordinal variable actually is. If you do decide to keep it as ordinal, then what to do depends on your sample size. If N is very large then I like the box plot solution. If N is not so large, then I like jitter. There are other additions you can make to the scatterplot as well - I wrote a presentation about this using SAS, but I am sure it could be duplicated in R. (If that link does not work, Googling flom, scatterplots, enhancements should find it).
But what if treating the variable as interval is not reasonable? You could come to this conclusion either substantively or by trying different codings and seeing how results change. In that case, I suggest trying optimal scaling. There is an R package optiscale that may help (I have not used this package).
answered 8 hours ago
Peter Flom♦Peter Flom
80.7k13 gold badges117 silver badges228 bronze badges
80.7k13 gold badges117 silver badges228 bronze badges
add a comment
|
add a comment
|
Thanks for contributing an answer to Cross Validated!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f428445%2fhow-to-visualize-an-ordinal-variable-predicting-a-continuous-outcome%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
3
$begingroup$
1. Is there any particular reason to imagine the relationship would be linear? 2. Is there indeed any need to draw any kind of curve or line? Why not simply mark in the means (or any other suitable measure of location) in each category? 3. Can you say more about the continuous outcome? What kind of thing is it?
$endgroup$
– Glen_b♦
11 hours ago