Is there a reliable way to hide/convey a message in vocal expressions (speech, song,…)Natural gas based communicationIntercepting and Faking Radio CommunicationHow would the Lilim live?Is there a more effective way to build a language than a word frequency list?How to name aliens?Social engineering: Design a new language for a totalitarian state obsessed with scientific advancementHow many parts of speech are there, really?Recognizable natural numbers for alien message?Are there any natural forms of communication as robust as speech/vocalization?How can people adjust spoken language to adapt for the discovery that they live on a non-orientable surface?
Can I cast Sunbeam if both my hands are busy?
Random point on a sphere
How can I maximize the impact of my charitable donations?
Which ping implementation is Cygwin using?
How can I fix a framing mistake so I can drywall?
Is it appropriate for a professor to require students to sign a non-disclosure agreement before being taught?
What does "synoptic" mean in avionics?
Is the union of a chain of elementary embeddings elementary?
How to help my 2.5-year-old daughter take her medicine when she refuses to?
Why is the T-1000 humanoid?
How does Vivi differ from other Black Mages?
Kerning feedback on logo
What is the default setting for reducing consequences/harm?
What's is this random file in Macintosh HD? Malicious?
Double it your way
Do any aircraft carry boats?
Renewed US passport, did not receive expired US passport
My research paper filed as a patent in China by my Chinese supervisor without me as inventor
Can a magnet rip protons from a nucleus?
What is Japanese Language Stack Exchange called in Japanese?
Where can I get an anonymous Rav Kav card issued?
Insert str into larger str in the most pythonic way
Writing a love interest for my hero
Why don't I get the correct limit of a sequence, regardless of how I arrange it (the sequence), while following the rules for solving limits?
Is there a reliable way to hide/convey a message in vocal expressions (speech, song,…)
Natural gas based communicationIntercepting and Faking Radio CommunicationHow would the Lilim live?Is there a more effective way to build a language than a word frequency list?How to name aliens?Social engineering: Design a new language for a totalitarian state obsessed with scientific advancementHow many parts of speech are there, really?Recognizable natural numbers for alien message?Are there any natural forms of communication as robust as speech/vocalization?How can people adjust spoken language to adapt for the discovery that they live on a non-orientable surface?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
$begingroup$
Assume Bob. Bob wants to convey the following, simple message (an example) to Cassandra:
Walk 5 feet forward, turn around 90 degrees clockwise, walk another 4 feet and dig 3 feet into the ground.
Easy, right? Now here's the twist: I require Bob to not state the message outright, but by somehow hide or convey it within other vocal expressions. This could be him talking or singing. Bob and Cassandra had the prior opportunity to agree on a code scheme, and that's what I am after.
- there should not be any link to the hidden message within the words of Bobs utterance. So something like "use every first word of every second sentence" is not viable. The meaning of the actual spoken/uttered words can not play any role within the scheme.
- They do not know beforehand, if a song or talking will be used, so both modes need to be viable. Bonus Points if even random screaming could be used.
- The scheme should allow for an almost mathematical precision. There should be no doubt if Bob meant 3 or 4 feet.
- Assume that Cassandra, the recipent, can hear the message clearly. Audio Transfer is not my point, I am only looking for an encoding scheme.
- I imagine, that some parameters of human voice or soundwaves in general could be used. I am unsure hovewer which one. Volume shouldn't have any meaning, so amplitude is out, right? Frequency?
- Ease of use is not a primary concern. If both need to be geniuses and have absolute pitch for your idea to work, so be it. If Cassandra needs to know "oh, boy thats 120 Hz right now", so be it.
Given my requirements, goals and constraints, is there a way to use some acoustic property of the human voice as an "additional channel" to convey a second (hidden) message? How would such a mapping work?
humans language communication
$endgroup$
add a comment |
$begingroup$
Assume Bob. Bob wants to convey the following, simple message (an example) to Cassandra:
Walk 5 feet forward, turn around 90 degrees clockwise, walk another 4 feet and dig 3 feet into the ground.
Easy, right? Now here's the twist: I require Bob to not state the message outright, but by somehow hide or convey it within other vocal expressions. This could be him talking or singing. Bob and Cassandra had the prior opportunity to agree on a code scheme, and that's what I am after.
- there should not be any link to the hidden message within the words of Bobs utterance. So something like "use every first word of every second sentence" is not viable. The meaning of the actual spoken/uttered words can not play any role within the scheme.
- They do not know beforehand, if a song or talking will be used, so both modes need to be viable. Bonus Points if even random screaming could be used.
- The scheme should allow for an almost mathematical precision. There should be no doubt if Bob meant 3 or 4 feet.
- Assume that Cassandra, the recipent, can hear the message clearly. Audio Transfer is not my point, I am only looking for an encoding scheme.
- I imagine, that some parameters of human voice or soundwaves in general could be used. I am unsure hovewer which one. Volume shouldn't have any meaning, so amplitude is out, right? Frequency?
- Ease of use is not a primary concern. If both need to be geniuses and have absolute pitch for your idea to work, so be it. If Cassandra needs to know "oh, boy thats 120 Hz right now", so be it.
Given my requirements, goals and constraints, is there a way to use some acoustic property of the human voice as an "additional channel" to convey a second (hidden) message? How would such a mapping work?
humans language communication
$endgroup$
$begingroup$
Number of syllables sound nice! I will think it trough. Care to expand it into an answer? I also expanded my constraints thanks to your comment.
$endgroup$
– openend
8 hours ago
$begingroup$
I had a similar idea once.
$endgroup$
– Renan
6 hours ago
$begingroup$
How do you determine "best answer" here. (Too interesting to shut down as idea genereation...)
$endgroup$
– Guran
1 hour ago
add a comment |
$begingroup$
Assume Bob. Bob wants to convey the following, simple message (an example) to Cassandra:
Walk 5 feet forward, turn around 90 degrees clockwise, walk another 4 feet and dig 3 feet into the ground.
Easy, right? Now here's the twist: I require Bob to not state the message outright, but by somehow hide or convey it within other vocal expressions. This could be him talking or singing. Bob and Cassandra had the prior opportunity to agree on a code scheme, and that's what I am after.
- there should not be any link to the hidden message within the words of Bobs utterance. So something like "use every first word of every second sentence" is not viable. The meaning of the actual spoken/uttered words can not play any role within the scheme.
- They do not know beforehand, if a song or talking will be used, so both modes need to be viable. Bonus Points if even random screaming could be used.
- The scheme should allow for an almost mathematical precision. There should be no doubt if Bob meant 3 or 4 feet.
- Assume that Cassandra, the recipent, can hear the message clearly. Audio Transfer is not my point, I am only looking for an encoding scheme.
- I imagine, that some parameters of human voice or soundwaves in general could be used. I am unsure hovewer which one. Volume shouldn't have any meaning, so amplitude is out, right? Frequency?
- Ease of use is not a primary concern. If both need to be geniuses and have absolute pitch for your idea to work, so be it. If Cassandra needs to know "oh, boy thats 120 Hz right now", so be it.
Given my requirements, goals and constraints, is there a way to use some acoustic property of the human voice as an "additional channel" to convey a second (hidden) message? How would such a mapping work?
humans language communication
$endgroup$
Assume Bob. Bob wants to convey the following, simple message (an example) to Cassandra:
Walk 5 feet forward, turn around 90 degrees clockwise, walk another 4 feet and dig 3 feet into the ground.
Easy, right? Now here's the twist: I require Bob to not state the message outright, but by somehow hide or convey it within other vocal expressions. This could be him talking or singing. Bob and Cassandra had the prior opportunity to agree on a code scheme, and that's what I am after.
- there should not be any link to the hidden message within the words of Bobs utterance. So something like "use every first word of every second sentence" is not viable. The meaning of the actual spoken/uttered words can not play any role within the scheme.
- They do not know beforehand, if a song or talking will be used, so both modes need to be viable. Bonus Points if even random screaming could be used.
- The scheme should allow for an almost mathematical precision. There should be no doubt if Bob meant 3 or 4 feet.
- Assume that Cassandra, the recipent, can hear the message clearly. Audio Transfer is not my point, I am only looking for an encoding scheme.
- I imagine, that some parameters of human voice or soundwaves in general could be used. I am unsure hovewer which one. Volume shouldn't have any meaning, so amplitude is out, right? Frequency?
- Ease of use is not a primary concern. If both need to be geniuses and have absolute pitch for your idea to work, so be it. If Cassandra needs to know "oh, boy thats 120 Hz right now", so be it.
Given my requirements, goals and constraints, is there a way to use some acoustic property of the human voice as an "additional channel" to convey a second (hidden) message? How would such a mapping work?
humans language communication
humans language communication
edited 8 hours ago
openend
asked 8 hours ago
openendopenend
2,6181 gold badge19 silver badges48 bronze badges
2,6181 gold badge19 silver badges48 bronze badges
$begingroup$
Number of syllables sound nice! I will think it trough. Care to expand it into an answer? I also expanded my constraints thanks to your comment.
$endgroup$
– openend
8 hours ago
$begingroup$
I had a similar idea once.
$endgroup$
– Renan
6 hours ago
$begingroup$
How do you determine "best answer" here. (Too interesting to shut down as idea genereation...)
$endgroup$
– Guran
1 hour ago
add a comment |
$begingroup$
Number of syllables sound nice! I will think it trough. Care to expand it into an answer? I also expanded my constraints thanks to your comment.
$endgroup$
– openend
8 hours ago
$begingroup$
I had a similar idea once.
$endgroup$
– Renan
6 hours ago
$begingroup$
How do you determine "best answer" here. (Too interesting to shut down as idea genereation...)
$endgroup$
– Guran
1 hour ago
$begingroup$
Number of syllables sound nice! I will think it trough. Care to expand it into an answer? I also expanded my constraints thanks to your comment.
$endgroup$
– openend
8 hours ago
$begingroup$
Number of syllables sound nice! I will think it trough. Care to expand it into an answer? I also expanded my constraints thanks to your comment.
$endgroup$
– openend
8 hours ago
$begingroup$
I had a similar idea once.
$endgroup$
– Renan
6 hours ago
$begingroup$
I had a similar idea once.
$endgroup$
– Renan
6 hours ago
$begingroup$
How do you determine "best answer" here. (Too interesting to shut down as idea genereation...)
$endgroup$
– Guran
1 hour ago
$begingroup$
How do you determine "best answer" here. (Too interesting to shut down as idea genereation...)
$endgroup$
– Guran
1 hour ago
add a comment |
5 Answers
5
active
oldest
votes
$begingroup$
Morse code and syllable length - that is, use syllable length to encode a message in Morse code so a long syllable for a dash, a short one of a dot. Easy to decode if intercepted - yes, absolutely. But you didn't mention the possibility that anyone was listening in to find a hidden message, just that it needed to be encoded.
Without any training, it'll be slightly noticeable to be sure, though an excuse like 'My cadence varies when I'm nervous' could help. With training, though, you'll be able to keep the dash and dot syllables only slightly varied from true syllables, and thus perfectly viable.
$endgroup$
add a comment |
$begingroup$
My first thought was that you can use principles of steganography here. but you've rejected the simpler patterns in the first bullet (there should not be any link to the hidden message within the words of Bobs utterance).
So, next option. Most people speak from the mouth, not nasally. You can try to have Bob speak regular words nasally. Choose any two type of words here - example monosyllable and disyllable. Based on this, you can now convert these sounds into a morse code for english character.
EDIT:
I realise now that you are only looking for a way to encode the information in a uniform manner, in which case, even regular speech using the right monosyllable and disyllable words will suffice.
$endgroup$
$begingroup$
Sorry, I did not understand how the nasal voice would matter in this context?
$endgroup$
– openend
8 hours ago
$begingroup$
@openend I think I had misunderstood your question the first time to mean using an acoustic property in addition to regular speech. SO, I wanted to suggest using both nasal + regular voice for communication, with nasal being used to promote the scheme for morse. BUT after re-reading your question, I understand now that you are only looking for a way to encode the information in a uniform manner, in which case, even regular speech using the right monosyllable and disyllable words will suffice.
$endgroup$
– mu 無
8 hours ago
add a comment |
$begingroup$
Breath and word count
The simplest form of encoding I can come up with is this:
Wether Bob speaks or sings, pay attention to when he breathes. Count the number of words between each breath. It will be a number between one and eight. (If not, it is meaningless noice, a filler)
By combining two such numbers the scheme allows for 64 characters, more than enough for A-Z, numbers and space.
Granted, this will be very hard to encode/decode on the fly, but given some minimal preparation Bob can easily disguise any message in speech or song.
$endgroup$
add a comment |
$begingroup$
Steganography is the encoding and decoding of information hidden in plain sight within pictures, audio files, whatever. It's much easier if you allow hardware in the mix - for example, encoding a message in the noise below the audible level, or adjusting the frequency of each tone by just enough that the difference can be measured but not heard.
But you want to be able to do this by just singing or talking. For singing, one possible encoding scheme for a very good singer would be vibrato. You could send numerically encoded messages by controlling how many 'beats' of vibrato you use for each phrase.
But you want to be able to use talking as well, and the encoding can't be in the words according to your criteria. So that leaves things like pitch, duration, volume, and non-word sounds like breath intakes, duration between words, 'umm's and 'awws', etc. Skip volume, as it's too hard to determine absolute volume and you probably want it to work for varying levels of background noise,
For more complexity add them together. For example, taking in a breath then saying, 'um, we should go' could mean something completely different than just saying 'we should go', which could be different than sighing then saying the same thing. It's not the 'we should go' that matters, it's the patterns of speech around the words.
So, 'breath intake + um + rising tone at end of sentence' (as in a question) means one thing. 'breath exhale + 2ums in sentence + flat tone' means something else. Make up as many different combinations as you need to encode all the information.
The nice thing about encoding your message in the 'metadata' of talking instead of words or syllable lengths or something tied to specific words is that you could make it work with any text. What matters is not the text itself, but how you say it or sing it. You could read the phone book this way and still get your message across.
$endgroup$
add a comment |
$begingroup$
You could hide the message in the word order where grammar allows it (obviously that works the better the more the used language allows to reorder words).
For example, consider the sentence:
Today I'll have pizza for lunch.
You can move "today" and "for lunch" to many different positions:
I'll have pizza today for lunch.
I'll have pizza for lunch today.
For lunch I'll have pizza today.
For lunch today, I'll have pizza.
Today for lunch, I'll have pizza.
So you can encode a digit between 1 and 6 in the word order. Note that the hidden message is independent of the obvious message; the very same digit can be hidden in sentences like
Yesterday I watched Doctor Who after work.
Sometimes I go swimming in the morning.
Now clearly there are sentences with different number of possible word orders. But it should be possible to come up with a code that works with arbitrary sentences (except that sentences with fixed word order won't be able to give any information).
A possible coding strategy could be as follows:
For each sentence, determine the number of possible word orders; let's call it the sentence capacity. For example, the example sentence above would have a capacity of 6. Then take as many sentences that the product of their capacities is at least 27 (enough to encode 26 letters and a space). These sentences give a code group.
Next, assign each sentence of the code group a number by lexicographically ordering the possible sentences and numbering them starting at zero. The example sentence is last in the order, therefore it would get the number 5.
Then, calculate the value of the code group by multiplying the number of each sentence by the capacities of all following sentences, and then adding it all together.
If the resulting value is zero, it is a space, if it is between 1 and 26, it describes a letter, and if it is larger, then the encoder made an error.
For example, consider the following text:
I'll have Pizza for lunch today. I bought it this morning. For dessert I plan to eat strawberries or cherries.
The first sentence has a capacity of 6, the second has a capacity of 2, the third has a capacity of 4 ("for dessert" can be put at the beginning or end, and also the order of strawberries and cherries can be flipped without changing the meaning).
The product of the capacities is 6×2×4=48, clearly larger than 27, but the first two sentences only give 12, so the code group consists of those three sentences.
The first sentence has two other possible orders preceding it in lexicographical order (the two variants starting with "for lunch"), so it gets the value 2. The second sentence is first in the list of possible word orders, so it gets the value 0. And the third sentence has only the one with strawberries and cherries switched preceding it lexicographically, thus it gets the value 1.
Thus the value of the code group is 2×2×4 + 0×4 + 1 = 17, which corresponds to a Q.
$endgroup$
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "579"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fworldbuilding.stackexchange.com%2fquestions%2f154862%2fis-there-a-reliable-way-to-hide-convey-a-message-in-vocal-expressions-speech-s%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
5 Answers
5
active
oldest
votes
5 Answers
5
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Morse code and syllable length - that is, use syllable length to encode a message in Morse code so a long syllable for a dash, a short one of a dot. Easy to decode if intercepted - yes, absolutely. But you didn't mention the possibility that anyone was listening in to find a hidden message, just that it needed to be encoded.
Without any training, it'll be slightly noticeable to be sure, though an excuse like 'My cadence varies when I'm nervous' could help. With training, though, you'll be able to keep the dash and dot syllables only slightly varied from true syllables, and thus perfectly viable.
$endgroup$
add a comment |
$begingroup$
Morse code and syllable length - that is, use syllable length to encode a message in Morse code so a long syllable for a dash, a short one of a dot. Easy to decode if intercepted - yes, absolutely. But you didn't mention the possibility that anyone was listening in to find a hidden message, just that it needed to be encoded.
Without any training, it'll be slightly noticeable to be sure, though an excuse like 'My cadence varies when I'm nervous' could help. With training, though, you'll be able to keep the dash and dot syllables only slightly varied from true syllables, and thus perfectly viable.
$endgroup$
add a comment |
$begingroup$
Morse code and syllable length - that is, use syllable length to encode a message in Morse code so a long syllable for a dash, a short one of a dot. Easy to decode if intercepted - yes, absolutely. But you didn't mention the possibility that anyone was listening in to find a hidden message, just that it needed to be encoded.
Without any training, it'll be slightly noticeable to be sure, though an excuse like 'My cadence varies when I'm nervous' could help. With training, though, you'll be able to keep the dash and dot syllables only slightly varied from true syllables, and thus perfectly viable.
$endgroup$
Morse code and syllable length - that is, use syllable length to encode a message in Morse code so a long syllable for a dash, a short one of a dot. Easy to decode if intercepted - yes, absolutely. But you didn't mention the possibility that anyone was listening in to find a hidden message, just that it needed to be encoded.
Without any training, it'll be slightly noticeable to be sure, though an excuse like 'My cadence varies when I'm nervous' could help. With training, though, you'll be able to keep the dash and dot syllables only slightly varied from true syllables, and thus perfectly viable.
answered 8 hours ago
HalfthawedHalfthawed
6,7101 gold badge7 silver badges31 bronze badges
6,7101 gold badge7 silver badges31 bronze badges
add a comment |
add a comment |
$begingroup$
My first thought was that you can use principles of steganography here. but you've rejected the simpler patterns in the first bullet (there should not be any link to the hidden message within the words of Bobs utterance).
So, next option. Most people speak from the mouth, not nasally. You can try to have Bob speak regular words nasally. Choose any two type of words here - example monosyllable and disyllable. Based on this, you can now convert these sounds into a morse code for english character.
EDIT:
I realise now that you are only looking for a way to encode the information in a uniform manner, in which case, even regular speech using the right monosyllable and disyllable words will suffice.
$endgroup$
$begingroup$
Sorry, I did not understand how the nasal voice would matter in this context?
$endgroup$
– openend
8 hours ago
$begingroup$
@openend I think I had misunderstood your question the first time to mean using an acoustic property in addition to regular speech. SO, I wanted to suggest using both nasal + regular voice for communication, with nasal being used to promote the scheme for morse. BUT after re-reading your question, I understand now that you are only looking for a way to encode the information in a uniform manner, in which case, even regular speech using the right monosyllable and disyllable words will suffice.
$endgroup$
– mu 無
8 hours ago
add a comment |
$begingroup$
My first thought was that you can use principles of steganography here. but you've rejected the simpler patterns in the first bullet (there should not be any link to the hidden message within the words of Bobs utterance).
So, next option. Most people speak from the mouth, not nasally. You can try to have Bob speak regular words nasally. Choose any two type of words here - example monosyllable and disyllable. Based on this, you can now convert these sounds into a morse code for english character.
EDIT:
I realise now that you are only looking for a way to encode the information in a uniform manner, in which case, even regular speech using the right monosyllable and disyllable words will suffice.
$endgroup$
$begingroup$
Sorry, I did not understand how the nasal voice would matter in this context?
$endgroup$
– openend
8 hours ago
$begingroup$
@openend I think I had misunderstood your question the first time to mean using an acoustic property in addition to regular speech. SO, I wanted to suggest using both nasal + regular voice for communication, with nasal being used to promote the scheme for morse. BUT after re-reading your question, I understand now that you are only looking for a way to encode the information in a uniform manner, in which case, even regular speech using the right monosyllable and disyllable words will suffice.
$endgroup$
– mu 無
8 hours ago
add a comment |
$begingroup$
My first thought was that you can use principles of steganography here. but you've rejected the simpler patterns in the first bullet (there should not be any link to the hidden message within the words of Bobs utterance).
So, next option. Most people speak from the mouth, not nasally. You can try to have Bob speak regular words nasally. Choose any two type of words here - example monosyllable and disyllable. Based on this, you can now convert these sounds into a morse code for english character.
EDIT:
I realise now that you are only looking for a way to encode the information in a uniform manner, in which case, even regular speech using the right monosyllable and disyllable words will suffice.
$endgroup$
My first thought was that you can use principles of steganography here. but you've rejected the simpler patterns in the first bullet (there should not be any link to the hidden message within the words of Bobs utterance).
So, next option. Most people speak from the mouth, not nasally. You can try to have Bob speak regular words nasally. Choose any two type of words here - example monosyllable and disyllable. Based on this, you can now convert these sounds into a morse code for english character.
EDIT:
I realise now that you are only looking for a way to encode the information in a uniform manner, in which case, even regular speech using the right monosyllable and disyllable words will suffice.
edited 7 hours ago
answered 8 hours ago
mu 無mu 無
5112 silver badges5 bronze badges
5112 silver badges5 bronze badges
$begingroup$
Sorry, I did not understand how the nasal voice would matter in this context?
$endgroup$
– openend
8 hours ago
$begingroup$
@openend I think I had misunderstood your question the first time to mean using an acoustic property in addition to regular speech. SO, I wanted to suggest using both nasal + regular voice for communication, with nasal being used to promote the scheme for morse. BUT after re-reading your question, I understand now that you are only looking for a way to encode the information in a uniform manner, in which case, even regular speech using the right monosyllable and disyllable words will suffice.
$endgroup$
– mu 無
8 hours ago
add a comment |
$begingroup$
Sorry, I did not understand how the nasal voice would matter in this context?
$endgroup$
– openend
8 hours ago
$begingroup$
@openend I think I had misunderstood your question the first time to mean using an acoustic property in addition to regular speech. SO, I wanted to suggest using both nasal + regular voice for communication, with nasal being used to promote the scheme for morse. BUT after re-reading your question, I understand now that you are only looking for a way to encode the information in a uniform manner, in which case, even regular speech using the right monosyllable and disyllable words will suffice.
$endgroup$
– mu 無
8 hours ago
$begingroup$
Sorry, I did not understand how the nasal voice would matter in this context?
$endgroup$
– openend
8 hours ago
$begingroup$
Sorry, I did not understand how the nasal voice would matter in this context?
$endgroup$
– openend
8 hours ago
$begingroup$
@openend I think I had misunderstood your question the first time to mean using an acoustic property in addition to regular speech. SO, I wanted to suggest using both nasal + regular voice for communication, with nasal being used to promote the scheme for morse. BUT after re-reading your question, I understand now that you are only looking for a way to encode the information in a uniform manner, in which case, even regular speech using the right monosyllable and disyllable words will suffice.
$endgroup$
– mu 無
8 hours ago
$begingroup$
@openend I think I had misunderstood your question the first time to mean using an acoustic property in addition to regular speech. SO, I wanted to suggest using both nasal + regular voice for communication, with nasal being used to promote the scheme for morse. BUT after re-reading your question, I understand now that you are only looking for a way to encode the information in a uniform manner, in which case, even regular speech using the right monosyllable and disyllable words will suffice.
$endgroup$
– mu 無
8 hours ago
add a comment |
$begingroup$
Breath and word count
The simplest form of encoding I can come up with is this:
Wether Bob speaks or sings, pay attention to when he breathes. Count the number of words between each breath. It will be a number between one and eight. (If not, it is meaningless noice, a filler)
By combining two such numbers the scheme allows for 64 characters, more than enough for A-Z, numbers and space.
Granted, this will be very hard to encode/decode on the fly, but given some minimal preparation Bob can easily disguise any message in speech or song.
$endgroup$
add a comment |
$begingroup$
Breath and word count
The simplest form of encoding I can come up with is this:
Wether Bob speaks or sings, pay attention to when he breathes. Count the number of words between each breath. It will be a number between one and eight. (If not, it is meaningless noice, a filler)
By combining two such numbers the scheme allows for 64 characters, more than enough for A-Z, numbers and space.
Granted, this will be very hard to encode/decode on the fly, but given some minimal preparation Bob can easily disguise any message in speech or song.
$endgroup$
add a comment |
$begingroup$
Breath and word count
The simplest form of encoding I can come up with is this:
Wether Bob speaks or sings, pay attention to when he breathes. Count the number of words between each breath. It will be a number between one and eight. (If not, it is meaningless noice, a filler)
By combining two such numbers the scheme allows for 64 characters, more than enough for A-Z, numbers and space.
Granted, this will be very hard to encode/decode on the fly, but given some minimal preparation Bob can easily disguise any message in speech or song.
$endgroup$
Breath and word count
The simplest form of encoding I can come up with is this:
Wether Bob speaks or sings, pay attention to when he breathes. Count the number of words between each breath. It will be a number between one and eight. (If not, it is meaningless noice, a filler)
By combining two such numbers the scheme allows for 64 characters, more than enough for A-Z, numbers and space.
Granted, this will be very hard to encode/decode on the fly, but given some minimal preparation Bob can easily disguise any message in speech or song.
answered 1 hour ago
GuranGuran
4,2531 gold badge13 silver badges27 bronze badges
4,2531 gold badge13 silver badges27 bronze badges
add a comment |
add a comment |
$begingroup$
Steganography is the encoding and decoding of information hidden in plain sight within pictures, audio files, whatever. It's much easier if you allow hardware in the mix - for example, encoding a message in the noise below the audible level, or adjusting the frequency of each tone by just enough that the difference can be measured but not heard.
But you want to be able to do this by just singing or talking. For singing, one possible encoding scheme for a very good singer would be vibrato. You could send numerically encoded messages by controlling how many 'beats' of vibrato you use for each phrase.
But you want to be able to use talking as well, and the encoding can't be in the words according to your criteria. So that leaves things like pitch, duration, volume, and non-word sounds like breath intakes, duration between words, 'umm's and 'awws', etc. Skip volume, as it's too hard to determine absolute volume and you probably want it to work for varying levels of background noise,
For more complexity add them together. For example, taking in a breath then saying, 'um, we should go' could mean something completely different than just saying 'we should go', which could be different than sighing then saying the same thing. It's not the 'we should go' that matters, it's the patterns of speech around the words.
So, 'breath intake + um + rising tone at end of sentence' (as in a question) means one thing. 'breath exhale + 2ums in sentence + flat tone' means something else. Make up as many different combinations as you need to encode all the information.
The nice thing about encoding your message in the 'metadata' of talking instead of words or syllable lengths or something tied to specific words is that you could make it work with any text. What matters is not the text itself, but how you say it or sing it. You could read the phone book this way and still get your message across.
$endgroup$
add a comment |
$begingroup$
Steganography is the encoding and decoding of information hidden in plain sight within pictures, audio files, whatever. It's much easier if you allow hardware in the mix - for example, encoding a message in the noise below the audible level, or adjusting the frequency of each tone by just enough that the difference can be measured but not heard.
But you want to be able to do this by just singing or talking. For singing, one possible encoding scheme for a very good singer would be vibrato. You could send numerically encoded messages by controlling how many 'beats' of vibrato you use for each phrase.
But you want to be able to use talking as well, and the encoding can't be in the words according to your criteria. So that leaves things like pitch, duration, volume, and non-word sounds like breath intakes, duration between words, 'umm's and 'awws', etc. Skip volume, as it's too hard to determine absolute volume and you probably want it to work for varying levels of background noise,
For more complexity add them together. For example, taking in a breath then saying, 'um, we should go' could mean something completely different than just saying 'we should go', which could be different than sighing then saying the same thing. It's not the 'we should go' that matters, it's the patterns of speech around the words.
So, 'breath intake + um + rising tone at end of sentence' (as in a question) means one thing. 'breath exhale + 2ums in sentence + flat tone' means something else. Make up as many different combinations as you need to encode all the information.
The nice thing about encoding your message in the 'metadata' of talking instead of words or syllable lengths or something tied to specific words is that you could make it work with any text. What matters is not the text itself, but how you say it or sing it. You could read the phone book this way and still get your message across.
$endgroup$
add a comment |
$begingroup$
Steganography is the encoding and decoding of information hidden in plain sight within pictures, audio files, whatever. It's much easier if you allow hardware in the mix - for example, encoding a message in the noise below the audible level, or adjusting the frequency of each tone by just enough that the difference can be measured but not heard.
But you want to be able to do this by just singing or talking. For singing, one possible encoding scheme for a very good singer would be vibrato. You could send numerically encoded messages by controlling how many 'beats' of vibrato you use for each phrase.
But you want to be able to use talking as well, and the encoding can't be in the words according to your criteria. So that leaves things like pitch, duration, volume, and non-word sounds like breath intakes, duration between words, 'umm's and 'awws', etc. Skip volume, as it's too hard to determine absolute volume and you probably want it to work for varying levels of background noise,
For more complexity add them together. For example, taking in a breath then saying, 'um, we should go' could mean something completely different than just saying 'we should go', which could be different than sighing then saying the same thing. It's not the 'we should go' that matters, it's the patterns of speech around the words.
So, 'breath intake + um + rising tone at end of sentence' (as in a question) means one thing. 'breath exhale + 2ums in sentence + flat tone' means something else. Make up as many different combinations as you need to encode all the information.
The nice thing about encoding your message in the 'metadata' of talking instead of words or syllable lengths or something tied to specific words is that you could make it work with any text. What matters is not the text itself, but how you say it or sing it. You could read the phone book this way and still get your message across.
$endgroup$
Steganography is the encoding and decoding of information hidden in plain sight within pictures, audio files, whatever. It's much easier if you allow hardware in the mix - for example, encoding a message in the noise below the audible level, or adjusting the frequency of each tone by just enough that the difference can be measured but not heard.
But you want to be able to do this by just singing or talking. For singing, one possible encoding scheme for a very good singer would be vibrato. You could send numerically encoded messages by controlling how many 'beats' of vibrato you use for each phrase.
But you want to be able to use talking as well, and the encoding can't be in the words according to your criteria. So that leaves things like pitch, duration, volume, and non-word sounds like breath intakes, duration between words, 'umm's and 'awws', etc. Skip volume, as it's too hard to determine absolute volume and you probably want it to work for varying levels of background noise,
For more complexity add them together. For example, taking in a breath then saying, 'um, we should go' could mean something completely different than just saying 'we should go', which could be different than sighing then saying the same thing. It's not the 'we should go' that matters, it's the patterns of speech around the words.
So, 'breath intake + um + rising tone at end of sentence' (as in a question) means one thing. 'breath exhale + 2ums in sentence + flat tone' means something else. Make up as many different combinations as you need to encode all the information.
The nice thing about encoding your message in the 'metadata' of talking instead of words or syllable lengths or something tied to specific words is that you could make it work with any text. What matters is not the text itself, but how you say it or sing it. You could read the phone book this way and still get your message across.
answered 4 hours ago
Dan HansonDan Hanson
764 bronze badges
764 bronze badges
add a comment |
add a comment |
$begingroup$
You could hide the message in the word order where grammar allows it (obviously that works the better the more the used language allows to reorder words).
For example, consider the sentence:
Today I'll have pizza for lunch.
You can move "today" and "for lunch" to many different positions:
I'll have pizza today for lunch.
I'll have pizza for lunch today.
For lunch I'll have pizza today.
For lunch today, I'll have pizza.
Today for lunch, I'll have pizza.
So you can encode a digit between 1 and 6 in the word order. Note that the hidden message is independent of the obvious message; the very same digit can be hidden in sentences like
Yesterday I watched Doctor Who after work.
Sometimes I go swimming in the morning.
Now clearly there are sentences with different number of possible word orders. But it should be possible to come up with a code that works with arbitrary sentences (except that sentences with fixed word order won't be able to give any information).
A possible coding strategy could be as follows:
For each sentence, determine the number of possible word orders; let's call it the sentence capacity. For example, the example sentence above would have a capacity of 6. Then take as many sentences that the product of their capacities is at least 27 (enough to encode 26 letters and a space). These sentences give a code group.
Next, assign each sentence of the code group a number by lexicographically ordering the possible sentences and numbering them starting at zero. The example sentence is last in the order, therefore it would get the number 5.
Then, calculate the value of the code group by multiplying the number of each sentence by the capacities of all following sentences, and then adding it all together.
If the resulting value is zero, it is a space, if it is between 1 and 26, it describes a letter, and if it is larger, then the encoder made an error.
For example, consider the following text:
I'll have Pizza for lunch today. I bought it this morning. For dessert I plan to eat strawberries or cherries.
The first sentence has a capacity of 6, the second has a capacity of 2, the third has a capacity of 4 ("for dessert" can be put at the beginning or end, and also the order of strawberries and cherries can be flipped without changing the meaning).
The product of the capacities is 6×2×4=48, clearly larger than 27, but the first two sentences only give 12, so the code group consists of those three sentences.
The first sentence has two other possible orders preceding it in lexicographical order (the two variants starting with "for lunch"), so it gets the value 2. The second sentence is first in the list of possible word orders, so it gets the value 0. And the third sentence has only the one with strawberries and cherries switched preceding it lexicographically, thus it gets the value 1.
Thus the value of the code group is 2×2×4 + 0×4 + 1 = 17, which corresponds to a Q.
$endgroup$
add a comment |
$begingroup$
You could hide the message in the word order where grammar allows it (obviously that works the better the more the used language allows to reorder words).
For example, consider the sentence:
Today I'll have pizza for lunch.
You can move "today" and "for lunch" to many different positions:
I'll have pizza today for lunch.
I'll have pizza for lunch today.
For lunch I'll have pizza today.
For lunch today, I'll have pizza.
Today for lunch, I'll have pizza.
So you can encode a digit between 1 and 6 in the word order. Note that the hidden message is independent of the obvious message; the very same digit can be hidden in sentences like
Yesterday I watched Doctor Who after work.
Sometimes I go swimming in the morning.
Now clearly there are sentences with different number of possible word orders. But it should be possible to come up with a code that works with arbitrary sentences (except that sentences with fixed word order won't be able to give any information).
A possible coding strategy could be as follows:
For each sentence, determine the number of possible word orders; let's call it the sentence capacity. For example, the example sentence above would have a capacity of 6. Then take as many sentences that the product of their capacities is at least 27 (enough to encode 26 letters and a space). These sentences give a code group.
Next, assign each sentence of the code group a number by lexicographically ordering the possible sentences and numbering them starting at zero. The example sentence is last in the order, therefore it would get the number 5.
Then, calculate the value of the code group by multiplying the number of each sentence by the capacities of all following sentences, and then adding it all together.
If the resulting value is zero, it is a space, if it is between 1 and 26, it describes a letter, and if it is larger, then the encoder made an error.
For example, consider the following text:
I'll have Pizza for lunch today. I bought it this morning. For dessert I plan to eat strawberries or cherries.
The first sentence has a capacity of 6, the second has a capacity of 2, the third has a capacity of 4 ("for dessert" can be put at the beginning or end, and also the order of strawberries and cherries can be flipped without changing the meaning).
The product of the capacities is 6×2×4=48, clearly larger than 27, but the first two sentences only give 12, so the code group consists of those three sentences.
The first sentence has two other possible orders preceding it in lexicographical order (the two variants starting with "for lunch"), so it gets the value 2. The second sentence is first in the list of possible word orders, so it gets the value 0. And the third sentence has only the one with strawberries and cherries switched preceding it lexicographically, thus it gets the value 1.
Thus the value of the code group is 2×2×4 + 0×4 + 1 = 17, which corresponds to a Q.
$endgroup$
add a comment |
$begingroup$
You could hide the message in the word order where grammar allows it (obviously that works the better the more the used language allows to reorder words).
For example, consider the sentence:
Today I'll have pizza for lunch.
You can move "today" and "for lunch" to many different positions:
I'll have pizza today for lunch.
I'll have pizza for lunch today.
For lunch I'll have pizza today.
For lunch today, I'll have pizza.
Today for lunch, I'll have pizza.
So you can encode a digit between 1 and 6 in the word order. Note that the hidden message is independent of the obvious message; the very same digit can be hidden in sentences like
Yesterday I watched Doctor Who after work.
Sometimes I go swimming in the morning.
Now clearly there are sentences with different number of possible word orders. But it should be possible to come up with a code that works with arbitrary sentences (except that sentences with fixed word order won't be able to give any information).
A possible coding strategy could be as follows:
For each sentence, determine the number of possible word orders; let's call it the sentence capacity. For example, the example sentence above would have a capacity of 6. Then take as many sentences that the product of their capacities is at least 27 (enough to encode 26 letters and a space). These sentences give a code group.
Next, assign each sentence of the code group a number by lexicographically ordering the possible sentences and numbering them starting at zero. The example sentence is last in the order, therefore it would get the number 5.
Then, calculate the value of the code group by multiplying the number of each sentence by the capacities of all following sentences, and then adding it all together.
If the resulting value is zero, it is a space, if it is between 1 and 26, it describes a letter, and if it is larger, then the encoder made an error.
For example, consider the following text:
I'll have Pizza for lunch today. I bought it this morning. For dessert I plan to eat strawberries or cherries.
The first sentence has a capacity of 6, the second has a capacity of 2, the third has a capacity of 4 ("for dessert" can be put at the beginning or end, and also the order of strawberries and cherries can be flipped without changing the meaning).
The product of the capacities is 6×2×4=48, clearly larger than 27, but the first two sentences only give 12, so the code group consists of those three sentences.
The first sentence has two other possible orders preceding it in lexicographical order (the two variants starting with "for lunch"), so it gets the value 2. The second sentence is first in the list of possible word orders, so it gets the value 0. And the third sentence has only the one with strawberries and cherries switched preceding it lexicographically, thus it gets the value 1.
Thus the value of the code group is 2×2×4 + 0×4 + 1 = 17, which corresponds to a Q.
$endgroup$
You could hide the message in the word order where grammar allows it (obviously that works the better the more the used language allows to reorder words).
For example, consider the sentence:
Today I'll have pizza for lunch.
You can move "today" and "for lunch" to many different positions:
I'll have pizza today for lunch.
I'll have pizza for lunch today.
For lunch I'll have pizza today.
For lunch today, I'll have pizza.
Today for lunch, I'll have pizza.
So you can encode a digit between 1 and 6 in the word order. Note that the hidden message is independent of the obvious message; the very same digit can be hidden in sentences like
Yesterday I watched Doctor Who after work.
Sometimes I go swimming in the morning.
Now clearly there are sentences with different number of possible word orders. But it should be possible to come up with a code that works with arbitrary sentences (except that sentences with fixed word order won't be able to give any information).
A possible coding strategy could be as follows:
For each sentence, determine the number of possible word orders; let's call it the sentence capacity. For example, the example sentence above would have a capacity of 6. Then take as many sentences that the product of their capacities is at least 27 (enough to encode 26 letters and a space). These sentences give a code group.
Next, assign each sentence of the code group a number by lexicographically ordering the possible sentences and numbering them starting at zero. The example sentence is last in the order, therefore it would get the number 5.
Then, calculate the value of the code group by multiplying the number of each sentence by the capacities of all following sentences, and then adding it all together.
If the resulting value is zero, it is a space, if it is between 1 and 26, it describes a letter, and if it is larger, then the encoder made an error.
For example, consider the following text:
I'll have Pizza for lunch today. I bought it this morning. For dessert I plan to eat strawberries or cherries.
The first sentence has a capacity of 6, the second has a capacity of 2, the third has a capacity of 4 ("for dessert" can be put at the beginning or end, and also the order of strawberries and cherries can be flipped without changing the meaning).
The product of the capacities is 6×2×4=48, clearly larger than 27, but the first two sentences only give 12, so the code group consists of those three sentences.
The first sentence has two other possible orders preceding it in lexicographical order (the two variants starting with "for lunch"), so it gets the value 2. The second sentence is first in the list of possible word orders, so it gets the value 0. And the third sentence has only the one with strawberries and cherries switched preceding it lexicographically, thus it gets the value 1.
Thus the value of the code group is 2×2×4 + 0×4 + 1 = 17, which corresponds to a Q.
answered 7 mins ago
celtschkceltschk
24.1k12 gold badges78 silver badges143 bronze badges
24.1k12 gold badges78 silver badges143 bronze badges
add a comment |
add a comment |
Thanks for contributing an answer to Worldbuilding Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fworldbuilding.stackexchange.com%2fquestions%2f154862%2fis-there-a-reliable-way-to-hide-convey-a-message-in-vocal-expressions-speech-s%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
Number of syllables sound nice! I will think it trough. Care to expand it into an answer? I also expanded my constraints thanks to your comment.
$endgroup$
– openend
8 hours ago
$begingroup$
I had a similar idea once.
$endgroup$
– Renan
6 hours ago
$begingroup$
How do you determine "best answer" here. (Too interesting to shut down as idea genereation...)
$endgroup$
– Guran
1 hour ago