extract specific cheracters from each lineExtract keyword from lineExtract specific columns from text fileSearch for a specific word in each line and print rest of the lineUse awk/sed to remove everything but matching pattern in a specific columnHow to extract one line followed by range of linesHow to extract a line based on the multiple fields in perlExtract specific thing from each row in columnExtract text from bracketsExtract specific fields from file
What exactly is Apple Cider
How quickly would a wooden treasure chest rot?
What are the map units that WGS84 uses?
Opportunity profits vs. opportunity costs
Do 643,000 Americans go bankrupt every year due to medical bills?
Why does 8 bit truecolor use only 2 bits for blue?
Magento 2: Set order history page as default after login
Can Salesforce update the MCCordovaPlugin to support Geofencing?
Euro sign in table with siunitx
What is the source of the fear in the Hallow spell's extra Fear effect?
Looking for a big fantasy novel about scholarly monks that sort of worship math?
How to reproduce this notation?
How do German speakers decide what should be on the left side of the verb?
How do I delete cookies from a specific site?
What are some countries where you can be imprisoned for reading or owning a Bible?
Notation: grace note played on the beat with a chord
Is future tense in English really a myth?
What's this constructed number's starter?
Can taking my 1-week-old on a 6-7 hours journey in the car lead to medical complications?
How do I make my fill-in-the-blank exercise more obvious?
Book where main character comes out of stasis bubble
Pronounceable encrypted text
These roommates throw strange parties
Do I need to declare engagement ring bought in UK when flying on holiday to US?
extract specific cheracters from each line
Extract keyword from lineExtract specific columns from text fileSearch for a specific word in each line and print rest of the lineUse awk/sed to remove everything but matching pattern in a specific columnHow to extract one line followed by range of linesHow to extract a line based on the multiple fields in perlExtract specific thing from each row in columnExtract text from bracketsExtract specific fields from file
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
I have a text file and i want extract the a string from each line coming after "OS="
input file line
A0A0A9PBI3_ARUDO Uncharacterized protein OS=Arundo donax OX=35708 PE=4 SV=1
K3Y356_SETIT ATP-dependent DNA helicase OS=Setaria italica OX=4555 PE=3 SV=1
output desired
OS=Arundo donax
OS=Setaria italica
OR
Arundo donax
Setaria italica
awk perl cat
New contributor
shahzad is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |
I have a text file and i want extract the a string from each line coming after "OS="
input file line
A0A0A9PBI3_ARUDO Uncharacterized protein OS=Arundo donax OX=35708 PE=4 SV=1
K3Y356_SETIT ATP-dependent DNA helicase OS=Setaria italica OX=4555 PE=3 SV=1
output desired
OS=Arundo donax
OS=Setaria italica
OR
Arundo donax
Setaria italica
awk perl cat
New contributor
shahzad is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
Are there always 2 words to print afterOS=or do you want all words betweenOS=andOX=?
– oliv
8 hours ago
i need only two words
– shahzad
8 hours ago
add a comment |
I have a text file and i want extract the a string from each line coming after "OS="
input file line
A0A0A9PBI3_ARUDO Uncharacterized protein OS=Arundo donax OX=35708 PE=4 SV=1
K3Y356_SETIT ATP-dependent DNA helicase OS=Setaria italica OX=4555 PE=3 SV=1
output desired
OS=Arundo donax
OS=Setaria italica
OR
Arundo donax
Setaria italica
awk perl cat
New contributor
shahzad is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
I have a text file and i want extract the a string from each line coming after "OS="
input file line
A0A0A9PBI3_ARUDO Uncharacterized protein OS=Arundo donax OX=35708 PE=4 SV=1
K3Y356_SETIT ATP-dependent DNA helicase OS=Setaria italica OX=4555 PE=3 SV=1
output desired
OS=Arundo donax
OS=Setaria italica
OR
Arundo donax
Setaria italica
awk perl cat
awk perl cat
New contributor
shahzad is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
shahzad is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
edited 8 hours ago
Kevdog777
2,13213 gold badges36 silver badges61 bronze badges
2,13213 gold badges36 silver badges61 bronze badges
New contributor
shahzad is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
asked 8 hours ago
shahzadshahzad
61 bronze badge
61 bronze badge
New contributor
shahzad is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
shahzad is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
Are there always 2 words to print afterOS=or do you want all words betweenOS=andOX=?
– oliv
8 hours ago
i need only two words
– shahzad
8 hours ago
add a comment |
Are there always 2 words to print afterOS=or do you want all words betweenOS=andOX=?
– oliv
8 hours ago
i need only two words
– shahzad
8 hours ago
Are there always 2 words to print after
OS= or do you want all words between OS= and OX=?– oliv
8 hours ago
Are there always 2 words to print after
OS= or do you want all words between OS= and OX=?– oliv
8 hours ago
i need only two words
– shahzad
8 hours ago
i need only two words
– shahzad
8 hours ago
add a comment |
3 Answers
3
active
oldest
votes
Use GNU grep (or compatible) with extended regex:
grep -Eo "OS=w+ w+" file
or basic regex (does not know about + quantifier)
grep -o "OS=w* w*" file
Output:
OS=Arundo donax
OS=Setaria italica
Thank you. It worked great.
– shahzad
8 hours ago
how can i use this if i want to extract the string stating from 2nd word till the OS= i.e from the mentioned above input THE desired output is "Uncharacterized protein" and "ATP-dependent DNA helicase" in each new line
– shahzad
6 hours ago
add a comment |
In Perl, two non-whitespace "words":
$ perl -lne 'print $1 if /OS=(S+ S+)/' input
or everything up to OX=:
$ perl -lne 'print $1 if /OS=(.*?) OX=/' input
or everything up to the next something=:
$ perl -lne 'print $1 if /OS=(.*?) (w+)=/' input
With your sample input, they all give the same output, but the output would be different with e.g. an input like this:
ABC=something here OS=foo bar doo PE=3 OX=1234
thank you it worked
– shahzad
8 hours ago
add a comment |
A more robust way is to use sed to parse the full value until the word containing the next = is found. That way it will work on any sized value (e.g. if you have a font with one word or three words).
sed 's/.*OS=([^=]*).*/1/;s/ [^ ]*$//'
The first block grabs everything up to OS=, the second block in the capture group (denoted by ()'s) matches upto the next = and can be referred to in the replacement as 1. The next substitution rids the last word which is a fragment from the next assignment.
Note: the ^ in []'s is to exclude match the character in this case everything that is not an = sign.
how can i use this if i want to extract the string stating from 2nd word till the OS= i.e from the mentioned above input THE desired output is "Uncharacterized protein" and "ATP-dependent DNA helicase" in each new line
– shahzad
6 hours ago
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
shahzad is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f539203%2fextract-specific-cheracters-from-each-line%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
Use GNU grep (or compatible) with extended regex:
grep -Eo "OS=w+ w+" file
or basic regex (does not know about + quantifier)
grep -o "OS=w* w*" file
Output:
OS=Arundo donax
OS=Setaria italica
Thank you. It worked great.
– shahzad
8 hours ago
how can i use this if i want to extract the string stating from 2nd word till the OS= i.e from the mentioned above input THE desired output is "Uncharacterized protein" and "ATP-dependent DNA helicase" in each new line
– shahzad
6 hours ago
add a comment |
Use GNU grep (or compatible) with extended regex:
grep -Eo "OS=w+ w+" file
or basic regex (does not know about + quantifier)
grep -o "OS=w* w*" file
Output:
OS=Arundo donax
OS=Setaria italica
Thank you. It worked great.
– shahzad
8 hours ago
how can i use this if i want to extract the string stating from 2nd word till the OS= i.e from the mentioned above input THE desired output is "Uncharacterized protein" and "ATP-dependent DNA helicase" in each new line
– shahzad
6 hours ago
add a comment |
Use GNU grep (or compatible) with extended regex:
grep -Eo "OS=w+ w+" file
or basic regex (does not know about + quantifier)
grep -o "OS=w* w*" file
Output:
OS=Arundo donax
OS=Setaria italica
Use GNU grep (or compatible) with extended regex:
grep -Eo "OS=w+ w+" file
or basic regex (does not know about + quantifier)
grep -o "OS=w* w*" file
Output:
OS=Arundo donax
OS=Setaria italica
edited 8 hours ago
Stéphane Chazelas
333k58 gold badges650 silver badges1020 bronze badges
333k58 gold badges650 silver badges1020 bronze badges
answered 8 hours ago
pLumopLumo
7,56315 silver badges34 bronze badges
7,56315 silver badges34 bronze badges
Thank you. It worked great.
– shahzad
8 hours ago
how can i use this if i want to extract the string stating from 2nd word till the OS= i.e from the mentioned above input THE desired output is "Uncharacterized protein" and "ATP-dependent DNA helicase" in each new line
– shahzad
6 hours ago
add a comment |
Thank you. It worked great.
– shahzad
8 hours ago
how can i use this if i want to extract the string stating from 2nd word till the OS= i.e from the mentioned above input THE desired output is "Uncharacterized protein" and "ATP-dependent DNA helicase" in each new line
– shahzad
6 hours ago
Thank you. It worked great.
– shahzad
8 hours ago
Thank you. It worked great.
– shahzad
8 hours ago
how can i use this if i want to extract the string stating from 2nd word till the OS= i.e from the mentioned above input THE desired output is "Uncharacterized protein" and "ATP-dependent DNA helicase" in each new line
– shahzad
6 hours ago
how can i use this if i want to extract the string stating from 2nd word till the OS= i.e from the mentioned above input THE desired output is "Uncharacterized protein" and "ATP-dependent DNA helicase" in each new line
– shahzad
6 hours ago
add a comment |
In Perl, two non-whitespace "words":
$ perl -lne 'print $1 if /OS=(S+ S+)/' input
or everything up to OX=:
$ perl -lne 'print $1 if /OS=(.*?) OX=/' input
or everything up to the next something=:
$ perl -lne 'print $1 if /OS=(.*?) (w+)=/' input
With your sample input, they all give the same output, but the output would be different with e.g. an input like this:
ABC=something here OS=foo bar doo PE=3 OX=1234
thank you it worked
– shahzad
8 hours ago
add a comment |
In Perl, two non-whitespace "words":
$ perl -lne 'print $1 if /OS=(S+ S+)/' input
or everything up to OX=:
$ perl -lne 'print $1 if /OS=(.*?) OX=/' input
or everything up to the next something=:
$ perl -lne 'print $1 if /OS=(.*?) (w+)=/' input
With your sample input, they all give the same output, but the output would be different with e.g. an input like this:
ABC=something here OS=foo bar doo PE=3 OX=1234
thank you it worked
– shahzad
8 hours ago
add a comment |
In Perl, two non-whitespace "words":
$ perl -lne 'print $1 if /OS=(S+ S+)/' input
or everything up to OX=:
$ perl -lne 'print $1 if /OS=(.*?) OX=/' input
or everything up to the next something=:
$ perl -lne 'print $1 if /OS=(.*?) (w+)=/' input
With your sample input, they all give the same output, but the output would be different with e.g. an input like this:
ABC=something here OS=foo bar doo PE=3 OX=1234
In Perl, two non-whitespace "words":
$ perl -lne 'print $1 if /OS=(S+ S+)/' input
or everything up to OX=:
$ perl -lne 'print $1 if /OS=(.*?) OX=/' input
or everything up to the next something=:
$ perl -lne 'print $1 if /OS=(.*?) (w+)=/' input
With your sample input, they all give the same output, but the output would be different with e.g. an input like this:
ABC=something here OS=foo bar doo PE=3 OX=1234
answered 8 hours ago
ilkkachuilkkachu
68.1k10 gold badges113 silver badges196 bronze badges
68.1k10 gold badges113 silver badges196 bronze badges
thank you it worked
– shahzad
8 hours ago
add a comment |
thank you it worked
– shahzad
8 hours ago
thank you it worked
– shahzad
8 hours ago
thank you it worked
– shahzad
8 hours ago
add a comment |
A more robust way is to use sed to parse the full value until the word containing the next = is found. That way it will work on any sized value (e.g. if you have a font with one word or three words).
sed 's/.*OS=([^=]*).*/1/;s/ [^ ]*$//'
The first block grabs everything up to OS=, the second block in the capture group (denoted by ()'s) matches upto the next = and can be referred to in the replacement as 1. The next substitution rids the last word which is a fragment from the next assignment.
Note: the ^ in []'s is to exclude match the character in this case everything that is not an = sign.
how can i use this if i want to extract the string stating from 2nd word till the OS= i.e from the mentioned above input THE desired output is "Uncharacterized protein" and "ATP-dependent DNA helicase" in each new line
– shahzad
6 hours ago
add a comment |
A more robust way is to use sed to parse the full value until the word containing the next = is found. That way it will work on any sized value (e.g. if you have a font with one word or three words).
sed 's/.*OS=([^=]*).*/1/;s/ [^ ]*$//'
The first block grabs everything up to OS=, the second block in the capture group (denoted by ()'s) matches upto the next = and can be referred to in the replacement as 1. The next substitution rids the last word which is a fragment from the next assignment.
Note: the ^ in []'s is to exclude match the character in this case everything that is not an = sign.
how can i use this if i want to extract the string stating from 2nd word till the OS= i.e from the mentioned above input THE desired output is "Uncharacterized protein" and "ATP-dependent DNA helicase" in each new line
– shahzad
6 hours ago
add a comment |
A more robust way is to use sed to parse the full value until the word containing the next = is found. That way it will work on any sized value (e.g. if you have a font with one word or three words).
sed 's/.*OS=([^=]*).*/1/;s/ [^ ]*$//'
The first block grabs everything up to OS=, the second block in the capture group (denoted by ()'s) matches upto the next = and can be referred to in the replacement as 1. The next substitution rids the last word which is a fragment from the next assignment.
Note: the ^ in []'s is to exclude match the character in this case everything that is not an = sign.
A more robust way is to use sed to parse the full value until the word containing the next = is found. That way it will work on any sized value (e.g. if you have a font with one word or three words).
sed 's/.*OS=([^=]*).*/1/;s/ [^ ]*$//'
The first block grabs everything up to OS=, the second block in the capture group (denoted by ()'s) matches upto the next = and can be referred to in the replacement as 1. The next substitution rids the last word which is a fragment from the next assignment.
Note: the ^ in []'s is to exclude match the character in this case everything that is not an = sign.
edited 7 hours ago
answered 8 hours ago
A.DanischewskiA.Danischewski
2342 silver badges7 bronze badges
2342 silver badges7 bronze badges
how can i use this if i want to extract the string stating from 2nd word till the OS= i.e from the mentioned above input THE desired output is "Uncharacterized protein" and "ATP-dependent DNA helicase" in each new line
– shahzad
6 hours ago
add a comment |
how can i use this if i want to extract the string stating from 2nd word till the OS= i.e from the mentioned above input THE desired output is "Uncharacterized protein" and "ATP-dependent DNA helicase" in each new line
– shahzad
6 hours ago
how can i use this if i want to extract the string stating from 2nd word till the OS= i.e from the mentioned above input THE desired output is "Uncharacterized protein" and "ATP-dependent DNA helicase" in each new line
– shahzad
6 hours ago
how can i use this if i want to extract the string stating from 2nd word till the OS= i.e from the mentioned above input THE desired output is "Uncharacterized protein" and "ATP-dependent DNA helicase" in each new line
– shahzad
6 hours ago
add a comment |
shahzad is a new contributor. Be nice, and check out our Code of Conduct.
shahzad is a new contributor. Be nice, and check out our Code of Conduct.
shahzad is a new contributor. Be nice, and check out our Code of Conduct.
shahzad is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f539203%2fextract-specific-cheracters-from-each-line%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Are there always 2 words to print after
OS=or do you want all words betweenOS=andOX=?– oliv
8 hours ago
i need only two words
– shahzad
8 hours ago