extract specific cheracters from each lineExtract keyword from lineExtract specific columns from text fileSearch for a specific word in each line and print rest of the lineUse awk/sed to remove everything but matching pattern in a specific columnHow to extract one line followed by range of linesHow to extract a line based on the multiple fields in perlExtract specific thing from each row in columnExtract text from bracketsExtract specific fields from file

What exactly is Apple Cider

How quickly would a wooden treasure chest rot?

What are the map units that WGS84 uses?

Opportunity profits vs. opportunity costs

Do 643,000 Americans go bankrupt every year due to medical bills?

Why does 8 bit truecolor use only 2 bits for blue?

Magento 2: Set order history page as default after login

Can Salesforce update the MCCordovaPlugin to support Geofencing?

Euro sign in table with siunitx

What is the source of the fear in the Hallow spell's extra Fear effect?

Looking for a big fantasy novel about scholarly monks that sort of worship math?

How to reproduce this notation?

How do German speakers decide what should be on the left side of the verb?

How do I delete cookies from a specific site?

What are some countries where you can be imprisoned for reading or owning a Bible?

Notation: grace note played on the beat with a chord

Is future tense in English really a myth?

What's this constructed number's starter?

Can taking my 1-week-old on a 6-7 hours journey in the car lead to medical complications?

How do I make my fill-in-the-blank exercise more obvious?

Book where main character comes out of stasis bubble

Pronounceable encrypted text

These roommates throw strange parties

Do I need to declare engagement ring bought in UK when flying on holiday to US?



extract specific cheracters from each line


Extract keyword from lineExtract specific columns from text fileSearch for a specific word in each line and print rest of the lineUse awk/sed to remove everything but matching pattern in a specific columnHow to extract one line followed by range of linesHow to extract a line based on the multiple fields in perlExtract specific thing from each row in columnExtract text from bracketsExtract specific fields from file






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








1















I have a text file and i want extract the a string from each line coming after "OS="



input file line
A0A0A9PBI3_ARUDO Uncharacterized protein OS=Arundo donax OX=35708 PE=4 SV=1
K3Y356_SETIT ATP-dependent DNA helicase OS=Setaria italica OX=4555 PE=3 SV=1


output desired



OS=Arundo donax
OS=Setaria italica


OR



Arundo donax
Setaria italica









share|improve this question









New contributor



shahzad is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





















  • Are there always 2 words to print after OS= or do you want all words between OS= and OX=?

    – oliv
    8 hours ago











  • i need only two words

    – shahzad
    8 hours ago

















1















I have a text file and i want extract the a string from each line coming after "OS="



input file line
A0A0A9PBI3_ARUDO Uncharacterized protein OS=Arundo donax OX=35708 PE=4 SV=1
K3Y356_SETIT ATP-dependent DNA helicase OS=Setaria italica OX=4555 PE=3 SV=1


output desired



OS=Arundo donax
OS=Setaria italica


OR



Arundo donax
Setaria italica









share|improve this question









New contributor



shahzad is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





















  • Are there always 2 words to print after OS= or do you want all words between OS= and OX=?

    – oliv
    8 hours ago











  • i need only two words

    – shahzad
    8 hours ago













1












1








1








I have a text file and i want extract the a string from each line coming after "OS="



input file line
A0A0A9PBI3_ARUDO Uncharacterized protein OS=Arundo donax OX=35708 PE=4 SV=1
K3Y356_SETIT ATP-dependent DNA helicase OS=Setaria italica OX=4555 PE=3 SV=1


output desired



OS=Arundo donax
OS=Setaria italica


OR



Arundo donax
Setaria italica









share|improve this question









New contributor



shahzad is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











I have a text file and i want extract the a string from each line coming after "OS="



input file line
A0A0A9PBI3_ARUDO Uncharacterized protein OS=Arundo donax OX=35708 PE=4 SV=1
K3Y356_SETIT ATP-dependent DNA helicase OS=Setaria italica OX=4555 PE=3 SV=1


output desired



OS=Arundo donax
OS=Setaria italica


OR



Arundo donax
Setaria italica






awk perl cat






share|improve this question









New contributor



shahzad is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.










share|improve this question









New contributor



shahzad is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.








share|improve this question




share|improve this question








edited 8 hours ago









Kevdog777

2,13213 gold badges36 silver badges61 bronze badges




2,13213 gold badges36 silver badges61 bronze badges






New contributor



shahzad is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.








asked 8 hours ago









shahzadshahzad

61 bronze badge




61 bronze badge




New contributor



shahzad is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.




New contributor




shahzad is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

















  • Are there always 2 words to print after OS= or do you want all words between OS= and OX=?

    – oliv
    8 hours ago











  • i need only two words

    – shahzad
    8 hours ago

















  • Are there always 2 words to print after OS= or do you want all words between OS= and OX=?

    – oliv
    8 hours ago











  • i need only two words

    – shahzad
    8 hours ago
















Are there always 2 words to print after OS= or do you want all words between OS= and OX=?

– oliv
8 hours ago





Are there always 2 words to print after OS= or do you want all words between OS= and OX=?

– oliv
8 hours ago













i need only two words

– shahzad
8 hours ago





i need only two words

– shahzad
8 hours ago










3 Answers
3






active

oldest

votes


















2
















Use GNU grep (or compatible) with extended regex:



grep -Eo "OS=w+ w+" file


or basic regex (does not know about + quantifier)



grep -o "OS=w* w*" file


Output:



OS=Arundo donax
OS=Setaria italica





share|improve this answer



























  • Thank you. It worked great.

    – shahzad
    8 hours ago











  • how can i use this if i want to extract the string stating from 2nd word till the OS= i.e from the mentioned above input THE desired output is "Uncharacterized protein" and "ATP-dependent DNA helicase" in each new line

    – shahzad
    6 hours ago


















1
















In Perl, two non-whitespace "words":



$ perl -lne 'print $1 if /OS=(S+ S+)/' input


or everything up to OX=:



$ perl -lne 'print $1 if /OS=(.*?) OX=/' input 


or everything up to the next something=:



$ perl -lne 'print $1 if /OS=(.*?) (w+)=/' input


With your sample input, they all give the same output, but the output would be different with e.g. an input like this:



ABC=something here OS=foo bar doo PE=3 OX=1234





share|improve this answer

























  • thank you it worked

    – shahzad
    8 hours ago


















1
















A more robust way is to use sed to parse the full value until the word containing the next = is found. That way it will work on any sized value (e.g. if you have a font with one word or three words).



sed 's/.*OS=([^=]*).*/1/;s/ [^ ]*$//'


The first block grabs everything up to OS=, the second block in the capture group (denoted by ()'s) matches upto the next = and can be referred to in the replacement as 1. The next substitution rids the last word which is a fragment from the next assignment.



Note: the ^ in []'s is to exclude match the character in this case everything that is not an = sign.






share|improve this answer



























  • how can i use this if i want to extract the string stating from 2nd word till the OS= i.e from the mentioned above input THE desired output is "Uncharacterized protein" and "ATP-dependent DNA helicase" in each new line

    – shahzad
    6 hours ago














Your Answer








StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);







shahzad is a new contributor. Be nice, and check out our Code of Conduct.









draft saved

draft discarded
















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f539203%2fextract-specific-cheracters-from-each-line%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























3 Answers
3






active

oldest

votes








3 Answers
3






active

oldest

votes









active

oldest

votes






active

oldest

votes









2
















Use GNU grep (or compatible) with extended regex:



grep -Eo "OS=w+ w+" file


or basic regex (does not know about + quantifier)



grep -o "OS=w* w*" file


Output:



OS=Arundo donax
OS=Setaria italica





share|improve this answer



























  • Thank you. It worked great.

    – shahzad
    8 hours ago











  • how can i use this if i want to extract the string stating from 2nd word till the OS= i.e from the mentioned above input THE desired output is "Uncharacterized protein" and "ATP-dependent DNA helicase" in each new line

    – shahzad
    6 hours ago















2
















Use GNU grep (or compatible) with extended regex:



grep -Eo "OS=w+ w+" file


or basic regex (does not know about + quantifier)



grep -o "OS=w* w*" file


Output:



OS=Arundo donax
OS=Setaria italica





share|improve this answer



























  • Thank you. It worked great.

    – shahzad
    8 hours ago











  • how can i use this if i want to extract the string stating from 2nd word till the OS= i.e from the mentioned above input THE desired output is "Uncharacterized protein" and "ATP-dependent DNA helicase" in each new line

    – shahzad
    6 hours ago













2














2










2









Use GNU grep (or compatible) with extended regex:



grep -Eo "OS=w+ w+" file


or basic regex (does not know about + quantifier)



grep -o "OS=w* w*" file


Output:



OS=Arundo donax
OS=Setaria italica





share|improve this answer















Use GNU grep (or compatible) with extended regex:



grep -Eo "OS=w+ w+" file


or basic regex (does not know about + quantifier)



grep -o "OS=w* w*" file


Output:



OS=Arundo donax
OS=Setaria italica






share|improve this answer














share|improve this answer



share|improve this answer








edited 8 hours ago









Stéphane Chazelas

333k58 gold badges650 silver badges1020 bronze badges




333k58 gold badges650 silver badges1020 bronze badges










answered 8 hours ago









pLumopLumo

7,56315 silver badges34 bronze badges




7,56315 silver badges34 bronze badges















  • Thank you. It worked great.

    – shahzad
    8 hours ago











  • how can i use this if i want to extract the string stating from 2nd word till the OS= i.e from the mentioned above input THE desired output is "Uncharacterized protein" and "ATP-dependent DNA helicase" in each new line

    – shahzad
    6 hours ago

















  • Thank you. It worked great.

    – shahzad
    8 hours ago











  • how can i use this if i want to extract the string stating from 2nd word till the OS= i.e from the mentioned above input THE desired output is "Uncharacterized protein" and "ATP-dependent DNA helicase" in each new line

    – shahzad
    6 hours ago
















Thank you. It worked great.

– shahzad
8 hours ago





Thank you. It worked great.

– shahzad
8 hours ago













how can i use this if i want to extract the string stating from 2nd word till the OS= i.e from the mentioned above input THE desired output is "Uncharacterized protein" and "ATP-dependent DNA helicase" in each new line

– shahzad
6 hours ago





how can i use this if i want to extract the string stating from 2nd word till the OS= i.e from the mentioned above input THE desired output is "Uncharacterized protein" and "ATP-dependent DNA helicase" in each new line

– shahzad
6 hours ago













1
















In Perl, two non-whitespace "words":



$ perl -lne 'print $1 if /OS=(S+ S+)/' input


or everything up to OX=:



$ perl -lne 'print $1 if /OS=(.*?) OX=/' input 


or everything up to the next something=:



$ perl -lne 'print $1 if /OS=(.*?) (w+)=/' input


With your sample input, they all give the same output, but the output would be different with e.g. an input like this:



ABC=something here OS=foo bar doo PE=3 OX=1234





share|improve this answer

























  • thank you it worked

    – shahzad
    8 hours ago















1
















In Perl, two non-whitespace "words":



$ perl -lne 'print $1 if /OS=(S+ S+)/' input


or everything up to OX=:



$ perl -lne 'print $1 if /OS=(.*?) OX=/' input 


or everything up to the next something=:



$ perl -lne 'print $1 if /OS=(.*?) (w+)=/' input


With your sample input, they all give the same output, but the output would be different with e.g. an input like this:



ABC=something here OS=foo bar doo PE=3 OX=1234





share|improve this answer

























  • thank you it worked

    – shahzad
    8 hours ago













1














1










1









In Perl, two non-whitespace "words":



$ perl -lne 'print $1 if /OS=(S+ S+)/' input


or everything up to OX=:



$ perl -lne 'print $1 if /OS=(.*?) OX=/' input 


or everything up to the next something=:



$ perl -lne 'print $1 if /OS=(.*?) (w+)=/' input


With your sample input, they all give the same output, but the output would be different with e.g. an input like this:



ABC=something here OS=foo bar doo PE=3 OX=1234





share|improve this answer













In Perl, two non-whitespace "words":



$ perl -lne 'print $1 if /OS=(S+ S+)/' input


or everything up to OX=:



$ perl -lne 'print $1 if /OS=(.*?) OX=/' input 


or everything up to the next something=:



$ perl -lne 'print $1 if /OS=(.*?) (w+)=/' input


With your sample input, they all give the same output, but the output would be different with e.g. an input like this:



ABC=something here OS=foo bar doo PE=3 OX=1234






share|improve this answer












share|improve this answer



share|improve this answer










answered 8 hours ago









ilkkachuilkkachu

68.1k10 gold badges113 silver badges196 bronze badges




68.1k10 gold badges113 silver badges196 bronze badges















  • thank you it worked

    – shahzad
    8 hours ago

















  • thank you it worked

    – shahzad
    8 hours ago
















thank you it worked

– shahzad
8 hours ago





thank you it worked

– shahzad
8 hours ago











1
















A more robust way is to use sed to parse the full value until the word containing the next = is found. That way it will work on any sized value (e.g. if you have a font with one word or three words).



sed 's/.*OS=([^=]*).*/1/;s/ [^ ]*$//'


The first block grabs everything up to OS=, the second block in the capture group (denoted by ()'s) matches upto the next = and can be referred to in the replacement as 1. The next substitution rids the last word which is a fragment from the next assignment.



Note: the ^ in []'s is to exclude match the character in this case everything that is not an = sign.






share|improve this answer



























  • how can i use this if i want to extract the string stating from 2nd word till the OS= i.e from the mentioned above input THE desired output is "Uncharacterized protein" and "ATP-dependent DNA helicase" in each new line

    – shahzad
    6 hours ago
















1
















A more robust way is to use sed to parse the full value until the word containing the next = is found. That way it will work on any sized value (e.g. if you have a font with one word or three words).



sed 's/.*OS=([^=]*).*/1/;s/ [^ ]*$//'


The first block grabs everything up to OS=, the second block in the capture group (denoted by ()'s) matches upto the next = and can be referred to in the replacement as 1. The next substitution rids the last word which is a fragment from the next assignment.



Note: the ^ in []'s is to exclude match the character in this case everything that is not an = sign.






share|improve this answer



























  • how can i use this if i want to extract the string stating from 2nd word till the OS= i.e from the mentioned above input THE desired output is "Uncharacterized protein" and "ATP-dependent DNA helicase" in each new line

    – shahzad
    6 hours ago














1














1










1









A more robust way is to use sed to parse the full value until the word containing the next = is found. That way it will work on any sized value (e.g. if you have a font with one word or three words).



sed 's/.*OS=([^=]*).*/1/;s/ [^ ]*$//'


The first block grabs everything up to OS=, the second block in the capture group (denoted by ()'s) matches upto the next = and can be referred to in the replacement as 1. The next substitution rids the last word which is a fragment from the next assignment.



Note: the ^ in []'s is to exclude match the character in this case everything that is not an = sign.






share|improve this answer















A more robust way is to use sed to parse the full value until the word containing the next = is found. That way it will work on any sized value (e.g. if you have a font with one word or three words).



sed 's/.*OS=([^=]*).*/1/;s/ [^ ]*$//'


The first block grabs everything up to OS=, the second block in the capture group (denoted by ()'s) matches upto the next = and can be referred to in the replacement as 1. The next substitution rids the last word which is a fragment from the next assignment.



Note: the ^ in []'s is to exclude match the character in this case everything that is not an = sign.







share|improve this answer














share|improve this answer



share|improve this answer








edited 7 hours ago

























answered 8 hours ago









A.DanischewskiA.Danischewski

2342 silver badges7 bronze badges




2342 silver badges7 bronze badges















  • how can i use this if i want to extract the string stating from 2nd word till the OS= i.e from the mentioned above input THE desired output is "Uncharacterized protein" and "ATP-dependent DNA helicase" in each new line

    – shahzad
    6 hours ago


















  • how can i use this if i want to extract the string stating from 2nd word till the OS= i.e from the mentioned above input THE desired output is "Uncharacterized protein" and "ATP-dependent DNA helicase" in each new line

    – shahzad
    6 hours ago

















how can i use this if i want to extract the string stating from 2nd word till the OS= i.e from the mentioned above input THE desired output is "Uncharacterized protein" and "ATP-dependent DNA helicase" in each new line

– shahzad
6 hours ago






how can i use this if i want to extract the string stating from 2nd word till the OS= i.e from the mentioned above input THE desired output is "Uncharacterized protein" and "ATP-dependent DNA helicase" in each new line

– shahzad
6 hours ago












shahzad is a new contributor. Be nice, and check out our Code of Conduct.









draft saved

draft discarded

















shahzad is a new contributor. Be nice, and check out our Code of Conduct.












shahzad is a new contributor. Be nice, and check out our Code of Conduct.











shahzad is a new contributor. Be nice, and check out our Code of Conduct.














Thanks for contributing an answer to Unix & Linux Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f539203%2fextract-specific-cheracters-from-each-line%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Canceling a color specificationRandomly assigning color to Graphics3D objects?Default color for Filling in Mathematica 9Coloring specific elements of sets with a prime modified order in an array plotHow to pick a color differing significantly from the colors already in a given color list?Detection of the text colorColor numbers based on their valueCan color schemes for use with ColorData include opacity specification?My dynamic color schemes

Invision Community Contents History See also References External links Navigation menuProprietaryinvisioncommunity.comIPS Community ForumsIPS Community Forumsthis blog entry"License Changes, IP.Board 3.4, and the Future""Interview -- Matt Mecham of Ibforums""CEO Invision Power Board, Matt Mecham Is a Liar, Thief!"IPB License Explanation 1.3, 1.3.1, 2.0, and 2.1ArchivedSecurity Fixes, Updates And Enhancements For IPB 1.3.1Archived"New Demo Accounts - Invision Power Services"the original"New Default Skin"the original"Invision Power Board 3.0.0 and Applications Released"the original"Archived copy"the original"Perpetual licenses being done away with""Release Notes - Invision Power Services""Introducing: IPS Community Suite 4!"Invision Community Release Notes

François Viète Contents Biography Work and thought Bibliography See also Notes Further reading External links Navigation menup. 21Google Bookspp. 75–77Google BooksDe thou (from University of Saint Andrews)ArchivedGoogle BooksGoogle BooksGoogle BooksGoogle booksGoogle Bookscc-parthenay.frL'histoire universelle (fr)Universal History (en)ArchivedAdsabs.harvard.eduPagesperso-orange.frArchive.orgChikara Sasaki. Descartes' mathematical thought p.259Google BooksGoogle BooksGoogle Bookspp. 152 and onwardGoogle BooksGoogle BooksScribd.comGoogle Books1257-7979Google BooksGoogle BooksGoogle BooksGoogle BooksGoogle BooksGoogle BooksGallica.bnf.frGoogle BooksGoogle Books"François Viète"Francois Viète: Father of Modern Algebraic NotationThe Lawyer and the GamblerAbout TarporleySite de Jean-Paul GuichardL'algèbre nouvelle"About the Harmonicon"cb120511976(data)1188044800000 0001 0913 5903n82164680ola2013766880073431702w6vt1sb70287374827140948071409480