PDB file downloading: pymol automation vs. manualBest distance parameter for estimating physical interaction between residues in a PDB fileBiopython: resseq doesn't match pdb fileIs there a tool that can take a protein's amino acid sequence and would display it's locus on the genome?How to read the “SEQRES” section from a PDB file, using RIs there a standard way to clean a PDB file and re-number its residues?Using R to import specific records from a PDB filePlotting Ramachandran Plot from more than one PDB fileHow to iterate protein sequences using amino acids?Retrieve ID ligand from PDB fileHow to select only RNA with Hetero atoms from pdb file with python?

I multiply the source, you (probably) multiply the output!

Features seen on the Space Shuttle's solid booster; what does "LOADED" mean exactly?

After a few interviews, What should I do after told to wait?

Electric shock from pedals and guitar. Jacks too long?

What makes an ending "happy"?

I need to know information from an old German birth certificate

Bit floating sequence

Did "Dirty Harry" feel lucky?

Why don't the currents clash in a 3 phase delta connection?

Word for something that used to be popular but not anymore

How strong is aircraft-grade spruce?

How to finish my PhD?

Could someone please explain what this inline #define assembly is doing?

Why does low tire pressure decrease fuel economy?

Template default argument loses its reference type

Why would an AC motor heavily shake when driven with certain frequencies?

Complex conjugate and transpose "with respect to a basis"

Why does PAUSE key have a long make code and no break code?

Extra arrow heads appearing tikz

I won a car in a poker game. How is that taxed in Canada?

What is the purpose of the rotating plate in front of the lock?

antimatter annihilation in stars

How should Thaumaturgy's "three times as loud as normal" be interpreted?

What is the delta-v required to get a mass in Earth orbit into the sun using a SINGLE transfer?



PDB file downloading: pymol automation vs. manual


Best distance parameter for estimating physical interaction between residues in a PDB fileBiopython: resseq doesn't match pdb fileIs there a tool that can take a protein's amino acid sequence and would display it's locus on the genome?How to read the “SEQRES” section from a PDB file, using RIs there a standard way to clean a PDB file and re-number its residues?Using R to import specific records from a PDB filePlotting Ramachandran Plot from more than one PDB fileHow to iterate protein sequences using amino acids?Retrieve ID ligand from PDB fileHow to select only RNA with Hetero atoms from pdb file with python?






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








2












$begingroup$


I automated a PDB download using a Pymol script (below)



python
pdb_lists = ['2I69', '2HG0'] # lots more pdbs
for x in pdb_lists:
cmd.fetch(x)
cmd.select(x)
cmd.save(x + '.pdb', x)
cmd.delete(x)
cmd.quit()
python end


When the script is run within Pymol it pulls down a collection of pdbs, however this flopped when I plugged it through a Biopython PDB parser, but was okay when I manually downloaded the same pdb file from RCSB.org



A manual downloaded PDB was perfect, meaning the amino acids which cannot be observed by X-ray crystalography (e.g. they not sufficiently stationary) were recognised as being absent from the parser (resulting in a protein ~390 amino acids), but the Pymol scripted download contained all amino acid residues, i.e. even those which couldn't be observed in the X-ray (resulting in a protein of ~440 amino acid residues). Any ideas why the script resulted in a pdb that flopped and how to correct it?










share|improve this question









$endgroup$









  • 2




    $begingroup$
    In PyMol 1.8+ fetch downloads mmcif files by default. Your script must convert mmCIF to pdb, that's why the result differs from a pdb file downloaded directly from wwPDB.
    $endgroup$
    – marcin
    8 hours ago

















2












$begingroup$


I automated a PDB download using a Pymol script (below)



python
pdb_lists = ['2I69', '2HG0'] # lots more pdbs
for x in pdb_lists:
cmd.fetch(x)
cmd.select(x)
cmd.save(x + '.pdb', x)
cmd.delete(x)
cmd.quit()
python end


When the script is run within Pymol it pulls down a collection of pdbs, however this flopped when I plugged it through a Biopython PDB parser, but was okay when I manually downloaded the same pdb file from RCSB.org



A manual downloaded PDB was perfect, meaning the amino acids which cannot be observed by X-ray crystalography (e.g. they not sufficiently stationary) were recognised as being absent from the parser (resulting in a protein ~390 amino acids), but the Pymol scripted download contained all amino acid residues, i.e. even those which couldn't be observed in the X-ray (resulting in a protein of ~440 amino acid residues). Any ideas why the script resulted in a pdb that flopped and how to correct it?










share|improve this question









$endgroup$









  • 2




    $begingroup$
    In PyMol 1.8+ fetch downloads mmcif files by default. Your script must convert mmCIF to pdb, that's why the result differs from a pdb file downloaded directly from wwPDB.
    $endgroup$
    – marcin
    8 hours ago













2












2








2





$begingroup$


I automated a PDB download using a Pymol script (below)



python
pdb_lists = ['2I69', '2HG0'] # lots more pdbs
for x in pdb_lists:
cmd.fetch(x)
cmd.select(x)
cmd.save(x + '.pdb', x)
cmd.delete(x)
cmd.quit()
python end


When the script is run within Pymol it pulls down a collection of pdbs, however this flopped when I plugged it through a Biopython PDB parser, but was okay when I manually downloaded the same pdb file from RCSB.org



A manual downloaded PDB was perfect, meaning the amino acids which cannot be observed by X-ray crystalography (e.g. they not sufficiently stationary) were recognised as being absent from the parser (resulting in a protein ~390 amino acids), but the Pymol scripted download contained all amino acid residues, i.e. even those which couldn't be observed in the X-ray (resulting in a protein of ~440 amino acid residues). Any ideas why the script resulted in a pdb that flopped and how to correct it?










share|improve this question









$endgroup$




I automated a PDB download using a Pymol script (below)



python
pdb_lists = ['2I69', '2HG0'] # lots more pdbs
for x in pdb_lists:
cmd.fetch(x)
cmd.select(x)
cmd.save(x + '.pdb', x)
cmd.delete(x)
cmd.quit()
python end


When the script is run within Pymol it pulls down a collection of pdbs, however this flopped when I plugged it through a Biopython PDB parser, but was okay when I manually downloaded the same pdb file from RCSB.org



A manual downloaded PDB was perfect, meaning the amino acids which cannot be observed by X-ray crystalography (e.g. they not sufficiently stationary) were recognised as being absent from the parser (resulting in a protein ~390 amino acids), but the Pymol scripted download contained all amino acid residues, i.e. even those which couldn't be observed in the X-ray (resulting in a protein of ~440 amino acid residues). Any ideas why the script resulted in a pdb that flopped and how to correct it?







python phylogenetics pdb pymol






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked 8 hours ago









Michael G.Michael G.

1,4652 gold badges2 silver badges21 bronze badges




1,4652 gold badges2 silver badges21 bronze badges










  • 2




    $begingroup$
    In PyMol 1.8+ fetch downloads mmcif files by default. Your script must convert mmCIF to pdb, that's why the result differs from a pdb file downloaded directly from wwPDB.
    $endgroup$
    – marcin
    8 hours ago












  • 2




    $begingroup$
    In PyMol 1.8+ fetch downloads mmcif files by default. Your script must convert mmCIF to pdb, that's why the result differs from a pdb file downloaded directly from wwPDB.
    $endgroup$
    – marcin
    8 hours ago







2




2




$begingroup$
In PyMol 1.8+ fetch downloads mmcif files by default. Your script must convert mmCIF to pdb, that's why the result differs from a pdb file downloaded directly from wwPDB.
$endgroup$
– marcin
8 hours ago




$begingroup$
In PyMol 1.8+ fetch downloads mmcif files by default. Your script must convert mmCIF to pdb, that's why the result differs from a pdb file downloaded directly from wwPDB.
$endgroup$
– marcin
8 hours ago










1 Answer
1






active

oldest

votes


















2














$begingroup$

To expand on the comment by marcin:




  1. fetch downloads files in mmCIF format by default (https://pymolwiki.org/index.php/Fetch). Not all PDB entries have PDB format files, e.g. due to too many chains. Presumably this is why the change was made, though mmCIF files tend to be larger and hence download slower.

  2. When calling save with the .pdb file extension the structure is converted to PDB format. Perhaps the converter uses the SEQRES records to define the sequence, and hence gives bad files when there are some residues not resolved.

To correct it you could try:



  1. Saving as .cif and reading in with the Biopython mmCIF parser.

  2. Providing the type=pdb argument to fetch, which will force PDB format downloads. Some PDB files may not be available.

  3. Don't do it in PyMol and instead use the Biopython structure downloader (http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc187).

Also, if download speed is limiting you then consider using the MMTF file format, which is a binary format and hence smaller. Biopython can read in and write out MMTF files.






share|improve this answer









$endgroup$














  • $begingroup$
    Thanks point 2 was exactly how I solved the issue (and it worked). Let me look at the Biopython solution. I looked in detail at MMTF but it that appeared more aimed at pdb->MMTF.
    $endgroup$
    – Michael G.
    1 hour ago













Your Answer








StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "676"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);














draft saved

draft discarded
















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fbioinformatics.stackexchange.com%2fquestions%2f9327%2fpdb-file-downloading-pymol-automation-vs-manual%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









2














$begingroup$

To expand on the comment by marcin:




  1. fetch downloads files in mmCIF format by default (https://pymolwiki.org/index.php/Fetch). Not all PDB entries have PDB format files, e.g. due to too many chains. Presumably this is why the change was made, though mmCIF files tend to be larger and hence download slower.

  2. When calling save with the .pdb file extension the structure is converted to PDB format. Perhaps the converter uses the SEQRES records to define the sequence, and hence gives bad files when there are some residues not resolved.

To correct it you could try:



  1. Saving as .cif and reading in with the Biopython mmCIF parser.

  2. Providing the type=pdb argument to fetch, which will force PDB format downloads. Some PDB files may not be available.

  3. Don't do it in PyMol and instead use the Biopython structure downloader (http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc187).

Also, if download speed is limiting you then consider using the MMTF file format, which is a binary format and hence smaller. Biopython can read in and write out MMTF files.






share|improve this answer









$endgroup$














  • $begingroup$
    Thanks point 2 was exactly how I solved the issue (and it worked). Let me look at the Biopython solution. I looked in detail at MMTF but it that appeared more aimed at pdb->MMTF.
    $endgroup$
    – Michael G.
    1 hour ago















2














$begingroup$

To expand on the comment by marcin:




  1. fetch downloads files in mmCIF format by default (https://pymolwiki.org/index.php/Fetch). Not all PDB entries have PDB format files, e.g. due to too many chains. Presumably this is why the change was made, though mmCIF files tend to be larger and hence download slower.

  2. When calling save with the .pdb file extension the structure is converted to PDB format. Perhaps the converter uses the SEQRES records to define the sequence, and hence gives bad files when there are some residues not resolved.

To correct it you could try:



  1. Saving as .cif and reading in with the Biopython mmCIF parser.

  2. Providing the type=pdb argument to fetch, which will force PDB format downloads. Some PDB files may not be available.

  3. Don't do it in PyMol and instead use the Biopython structure downloader (http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc187).

Also, if download speed is limiting you then consider using the MMTF file format, which is a binary format and hence smaller. Biopython can read in and write out MMTF files.






share|improve this answer









$endgroup$














  • $begingroup$
    Thanks point 2 was exactly how I solved the issue (and it worked). Let me look at the Biopython solution. I looked in detail at MMTF but it that appeared more aimed at pdb->MMTF.
    $endgroup$
    – Michael G.
    1 hour ago













2














2










2







$begingroup$

To expand on the comment by marcin:




  1. fetch downloads files in mmCIF format by default (https://pymolwiki.org/index.php/Fetch). Not all PDB entries have PDB format files, e.g. due to too many chains. Presumably this is why the change was made, though mmCIF files tend to be larger and hence download slower.

  2. When calling save with the .pdb file extension the structure is converted to PDB format. Perhaps the converter uses the SEQRES records to define the sequence, and hence gives bad files when there are some residues not resolved.

To correct it you could try:



  1. Saving as .cif and reading in with the Biopython mmCIF parser.

  2. Providing the type=pdb argument to fetch, which will force PDB format downloads. Some PDB files may not be available.

  3. Don't do it in PyMol and instead use the Biopython structure downloader (http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc187).

Also, if download speed is limiting you then consider using the MMTF file format, which is a binary format and hence smaller. Biopython can read in and write out MMTF files.






share|improve this answer









$endgroup$



To expand on the comment by marcin:




  1. fetch downloads files in mmCIF format by default (https://pymolwiki.org/index.php/Fetch). Not all PDB entries have PDB format files, e.g. due to too many chains. Presumably this is why the change was made, though mmCIF files tend to be larger and hence download slower.

  2. When calling save with the .pdb file extension the structure is converted to PDB format. Perhaps the converter uses the SEQRES records to define the sequence, and hence gives bad files when there are some residues not resolved.

To correct it you could try:



  1. Saving as .cif and reading in with the Biopython mmCIF parser.

  2. Providing the type=pdb argument to fetch, which will force PDB format downloads. Some PDB files may not be available.

  3. Don't do it in PyMol and instead use the Biopython structure downloader (http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc187).

Also, if download speed is limiting you then consider using the MMTF file format, which is a binary format and hence smaller. Biopython can read in and write out MMTF files.







share|improve this answer












share|improve this answer



share|improve this answer










answered 5 hours ago









jgreenerjgreener

1315 bronze badges




1315 bronze badges














  • $begingroup$
    Thanks point 2 was exactly how I solved the issue (and it worked). Let me look at the Biopython solution. I looked in detail at MMTF but it that appeared more aimed at pdb->MMTF.
    $endgroup$
    – Michael G.
    1 hour ago
















  • $begingroup$
    Thanks point 2 was exactly how I solved the issue (and it worked). Let me look at the Biopython solution. I looked in detail at MMTF but it that appeared more aimed at pdb->MMTF.
    $endgroup$
    – Michael G.
    1 hour ago















$begingroup$
Thanks point 2 was exactly how I solved the issue (and it worked). Let me look at the Biopython solution. I looked in detail at MMTF but it that appeared more aimed at pdb->MMTF.
$endgroup$
– Michael G.
1 hour ago




$begingroup$
Thanks point 2 was exactly how I solved the issue (and it worked). Let me look at the Biopython solution. I looked in detail at MMTF but it that appeared more aimed at pdb->MMTF.
$endgroup$
– Michael G.
1 hour ago


















draft saved

draft discarded















































Thanks for contributing an answer to Bioinformatics Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fbioinformatics.stackexchange.com%2fquestions%2f9327%2fpdb-file-downloading-pymol-automation-vs-manual%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Invision Community Contents History See also References External links Navigation menuProprietaryinvisioncommunity.comIPS Community ForumsIPS Community Forumsthis blog entry"License Changes, IP.Board 3.4, and the Future""Interview -- Matt Mecham of Ibforums""CEO Invision Power Board, Matt Mecham Is a Liar, Thief!"IPB License Explanation 1.3, 1.3.1, 2.0, and 2.1ArchivedSecurity Fixes, Updates And Enhancements For IPB 1.3.1Archived"New Demo Accounts - Invision Power Services"the original"New Default Skin"the original"Invision Power Board 3.0.0 and Applications Released"the original"Archived copy"the original"Perpetual licenses being done away with""Release Notes - Invision Power Services""Introducing: IPS Community Suite 4!"Invision Community Release Notes

Canceling a color specificationRandomly assigning color to Graphics3D objects?Default color for Filling in Mathematica 9Coloring specific elements of sets with a prime modified order in an array plotHow to pick a color differing significantly from the colors already in a given color list?Detection of the text colorColor numbers based on their valueCan color schemes for use with ColorData include opacity specification?My dynamic color schemes

Ласкавець круглолистий Зміст Опис | Поширення | Галерея | Примітки | Посилання | Навігаційне меню58171138361-22960890446Bupleurum rotundifoliumEuro+Med PlantbasePlants of the World Online — Kew ScienceGermplasm Resources Information Network (GRIN)Ласкавецькн. VI : Літери Ком — Левиправивши або дописавши її