PDB file downloading: pymol automation vs. manualBest distance parameter for estimating physical interaction between residues in a PDB fileBiopython: resseq doesn't match pdb fileIs there a tool that can take a protein's amino acid sequence and would display it's locus on the genome?How to read the “SEQRES” section from a PDB file, using RIs there a standard way to clean a PDB file and re-number its residues?Using R to import specific records from a PDB filePlotting Ramachandran Plot from more than one PDB fileHow to iterate protein sequences using amino acids?Retrieve ID ligand from PDB fileHow to select only RNA with Hetero atoms from pdb file with python?

I multiply the source, you (probably) multiply the output!

Features seen on the Space Shuttle's solid booster; what does "LOADED" mean exactly?

After a few interviews, What should I do after told to wait?

Electric shock from pedals and guitar. Jacks too long?

What makes an ending "happy"?

I need to know information from an old German birth certificate

Bit floating sequence

Did "Dirty Harry" feel lucky?

Why don't the currents clash in a 3 phase delta connection?

Word for something that used to be popular but not anymore

How strong is aircraft-grade spruce?

How to finish my PhD?

Could someone please explain what this inline #define assembly is doing?

Why does low tire pressure decrease fuel economy?

Template default argument loses its reference type

Why would an AC motor heavily shake when driven with certain frequencies?

Complex conjugate and transpose "with respect to a basis"

Why does PAUSE key have a long make code and no break code?

Extra arrow heads appearing tikz

I won a car in a poker game. How is that taxed in Canada?

What is the purpose of the rotating plate in front of the lock?

antimatter annihilation in stars

How should Thaumaturgy's "three times as loud as normal" be interpreted?

What is the delta-v required to get a mass in Earth orbit into the sun using a SINGLE transfer?

PDB file downloading: pymol automation vs. manual

Best distance parameter for estimating physical interaction between residues in a PDB fileBiopython: resseq doesn't match pdb fileIs there a tool that can take a protein's amino acid sequence and would display it's locus on the genome?How to read the “SEQRES” section from a PDB file, using RIs there a standard way to clean a PDB file and re-number its residues?Using R to import specific records from a PDB filePlotting Ramachandran Plot from more than one PDB fileHow to iterate protein sequences using amino acids?Retrieve ID ligand from PDB fileHow to select only RNA with Hetero atoms from pdb file with python?

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;

I automated a PDB download using a Pymol script (below)

python
pdb_lists = ['2I69', '2HG0'] # lots more pdbs
for x in pdb_lists:
 cmd.fetch(x)
 cmd.select(x)
 cmd.save(x + '.pdb', x)
 cmd.delete(x)
cmd.quit()
python end

When the script is run within Pymol it pulls down a collection of pdbs, however this flopped when I plugged it through a Biopython PDB parser, but was okay when I manually downloaded the same pdb file from RCSB.org

A manual downloaded PDB was perfect, meaning the amino acids which cannot be observed by X-ray crystalography (e.g. they not sufficiently stationary) were recognised as being absent from the parser (resulting in a protein ~390 amino acids), but the Pymol scripted download contained all amino acid residues, i.e. even those which couldn't be observed in the X-ray (resulting in a protein of ~440 amino acid residues). Any ideas why the script resulted in a pdb that flopped and how to correct it?

asked 8 hours ago

Michael G.

1,4652 gold badges2 silver badges21 bronze badges

2

$begingroup$
In PyMol 1.8+ fetch downloads mmcif files by default. Your script must convert mmCIF to pdb, that's why the result differs from a pdb file downloaded directly from wwPDB.
$endgroup$
– marcin
8 hours ago

add a comment |

I automated a PDB download using a Pymol script (below)

python
pdb_lists = ['2I69', '2HG0'] # lots more pdbs
for x in pdb_lists:
 cmd.fetch(x)
 cmd.select(x)
 cmd.save(x + '.pdb', x)
 cmd.delete(x)
cmd.quit()
python end

asked 8 hours ago

Michael G.

1,4652 gold badges2 silver badges21 bronze badges

2

$begingroup$
In PyMol 1.8+ fetch downloads mmcif files by default. Your script must convert mmCIF to pdb, that's why the result differs from a pdb file downloaded directly from wwPDB.
$endgroup$
– marcin
8 hours ago

add a comment |

I automated a PDB download using a Pymol script (below)

python
pdb_lists = ['2I69', '2HG0'] # lots more pdbs
for x in pdb_lists:
 cmd.fetch(x)
 cmd.select(x)
 cmd.save(x + '.pdb', x)
 cmd.delete(x)
cmd.quit()
python end

asked 8 hours ago

Michael G.

1,4652 gold badges2 silver badges21 bronze badges

I automated a PDB download using a Pymol script (below)

python
pdb_lists = ['2I69', '2HG0'] # lots more pdbs
for x in pdb_lists:
 cmd.fetch(x)
 cmd.select(x)
 cmd.save(x + '.pdb', x)
 cmd.delete(x)
cmd.quit()
python end

python phylogenetics pdb pymol

asked 8 hours ago

Michael G.

1,4652 gold badges2 silver badges21 bronze badges

asked 8 hours ago

Michael G.

1,4652 gold badges2 silver badges21 bronze badges

asked 8 hours ago

Michael G.

1,4652 gold badges2 silver badges21 bronze badges

asked 8 hours ago

Michael G.

1,4652 gold badges2 silver badges21 bronze badges

asked 8 hours ago

Michael G.

1,4652 gold badges2 silver badges21 bronze badges

2

$begingroup$
In PyMol 1.8+ fetch downloads mmcif files by default. Your script must convert mmCIF to pdb, that's why the result differs from a pdb file downloaded directly from wwPDB.
$endgroup$
– marcin
8 hours ago

add a comment |

2

$begingroup$
In PyMol 1.8+ fetch downloads mmcif files by default. Your script must convert mmCIF to pdb, that's why the result differs from a pdb file downloaded directly from wwPDB.
$endgroup$
– marcin
8 hours ago

In PyMol 1.8+ fetch downloads mmcif files by default. Your script must convert mmCIF to pdb, that's why the result differs from a pdb file downloaded directly from wwPDB.

– marcin
8 hours ago

add a comment |

1 Answer
1

active

oldest

votes

To expand on the comment by marcin:

fetch downloads files in mmCIF format by default (https://pymolwiki.org/index.php/Fetch). Not all PDB entries have PDB format files, e.g. due to too many chains. Presumably this is why the change was made, though mmCIF files tend to be larger and hence download slower.

When calling save with the .pdb file extension the structure is converted to PDB format. Perhaps the converter uses the SEQRES records to define the sequence, and hence gives bad files when there are some residues not resolved.

To correct it you could try:

Saving as .cif and reading in with the Biopython mmCIF parser.

Providing the type=pdb argument to fetch, which will force PDB format downloads. Some PDB files may not be available.

Don't do it in PyMol and instead use the Biopython structure downloader (http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc187).

Also, if download speed is limiting you then consider using the MMTF file format, which is a binary format and hence smaller. Biopython can read in and write out MMTF files.

answered 5 hours ago

jgreener

1315 bronze badges

$begingroup$
Thanks point 2 was exactly how I solved the issue (and it worked). Let me look at the Biopython solution. I looked in detail at MMTF but it that appeared more aimed at pdb->MMTF.
$endgroup$
– Michael G.
1 hour ago

add a comment |

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "676"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fbioinformatics.stackexchange.com%2fquestions%2f9327%2fpdb-file-downloading-pymol-automation-vs-manual%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

To expand on the comment by marcin:

fetch downloads files in mmCIF format by default (https://pymolwiki.org/index.php/Fetch). Not all PDB entries have PDB format files, e.g. due to too many chains. Presumably this is why the change was made, though mmCIF files tend to be larger and hence download slower.

When calling save with the .pdb file extension the structure is converted to PDB format. Perhaps the converter uses the SEQRES records to define the sequence, and hence gives bad files when there are some residues not resolved.

To correct it you could try:

Saving as .cif and reading in with the Biopython mmCIF parser.

Providing the type=pdb argument to fetch, which will force PDB format downloads. Some PDB files may not be available.

Don't do it in PyMol and instead use the Biopython structure downloader (http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc187).

Also, if download speed is limiting you then consider using the MMTF file format, which is a binary format and hence smaller. Biopython can read in and write out MMTF files.

answered 5 hours ago

jgreener

1315 bronze badges

$begingroup$
Thanks point 2 was exactly how I solved the issue (and it worked). Let me look at the Biopython solution. I looked in detail at MMTF but it that appeared more aimed at pdb->MMTF.
$endgroup$
– Michael G.
1 hour ago

add a comment |

To expand on the comment by marcin:

fetch downloads files in mmCIF format by default (https://pymolwiki.org/index.php/Fetch). Not all PDB entries have PDB format files, e.g. due to too many chains. Presumably this is why the change was made, though mmCIF files tend to be larger and hence download slower.

When calling save with the .pdb file extension the structure is converted to PDB format. Perhaps the converter uses the SEQRES records to define the sequence, and hence gives bad files when there are some residues not resolved.

To correct it you could try:

Saving as .cif and reading in with the Biopython mmCIF parser.

Providing the type=pdb argument to fetch, which will force PDB format downloads. Some PDB files may not be available.

Don't do it in PyMol and instead use the Biopython structure downloader (http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc187).

Also, if download speed is limiting you then consider using the MMTF file format, which is a binary format and hence smaller. Biopython can read in and write out MMTF files.

answered 5 hours ago

jgreener

1315 bronze badges

$begingroup$
Thanks point 2 was exactly how I solved the issue (and it worked). Let me look at the Biopython solution. I looked in detail at MMTF but it that appeared more aimed at pdb->MMTF.
$endgroup$
– Michael G.
1 hour ago

add a comment |

To expand on the comment by marcin:

fetch downloads files in mmCIF format by default (https://pymolwiki.org/index.php/Fetch). Not all PDB entries have PDB format files, e.g. due to too many chains. Presumably this is why the change was made, though mmCIF files tend to be larger and hence download slower.

When calling save with the .pdb file extension the structure is converted to PDB format. Perhaps the converter uses the SEQRES records to define the sequence, and hence gives bad files when there are some residues not resolved.

To correct it you could try:

Saving as .cif and reading in with the Biopython mmCIF parser.

Providing the type=pdb argument to fetch, which will force PDB format downloads. Some PDB files may not be available.

Don't do it in PyMol and instead use the Biopython structure downloader (http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc187).

Also, if download speed is limiting you then consider using the MMTF file format, which is a binary format and hence smaller. Biopython can read in and write out MMTF files.

answered 5 hours ago

jgreener

1315 bronze badges

To expand on the comment by marcin:

fetch downloads files in mmCIF format by default (https://pymolwiki.org/index.php/Fetch). Not all PDB entries have PDB format files, e.g. due to too many chains. Presumably this is why the change was made, though mmCIF files tend to be larger and hence download slower.

When calling save with the .pdb file extension the structure is converted to PDB format. Perhaps the converter uses the SEQRES records to define the sequence, and hence gives bad files when there are some residues not resolved.

To correct it you could try:

Saving as .cif and reading in with the Biopython mmCIF parser.

Providing the type=pdb argument to fetch, which will force PDB format downloads. Some PDB files may not be available.

Don't do it in PyMol and instead use the Biopython structure downloader (http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc187).

Also, if download speed is limiting you then consider using the MMTF file format, which is a binary format and hence smaller. Biopython can read in and write out MMTF files.

answered 5 hours ago

jgreener

1315 bronze badges

answered 5 hours ago

jgreener

1315 bronze badges

answered 5 hours ago

jgreener

1315 bronze badges

answered 5 hours ago

jgreener

1315 bronze badges

$begingroup$
Thanks point 2 was exactly how I solved the issue (and it worked). Let me look at the Biopython solution. I looked in detail at MMTF but it that appeared more aimed at pdb->MMTF.
$endgroup$
– Michael G.
1 hour ago

add a comment |

$begingroup$
Thanks point 2 was exactly how I solved the issue (and it worked). Let me look at the Biopython solution. I looked in detail at MMTF but it that appeared more aimed at pdb->MMTF.
$endgroup$
– Michael G.
1 hour ago

Thanks point 2 was exactly how I solved the issue (and it worked). Let me look at the Biopython solution. I looked in detail at MMTF but it that appeared more aimed at pdb->MMTF.

– Michael G.
1 hour ago

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Bioinformatics Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Mfcttrf

1 Answer
1

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

Post as a guest

Popular posts from this blog

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

1 Answer
1

1 Answer
1

1 Answer
1