How to proceed when the baseline (state-of-the-art) published results claim much better performance than I can reproduce?How to get the data to reproduce a published result?Requesting raw data from previously published researchWhat should you do if you cannot reproduce published results?What to do when the claims in a paper are proven to be wrong?When replicating results, what should I do when the author didn't use the method the paper said they did?Why are papers accepted even if they don't release code or data to allow reproducibility?How much should we trust that the approach described in a paper solves the problem that it addresses as well as it claims to?Reproducing work of others - name them as conflicts of interest?Incentives for independent researchers to work on reproducible researchHow to deal with non-reproducible research?
How could Thanos survive this attack?
In what way were Renaissance battles like chess matches?
Trying to find a short story about the representative of a city discovering the "primitive" folk are actually more advanced
Totally Blind Chess
How can you castle legally in Chess960 when the castling rook is on the king's destination square?
Would the US government of the 1960’s be able to feasibly recreate a modern laptop?
/etc/shadow permissions security best practice (000 vs. 600 vs. 640)
Is a turbocharged piston aircraft the same thing as turboprop?
How likely are you to be injured by falling shot from a game shoot?
What does "T.O." mean?
Characteristic scale degrees
Instant coffee melts like chocolate
Is (manual) feature extraction outdated?
What spacing difference is acceptable with tile?
Does the on'yomi of 輪 (リン) have any relation to the English "ring", or is it a coincidence?
During a log backup is the data backed up to the start or end of the operation?
Is it possible to be admitted to CS PhD programs (in US) with scholarship at age 18?
A Caesar cipher in Python3
What are valid bugs
English equivalent of the Malayalam saying "don't stab/poke the dead body"?
Peano's dot notation
are there any security risks of using user generated html tags?
If someone orders a pizza in the US and doesn't pay for it, could they be arrested?
Is this change to how XP works in D&D 3.5 unbalanced?
How to proceed when the baseline (state-of-the-art) published results claim much better performance than I can reproduce?
How to get the data to reproduce a published result?Requesting raw data from previously published researchWhat should you do if you cannot reproduce published results?What to do when the claims in a paper are proven to be wrong?When replicating results, what should I do when the author didn't use the method the paper said they did?Why are papers accepted even if they don't release code or data to allow reproducibility?How much should we trust that the approach described in a paper solves the problem that it addresses as well as it claims to?Reproducing work of others - name them as conflicts of interest?Incentives for independent researchers to work on reproducible researchHow to deal with non-reproducible research?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty
margin-bottom:0;
I am graduate student, to finish my degree I need to build methods outperform what is already there. An issue that I came across with, is that two papers reported way (I mean more than 20%) more than what resulted from my reimplementation. This could be due to two reasons:
I missed something during the implementation. Which is what I have been telling myself. For months, I tried all possible combinations and possible paths. One of the methods is straightforward. Still, I could not reach their claimed performance.
I contacted the corresponding authors, and no one replied. So I tried to contact the other authors.
The first paper, the author replied and sent me the code. He/she told me to keep all details ”confidential”. Well, it turns out they they are not using the data they claim in their the paper, of course their results are different than my reimplementation. And my implementation was correct.
The second paper author also replied and they didn’t send me the code because they say it is easy to implement, but confirmed that what I did is correct still I couldn’t understand why such difference.
Both papers are published in <2 impact factor journals. Their web servers are not working.
They are not honest.
Now I am stuck, my method does outperform my reimplementation of their methods but not what they claim. The first paper I can’t say anything because “it is confidential” the second paper I can only confirm that I correctly implemented their method for the most part (based on my chat with the authors)
I know that I probably could not publish on this part of my work, because who is going to believe a young scientist who just started her way?
But not sure how the committee are going to believe me. What can I say or do? Please help me
reproducible-research
|
show 10 more comments
I am graduate student, to finish my degree I need to build methods outperform what is already there. An issue that I came across with, is that two papers reported way (I mean more than 20%) more than what resulted from my reimplementation. This could be due to two reasons:
I missed something during the implementation. Which is what I have been telling myself. For months, I tried all possible combinations and possible paths. One of the methods is straightforward. Still, I could not reach their claimed performance.
I contacted the corresponding authors, and no one replied. So I tried to contact the other authors.
The first paper, the author replied and sent me the code. He/she told me to keep all details ”confidential”. Well, it turns out they they are not using the data they claim in their the paper, of course their results are different than my reimplementation. And my implementation was correct.
The second paper author also replied and they didn’t send me the code because they say it is easy to implement, but confirmed that what I did is correct still I couldn’t understand why such difference.
Both papers are published in <2 impact factor journals. Their web servers are not working.
They are not honest.
Now I am stuck, my method does outperform my reimplementation of their methods but not what they claim. The first paper I can’t say anything because “it is confidential” the second paper I can only confirm that I correctly implemented their method for the most part (based on my chat with the authors)
I know that I probably could not publish on this part of my work, because who is going to believe a young scientist who just started her way?
But not sure how the committee are going to believe me. What can I say or do? Please help me
reproducible-research
16
What is the measure of “performance” you are referring to and could its definition be the source of this 20% difference? As a software developer I’m used to seeing such performance gains (in runtime) just from small but significant optimization tweaks, that are very platform dependent.
– mvds
Oct 15 at 7:11
4
I can't really dispute that. But it's still proving the old theories wrong that make progress. Like when we figured out Newton's force/acceleration ideas really are insufficient when approaching c. What I'm trying to say, don't be afraid to disagree.
– Gloweye
Oct 15 at 9:05
10
Before assuming the worst (academic misconduct), you could take into account that there may be subtle differences (optimiser? better data structures?) that caused the 20% gain. I once had a student code something, and, for fun, coded it myself. My code was a factor 1000 faster for a more complicated case. It was not just the language, but the way I coded it that gained this speedup. Of course, if their code, with the original data, does not run that fast with you, then there may be something wrong. Again, it may be an honest mistake rather than misconduct. Report your results as side remark.
– Captain Emacs
Oct 15 at 11:20
5
3rd option: they made an honest mistake.
– daisy
Oct 15 at 14:34
5
"it turns out they they are not using the data they claim in their the paper"
Can we get some more details about this (or more emphasis of that particular part of the question)? The answers I've seen so far seem to assume that the differences might be due to an honest mistake, but If they published results on one dataset and claimed they were for another, that needs to be reported to the editor.
– Ray
Oct 16 at 13:23
|
show 10 more comments
I am graduate student, to finish my degree I need to build methods outperform what is already there. An issue that I came across with, is that two papers reported way (I mean more than 20%) more than what resulted from my reimplementation. This could be due to two reasons:
I missed something during the implementation. Which is what I have been telling myself. For months, I tried all possible combinations and possible paths. One of the methods is straightforward. Still, I could not reach their claimed performance.
I contacted the corresponding authors, and no one replied. So I tried to contact the other authors.
The first paper, the author replied and sent me the code. He/she told me to keep all details ”confidential”. Well, it turns out they they are not using the data they claim in their the paper, of course their results are different than my reimplementation. And my implementation was correct.
The second paper author also replied and they didn’t send me the code because they say it is easy to implement, but confirmed that what I did is correct still I couldn’t understand why such difference.
Both papers are published in <2 impact factor journals. Their web servers are not working.
They are not honest.
Now I am stuck, my method does outperform my reimplementation of their methods but not what they claim. The first paper I can’t say anything because “it is confidential” the second paper I can only confirm that I correctly implemented their method for the most part (based on my chat with the authors)
I know that I probably could not publish on this part of my work, because who is going to believe a young scientist who just started her way?
But not sure how the committee are going to believe me. What can I say or do? Please help me
reproducible-research
I am graduate student, to finish my degree I need to build methods outperform what is already there. An issue that I came across with, is that two papers reported way (I mean more than 20%) more than what resulted from my reimplementation. This could be due to two reasons:
I missed something during the implementation. Which is what I have been telling myself. For months, I tried all possible combinations and possible paths. One of the methods is straightforward. Still, I could not reach their claimed performance.
I contacted the corresponding authors, and no one replied. So I tried to contact the other authors.
The first paper, the author replied and sent me the code. He/she told me to keep all details ”confidential”. Well, it turns out they they are not using the data they claim in their the paper, of course their results are different than my reimplementation. And my implementation was correct.
The second paper author also replied and they didn’t send me the code because they say it is easy to implement, but confirmed that what I did is correct still I couldn’t understand why such difference.
Both papers are published in <2 impact factor journals. Their web servers are not working.
They are not honest.
Now I am stuck, my method does outperform my reimplementation of their methods but not what they claim. The first paper I can’t say anything because “it is confidential” the second paper I can only confirm that I correctly implemented their method for the most part (based on my chat with the authors)
I know that I probably could not publish on this part of my work, because who is going to believe a young scientist who just started her way?
But not sure how the committee are going to believe me. What can I say or do? Please help me
reproducible-research
reproducible-research
edited Oct 17 at 5:40
cag51♦
24.1k10 gold badges56 silver badges89 bronze badges
24.1k10 gold badges56 silver badges89 bronze badges
asked Oct 14 at 14:21
Monii_80Monii_80
3111 gold badge2 silver badges6 bronze badges
3111 gold badge2 silver badges6 bronze badges
16
What is the measure of “performance” you are referring to and could its definition be the source of this 20% difference? As a software developer I’m used to seeing such performance gains (in runtime) just from small but significant optimization tweaks, that are very platform dependent.
– mvds
Oct 15 at 7:11
4
I can't really dispute that. But it's still proving the old theories wrong that make progress. Like when we figured out Newton's force/acceleration ideas really are insufficient when approaching c. What I'm trying to say, don't be afraid to disagree.
– Gloweye
Oct 15 at 9:05
10
Before assuming the worst (academic misconduct), you could take into account that there may be subtle differences (optimiser? better data structures?) that caused the 20% gain. I once had a student code something, and, for fun, coded it myself. My code was a factor 1000 faster for a more complicated case. It was not just the language, but the way I coded it that gained this speedup. Of course, if their code, with the original data, does not run that fast with you, then there may be something wrong. Again, it may be an honest mistake rather than misconduct. Report your results as side remark.
– Captain Emacs
Oct 15 at 11:20
5
3rd option: they made an honest mistake.
– daisy
Oct 15 at 14:34
5
"it turns out they they are not using the data they claim in their the paper"
Can we get some more details about this (or more emphasis of that particular part of the question)? The answers I've seen so far seem to assume that the differences might be due to an honest mistake, but If they published results on one dataset and claimed they were for another, that needs to be reported to the editor.
– Ray
Oct 16 at 13:23
|
show 10 more comments
16
What is the measure of “performance” you are referring to and could its definition be the source of this 20% difference? As a software developer I’m used to seeing such performance gains (in runtime) just from small but significant optimization tweaks, that are very platform dependent.
– mvds
Oct 15 at 7:11
4
I can't really dispute that. But it's still proving the old theories wrong that make progress. Like when we figured out Newton's force/acceleration ideas really are insufficient when approaching c. What I'm trying to say, don't be afraid to disagree.
– Gloweye
Oct 15 at 9:05
10
Before assuming the worst (academic misconduct), you could take into account that there may be subtle differences (optimiser? better data structures?) that caused the 20% gain. I once had a student code something, and, for fun, coded it myself. My code was a factor 1000 faster for a more complicated case. It was not just the language, but the way I coded it that gained this speedup. Of course, if their code, with the original data, does not run that fast with you, then there may be something wrong. Again, it may be an honest mistake rather than misconduct. Report your results as side remark.
– Captain Emacs
Oct 15 at 11:20
5
3rd option: they made an honest mistake.
– daisy
Oct 15 at 14:34
5
"it turns out they they are not using the data they claim in their the paper"
Can we get some more details about this (or more emphasis of that particular part of the question)? The answers I've seen so far seem to assume that the differences might be due to an honest mistake, but If they published results on one dataset and claimed they were for another, that needs to be reported to the editor.
– Ray
Oct 16 at 13:23
16
16
What is the measure of “performance” you are referring to and could its definition be the source of this 20% difference? As a software developer I’m used to seeing such performance gains (in runtime) just from small but significant optimization tweaks, that are very platform dependent.
– mvds
Oct 15 at 7:11
What is the measure of “performance” you are referring to and could its definition be the source of this 20% difference? As a software developer I’m used to seeing such performance gains (in runtime) just from small but significant optimization tweaks, that are very platform dependent.
– mvds
Oct 15 at 7:11
4
4
I can't really dispute that. But it's still proving the old theories wrong that make progress. Like when we figured out Newton's force/acceleration ideas really are insufficient when approaching c. What I'm trying to say, don't be afraid to disagree.
– Gloweye
Oct 15 at 9:05
I can't really dispute that. But it's still proving the old theories wrong that make progress. Like when we figured out Newton's force/acceleration ideas really are insufficient when approaching c. What I'm trying to say, don't be afraid to disagree.
– Gloweye
Oct 15 at 9:05
10
10
Before assuming the worst (academic misconduct), you could take into account that there may be subtle differences (optimiser? better data structures?) that caused the 20% gain. I once had a student code something, and, for fun, coded it myself. My code was a factor 1000 faster for a more complicated case. It was not just the language, but the way I coded it that gained this speedup. Of course, if their code, with the original data, does not run that fast with you, then there may be something wrong. Again, it may be an honest mistake rather than misconduct. Report your results as side remark.
– Captain Emacs
Oct 15 at 11:20
Before assuming the worst (academic misconduct), you could take into account that there may be subtle differences (optimiser? better data structures?) that caused the 20% gain. I once had a student code something, and, for fun, coded it myself. My code was a factor 1000 faster for a more complicated case. It was not just the language, but the way I coded it that gained this speedup. Of course, if their code, with the original data, does not run that fast with you, then there may be something wrong. Again, it may be an honest mistake rather than misconduct. Report your results as side remark.
– Captain Emacs
Oct 15 at 11:20
5
5
3rd option: they made an honest mistake.
– daisy
Oct 15 at 14:34
3rd option: they made an honest mistake.
– daisy
Oct 15 at 14:34
5
5
"it turns out they they are not using the data they claim in their the paper"
Can we get some more details about this (or more emphasis of that particular part of the question)? The answers I've seen so far seem to assume that the differences might be due to an honest mistake, but If they published results on one dataset and claimed they were for another, that needs to be reported to the editor.– Ray
Oct 16 at 13:23
"it turns out they they are not using the data they claim in their the paper"
Can we get some more details about this (or more emphasis of that particular part of the question)? The answers I've seen so far seem to assume that the differences might be due to an honest mistake, but If they published results on one dataset and claimed they were for another, that needs to be reported to the editor.– Ray
Oct 16 at 13:23
|
show 10 more comments
6 Answers
6
active
oldest
votes
There is absolutely no reason that you can't publish a paper that says "We compared our method to methods X and Y. Since code the original code was not available for X and Y, we reimplemented the methods to the best of our ability. The code for these reimplementations is available in supplementary files A and B. Our new method out performed the reimplementations of X and Y by z%. However, it should be noted that it was not possible to reproduce the reported results for X and Y. "
People who want to know will have to look at your re-implementations and decide themselves if they think you have correctly re-implemented.
Seniority has nothing to do with it - be transparent, and the world will judge if they believe you or the people that won't release their code.
94
So much this. We need to encourage reproduction attempts in science. If OP made a mistake, then hopefully someone will find it and (kindly) point it out.
– Kathy
Oct 14 at 18:05
14
Honestly, I would also be tempted to write a short, slightly passive-agressive, statement to the extent that it was impossible to extract the original implementations from the authors. This kind of thing should be frowned upon much more than what it currently is.
– xLeitix
Oct 15 at 10:38
5
In an ideal world, this. But do check with your supervisor/colleagues/... to find out the nature of the beast: If one or more of them are relatively understated but vindictive people with actual power on the small sub-field you're specialized in, this can be bad (as they and their friends will review both your papers and applications, and reject). If they're vindictive cartoon villains, you'll reap more sympathy than rejections from journal editors etc. If they're scientists, they'll respect you more for standing up (with facts, not principles).
– user3445853
Oct 15 at 10:44
20
Indeed, make your own research fully reproducible and submit it. Make sure to inform the editor of the journal you publish in about whether you want to exclude the authors of the works in question from being referees. While they seem to be helpful now, they might bias the reviewing process. They might well not, of course, but it is better to stay on the safe side and aim for affiliations that are genuine third parties. Peer reviewing is not a perfect system, but it is the least imperfect, always perfectible and the dominant one anyhow. Your community will value your contribution.
– XavierStuvw
Oct 15 at 11:00
4
What I fear is that the paper is rejected because it lacks the "novelty factor".
– thermomagnetic condensed boson
Oct 15 at 20:20
|
show 5 more comments
People can be dishonest. They can also make honest mistakes and publish bad science. Don't assume that it is you who has an inferior result. And don't assume that a doctoral committee won't believe you. If they are competent to judge you without the earlier results they should be competent to understand what you have done.
However, I have two suggestions. The first is to walk through what you have done with your advisor and/or another faculty member who is most competent to understand your work. You may, indeed, have the best results. If you can get support there, then the larger committee should be no problem. I don't think that you need to hide the communication you got from your committee members. It may be necessary to explain why you can't believe the reported results from the other paper. I don't think that "confidentially" really applies here.
But the other is a bit harder. See if you can figure out exactly where the other group failed to match their methods to their results. If you can do that, then you have much stronger evidence for your own work.
The evidence you mention here seems pretty strong to me (an outsider) that the other paper has a problem. There is no reason not to contradict it if it is incorrect, for whatever reason.
There is alo another case: they found something which they interpreted as "numerical mistake" and dismissed it -- to get back to traditional results. They were not experts in numerical methods and honestly thought that what they saw was a glitch in the computation. Turned out their model was not good enough, the "glitch" was a hint about that and someone else built on their work to expand the model. They were not very happy with the outcome but were credited for the unfortunate smoothing of the result which led to the discovery. Happened in 1995 :)
– WoJ
Oct 17 at 12:29
add a comment
|
to finish my degree I need to build methods outperform what is already there
No, that is not true. You need to deliver a piece of proper scientific work and advance knowledge and that does not depend on what direction your findings point.
Of course, things are easier and more pleasant if your implementation is better. But the actual scientific part of your thesis is to study both the old and your approach scientifically and then conclude whether one is better (and possibly in which situations).
The difficulty in your situation is to proove that the discrepancy to literature is not due to your incompetence or lack of hard work (=> you deserve a bad mark) but actually due to "nature" not being as it was supposed to be by the previous paper.
What you can and should report is
- that you were not able to reproduce the findings in papers 1 + 2,
- in consequence have been in communication with the authors.
- Importandly, that your implementation has been confirmed as correct by private communication with the authors of paper 2 and by comparison with (confidential) code you received from the authors of paper 1 again by private communication for that purpose.
If
Well, it turns out they they are not using the data they claim in their the paper, of course their results are different than my reimplementation.
means that you got the data set they actually used and got the same results with that, then you can also report that for a related data set, the same results were obtained.
If not, it may be possible to kindly ask the authors of paper 1 + 2 whether they'd run a data set you send them and give you the results of their implementations so you can compare that to your results. You can then report (hopefully) that equal results were obtained on a different data set and thank the authors of those papers for running your data.
The last two points should make amply clear that the discrepancy is not due to a fault in your implementation - which is what counts for your thesis.
As a personal side note, I got top grade on my Diplom (≈ Master) thesis which (among other findings) found that the software implementation I was using did not work as it was supposed to. I was able to point out a plausible and probable reason for that bug (which may have been a leftover debugging "feature") - which is much harder for you as you don't have access to a running instance of their software that you can test (= study) to form and confirm or dismiss hypotheses about its behaviour.
As an addition to what @Buffy explained already about the possibility of honest mistakes in published papers:
As scientists we tend to work at the edge of what is known. Which also means that we're inherently running a high risk of not (yet) knowing/having realized important conditions and limitations of what we are doing.
We thus also run a comparatively high risk that tentative generalizations we consider may turn out to be not all that general after all. Or that we may be plain wrong and realize this only later (or not at all). I believe it is very hard for humans to be completely aware of the limitations of the conclusions we draw - possibly/probably because our brains are "hardwired" to overfit. (Which also puts us into a bad starting position for avoiding overfitting in e.g. machine learning models we build)
The take-home message from this is that we need to be careful also when reading published papers: we need to keep the possibility of the paper being wrong, containing honest mistakes or not being as directly applicable to our task at hand as we believe at the first glance in mind.
I missed something during the implementation.
I experienced something similar once when I was also implementing a reference method from literature (related but different field). It turned out that different defaults in the preprocessing of the data caused the difference - but only after I had the bright idea of trying out to omit a preprocessing step - although the model doesn't make much sense physically without that step, but the paper didn't mention any such step (neither do many papers in my field who do use that step because it is considered necessary because of physics).
- They are not honest.
While that is of course possible, I've seen sufficient honest mistakes to use Hanlon's razor (which I first met as Murphy's razor): and not assume dishonesty or misconduct unless there are extremely strong indications for that.
Proving superiority may in any case be something that is impossible due to limitations in the old paper.
E.g. if they report validation results based on a small number of cases, the uncertainty on those results may be so large and thus it cannot be excluded that the method is better than it seemed that truly improved methods later on will not be able to demonstrate their superiority in a statistically sound manner.
Still, such a shortcoming of the old paper does not limit the scientific content or advance of your work.
6
Too bad I can give only a single +1. Publishing negative results, i.e. proving that something does not work, is very important as it saves other scientists going down a path that does lead nowhere.
– Dohn Joe
Oct 15 at 12:21
3
Citation of Hazon's razor relevant and valued: Never attribute to malice that which is adequately explained by stupidity
– XavierStuvw
Oct 15 at 15:58
2
@XavierStuvw you mean Hanlon's razor?
– slebetman
Oct 17 at 1:03
@slebetman Thanks, I did as per the link target, but it was too late to edit when I realized that my wires were crossed. This note lends the possibility to add a corollary or identify a premise: malice and stupidity can be equally obnoxious
– XavierStuvw
Oct 17 at 8:09
add a comment
|
You can write that you used your implementation of the competing method for your results, and that you were not able to reproduce the published results. Make your code available so people can check.
It seems that the authors of the other papers didn't publish their code, so nobody can say you should've used that.
add a comment
|
In the first place, you should consult this with your supervisors. The code for papers is often rushed and unfinished, and what works on one machine may not work on another for a number of reasons. The most reasonable way is to let your supervisors know that you implemented both methods, communicated with original authors (mention only non-confidential things/say some things are confidential/ask authors for permission to discuss the implementation with your supervisor), and yet you did not reach the performance claimed. As a senior academic capacity, they are better equipped to decide what to do with regards to politics of department/field/research teams, are bound to get quicker and more elaborate responses from authors of the papers and handle potential fallout should anything go wrong in the process. I would not advise to pursue this matter on your own, and surely if you have doubts about something this important to your project, it would be reasonable to seek their advise and they will understand that.
1
"ask authors for permission to discuss the implementation with your supervisor" 'confidential' rather means that the code you should be published.
– FooBar
Oct 15 at 7:05
1
I think that @jericho-jones answer is the most appropriate one. Also, take this as a lesson. Use good coding practices. Check your work into a source code repository. Tag your working code so you can easily back out changes. Make everything reproducible, including processing any raw data. Make the raw data read-only using file permissions, deny database update/insert/delete statements, etc. Make sure your code builds every day, or every other day. Make sure a change builds and passes all tests, and reproduces already known results before checking into your source code repository.
– Daisuke Aramaki
Oct 16 at 13:06
add a comment
|
In addition to the other answers, you should consider publishing your re-implementation. Then any reviewers can check if they think your results are plausible or if they spot a flaw in your re-implementation.
In the first case, it is right to say "We implemented paper X, but could not reproduce the claimed efficiency" and in the second case the flaw found by the reviewer may help you to improve your re-implementation, so you achieve a similar result.
Most reviewers will not debug your code, but you did your best efford to allow anyone to verify your claims of less efficiency and at least your paper is as honest as possible.
If the algorithm is interesting, publishing an open source version may get some users, that point out issues with your code (or contribute improvements) as well. But make sure not to be too close to the confidential code, as the original authors may claim copyright infringement.
You may use clean room reverse engineering with another person or at least do it yourself by just using the given code to write down the parts missing in the paper and then reimplement it from the documentation and not from the code.
add a comment
|
protected by Alexandros Oct 17 at 8:51
Thank you for your interest in this question.
Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).
Would you like to answer one of these unanswered questions instead?
6 Answers
6
active
oldest
votes
6 Answers
6
active
oldest
votes
active
oldest
votes
active
oldest
votes
There is absolutely no reason that you can't publish a paper that says "We compared our method to methods X and Y. Since code the original code was not available for X and Y, we reimplemented the methods to the best of our ability. The code for these reimplementations is available in supplementary files A and B. Our new method out performed the reimplementations of X and Y by z%. However, it should be noted that it was not possible to reproduce the reported results for X and Y. "
People who want to know will have to look at your re-implementations and decide themselves if they think you have correctly re-implemented.
Seniority has nothing to do with it - be transparent, and the world will judge if they believe you or the people that won't release their code.
94
So much this. We need to encourage reproduction attempts in science. If OP made a mistake, then hopefully someone will find it and (kindly) point it out.
– Kathy
Oct 14 at 18:05
14
Honestly, I would also be tempted to write a short, slightly passive-agressive, statement to the extent that it was impossible to extract the original implementations from the authors. This kind of thing should be frowned upon much more than what it currently is.
– xLeitix
Oct 15 at 10:38
5
In an ideal world, this. But do check with your supervisor/colleagues/... to find out the nature of the beast: If one or more of them are relatively understated but vindictive people with actual power on the small sub-field you're specialized in, this can be bad (as they and their friends will review both your papers and applications, and reject). If they're vindictive cartoon villains, you'll reap more sympathy than rejections from journal editors etc. If they're scientists, they'll respect you more for standing up (with facts, not principles).
– user3445853
Oct 15 at 10:44
20
Indeed, make your own research fully reproducible and submit it. Make sure to inform the editor of the journal you publish in about whether you want to exclude the authors of the works in question from being referees. While they seem to be helpful now, they might bias the reviewing process. They might well not, of course, but it is better to stay on the safe side and aim for affiliations that are genuine third parties. Peer reviewing is not a perfect system, but it is the least imperfect, always perfectible and the dominant one anyhow. Your community will value your contribution.
– XavierStuvw
Oct 15 at 11:00
4
What I fear is that the paper is rejected because it lacks the "novelty factor".
– thermomagnetic condensed boson
Oct 15 at 20:20
|
show 5 more comments
There is absolutely no reason that you can't publish a paper that says "We compared our method to methods X and Y. Since code the original code was not available for X and Y, we reimplemented the methods to the best of our ability. The code for these reimplementations is available in supplementary files A and B. Our new method out performed the reimplementations of X and Y by z%. However, it should be noted that it was not possible to reproduce the reported results for X and Y. "
People who want to know will have to look at your re-implementations and decide themselves if they think you have correctly re-implemented.
Seniority has nothing to do with it - be transparent, and the world will judge if they believe you or the people that won't release their code.
94
So much this. We need to encourage reproduction attempts in science. If OP made a mistake, then hopefully someone will find it and (kindly) point it out.
– Kathy
Oct 14 at 18:05
14
Honestly, I would also be tempted to write a short, slightly passive-agressive, statement to the extent that it was impossible to extract the original implementations from the authors. This kind of thing should be frowned upon much more than what it currently is.
– xLeitix
Oct 15 at 10:38
5
In an ideal world, this. But do check with your supervisor/colleagues/... to find out the nature of the beast: If one or more of them are relatively understated but vindictive people with actual power on the small sub-field you're specialized in, this can be bad (as they and their friends will review both your papers and applications, and reject). If they're vindictive cartoon villains, you'll reap more sympathy than rejections from journal editors etc. If they're scientists, they'll respect you more for standing up (with facts, not principles).
– user3445853
Oct 15 at 10:44
20
Indeed, make your own research fully reproducible and submit it. Make sure to inform the editor of the journal you publish in about whether you want to exclude the authors of the works in question from being referees. While they seem to be helpful now, they might bias the reviewing process. They might well not, of course, but it is better to stay on the safe side and aim for affiliations that are genuine third parties. Peer reviewing is not a perfect system, but it is the least imperfect, always perfectible and the dominant one anyhow. Your community will value your contribution.
– XavierStuvw
Oct 15 at 11:00
4
What I fear is that the paper is rejected because it lacks the "novelty factor".
– thermomagnetic condensed boson
Oct 15 at 20:20
|
show 5 more comments
There is absolutely no reason that you can't publish a paper that says "We compared our method to methods X and Y. Since code the original code was not available for X and Y, we reimplemented the methods to the best of our ability. The code for these reimplementations is available in supplementary files A and B. Our new method out performed the reimplementations of X and Y by z%. However, it should be noted that it was not possible to reproduce the reported results for X and Y. "
People who want to know will have to look at your re-implementations and decide themselves if they think you have correctly re-implemented.
Seniority has nothing to do with it - be transparent, and the world will judge if they believe you or the people that won't release their code.
There is absolutely no reason that you can't publish a paper that says "We compared our method to methods X and Y. Since code the original code was not available for X and Y, we reimplemented the methods to the best of our ability. The code for these reimplementations is available in supplementary files A and B. Our new method out performed the reimplementations of X and Y by z%. However, it should be noted that it was not possible to reproduce the reported results for X and Y. "
People who want to know will have to look at your re-implementations and decide themselves if they think you have correctly re-implemented.
Seniority has nothing to do with it - be transparent, and the world will judge if they believe you or the people that won't release their code.
answered Oct 14 at 16:39
Ian SudberyIan Sudbery
7,5591 gold badge18 silver badges25 bronze badges
7,5591 gold badge18 silver badges25 bronze badges
94
So much this. We need to encourage reproduction attempts in science. If OP made a mistake, then hopefully someone will find it and (kindly) point it out.
– Kathy
Oct 14 at 18:05
14
Honestly, I would also be tempted to write a short, slightly passive-agressive, statement to the extent that it was impossible to extract the original implementations from the authors. This kind of thing should be frowned upon much more than what it currently is.
– xLeitix
Oct 15 at 10:38
5
In an ideal world, this. But do check with your supervisor/colleagues/... to find out the nature of the beast: If one or more of them are relatively understated but vindictive people with actual power on the small sub-field you're specialized in, this can be bad (as they and their friends will review both your papers and applications, and reject). If they're vindictive cartoon villains, you'll reap more sympathy than rejections from journal editors etc. If they're scientists, they'll respect you more for standing up (with facts, not principles).
– user3445853
Oct 15 at 10:44
20
Indeed, make your own research fully reproducible and submit it. Make sure to inform the editor of the journal you publish in about whether you want to exclude the authors of the works in question from being referees. While they seem to be helpful now, they might bias the reviewing process. They might well not, of course, but it is better to stay on the safe side and aim for affiliations that are genuine third parties. Peer reviewing is not a perfect system, but it is the least imperfect, always perfectible and the dominant one anyhow. Your community will value your contribution.
– XavierStuvw
Oct 15 at 11:00
4
What I fear is that the paper is rejected because it lacks the "novelty factor".
– thermomagnetic condensed boson
Oct 15 at 20:20
|
show 5 more comments
94
So much this. We need to encourage reproduction attempts in science. If OP made a mistake, then hopefully someone will find it and (kindly) point it out.
– Kathy
Oct 14 at 18:05
14
Honestly, I would also be tempted to write a short, slightly passive-agressive, statement to the extent that it was impossible to extract the original implementations from the authors. This kind of thing should be frowned upon much more than what it currently is.
– xLeitix
Oct 15 at 10:38
5
In an ideal world, this. But do check with your supervisor/colleagues/... to find out the nature of the beast: If one or more of them are relatively understated but vindictive people with actual power on the small sub-field you're specialized in, this can be bad (as they and their friends will review both your papers and applications, and reject). If they're vindictive cartoon villains, you'll reap more sympathy than rejections from journal editors etc. If they're scientists, they'll respect you more for standing up (with facts, not principles).
– user3445853
Oct 15 at 10:44
20
Indeed, make your own research fully reproducible and submit it. Make sure to inform the editor of the journal you publish in about whether you want to exclude the authors of the works in question from being referees. While they seem to be helpful now, they might bias the reviewing process. They might well not, of course, but it is better to stay on the safe side and aim for affiliations that are genuine third parties. Peer reviewing is not a perfect system, but it is the least imperfect, always perfectible and the dominant one anyhow. Your community will value your contribution.
– XavierStuvw
Oct 15 at 11:00
4
What I fear is that the paper is rejected because it lacks the "novelty factor".
– thermomagnetic condensed boson
Oct 15 at 20:20
94
94
So much this. We need to encourage reproduction attempts in science. If OP made a mistake, then hopefully someone will find it and (kindly) point it out.
– Kathy
Oct 14 at 18:05
So much this. We need to encourage reproduction attempts in science. If OP made a mistake, then hopefully someone will find it and (kindly) point it out.
– Kathy
Oct 14 at 18:05
14
14
Honestly, I would also be tempted to write a short, slightly passive-agressive, statement to the extent that it was impossible to extract the original implementations from the authors. This kind of thing should be frowned upon much more than what it currently is.
– xLeitix
Oct 15 at 10:38
Honestly, I would also be tempted to write a short, slightly passive-agressive, statement to the extent that it was impossible to extract the original implementations from the authors. This kind of thing should be frowned upon much more than what it currently is.
– xLeitix
Oct 15 at 10:38
5
5
In an ideal world, this. But do check with your supervisor/colleagues/... to find out the nature of the beast: If one or more of them are relatively understated but vindictive people with actual power on the small sub-field you're specialized in, this can be bad (as they and their friends will review both your papers and applications, and reject). If they're vindictive cartoon villains, you'll reap more sympathy than rejections from journal editors etc. If they're scientists, they'll respect you more for standing up (with facts, not principles).
– user3445853
Oct 15 at 10:44
In an ideal world, this. But do check with your supervisor/colleagues/... to find out the nature of the beast: If one or more of them are relatively understated but vindictive people with actual power on the small sub-field you're specialized in, this can be bad (as they and their friends will review both your papers and applications, and reject). If they're vindictive cartoon villains, you'll reap more sympathy than rejections from journal editors etc. If they're scientists, they'll respect you more for standing up (with facts, not principles).
– user3445853
Oct 15 at 10:44
20
20
Indeed, make your own research fully reproducible and submit it. Make sure to inform the editor of the journal you publish in about whether you want to exclude the authors of the works in question from being referees. While they seem to be helpful now, they might bias the reviewing process. They might well not, of course, but it is better to stay on the safe side and aim for affiliations that are genuine third parties. Peer reviewing is not a perfect system, but it is the least imperfect, always perfectible and the dominant one anyhow. Your community will value your contribution.
– XavierStuvw
Oct 15 at 11:00
Indeed, make your own research fully reproducible and submit it. Make sure to inform the editor of the journal you publish in about whether you want to exclude the authors of the works in question from being referees. While they seem to be helpful now, they might bias the reviewing process. They might well not, of course, but it is better to stay on the safe side and aim for affiliations that are genuine third parties. Peer reviewing is not a perfect system, but it is the least imperfect, always perfectible and the dominant one anyhow. Your community will value your contribution.
– XavierStuvw
Oct 15 at 11:00
4
4
What I fear is that the paper is rejected because it lacks the "novelty factor".
– thermomagnetic condensed boson
Oct 15 at 20:20
What I fear is that the paper is rejected because it lacks the "novelty factor".
– thermomagnetic condensed boson
Oct 15 at 20:20
|
show 5 more comments
People can be dishonest. They can also make honest mistakes and publish bad science. Don't assume that it is you who has an inferior result. And don't assume that a doctoral committee won't believe you. If they are competent to judge you without the earlier results they should be competent to understand what you have done.
However, I have two suggestions. The first is to walk through what you have done with your advisor and/or another faculty member who is most competent to understand your work. You may, indeed, have the best results. If you can get support there, then the larger committee should be no problem. I don't think that you need to hide the communication you got from your committee members. It may be necessary to explain why you can't believe the reported results from the other paper. I don't think that "confidentially" really applies here.
But the other is a bit harder. See if you can figure out exactly where the other group failed to match their methods to their results. If you can do that, then you have much stronger evidence for your own work.
The evidence you mention here seems pretty strong to me (an outsider) that the other paper has a problem. There is no reason not to contradict it if it is incorrect, for whatever reason.
There is alo another case: they found something which they interpreted as "numerical mistake" and dismissed it -- to get back to traditional results. They were not experts in numerical methods and honestly thought that what they saw was a glitch in the computation. Turned out their model was not good enough, the "glitch" was a hint about that and someone else built on their work to expand the model. They were not very happy with the outcome but were credited for the unfortunate smoothing of the result which led to the discovery. Happened in 1995 :)
– WoJ
Oct 17 at 12:29
add a comment
|
People can be dishonest. They can also make honest mistakes and publish bad science. Don't assume that it is you who has an inferior result. And don't assume that a doctoral committee won't believe you. If they are competent to judge you without the earlier results they should be competent to understand what you have done.
However, I have two suggestions. The first is to walk through what you have done with your advisor and/or another faculty member who is most competent to understand your work. You may, indeed, have the best results. If you can get support there, then the larger committee should be no problem. I don't think that you need to hide the communication you got from your committee members. It may be necessary to explain why you can't believe the reported results from the other paper. I don't think that "confidentially" really applies here.
But the other is a bit harder. See if you can figure out exactly where the other group failed to match their methods to their results. If you can do that, then you have much stronger evidence for your own work.
The evidence you mention here seems pretty strong to me (an outsider) that the other paper has a problem. There is no reason not to contradict it if it is incorrect, for whatever reason.
There is alo another case: they found something which they interpreted as "numerical mistake" and dismissed it -- to get back to traditional results. They were not experts in numerical methods and honestly thought that what they saw was a glitch in the computation. Turned out their model was not good enough, the "glitch" was a hint about that and someone else built on their work to expand the model. They were not very happy with the outcome but were credited for the unfortunate smoothing of the result which led to the discovery. Happened in 1995 :)
– WoJ
Oct 17 at 12:29
add a comment
|
People can be dishonest. They can also make honest mistakes and publish bad science. Don't assume that it is you who has an inferior result. And don't assume that a doctoral committee won't believe you. If they are competent to judge you without the earlier results they should be competent to understand what you have done.
However, I have two suggestions. The first is to walk through what you have done with your advisor and/or another faculty member who is most competent to understand your work. You may, indeed, have the best results. If you can get support there, then the larger committee should be no problem. I don't think that you need to hide the communication you got from your committee members. It may be necessary to explain why you can't believe the reported results from the other paper. I don't think that "confidentially" really applies here.
But the other is a bit harder. See if you can figure out exactly where the other group failed to match their methods to their results. If you can do that, then you have much stronger evidence for your own work.
The evidence you mention here seems pretty strong to me (an outsider) that the other paper has a problem. There is no reason not to contradict it if it is incorrect, for whatever reason.
People can be dishonest. They can also make honest mistakes and publish bad science. Don't assume that it is you who has an inferior result. And don't assume that a doctoral committee won't believe you. If they are competent to judge you without the earlier results they should be competent to understand what you have done.
However, I have two suggestions. The first is to walk through what you have done with your advisor and/or another faculty member who is most competent to understand your work. You may, indeed, have the best results. If you can get support there, then the larger committee should be no problem. I don't think that you need to hide the communication you got from your committee members. It may be necessary to explain why you can't believe the reported results from the other paper. I don't think that "confidentially" really applies here.
But the other is a bit harder. See if you can figure out exactly where the other group failed to match their methods to their results. If you can do that, then you have much stronger evidence for your own work.
The evidence you mention here seems pretty strong to me (an outsider) that the other paper has a problem. There is no reason not to contradict it if it is incorrect, for whatever reason.
answered Oct 14 at 14:51
BuffyBuffy
91.8k23 gold badges283 silver badges394 bronze badges
91.8k23 gold badges283 silver badges394 bronze badges
There is alo another case: they found something which they interpreted as "numerical mistake" and dismissed it -- to get back to traditional results. They were not experts in numerical methods and honestly thought that what they saw was a glitch in the computation. Turned out their model was not good enough, the "glitch" was a hint about that and someone else built on their work to expand the model. They were not very happy with the outcome but were credited for the unfortunate smoothing of the result which led to the discovery. Happened in 1995 :)
– WoJ
Oct 17 at 12:29
add a comment
|
There is alo another case: they found something which they interpreted as "numerical mistake" and dismissed it -- to get back to traditional results. They were not experts in numerical methods and honestly thought that what they saw was a glitch in the computation. Turned out their model was not good enough, the "glitch" was a hint about that and someone else built on their work to expand the model. They were not very happy with the outcome but were credited for the unfortunate smoothing of the result which led to the discovery. Happened in 1995 :)
– WoJ
Oct 17 at 12:29
There is alo another case: they found something which they interpreted as "numerical mistake" and dismissed it -- to get back to traditional results. They were not experts in numerical methods and honestly thought that what they saw was a glitch in the computation. Turned out their model was not good enough, the "glitch" was a hint about that and someone else built on their work to expand the model. They were not very happy with the outcome but were credited for the unfortunate smoothing of the result which led to the discovery. Happened in 1995 :)
– WoJ
Oct 17 at 12:29
There is alo another case: they found something which they interpreted as "numerical mistake" and dismissed it -- to get back to traditional results. They were not experts in numerical methods and honestly thought that what they saw was a glitch in the computation. Turned out their model was not good enough, the "glitch" was a hint about that and someone else built on their work to expand the model. They were not very happy with the outcome but were credited for the unfortunate smoothing of the result which led to the discovery. Happened in 1995 :)
– WoJ
Oct 17 at 12:29
add a comment
|
to finish my degree I need to build methods outperform what is already there
No, that is not true. You need to deliver a piece of proper scientific work and advance knowledge and that does not depend on what direction your findings point.
Of course, things are easier and more pleasant if your implementation is better. But the actual scientific part of your thesis is to study both the old and your approach scientifically and then conclude whether one is better (and possibly in which situations).
The difficulty in your situation is to proove that the discrepancy to literature is not due to your incompetence or lack of hard work (=> you deserve a bad mark) but actually due to "nature" not being as it was supposed to be by the previous paper.
What you can and should report is
- that you were not able to reproduce the findings in papers 1 + 2,
- in consequence have been in communication with the authors.
- Importandly, that your implementation has been confirmed as correct by private communication with the authors of paper 2 and by comparison with (confidential) code you received from the authors of paper 1 again by private communication for that purpose.
If
Well, it turns out they they are not using the data they claim in their the paper, of course their results are different than my reimplementation.
means that you got the data set they actually used and got the same results with that, then you can also report that for a related data set, the same results were obtained.
If not, it may be possible to kindly ask the authors of paper 1 + 2 whether they'd run a data set you send them and give you the results of their implementations so you can compare that to your results. You can then report (hopefully) that equal results were obtained on a different data set and thank the authors of those papers for running your data.
The last two points should make amply clear that the discrepancy is not due to a fault in your implementation - which is what counts for your thesis.
As a personal side note, I got top grade on my Diplom (≈ Master) thesis which (among other findings) found that the software implementation I was using did not work as it was supposed to. I was able to point out a plausible and probable reason for that bug (which may have been a leftover debugging "feature") - which is much harder for you as you don't have access to a running instance of their software that you can test (= study) to form and confirm or dismiss hypotheses about its behaviour.
As an addition to what @Buffy explained already about the possibility of honest mistakes in published papers:
As scientists we tend to work at the edge of what is known. Which also means that we're inherently running a high risk of not (yet) knowing/having realized important conditions and limitations of what we are doing.
We thus also run a comparatively high risk that tentative generalizations we consider may turn out to be not all that general after all. Or that we may be plain wrong and realize this only later (or not at all). I believe it is very hard for humans to be completely aware of the limitations of the conclusions we draw - possibly/probably because our brains are "hardwired" to overfit. (Which also puts us into a bad starting position for avoiding overfitting in e.g. machine learning models we build)
The take-home message from this is that we need to be careful also when reading published papers: we need to keep the possibility of the paper being wrong, containing honest mistakes or not being as directly applicable to our task at hand as we believe at the first glance in mind.
I missed something during the implementation.
I experienced something similar once when I was also implementing a reference method from literature (related but different field). It turned out that different defaults in the preprocessing of the data caused the difference - but only after I had the bright idea of trying out to omit a preprocessing step - although the model doesn't make much sense physically without that step, but the paper didn't mention any such step (neither do many papers in my field who do use that step because it is considered necessary because of physics).
- They are not honest.
While that is of course possible, I've seen sufficient honest mistakes to use Hanlon's razor (which I first met as Murphy's razor): and not assume dishonesty or misconduct unless there are extremely strong indications for that.
Proving superiority may in any case be something that is impossible due to limitations in the old paper.
E.g. if they report validation results based on a small number of cases, the uncertainty on those results may be so large and thus it cannot be excluded that the method is better than it seemed that truly improved methods later on will not be able to demonstrate their superiority in a statistically sound manner.
Still, such a shortcoming of the old paper does not limit the scientific content or advance of your work.
6
Too bad I can give only a single +1. Publishing negative results, i.e. proving that something does not work, is very important as it saves other scientists going down a path that does lead nowhere.
– Dohn Joe
Oct 15 at 12:21
3
Citation of Hazon's razor relevant and valued: Never attribute to malice that which is adequately explained by stupidity
– XavierStuvw
Oct 15 at 15:58
2
@XavierStuvw you mean Hanlon's razor?
– slebetman
Oct 17 at 1:03
@slebetman Thanks, I did as per the link target, but it was too late to edit when I realized that my wires were crossed. This note lends the possibility to add a corollary or identify a premise: malice and stupidity can be equally obnoxious
– XavierStuvw
Oct 17 at 8:09
add a comment
|
to finish my degree I need to build methods outperform what is already there
No, that is not true. You need to deliver a piece of proper scientific work and advance knowledge and that does not depend on what direction your findings point.
Of course, things are easier and more pleasant if your implementation is better. But the actual scientific part of your thesis is to study both the old and your approach scientifically and then conclude whether one is better (and possibly in which situations).
The difficulty in your situation is to proove that the discrepancy to literature is not due to your incompetence or lack of hard work (=> you deserve a bad mark) but actually due to "nature" not being as it was supposed to be by the previous paper.
What you can and should report is
- that you were not able to reproduce the findings in papers 1 + 2,
- in consequence have been in communication with the authors.
- Importandly, that your implementation has been confirmed as correct by private communication with the authors of paper 2 and by comparison with (confidential) code you received from the authors of paper 1 again by private communication for that purpose.
If
Well, it turns out they they are not using the data they claim in their the paper, of course their results are different than my reimplementation.
means that you got the data set they actually used and got the same results with that, then you can also report that for a related data set, the same results were obtained.
If not, it may be possible to kindly ask the authors of paper 1 + 2 whether they'd run a data set you send them and give you the results of their implementations so you can compare that to your results. You can then report (hopefully) that equal results were obtained on a different data set and thank the authors of those papers for running your data.
The last two points should make amply clear that the discrepancy is not due to a fault in your implementation - which is what counts for your thesis.
As a personal side note, I got top grade on my Diplom (≈ Master) thesis which (among other findings) found that the software implementation I was using did not work as it was supposed to. I was able to point out a plausible and probable reason for that bug (which may have been a leftover debugging "feature") - which is much harder for you as you don't have access to a running instance of their software that you can test (= study) to form and confirm or dismiss hypotheses about its behaviour.
As an addition to what @Buffy explained already about the possibility of honest mistakes in published papers:
As scientists we tend to work at the edge of what is known. Which also means that we're inherently running a high risk of not (yet) knowing/having realized important conditions and limitations of what we are doing.
We thus also run a comparatively high risk that tentative generalizations we consider may turn out to be not all that general after all. Or that we may be plain wrong and realize this only later (or not at all). I believe it is very hard for humans to be completely aware of the limitations of the conclusions we draw - possibly/probably because our brains are "hardwired" to overfit. (Which also puts us into a bad starting position for avoiding overfitting in e.g. machine learning models we build)
The take-home message from this is that we need to be careful also when reading published papers: we need to keep the possibility of the paper being wrong, containing honest mistakes or not being as directly applicable to our task at hand as we believe at the first glance in mind.
I missed something during the implementation.
I experienced something similar once when I was also implementing a reference method from literature (related but different field). It turned out that different defaults in the preprocessing of the data caused the difference - but only after I had the bright idea of trying out to omit a preprocessing step - although the model doesn't make much sense physically without that step, but the paper didn't mention any such step (neither do many papers in my field who do use that step because it is considered necessary because of physics).
- They are not honest.
While that is of course possible, I've seen sufficient honest mistakes to use Hanlon's razor (which I first met as Murphy's razor): and not assume dishonesty or misconduct unless there are extremely strong indications for that.
Proving superiority may in any case be something that is impossible due to limitations in the old paper.
E.g. if they report validation results based on a small number of cases, the uncertainty on those results may be so large and thus it cannot be excluded that the method is better than it seemed that truly improved methods later on will not be able to demonstrate their superiority in a statistically sound manner.
Still, such a shortcoming of the old paper does not limit the scientific content or advance of your work.
6
Too bad I can give only a single +1. Publishing negative results, i.e. proving that something does not work, is very important as it saves other scientists going down a path that does lead nowhere.
– Dohn Joe
Oct 15 at 12:21
3
Citation of Hazon's razor relevant and valued: Never attribute to malice that which is adequately explained by stupidity
– XavierStuvw
Oct 15 at 15:58
2
@XavierStuvw you mean Hanlon's razor?
– slebetman
Oct 17 at 1:03
@slebetman Thanks, I did as per the link target, but it was too late to edit when I realized that my wires were crossed. This note lends the possibility to add a corollary or identify a premise: malice and stupidity can be equally obnoxious
– XavierStuvw
Oct 17 at 8:09
add a comment
|
to finish my degree I need to build methods outperform what is already there
No, that is not true. You need to deliver a piece of proper scientific work and advance knowledge and that does not depend on what direction your findings point.
Of course, things are easier and more pleasant if your implementation is better. But the actual scientific part of your thesis is to study both the old and your approach scientifically and then conclude whether one is better (and possibly in which situations).
The difficulty in your situation is to proove that the discrepancy to literature is not due to your incompetence or lack of hard work (=> you deserve a bad mark) but actually due to "nature" not being as it was supposed to be by the previous paper.
What you can and should report is
- that you were not able to reproduce the findings in papers 1 + 2,
- in consequence have been in communication with the authors.
- Importandly, that your implementation has been confirmed as correct by private communication with the authors of paper 2 and by comparison with (confidential) code you received from the authors of paper 1 again by private communication for that purpose.
If
Well, it turns out they they are not using the data they claim in their the paper, of course their results are different than my reimplementation.
means that you got the data set they actually used and got the same results with that, then you can also report that for a related data set, the same results were obtained.
If not, it may be possible to kindly ask the authors of paper 1 + 2 whether they'd run a data set you send them and give you the results of their implementations so you can compare that to your results. You can then report (hopefully) that equal results were obtained on a different data set and thank the authors of those papers for running your data.
The last two points should make amply clear that the discrepancy is not due to a fault in your implementation - which is what counts for your thesis.
As a personal side note, I got top grade on my Diplom (≈ Master) thesis which (among other findings) found that the software implementation I was using did not work as it was supposed to. I was able to point out a plausible and probable reason for that bug (which may have been a leftover debugging "feature") - which is much harder for you as you don't have access to a running instance of their software that you can test (= study) to form and confirm or dismiss hypotheses about its behaviour.
As an addition to what @Buffy explained already about the possibility of honest mistakes in published papers:
As scientists we tend to work at the edge of what is known. Which also means that we're inherently running a high risk of not (yet) knowing/having realized important conditions and limitations of what we are doing.
We thus also run a comparatively high risk that tentative generalizations we consider may turn out to be not all that general after all. Or that we may be plain wrong and realize this only later (or not at all). I believe it is very hard for humans to be completely aware of the limitations of the conclusions we draw - possibly/probably because our brains are "hardwired" to overfit. (Which also puts us into a bad starting position for avoiding overfitting in e.g. machine learning models we build)
The take-home message from this is that we need to be careful also when reading published papers: we need to keep the possibility of the paper being wrong, containing honest mistakes or not being as directly applicable to our task at hand as we believe at the first glance in mind.
I missed something during the implementation.
I experienced something similar once when I was also implementing a reference method from literature (related but different field). It turned out that different defaults in the preprocessing of the data caused the difference - but only after I had the bright idea of trying out to omit a preprocessing step - although the model doesn't make much sense physically without that step, but the paper didn't mention any such step (neither do many papers in my field who do use that step because it is considered necessary because of physics).
- They are not honest.
While that is of course possible, I've seen sufficient honest mistakes to use Hanlon's razor (which I first met as Murphy's razor): and not assume dishonesty or misconduct unless there are extremely strong indications for that.
Proving superiority may in any case be something that is impossible due to limitations in the old paper.
E.g. if they report validation results based on a small number of cases, the uncertainty on those results may be so large and thus it cannot be excluded that the method is better than it seemed that truly improved methods later on will not be able to demonstrate their superiority in a statistically sound manner.
Still, such a shortcoming of the old paper does not limit the scientific content or advance of your work.
to finish my degree I need to build methods outperform what is already there
No, that is not true. You need to deliver a piece of proper scientific work and advance knowledge and that does not depend on what direction your findings point.
Of course, things are easier and more pleasant if your implementation is better. But the actual scientific part of your thesis is to study both the old and your approach scientifically and then conclude whether one is better (and possibly in which situations).
The difficulty in your situation is to proove that the discrepancy to literature is not due to your incompetence or lack of hard work (=> you deserve a bad mark) but actually due to "nature" not being as it was supposed to be by the previous paper.
What you can and should report is
- that you were not able to reproduce the findings in papers 1 + 2,
- in consequence have been in communication with the authors.
- Importandly, that your implementation has been confirmed as correct by private communication with the authors of paper 2 and by comparison with (confidential) code you received from the authors of paper 1 again by private communication for that purpose.
If
Well, it turns out they they are not using the data they claim in their the paper, of course their results are different than my reimplementation.
means that you got the data set they actually used and got the same results with that, then you can also report that for a related data set, the same results were obtained.
If not, it may be possible to kindly ask the authors of paper 1 + 2 whether they'd run a data set you send them and give you the results of their implementations so you can compare that to your results. You can then report (hopefully) that equal results were obtained on a different data set and thank the authors of those papers for running your data.
The last two points should make amply clear that the discrepancy is not due to a fault in your implementation - which is what counts for your thesis.
As a personal side note, I got top grade on my Diplom (≈ Master) thesis which (among other findings) found that the software implementation I was using did not work as it was supposed to. I was able to point out a plausible and probable reason for that bug (which may have been a leftover debugging "feature") - which is much harder for you as you don't have access to a running instance of their software that you can test (= study) to form and confirm or dismiss hypotheses about its behaviour.
As an addition to what @Buffy explained already about the possibility of honest mistakes in published papers:
As scientists we tend to work at the edge of what is known. Which also means that we're inherently running a high risk of not (yet) knowing/having realized important conditions and limitations of what we are doing.
We thus also run a comparatively high risk that tentative generalizations we consider may turn out to be not all that general after all. Or that we may be plain wrong and realize this only later (or not at all). I believe it is very hard for humans to be completely aware of the limitations of the conclusions we draw - possibly/probably because our brains are "hardwired" to overfit. (Which also puts us into a bad starting position for avoiding overfitting in e.g. machine learning models we build)
The take-home message from this is that we need to be careful also when reading published papers: we need to keep the possibility of the paper being wrong, containing honest mistakes or not being as directly applicable to our task at hand as we believe at the first glance in mind.
I missed something during the implementation.
I experienced something similar once when I was also implementing a reference method from literature (related but different field). It turned out that different defaults in the preprocessing of the data caused the difference - but only after I had the bright idea of trying out to omit a preprocessing step - although the model doesn't make much sense physically without that step, but the paper didn't mention any such step (neither do many papers in my field who do use that step because it is considered necessary because of physics).
- They are not honest.
While that is of course possible, I've seen sufficient honest mistakes to use Hanlon's razor (which I first met as Murphy's razor): and not assume dishonesty or misconduct unless there are extremely strong indications for that.
Proving superiority may in any case be something that is impossible due to limitations in the old paper.
E.g. if they report validation results based on a small number of cases, the uncertainty on those results may be so large and thus it cannot be excluded that the method is better than it seemed that truly improved methods later on will not be able to demonstrate their superiority in a statistically sound manner.
Still, such a shortcoming of the old paper does not limit the scientific content or advance of your work.
answered Oct 15 at 11:05
cbeleitescbeleites
14.9k31 silver badges60 bronze badges
14.9k31 silver badges60 bronze badges
6
Too bad I can give only a single +1. Publishing negative results, i.e. proving that something does not work, is very important as it saves other scientists going down a path that does lead nowhere.
– Dohn Joe
Oct 15 at 12:21
3
Citation of Hazon's razor relevant and valued: Never attribute to malice that which is adequately explained by stupidity
– XavierStuvw
Oct 15 at 15:58
2
@XavierStuvw you mean Hanlon's razor?
– slebetman
Oct 17 at 1:03
@slebetman Thanks, I did as per the link target, but it was too late to edit when I realized that my wires were crossed. This note lends the possibility to add a corollary or identify a premise: malice and stupidity can be equally obnoxious
– XavierStuvw
Oct 17 at 8:09
add a comment
|
6
Too bad I can give only a single +1. Publishing negative results, i.e. proving that something does not work, is very important as it saves other scientists going down a path that does lead nowhere.
– Dohn Joe
Oct 15 at 12:21
3
Citation of Hazon's razor relevant and valued: Never attribute to malice that which is adequately explained by stupidity
– XavierStuvw
Oct 15 at 15:58
2
@XavierStuvw you mean Hanlon's razor?
– slebetman
Oct 17 at 1:03
@slebetman Thanks, I did as per the link target, but it was too late to edit when I realized that my wires were crossed. This note lends the possibility to add a corollary or identify a premise: malice and stupidity can be equally obnoxious
– XavierStuvw
Oct 17 at 8:09
6
6
Too bad I can give only a single +1. Publishing negative results, i.e. proving that something does not work, is very important as it saves other scientists going down a path that does lead nowhere.
– Dohn Joe
Oct 15 at 12:21
Too bad I can give only a single +1. Publishing negative results, i.e. proving that something does not work, is very important as it saves other scientists going down a path that does lead nowhere.
– Dohn Joe
Oct 15 at 12:21
3
3
Citation of Hazon's razor relevant and valued: Never attribute to malice that which is adequately explained by stupidity
– XavierStuvw
Oct 15 at 15:58
Citation of Hazon's razor relevant and valued: Never attribute to malice that which is adequately explained by stupidity
– XavierStuvw
Oct 15 at 15:58
2
2
@XavierStuvw you mean Hanlon's razor?
– slebetman
Oct 17 at 1:03
@XavierStuvw you mean Hanlon's razor?
– slebetman
Oct 17 at 1:03
@slebetman Thanks, I did as per the link target, but it was too late to edit when I realized that my wires were crossed. This note lends the possibility to add a corollary or identify a premise: malice and stupidity can be equally obnoxious
– XavierStuvw
Oct 17 at 8:09
@slebetman Thanks, I did as per the link target, but it was too late to edit when I realized that my wires were crossed. This note lends the possibility to add a corollary or identify a premise: malice and stupidity can be equally obnoxious
– XavierStuvw
Oct 17 at 8:09
add a comment
|
You can write that you used your implementation of the competing method for your results, and that you were not able to reproduce the published results. Make your code available so people can check.
It seems that the authors of the other papers didn't publish their code, so nobody can say you should've used that.
add a comment
|
You can write that you used your implementation of the competing method for your results, and that you were not able to reproduce the published results. Make your code available so people can check.
It seems that the authors of the other papers didn't publish their code, so nobody can say you should've used that.
add a comment
|
You can write that you used your implementation of the competing method for your results, and that you were not able to reproduce the published results. Make your code available so people can check.
It seems that the authors of the other papers didn't publish their code, so nobody can say you should've used that.
You can write that you used your implementation of the competing method for your results, and that you were not able to reproduce the published results. Make your code available so people can check.
It seems that the authors of the other papers didn't publish their code, so nobody can say you should've used that.
answered Oct 14 at 14:59
LewianLewian
4214 bronze badges
4214 bronze badges
add a comment
|
add a comment
|
In the first place, you should consult this with your supervisors. The code for papers is often rushed and unfinished, and what works on one machine may not work on another for a number of reasons. The most reasonable way is to let your supervisors know that you implemented both methods, communicated with original authors (mention only non-confidential things/say some things are confidential/ask authors for permission to discuss the implementation with your supervisor), and yet you did not reach the performance claimed. As a senior academic capacity, they are better equipped to decide what to do with regards to politics of department/field/research teams, are bound to get quicker and more elaborate responses from authors of the papers and handle potential fallout should anything go wrong in the process. I would not advise to pursue this matter on your own, and surely if you have doubts about something this important to your project, it would be reasonable to seek their advise and they will understand that.
1
"ask authors for permission to discuss the implementation with your supervisor" 'confidential' rather means that the code you should be published.
– FooBar
Oct 15 at 7:05
1
I think that @jericho-jones answer is the most appropriate one. Also, take this as a lesson. Use good coding practices. Check your work into a source code repository. Tag your working code so you can easily back out changes. Make everything reproducible, including processing any raw data. Make the raw data read-only using file permissions, deny database update/insert/delete statements, etc. Make sure your code builds every day, or every other day. Make sure a change builds and passes all tests, and reproduces already known results before checking into your source code repository.
– Daisuke Aramaki
Oct 16 at 13:06
add a comment
|
In the first place, you should consult this with your supervisors. The code for papers is often rushed and unfinished, and what works on one machine may not work on another for a number of reasons. The most reasonable way is to let your supervisors know that you implemented both methods, communicated with original authors (mention only non-confidential things/say some things are confidential/ask authors for permission to discuss the implementation with your supervisor), and yet you did not reach the performance claimed. As a senior academic capacity, they are better equipped to decide what to do with regards to politics of department/field/research teams, are bound to get quicker and more elaborate responses from authors of the papers and handle potential fallout should anything go wrong in the process. I would not advise to pursue this matter on your own, and surely if you have doubts about something this important to your project, it would be reasonable to seek their advise and they will understand that.
1
"ask authors for permission to discuss the implementation with your supervisor" 'confidential' rather means that the code you should be published.
– FooBar
Oct 15 at 7:05
1
I think that @jericho-jones answer is the most appropriate one. Also, take this as a lesson. Use good coding practices. Check your work into a source code repository. Tag your working code so you can easily back out changes. Make everything reproducible, including processing any raw data. Make the raw data read-only using file permissions, deny database update/insert/delete statements, etc. Make sure your code builds every day, or every other day. Make sure a change builds and passes all tests, and reproduces already known results before checking into your source code repository.
– Daisuke Aramaki
Oct 16 at 13:06
add a comment
|
In the first place, you should consult this with your supervisors. The code for papers is often rushed and unfinished, and what works on one machine may not work on another for a number of reasons. The most reasonable way is to let your supervisors know that you implemented both methods, communicated with original authors (mention only non-confidential things/say some things are confidential/ask authors for permission to discuss the implementation with your supervisor), and yet you did not reach the performance claimed. As a senior academic capacity, they are better equipped to decide what to do with regards to politics of department/field/research teams, are bound to get quicker and more elaborate responses from authors of the papers and handle potential fallout should anything go wrong in the process. I would not advise to pursue this matter on your own, and surely if you have doubts about something this important to your project, it would be reasonable to seek their advise and they will understand that.
In the first place, you should consult this with your supervisors. The code for papers is often rushed and unfinished, and what works on one machine may not work on another for a number of reasons. The most reasonable way is to let your supervisors know that you implemented both methods, communicated with original authors (mention only non-confidential things/say some things are confidential/ask authors for permission to discuss the implementation with your supervisor), and yet you did not reach the performance claimed. As a senior academic capacity, they are better equipped to decide what to do with regards to politics of department/field/research teams, are bound to get quicker and more elaborate responses from authors of the papers and handle potential fallout should anything go wrong in the process. I would not advise to pursue this matter on your own, and surely if you have doubts about something this important to your project, it would be reasonable to seek their advise and they will understand that.
answered Oct 14 at 14:48
Jericho JonesJericho Jones
491 bronze badge
491 bronze badge
1
"ask authors for permission to discuss the implementation with your supervisor" 'confidential' rather means that the code you should be published.
– FooBar
Oct 15 at 7:05
1
I think that @jericho-jones answer is the most appropriate one. Also, take this as a lesson. Use good coding practices. Check your work into a source code repository. Tag your working code so you can easily back out changes. Make everything reproducible, including processing any raw data. Make the raw data read-only using file permissions, deny database update/insert/delete statements, etc. Make sure your code builds every day, or every other day. Make sure a change builds and passes all tests, and reproduces already known results before checking into your source code repository.
– Daisuke Aramaki
Oct 16 at 13:06
add a comment
|
1
"ask authors for permission to discuss the implementation with your supervisor" 'confidential' rather means that the code you should be published.
– FooBar
Oct 15 at 7:05
1
I think that @jericho-jones answer is the most appropriate one. Also, take this as a lesson. Use good coding practices. Check your work into a source code repository. Tag your working code so you can easily back out changes. Make everything reproducible, including processing any raw data. Make the raw data read-only using file permissions, deny database update/insert/delete statements, etc. Make sure your code builds every day, or every other day. Make sure a change builds and passes all tests, and reproduces already known results before checking into your source code repository.
– Daisuke Aramaki
Oct 16 at 13:06
1
1
"ask authors for permission to discuss the implementation with your supervisor" 'confidential' rather means that the code you should be published.
– FooBar
Oct 15 at 7:05
"ask authors for permission to discuss the implementation with your supervisor" 'confidential' rather means that the code you should be published.
– FooBar
Oct 15 at 7:05
1
1
I think that @jericho-jones answer is the most appropriate one. Also, take this as a lesson. Use good coding practices. Check your work into a source code repository. Tag your working code so you can easily back out changes. Make everything reproducible, including processing any raw data. Make the raw data read-only using file permissions, deny database update/insert/delete statements, etc. Make sure your code builds every day, or every other day. Make sure a change builds and passes all tests, and reproduces already known results before checking into your source code repository.
– Daisuke Aramaki
Oct 16 at 13:06
I think that @jericho-jones answer is the most appropriate one. Also, take this as a lesson. Use good coding practices. Check your work into a source code repository. Tag your working code so you can easily back out changes. Make everything reproducible, including processing any raw data. Make the raw data read-only using file permissions, deny database update/insert/delete statements, etc. Make sure your code builds every day, or every other day. Make sure a change builds and passes all tests, and reproduces already known results before checking into your source code repository.
– Daisuke Aramaki
Oct 16 at 13:06
add a comment
|
In addition to the other answers, you should consider publishing your re-implementation. Then any reviewers can check if they think your results are plausible or if they spot a flaw in your re-implementation.
In the first case, it is right to say "We implemented paper X, but could not reproduce the claimed efficiency" and in the second case the flaw found by the reviewer may help you to improve your re-implementation, so you achieve a similar result.
Most reviewers will not debug your code, but you did your best efford to allow anyone to verify your claims of less efficiency and at least your paper is as honest as possible.
If the algorithm is interesting, publishing an open source version may get some users, that point out issues with your code (or contribute improvements) as well. But make sure not to be too close to the confidential code, as the original authors may claim copyright infringement.
You may use clean room reverse engineering with another person or at least do it yourself by just using the given code to write down the parts missing in the paper and then reimplement it from the documentation and not from the code.
add a comment
|
In addition to the other answers, you should consider publishing your re-implementation. Then any reviewers can check if they think your results are plausible or if they spot a flaw in your re-implementation.
In the first case, it is right to say "We implemented paper X, but could not reproduce the claimed efficiency" and in the second case the flaw found by the reviewer may help you to improve your re-implementation, so you achieve a similar result.
Most reviewers will not debug your code, but you did your best efford to allow anyone to verify your claims of less efficiency and at least your paper is as honest as possible.
If the algorithm is interesting, publishing an open source version may get some users, that point out issues with your code (or contribute improvements) as well. But make sure not to be too close to the confidential code, as the original authors may claim copyright infringement.
You may use clean room reverse engineering with another person or at least do it yourself by just using the given code to write down the parts missing in the paper and then reimplement it from the documentation and not from the code.
add a comment
|
In addition to the other answers, you should consider publishing your re-implementation. Then any reviewers can check if they think your results are plausible or if they spot a flaw in your re-implementation.
In the first case, it is right to say "We implemented paper X, but could not reproduce the claimed efficiency" and in the second case the flaw found by the reviewer may help you to improve your re-implementation, so you achieve a similar result.
Most reviewers will not debug your code, but you did your best efford to allow anyone to verify your claims of less efficiency and at least your paper is as honest as possible.
If the algorithm is interesting, publishing an open source version may get some users, that point out issues with your code (or contribute improvements) as well. But make sure not to be too close to the confidential code, as the original authors may claim copyright infringement.
You may use clean room reverse engineering with another person or at least do it yourself by just using the given code to write down the parts missing in the paper and then reimplement it from the documentation and not from the code.
In addition to the other answers, you should consider publishing your re-implementation. Then any reviewers can check if they think your results are plausible or if they spot a flaw in your re-implementation.
In the first case, it is right to say "We implemented paper X, but could not reproduce the claimed efficiency" and in the second case the flaw found by the reviewer may help you to improve your re-implementation, so you achieve a similar result.
Most reviewers will not debug your code, but you did your best efford to allow anyone to verify your claims of less efficiency and at least your paper is as honest as possible.
If the algorithm is interesting, publishing an open source version may get some users, that point out issues with your code (or contribute improvements) as well. But make sure not to be too close to the confidential code, as the original authors may claim copyright infringement.
You may use clean room reverse engineering with another person or at least do it yourself by just using the given code to write down the parts missing in the paper and then reimplement it from the documentation and not from the code.
answered Oct 17 at 8:43
alloallo
2,4411 gold badge6 silver badges18 bronze badges
2,4411 gold badge6 silver badges18 bronze badges
add a comment
|
add a comment
|
protected by Alexandros Oct 17 at 8:51
Thank you for your interest in this question.
Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).
Would you like to answer one of these unanswered questions instead?
16
What is the measure of “performance” you are referring to and could its definition be the source of this 20% difference? As a software developer I’m used to seeing such performance gains (in runtime) just from small but significant optimization tweaks, that are very platform dependent.
– mvds
Oct 15 at 7:11
4
I can't really dispute that. But it's still proving the old theories wrong that make progress. Like when we figured out Newton's force/acceleration ideas really are insufficient when approaching c. What I'm trying to say, don't be afraid to disagree.
– Gloweye
Oct 15 at 9:05
10
Before assuming the worst (academic misconduct), you could take into account that there may be subtle differences (optimiser? better data structures?) that caused the 20% gain. I once had a student code something, and, for fun, coded it myself. My code was a factor 1000 faster for a more complicated case. It was not just the language, but the way I coded it that gained this speedup. Of course, if their code, with the original data, does not run that fast with you, then there may be something wrong. Again, it may be an honest mistake rather than misconduct. Report your results as side remark.
– Captain Emacs
Oct 15 at 11:20
5
3rd option: they made an honest mistake.
– daisy
Oct 15 at 14:34
5
"it turns out they they are not using the data they claim in their the paper"
Can we get some more details about this (or more emphasis of that particular part of the question)? The answers I've seen so far seem to assume that the differences might be due to an honest mistake, but If they published results on one dataset and claimed they were for another, that needs to be reported to the editor.– Ray
Oct 16 at 13:23