Thanks

]]>Glad the post is useful. Are you looking for initial calling commands or the joint recalling steps? For the initial calls, both are in bcbio-nextgen (FreeBayes: https://github.com/chapmanb/bcbio-nextgen/blob/9b3e9e167cc26be203d0731c45ea13e318718567/bcbio/variation/freebayes.py#L121 and GATK: https://github.com/chapmanb/bcbio-nextgen/blob/9b3e9e167cc26be203d0731c45ea13e318718567/bcbio/variation/gatk.py#L71). For joint recalling, those are in bcbio for GATK (https://github.com/chapmanb/bcbio-nextgen/blob/master/bcbio/variation/gatkjoint.py) and in bcbio.variation.recall for FreeBayes (https://github.com/chapmanb/bcbio.variation.recall/blob/48b06869b58eebfb2380ec728605ec18e9bfac0f/src/bcbio/variation/recall/square.clj#L62). Hope this helps. ]]>

great post as usual. Can you tell me where I might find the exact commands used for Freebayes and GATK in this comparison? Thank you.

]]>The difference in numbers between this and the minimal post is the scale. This is whole genome and the minimal BAM preparation analysis is on exome. We’ve worked a lot on scaling in the meantime and able to run these reasonably fast on whole genomes. Hope this helps.

]]>Very nice analysis. Can you tell me if Discordant(shared) means true-negative?

If it is indeed true negative, how come specificity here is very different from the “minimal bam preparation pipeline” post?

Thanks a lot

]]>There are 50x coverage inputs from Platinum Genomes (http://www.illumina.com/platinumgenomes/) available in SRA. The specific wgets to retrieve the fastqs are here: https://github.com/chapmanb/bcbio-nextgen/blob/master/config/examples/NA12878-trio-sv-getdata.sh Thanks much ]]>

I put the final callsets from all 4 callers on an S3 bucket here:

http://bcbio.s3-website-us-east-1.amazonaws.com/jointeval/

Let me know if you have any questions. Thanks much for following up with these.

]]>