I have read counts data and I want to convert them into RPKM values. For this conversion I need the gene length.
Does the gene length need to be calculated based on the sum of coding exonic lengths? Or are there any different ways for that?
I know that gene length can be taken from the Gencode GTF v19 file. Could you please tell me how that Gene_length is calculated?
The Data I'm having is RNA-Seq data. I don't have any idea whether I need to include UTR's in this calculation or only exons?
In Github I have seen RPKM calculation from Counts data with the Gene_length from Gencode GTF file. Do you think this is the right way of calculation?
And why RPKM is - Its not for differential analysis. For TNBC subtyping they use microarray data. I would like to give a try with RNA-Seq data. So for this I'm trying out different and the right way.