Gene-combining read-through transcription in the human genome: new genes or alternative transcripts?
During the gene annotation of human chromosome 11, we identified at least 11 cases of read-through transcription where two distinct neighboring (child) genes were combined by a unique (parent) mRNA that spanned the entire length of both children. While read-through transcription is well established in higher eukaryotes, with some genes having multiple polyadenylation sites leading to transcripts that extend for variable distances (a few hundred nucleotides to several kilobases) into the 3' flanking region of the gene, we were not able to find many instances of the type of read-through translation described here. Either it is a somewhat rare phenomenon in the human genome, or it has not been that well characterized due to the current lack of uniform genome-wide annotation. There are currently only a few similar cases described in NCBI Entrez Gene including SEPT5-GP1BB and PRR5-ARHGAP8 (chr22) and DRG2-MYO15A (chr17).
Of the eleven cases we found on chromosome 11, only two were found that likely result in a protein fusion product, TRIM6-TRIM34 and BSCL2-HNRPUL2. Gene (or transcript) TRIM6-TRIM34 is a combination of neighboring genes TRIM6 and TRIM34, and is confirmed by mRNA AB039903. In three cases, the protein product of the read-through transcript is identical or nearly identical to that of one of its children. In the remaining cases, the predicted protein of the read-through transcript is only partially identical to one of its children or is completely novel. Most of these cases were only supported by a single mRNA or a few ESTs.
The following two questions will be addressed: 1) How common are such read-through transcripts in the human genome? 2) Should they be considered as alternative transcripts or separate genes? Whether these read-through transcripts are affecting the regulation of their children or are providing additional function remains to be investigated.