The 8% human genome is composed of Human Endogenous Retroviruses (HERVs), remnants of ancestral germline infections by exogenous retroviruses, which have been vertically transmitted as Mendelian characters. The HML-6 group, a member of the class II Betaretrovirus-like, includes several proviral loci with an increased transcriptional activity in cancer, and at least two elements that are known for retaining an intact open reading frame (ORF) and for encoding small proteins such as ERVK3-1, that is expressed in various healthy tissues, and HERV-K-MEL, a small Env peptide expressed in samples of cutaneous and ocular melanoma, but not in normal tissues.
Importance: We reported the distribution and genetic composition of 66 HML-6 elements. We analyzed the phylogeny of the HML-6 sequences and identified two main clusters. We provided the first description of a Rec domain within the env sequence of 23 HML-6 elements. A Rec domain was also predicted within the ERVK3-1 transcript sequence, revealing its expression in various healthy tissues. Evidence about the context of insertion and co-localization of 19 HML-6 elements with functional human genes are also reported, including the sequence 16p11.2, whose 5’LTR overlapped the exon of one transcript variant of a cellular Zinc-finger up-regulated and involved in hepatocellular carcinoma. The present work provides the first complete overview of the HML-6 elements in GRCh37(hg19), describing the structure, phylogeny and genomic context of insertion of each locus. This information allows a better understanding of the genetics of one of the most expressed HERV groups in the human genome.