UPF0488 is a protein that in humans is encoded by the C8orf33 (Chromosome 8 Open Reading Frame 33) gene. Chromosome 8 open reading frame 33 (C8orf33) is a human protein-coding gene of currently unknown function.

C8orf33
Identifiers
AliasesC8orf33, chromosome 8 open reading frame 33
External IDsMGI: 2152337; HomoloGene: 11320; GeneCards: C8orf33; OMA:C8orf33 - orthologs
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_023080

NM_054099
NM_001347540

RefSeq (protein)

NP_075568

NP_001334469
NP_473440

Location (UCSC)Chr 8: 145.05 – 145.07 MbChr 15: 76.83 – 76.84 Mb
PubMed search[3][4]
Wikidata
View/Edit HumanView/Edit Mouse

Tissue and subcellular distribution

edit

The UPF0488 protein is expressed in low-moderate levels in most tissues with some exceptions.[5] It is predicted to localize in the nucleus and mitochondrion, though several orthologs were also predicted to localize in the cytosol; additionally, there is experimental evidence showing that human C8orf33 may localize in the peroxisomes. The expression of this gene is up-regulated after lithium exposure. C8orf33 is significantly up regulated in breast cancer drug treatment.[6]

Post-translational modification

edit

Several post-translational modifications including phosphorylation, methylation, and acetylation are predicted.[7] Additionally, it has several post-translational modifications such as acetylation, methylation, phosphoprotein – this includes amino acid modifications (or modified residues) such as N-acetylalanine, omega-N-methylarginine, and phosphoserine).[8]

Gene

edit

This gene has 5 transcripts (splice variants), 62 orthologues and is a member of 1 Ensembl protein family. This gene is a member of the Human CCDS set: CCDS34974.1[9] This gene is a member of the Human CCDS set: CCDS34974. C8orf33 expression profile revealed that this gene was over-expressed after lithium exposure.[10]

C8orf33 (UPF0488) has 31 alternatively spliced exons which combine in 13 different transcript variants –X1 variant is the longest and seems to have the greatest identity. Human tissue RNA sequencing of UPF0488.

Transcript

edit

UPF0488 has 5 transcripts splice variants. In terms of common gene haplotype alleles, the frequency of haplotype is 96.3% for one variant site. The primary transcript is 3,593 bp while a similar variant is 1,666 bp. The mRNA secondary structure of 3’ and 5’ UTR’s indicate different fold energies. The 5’ UTR region contains a fold energy of -21.20 and consists of 54 bases, the energy of the bases is -0.393. The 3’UTR region contains a fold energy of -646.10, consisting of 1873 bases – while the energy of the bases is -0.345.[11]

Expression

edit

According to microarray-assessed tissue expression analysis by NCBI GEO, the gene C8orf33 has average expression levels in most tissues save including thyroid gland and parathyroid gland. Expression seems to be low in the pancreas, small intestine and other digestive organs except the kidney which seems relatively higher.[11]

Approximate expression patterns inferred from EST sources. Norway rat putative protein-coding gene. Represented by 30 ESTs from 20 cDNA libraries. EST representation biased toward fetus. Gene expression seems to increase in the obesity-resistant categories

Promoter

edit

The promoter region for c8orf33 covers 1191 base pairs of DNA and contains over 700 potential factor binding sites. Fifteen transcription factors with highly conserved binding sites across multiple species’ promoter regions for c8orf33 were selected and shown (see Annotated Promoter Section). CDF1(Cycling DOF Factor 1) physically interacts with FKF1, CDF1 protein is more stable in FKF1 mutants.[12] Another transcription factor, transcription factor II B (TFIIB) is a general transcription factor that is involved in the formation of the RNA polymerase II preinitiation complex (PIC).[13]

Protein

edit

The Isoelectric point of the protein (UPF0488) is 9.16, given a detailed analysis of isoelectric point according to different scales for individual proteins. The Net Charge had been determined using the values available from the Lehninger's Biochemistry book. The precursor protein has a molecular weight of approximately 24.9925 kDa. This is slightly greater than the average pI of 6.81 for the human proteome. It contains repeats from 149 to 166, and 167 to 186. However, the repeats contain a high degree of degeneracy.[14]

UPF0488 is an alanine rich protein relative to other proteins and low in all other amino acids besides arginine, leucine, and proline.

Homology and evolution

edit

The evolutionary lineage of UPF0488 can be traced as distant as invertebrates with a rate of evolution greater than that of fibrinogen.

Graph shows divergence of UPF0488 in a given time scale compared to fibrinogen and cytochrome c. Analignment using the SDSC Biology Workbench gives a 27.7% match Danio rerio. The ALIGN calculates a global alignment of two sequences, giving a Global alignment score of 215.[15]

The mRNA of UPF0488 has a very high level of degeneracy across organisms. Sequences of very low identity to the human mRNA could only be identified in closely related organisms. However, the protein had far more distant relatives, including several invertebrates. Protein alignments for Homo sapiens UPF0488 was performed using the San Diego Workbench; these alignments were performed against several different taxa including vertebrates such as mammalia, reptilia, aves and invertebrates such as insecta. The protein sequences for UPf0488 are very highly conserved amongst close relatives of homo sapiens such as Gorilla Gorilla Gorilla (Gorilla). The similarity in protein sequence is inversely proportional to divergence (MYA) (table of homologs).

Function

edit

C8orf33 activity was found to be associated with G protein-coupled receptor signaling pathway, neuroactive ligand-receptor interaction, calcium signaling pathway and the regulation of the actin cytoskeleton. The following substances interact with UPF0488: 7,8-dihydro-7,8-dihydroxybenzo(a)pyrene 9,10-oxide, benzo(a)pyrene, methotrexate, and vitamin E.[16][17]

Pathology

edit

The expression of the UPF0488 gene increases after treatment with cephaloridine, a semisynthetic derivative of cephalosporin C that inhibits gluconeogenesis in both target (kidney) and non-target (liver) organs.[12]

References

edit
  1. ^ a b c GRCh38: Ensembl release 89: ENSG00000182307Ensembl, May 2017
  2. ^ a b c GRCm38: Ensembl release 89: ENSMUSG00000063236Ensembl, May 2017
  3. ^ "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. ^ "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. ^ "C8orf33 chromosome 8 open reading frame 33 [Homo sapiens]". Entrez Gene.
  6. ^ Ma C, Chen HI, Flores M, Huang Y, Chen Y (2013). "BRCA-Monet: a breast cancer specific drug treatment mode-of-action network for treatment effective prediction using large scale microarray database". BMC Systems Biology. 7 (Suppl 5): S5. doi:10.1186/1752-0509-7-S5-S5. PMC 4029357. PMID 24564956.
  7. ^ "UPF0488 protein C8orf33 [Homo sapiens]". Entrez Protein.
  8. ^ "C8orf33". Gene Cards.
  9. ^ "Gene: C8orf33 ENSG00000182307". Ensembl. European Bioinformatics Institute – European Molecular Biology Laboratory.
  10. ^ Aitchison K, Serretti A, Goldman D, Curran S, Drago A, Malhotra AK (2009). "The 8th annual pharmacogenetics in psychiatry meeting report". The Pharmacogenomics Journal. 9 (6): 358–61. doi:10.1038/tpj.2009.47. PMC 2945913. PMID 19841640.
  11. ^ a b "C8orf33 tissue". The Human Protein Atlas.
  12. ^ a b Goldstein RS, Smith PF, Tarloff JB, Contardi L, Rush GF, Hook JB (1988). "Biochemical mechanisms of cephaloridine nephrotoxicity". Life Sciences. 42 (19): 1809–16. doi:10.1016/0024-3205(88)90018-5. PMID 3285106.
  13. ^ Lewin B (2004). Genes VIII (8th ed.). Upper Saddle River, NJ: Pearson/Prentice Hall. pp. 636–637. ISBN 978-0-13-143981-8.
  14. ^ "Detection and alignment of repeats in protein sequences". Radar.
  15. ^ Myers EW, Miller W (1988). "Optimal alignments in linear space". Computer Applications in the Biosciences. 4 (1): 11–7. doi:10.1093/bioinformatics/4.1.11. PMID 3382986.
  16. ^ "C8ORF33". Comparative Toxicogenomics Database.
  17. ^ Squassina A, Manchia M, Del Zompo M (2010). "Pharmacogenomics of mood stabilizers in the treatment of bipolar disorder". Human Genomics and Proteomics. 2010: 159761. doi:10.4061/2010/159761. PMC 2958627. PMID 20981231.