Welcome to SmProt!

Small proteins are the general term for proteins with length shorter than 100 amino acids. SmProt database contains records of Small Proteins encoded by genes, especially ones encoded by non-coding RNA genes. The selected small proteins were collected from the literature, mass spectrometry (MS) and ribosome profiling data carried out in many species, including human, mouse, zebrafish, yeast, fruitfly and so on. Moreover, SmProt database contains features for the collected small proteins on their sequences, data sources, genomic locations, tissues localization or source cell lines, and other detailed information.more

How to cite?     Hao Y., et al. 2017.SmProt: a database of small proteins encoded by annotated coding and non-coding RNA loci.Brief Bioinform bbx005.


search hints: Gene Symbol or Gene IDs from related databases (eg. NONCODE, RefSeq, ENSEMBL), cell line or tissue, PubMed ID (PMID), ORF type and gene type.


Species Num Data Sources Num ORF Types Num
human 167,785 MS data 117,099 sORF 223,077
fruitfly 39,015 Literature Mining 81,454 uORF 29,331
C.elegans 18,357 Ribosome profiling 53,459 dORF 2,495
mouse 15,581 Known Databases 2,998 NA 107
rat 8,128
zebrafish 2,994
yeast 1,875
E.coli 1,275


1st public release.
June 2016
started to collect small proteins from the literature.
December 2015
started to collect MS data sets and ribosome profiling data sets.
June 2015

Visitor Statistics