Based on Protein Language Models
This tool implements MPB-EXP for predicting protein expression levels using protein sequences only. It provides 88 species-specific models plus one general model (MPEPE), enabling host-aware prediction.
T. Liu, Y. Zhang, Y. Li, G. Xu, H. Gao, P. Wang, T. Tu, H. Luo, N. Wu, B. Yao, B. Liu, F. Guan, H. Huang, J. Tian, Effective Gene Expression Prediction and Optimization from Protein Sequences. Adv. Sci. 2025, 12, 2407664. https://doi.org/10.1002/advs.202407664
Ding Z, Guan F, Xu G, Wang Y, Yan Y, Zhang W, Wu N, Yao B, Huang H, Tuller T, Tian J. MPEPE, a predictive approach to improve protein expression in E. coli based on deep learning. Comput Struct Biotechnol J. 2022 Mar 1;20:1142-1153. doi: 10.1016/j.csbj.2022.02.030. PMID: 35317239; PMCID: PMC8913310.
MP-TRANS / MP-BERT: Transformer-based protein language model backbone.
Fine-tuning: species-specific expression prediction models (88 species) + a pan-species model (MPEPE).
Data: PaxDB-derived protein abundance/expression data and curated labels.
For questions about tool conceptualization, please contact tianjian@caas.cn.
1. Please select the target host species model (88 species-specific models) or the general model (MPEPE). Prepare a CSV file containing protein sequences and import it under the column named "Seq".
2. Input Format: CSV file with columns: id, seq . Output Format: CSV with columns: id, sequence, predicted_label, probabilities(HE-Value), host_model.
3. After submission, prediction will run on the server and the results CSV will be automatically downloaded.
Note: The web version only predicts the first 10 sequences per uploaded file.
Please select the host species model (88 species) or MPEPE model, and upload a CSV file containing protein sequences.