MPB-EXP: Protein Expression Level Prediction Across 88 Species

Based on Protein Language Models

Tool Overview

Tool Source

This tool implements MPB-EXP for predicting protein expression levels using protein sequences only. It provides 88 species-specific models plus one general model (MPEPE), enabling host-aware prediction.

Citation

T. Liu, Y. Zhang, Y. Li, G. Xu, H. Gao, P. Wang, T. Tu, H. Luo, N. Wu, B. Yao, B. Liu, F. Guan, H. Huang, J. Tian, Effective Gene Expression Prediction and Optimization from Protein Sequences. Adv. Sci. 2025, 12, 2407664. https://doi.org/10.1002/advs.202407664

Ding Z, Guan F, Xu G, Wang Y, Yan Y, Zhang W, Wu N, Yao B, Huang H, Tuller T, Tian J. MPEPE, a predictive approach to improve protein expression in E. coli based on deep learning. Comput Struct Biotechnol J. 2022 Mar 1;20:1142-1153. doi: 10.1016/j.csbj.2022.02.030. PMID: 35317239; PMCID: PMC8913310.

Methods and Reference Materials

MP-TRANS / MP-BERT: Transformer-based protein language model backbone.
Fine-tuning: species-specific expression prediction models (88 species) + a pan-species model (MPEPE).
Data: PaxDB-derived protein abundance/expression data and curated labels.

Seek help

For questions about tool conceptualization, please contact tianjian@caas.cn.

Usage Instructions

Explanation

1. Please select the target host species model (88 species-specific models) or the general model (MPEPE). Prepare a CSV file containing protein sequences and import it under the column named "Seq".

2. Input Format: CSV file with columns: id, seq . Output Format: CSV with columns: id, sequence, predicted_label, probabilities(HE-Value), host_model.

3. After submission, prediction will run on the server and the results CSV will be automatically downloaded.

Note: The web version only predicts the first 10 sequences per uploaded file.

Host Species / Model Selection and Sequence Input Prediction Stage

Please select the host species model (88 species) or MPEPE model, and upload a CSV file containing protein sequences.

Click to upload a CSV file

Or drag files here

Please upload a CSV file containing amino acid sequences (column: Seq).
Sample File
访问量: