🧪 ConoServerParser: A Flexible Shiny Application for Filtering, Exploring, and Exporting Conotoxin Sequences from ConoServer

Jun 12, 2025·By Dany Domínguez Pérez

✨ Conus Snails and Conotoxins

Marine gastropods, commonly known as cone snails (Conus genus), produce a sophisticated mixture of venomous peptides called conopeptides. These peptides are remarkable for their potent and highly specific action on ion channels, receptors, and transporters within the nervous system. This specificity positions conopeptides as valuable resources for physiological studies of neuroreceptors and promising leads for drug development. The structural diversity of conotoxins, coupled with the rapid evolution of their gene superfamilies, enhances their significance in evolutionary biology.

With over 700 Conus species worldwide, each potentially producing up to 200 distinct peptides, the number of conopeptides discovered to date is in the thousands. Many more remain to be characterized. This diversity supports research into the co-evolution of venom repertoires and their molecular targets. Additionally, conopeptides are commonly used in neurophysiology to probe receptor isoforms, leveraging the evolutionary conservation of ion channels and receptors.

Conopeptides are classified into two broad categories:

Disulfide-rich conotoxins (two or more disulfide bonds)
Disulfide-poor peptides (one or no disulfide bond)

These peptides are typically 10–40 amino acids long. Within ConoServer, classification relies on three schemes:

Pharmacological families (based on target specificity)
Gene superfamilies (signal peptide sequence similarity)
Cysteine frameworks (bonding pattern in the mature region)

🎯 Pharmacological Targets: Diversity and Significance

Conotoxins are notable for their high specificity and potency in modulating both voltage-gated and ligand-gated ion channels in neurological and muscular systems.

Examples of pharmacological families:

μ-, μO-, and δ-conotoxins: Target voltage-gated Na⁺ channels by various mechanisms.
ω-conotoxins: Block voltage-gated Ca²⁺ channels. ω-Conotoxin MVIIA is the active component in Prialt (ziconotide), a non-opioid drug for chronic pain.
κ- and κM-conotoxins: Target K⁺ channels.
α-conotoxins: Antagonists of nicotinic acetylcholine receptors (nAChRs).
χ-conotoxins: Block the neuronal noradrenaline transporter.
ρ-conotoxins: Act on α1-adrenoceptors.
σ-conotoxins: Target serotonin-gated 5-HT3 ion channels.
ι-conotoxins: Modulate Na⁺ channels without delayed inactivation.
γ-conotoxins: Affect neuronal pacemaker cation currents.
τ-conotoxins: Interact with somatostatin receptors.

ConoServer is a curated database of conopeptide sequences and structures, maintained by the Institute for Molecular Bioscience (IMB), Australia. Created in 2007, it is maintained by researchers including David Craik, Jan-Christoph Westermann and Quentin Kaas (Kaas et al., 2008; 2012).

Data sources include peer-reviewed publications and databases like UniProtKB, NCBI Nucleotide, and PDB. Manual curation ensures:

Sequence analysis and annotation
Identification of gene superfamilies and cysteine frameworks
Mapping of mature peptides
Recording of post-translational modifications

Key Features:

Multi-criteria search: Based on taxonomy, classification, geography, and sequence.
Detailed entry cards: Cover protein, nucleotide, and structural data.
Statistical reports: Automatically updated data on species, structures, classifications, and citations.
Precursor sequence tools: ORF prediction, region detection, homolog search.
Mass spectrometry tools: PTM prediction and mass list analysis.

🔍 Identified Inconsistencies and Data Issues in ConoServer

Despite the comprehensive resources provided by ConoServer, practical limitations in data handling and interoperability can emerge during routine research workflows. These challenges often hinder the seamless transition from raw database information to advanced analysis.

Specific Challenges Encountered:

1. Inconsistent FASTA Data Handling: Retrieving sequences from ConoServer often presents difficulties with FASTA formatting. Direct copying and pasting of sequences and headers into text files or other tools can introduce "unknown characters" or misinterpret the critical > symbol, essential for FASTA header recognition. This issue complicates the process of obtaining clean, bulk FASTA files ready for downstream computational analysis.

2. Workflow Bottleneck with Downstream Prediction Tools: A significant challenge involved using ConoServer-derived sequence data with external bioinformatics tools, such as ConoPrec, for predicting mature peptides. Even with meticulously cleaned FASTA files, these prediction tools frequently failed to process multi-sequence inputs, often only recognizing and analyzing the first sequence in a given file. Besides, there is no option to save the resulting mature peptides in a FASTA file, so this limitation necessitates time-consuming, manual, one-by-one processing for large datasets, or additional scripting skills to populate the sequences from the tabular format.

💡 ConoServerParser: Complementary Conotoxin FASTA Analysis Suite for ConoServer DATA

Although ConoServer is comprehensive, limitations remain when preparing data for downstream workflows. ConoServerParser, a Shiny application, was developed to enhance the analyses of the data exported from ConoServer. Despite the comprehensive resources provided by ConoServer, practical limitations in data handling and interoperability can emerge during routine research workflows. These challenges often hinder the seamless transition from raw database information to advanced analysis.

Common Challenges Resolved:

FASTA formatting issues: Pasting sequences may introduce encoding errors (e.g., broken headers).
Complex filtering: Lack of intuitive filtering tools and insufficient metadata in exported headers.
Batch tool compatibility: Tools like ConoPrec often fail to process multiple sequences at once.

ConoServerParser Features:

• FASTA parsing and intelligent metadata extraction: Reliably interprets ConoServer’s pipe-delimited headers, automatically parsing fields like ConoID, species, gene superfamily, pharmacological family, cysteine framework, and evidence level into a clean tabular format.
• FASTA export customization: Offers configurable FASTA header generation with user-selected fields and custom delimiters, supporting fine-tuned downstream integration.
• Multi-format data export: Export sequence metadata tables as CSV, TSV, or XLSX; or generate downloadable ZIP archives grouped by metadata (e.g., superfamily or pharmacological family).
• Mature peptide FASTA generation: Automatically extract and format mature sequences from ConoPrec CSV output, eliminating ConoPrec’s multi-sequence limitations.
• Advanced filtering and interactive interface: Select, filter, and exclude entries across multiple fields using an intuitive sidebar with dropdowns and search tools.
• Dynamic visual summaries: Create interactive bar plots to compare sequence distributions across superfamilies, pharmacological families, or evidence types.
• Header preview and export configuration: Preview how final FASTA headers will look, control included metadata, and avoid oversized headers or unnecessary fields.
• Custom FASTA export: Filtered or grouped downloads with customizable headers.
• ZIP archive generation: Organize exports by superfamily or pharmacological category.
• Mature peptide FASTA generation: Upload CSV output from ConoPrec to obtain well-formatted mature sequences.
• Interactive filtering: Flexible filter UI for any column.
• Data visualization: Generate plotly bar charts of sequence distributions.
• Multi-format exports: Download tables in TSV, CSV, XLSX.

🧾 Summary

ConoServer offers an invaluable foundation for conopeptide research. ConoServerParser extends this utility, addressing real-world bottlenecks in data handling and exploration. Together, these tools enhance the efficiency and reproducibility of workflows in pharmacology, evolutionary biology, and peptide synthesis.

📚 References

Kaas Q, Yu R, Jin AH, Dutertre S, Craik DJ. (2012). ConoServer: updated content, knowledge, and discovery tools in the conopeptide database. Nucleic Acids Research, 40(Database issue): D325–D330.
Kaas Q, Westermann JC, Halai R, Wang CK, Craik DJ. (2008). ConoServer, a database for conopeptide sequences and structures. Bioinformatics, 24(3): 445–446.
ConoServer. Accessed May 20, 2025. https://www.conoserver.org/
Intrathecal Pain Management. (2023). In Pain Management: Anesthesia, Analgesia, and Opioid Alternatives.