Computational Proteomics

Veit Schwämmle

Assistant professor
Protein Research Group
Department for Biochemistry and Molecular Biology
University of Southern Denmark
Email

Research

As a part of the Protein Research Group, I investigate post-translational modifications (PTMs) of proteins and their crosstalk. My main research focuses on quantifying PTM crosstalk and linking it to biological functions.
I apply statistical methods, develop algorithms and carry out simulations on the computer. With these methods, I analyze experimental data from protein mass spectrometry and integrate it with other -omics data sets.

Your browser cannot display svg

Software

Databases

Visit the CrosstalkDB hosting quantitative data for crosstalk between histone proteins measured by middle-down mass spectrometry.
The Elixir Tools and Data Service Registry, bio.tools, launched in January 2015, hosts details about thousands of databases and bioinformatics tools.

Shiny apps

The Shiny framework facilitates embedding R-scripts into simple GUIs.
Some of my developments are now available as simple web applications (see Shiny Apps)
Try our new tool for variance-sensitive clustering of large data sets: Launch app

Proteomics data analysis

- R scripting for statistical analysis of proteomics and PTMomics data

- Software pipelines for full analysis

- Workflow composition and comparison from tool collection

- Quantitative assessment of the behavior of protein complexes

Unsplashed background img

My main research focuses on the study of post-translational modifications (PTMs) and chromatin. I am mostly interested in deciphering the complex behaviour and the biological consequences of crosstalk between these modifications. With my background in theoretical physics and computational modeling, I use methods adapted from statistical physics, bioinformatics, and general computer modeling approaches which I already applied to study physical, biological and linguistic systems.
For my full publication record, see Google Scholar.

Your browser cannot display svg-files
Computational Proteomics

  • Review and Tutorial For an overview of methods used in PTMomics see (refs).

  • Characterization of PTM crosstalk Large-scale estimation of crosstalk between nearby residues (refs). Interplay scores provide quantitative information about negative and positive crosstalk between PTMs on a peptide (refs).

  • Statistical tests The complexity of high-throughput mass spectrometry experiments often leads to low replicates numbers and many missing values, especially when measuring (un)modified peptides. We implemented a combination of statistical tests that yields statistical results with sufficiently high confidence (a new and better approach is under development) (ref). Launch app

  • Data clustering After providing a simple way to estimate appropriate parameter values for fuzzy c-means clustering (ref). (old app: Launch app), find the new tool VSClust that additionally considers feature variance leading to more accurate clustering results (just accepted in Bioinformatics). Launch app

  • Standardization and community efforts See the proposed notation of proteoforms: ProForma (ref). As part of ELIXIR DK, we annotate proteomics tools for bio.tools and are part of the proteomics use case (see also white paper, ref)

no image
Chromatin Biology
  • Middle-down mass spectrometry We developed a workflow to quantify PTMs on histone tails (refs).

  • CrosstalkDB With quantitative data from middle-down and top-down mass spectrometry, the web server collects and analyzes the input files, followed by statistical assessment of the crosstalk between measured PTMs (refs).
    Access CrosstalkDB
    your browser cannot display svg-files

  • Computational models Taking simple rules for writing, propagating and deleting histone PTMs on chromatin, we were able to reproduce global patterns measured by ChIP-seq experiments. The implementation of crosstalk rules results in a rich spatial and temporal behavior (ref).
    your browser cannot display svg-files

Complex Systems
  • Biological evolution See my former studies of aging, sympatric speciation and competitive cellular automata (refs).

  • Simulations Almost anything can be simulated on the computer including sand dunes, opinion dynamics and linguistics (refs).

  • Statistical Mechanics See my work on generalized entropies and Fokker-Planck equations (refs).

no image
Unsplashed background img

Data analysis of proteomics data

This presents presents an overview of the main methods for analysis of data from peptide mass spectrometry and other -omics data.


Click on the picture for course material.

Data analysis of proteomics data


Biostatistics in R

2 Courses (BMB830 and BMB831) at the Department of Biochemistry and Molecular Biology.


Click on the picture for more information.

Biostatistics in R

Biostatistics in R I

Course description: SDU web page

Lecture I: Introduction
Corresponding R script
Lecture II: Basics
Corresponding R script
Lecture III: Arrays, matrices and data frames
Corresponding R script
Lecture IV: Data manipulation
Corresponding R script
Lecture V: Visual methods
Corresponding R script
Lecture VI: Basic statistics
Corresponding R script
Lecture VII: Data modeling
Corresponding R script
Lecture VIII: Statistical tests
Corresponding R script
Lecture IX: Multi-variate analysis
Corresponding R script
Lecture X: Interactivity in R
Corresponding R script
Exercises

Biostatistics in R II

Course description: SDU web page

Lecture I: Data analysis of omics data: general aspects
Lecture II: Proteomics data
Lecture III: Transcriptomics data
Lecture IV: Epigenomics data
Lecture V: Metabolomics data
Lecture VI: Data interpretation

1st Year Bachelor Project

Functional analysis of a fish oil diet.


Click on the picture for course material.

Unsplashed background img
VSClust: Variance-sensitive clustering
Find the source code here.
Detection of differentially regulated features
Find the source code here.
Unsplashed background img

Elixir
(European Infrastructure for Biological Information)

The Danish node of the Elixir consortium implements and maintains the registry of software tools in life science (ref). The registry adapts rich annotations by basing software descriptions on the EDAM ontologies. I am involved in several projects aiming to improve and extend curation of software tools, the proteomics use case, automatic synthesis of workflows from the registry content, as well as maintenance of the EDAM ontology.

no image

EuBIC
(European Bioinformatics Initiative)

Initiative of bioinformaticians in Europe to improve support and coordination of training and software development in proteomics informatics.
Conference: We are organizing conferences, hackathons and workshops in Computational Proteomics.

Unsplashed background img